How to encode images for vector search?

michael_hoeller · December 16, 2023, 9:51pm

Hi there,

I am urgently seeking for an option to encode pictures of textiles (always the same shape) .
Encoding and vector searching is easy for text. For images I struggle already with to finding an encoding service. I want to focus on material, patterns not on shape.
Can you help me and point me to a good source where I can encode images of textiles?

Thanks a lot!
Michael

Jack_Woehr · December 17, 2023, 5:52pm

@michael_hoeller it seems to be a difficult problem!

There’s a seminal paper here.

Dave_Nielsen · December 18, 2023, 8:58pm

@Luca_Napoli created an image search demo that might be useful. He used squeezenett to generate the embedding, a simple model that you can download from PyTorch. He also includes a link to the GitHub repo. Reply here if you have any questions and I’ll bet Luca would be happy to help out.

Dave_Nielsen · December 18, 2023, 9:03pm

Here’s another one that might be helpful, Source Digital analysed images in videos to trigger ads. They use models from Google Vision AI. @Mat_Keep and Elliott Gluck wrote up an overview of the story in the 3rd example in this blog post titled " Creating a new media currency with video detection and monetization"

Hubert_Nguyen1 · January 2, 2024, 5:32pm

@michael_hoeller , I haven’t tried for myself, but I’ve seen some information on this topic.

It looks like you’re looking for a “texture-recognition” model. Depending on your use case and data sets, some models might be able to recognize the appearance of a fabric, while others are trained on categorizing what type of fabric it is, in which case, you might want to search for “fabric structure” models. I haven’t seen these on popular places like huggingface, but maybe I missed them.

michael_hoeller · January 2, 2024, 9:22pm

Hello @Hubert_Nguyen1

thanks for the update, I am currently testing a slightly different way. Unfortunately no MongoDB (yet). The use case changed, or if you want, iterated to a proof of concept to identify pairs of socks. One option I follow is to encode texture driven and run a vector search. The other currently preferred is to ask “the AI” only one question. Let me reword this: focus on simplicity. Asking for one “parmerter” only reduces complexity tremendously. I am currently working on training a model only to answer the question: are theses socks one pair? It is still quite a bit of training but surprisingly less than expected. The approach to work with multiple nodes in my net explored the number of options and so the amount of training. One result, though not the initial goal is the learning to think carefully about the question to answer and so to be able to build small but still very well working models.
Unfortunately on other project eats up some time the next two weeks. After that I’ll continue to work on the actual textile / image encoding.
Best Michael

Hubert_Nguyen1 · January 3, 2024, 1:08am

Thanks for letting us know. It makes sense to nail down the model before starting using a vector store, no worries. You’ll see how super-easy MongoDB Vector Search is once you have your embeddings ready for vector-search testing.

Just curious, if you have examples and results later I’d love to see what you came up with!

Cheers, Hubert

michael_hoeller · January 3, 2024, 6:30am

Hi @Hubert_Nguyen1

You’ll see how super-easy MongoDB Vector Search is once you have your embeddings ready for vector-search testing.

It is, I have implemented several already.

Just curious, if you have examples and results later I’d love to see what you came up with!

Sure I will, I may need to wait for the end of project, end of Feb., for more specific details to share.

I have not dropped the initial idea and will be on the watch out for “texture-recognition”

Cheers, Michael