I thought the same, but the description of the cat picture is pretty spot on. I wonder if this is a dataset issue. Cat pictures are far more prevalent than abstract art on the internet so might well be overrepresented. Can Vision LLMs deal with a long tail of underrepresented objects when small? Or can they only do so at scale?
Need to try this directly before passing judgement, but this can unlock a few project ideas I have if the quality lives up to the examples with this low of resource requirements.
Can GitHub please acquire all these model-hub companies like fal, replicate, ollama, hf, and checks notes "nexa.ai"? That way we can get past the inevitable fragmentation and ultimate breaking of everyone's workflow w.r.t. ML-oriented dev ops?
When faced with a diversity of implantation, why is the goto “let’s have a corporate entity acquire them all” instead of “let’s come up with a good runtime standard”. The company is going to do the same thing anyway except with the additional risk of messing up the API and throwing away the hard work of so many people.
Its description of the art piece is so awful.
I thought the same, but the description of the cat picture is pretty spot on. I wonder if this is a dataset issue. Cat pictures are far more prevalent than abstract art on the internet so might well be overrepresented. Can Vision LLMs deal with a long tail of underrepresented objects when small? Or can they only do so at scale?
Easy to try here: https://huggingface.co/spaces/NexaAIDev/omnivlm-dpo-demo
https://i.imgur.com/44XYyXU.png
I saw a turntable at a shop recently and my inner classifier went: "Oh a DSTOM turntable, that's sweet!"
https://www.project-audio.com/en/product/the-dark-side-of-th...
I was kinda expecting the model in your picture to make the link with the album cover.
Need to try this directly before passing judgement, but this can unlock a few project ideas I have if the quality lives up to the examples with this low of resource requirements.
Can GitHub please acquire all these model-hub companies like fal, replicate, ollama, hf, and checks notes "nexa.ai"? That way we can get past the inevitable fragmentation and ultimate breaking of everyone's workflow w.r.t. ML-oriented dev ops?
When faced with a diversity of implantation, why is the goto “let’s have a corporate entity acquire them all” instead of “let’s come up with a good runtime standard”. The company is going to do the same thing anyway except with the additional risk of messing up the API and throwing away the hard work of so many people.
You want everything under the control of Microsoft?
Satya is that you?
I definately wish to try this https://nexa.ai/blogs/omni-vision