GPT-4 by OpenAI, Grok 4, ultralyticsplus/yolov8s, immich-app/ViT-H-14-378-quickgelu__dfn5b, and LLaVA are the most popular tools in the category “Multimodal Models”.
GPT-4 by OpenAI
67 stacks
Grok 4
4 stacks
ultralyticsplus/yolov8s
2 stacks
immich-app/ViT-H-14-378-quickgelu__dfn5b
1 stacks
LLaVA
openai/clip-vit-large-patch14
patrickjohncyh/fashion-clip
nlpconnect/vit-gpt2-image-captioning
FoxAIHub
AI Image to Text
A large multimodal model that can solve difficult problems with greater accuracy