GPT-4 by OpenAI, ultralyticsplus/yolov8s, patrickjohncyh/fashion-clip, openai/clip-vit-large-patch14, and LLaVA are the most popular tools in the category “Multimodal Models”.
A large multimodal model that can solve difficult problems with greater accuracy
A Gradio web UI for Large Language Models
Framework for vision and language multimodal research