nlpconnect/vit-gpt2-image-captioning vs google/owlvit-base-patch32 | What are the differences? | StackShare