nlpconnect/vit-gpt2-image-captioning vs amunchet/rorshark-vit-base | What are the differences? | StackShare