nlpconnect/vit-gpt2-image-captioning vs facebook/detr-resnet-101 | What are the differences? | StackShare