Guide to using vision language models for image understanding tasks
Deployment Name | Parameters | Architecture | License | Context Window | Always-On Shared Endpoint |
---|---|---|---|---|---|
qwen2-vl-7b-instruct | 7B | Qwen2 | Tongyi Qianwen | 32K | ❌ |
qwen2-5-vl-3b-instruct | 3B | Qwen2.5 | Tongyi Qianwen | 32K | ❌ |
qwen2-5-vl-7b-instruct | 7B | Qwen2.5 | Tongyi Qianwen | 32K | ❌ |
llama-3-2-11b-vision* | 11B | Llama-3 | Meta (request for commercial use) | 32K | ❌ |
llama-3-2-11b-vision-instruct* | 11B | Llama-3 | Meta (request for commercial use) | 32K | ❌ |