Qwen-Omni-Turbo
Multimodal turbo model supporting text, image, audio, and video with fast output.
Model Details
Provider
Alibaba Cloud
Model Type
multimodal
Context Window
32,768 tokens
Pricing
Input (1M)$0.06
Output (1M)$0.23
Capabilities
1. Fast multimodal understanding
- Handles text, audio, images.
2. Supports text+audio outputs
- Great for assistants and education.
3. Strong cross-modal alignment
- Solid for recognition, instructions, and conversion tasks.