Qwen-Omni-Turbo

Multimodal turbo model supporting text, image, audio, and video with fast output.

Model Details

Provider

Alibaba Cloud

Model Type

multimodal

Context Window

32,768 tokens

Pricing

Input (1M)$0.06
Output (1M)$0.23

Capabilities

1. Fast multimodal understanding

  • Handles text, audio, images.

2. Supports text+audio outputs

  • Great for assistants and education.

3. Strong cross-modal alignment

  • Solid for recognition, instructions, and conversion tasks.

The platform for your ideal software

Use Appaca to to do the most with any software you need, just for your use case.