MULTIMODAL EDGE
|
Node-Based Inference Canvas
10x Vision Models
Input Image
ID: 01
Upload Image
Click or drop image here
—
Model Selector
ID: 02
Active Model
Qwen3-VL-2B-Instruct
Qwen3-VL-4B-Instruct
Qwen3.5-4B-Unredacted-MAX
Qwen3.5-4B
Qwen3.5-2B
LFM2.5-VL-450M (LiquidAI)
Gemma4-E2B-it (Google)
LFM2.5-VL-1.6B (LiquidAI)
Qwen3.5-2B-Unredacted-MAX
Qwen2.5-VL-3B-Instruct
QWEN3-VL · 2B
Qwen3-VL-2B-Instruct — dedicated vision-language model by Alibaba Cloud. Strong spatial grounding, OCR & instruction-following.
Task Config
ID: 03
Task Category
Query
Caption
Point
Detect
Prompt Directive
Execute
Output Stream
ID: 04
Streamed Result
COPY
Results will stream here...
View Grounding
ID: 05
Point / Detect Overlay
SAVE
Active for Point / Detect tasks.
Run inference to visualise.