Arcee AI: Spotlight
OpenAI ⢠text ⢠vision
arcee-ai/spotlightSpotlight is a 7âbillionâparameter visionâlanguage model derived from QwenâŻ2.5âVL and fineâtuned by Arcee AI for tight imageâtext grounding tasks. It offers a 32âŻkâtoken context window, enabling rich multimodal conversations that combine lengthy documents with one or more images. Training emphasized fast inference on consumer GPUs while retaining strong captioning, visualâquestionâanswering, and diagramâanalysis accuracy. As a result, Spotlight slots neatly into agent workflows where screenshots, charts or UI mockâups need to be interpreted on the fly. Early benchmarks show it matching or outâscoring larger VLMs such as LLaVAâ1.6 13âŻB on popular VQA and POPE alignment tests.
Best For:
High-volume, low-latency tasks where cost efficiency is paramount
Pricing:
$0.00/1M input tokens, $0.00/1M output tokens
Context Window:
131,072 tokens
Key Differentiator:
Cost-optimized for high-volume usage