StarVector: A Multimodal Model for Generating SVG Code from Images and Text

Posted on: 2025-03-22 • Updated on: 2025-03-22 • 4 min read

What is StarVector?

StarVector is a multimodal vision-language model (VLM) designed specifically for Scalable Vector Graphics (SVG) generation. It can produce high-precision, semantically rich SVG code through both Image-to-SVG and Text-to-SVG methods. Unlike traditional curve vectorization techniques, StarVector operates directly at the SVG code level, allowing it to accurately utilize SVG primitives (such as ellipses, rectangles, polygons, and text), thus avoiding common distortions and artifacts seen in conventional methods.

Core Technologies of StarVector

1. Multimodal Architecture

StarVector employs a multimodal architecture capable of processing both images and text as inputs:

Image-to-SVG: Converts images into visual tokens and then generates SVG code.
Text-to-SVG: Creates new SVGs purely from text instructions, without needing an image.

The model is built upon StarCoder, enabling it to transfer coding capabilities to the SVG generation domain, ensuring concise and syntactically correct code.

Challenges in SVG Generation & StarVector’s Advantages

1. Overcoming Traditional Method Limitations

Traditional SVG generation methods, such as AutoTrace, Potrace, and VTracer, primarily rely on curve fitting and lack semantic understanding of images. This often results in distorted or overly complex path data, making it difficult to handle complex SVG elements.

StarVector’s advantages:

Semantic Understanding: The model can analyze image content and correctly select appropriate SVG primitives (e.g., circles, rectangles, polylines).
Concise Code: Directly outputs structured and compact SVG code instead of complex <path> data.
Supports Various SVG Generation Scenarios: Including logos, technical diagrams, and icons.

2. More Accurate Evaluation Metrics

Many previous SVG generation methods relied on pixel-level evaluation metrics (e.g., MSE), which fail to measure the true semantic accuracy of SVGs. To address this, the StarVector team developed SVG-Bench, a benchmark specifically designed to assess SVG generation quality, covering 10 datasets and 3 types of SVG generation tasks:

Image-to-SVG
Text-to-SVG
Diagram-to-SVG

StarVector Models & Evaluation Results

Currently, StarVector offers two model versions, both available for download on Hugging Face:

💫 StarVector-8B
💫 StarVector-1B

In SVG-Bench testing, StarVector outperformed all baseline models in DinoScore metrics:

Method	SVG-Stack	SVG-Fonts	SVG-Icons	SVG-Emoji	SVG-Diagrams
AutoTrace	0.942	0.954	0.946	0.975	0.874
Potrace	0.898	0.967	0.972	0.882	0.875
VTracer	0.954	0.964	0.940	0.981	0.882
Im2Vec	0.692	0.733	0.754	0.732	-
LIVE	0.934	0.956	0.959	0.969	0.870
DiffVG	0.810	0.821	0.952	0.814	0.822
GPT-4-V	0.852	0.842	0.848	0.850	-
💫 StarVector-1B	0.926	0.978	0.975	0.929	0.943
💫 StarVector-8B	0.966	0.982	0.984	0.981	0.959

Note: StarVector is not designed for natural images or illustrations since its training data primarily consists of icons, technical diagrams, charts, and logos.

SVG-Bench Dataset Overview

StarVector’s training data comes from SVG-Bench, a specialized dataset for SVG generation models, covering 10 sub-datasets, each targeting different SVG generation scenarios:

Dataset	Training Set	Validation Set	Test Set	Avg. Token Length	Supported SVG Primitives	Annotation Type
SVG-Stack	2.1M	108k	5.7k	1,822 ± 1,808	All SVG Primitives	Image Annotation
SVG-Stack_sim	601k	30.1k	1.5k	2,000 ± 918	Vector path	-
SVG-Diagrams	-	-	472	3,486 ± 1,918	All SVG Primitives	-
SVG-Fonts	1.8M	91.5k	4.8k	2,121 ± 1,868	Vector path	Font Annotation
SVG-Fonts_sim	1.4M	71.7k	3.7k	1,722 ± 723	Vector path	Font Annotation
SVG-Emoji	8.7k	667	668	2,551 ± 1,805	All SVG Primitives	-
SVG-Emoji_sim	580	57	96	2,448 ± 1,026	Vector path	-
SVG-Icons	80.4k	6.2k	2.4k	2,449 ± 1,543	Vector path	-
SVG-Icons_sim	80.4k	2.8k	1.2k	2,005 ± 824	Vector path	-
SVG-FIGR	270k	27k	3k	5,342 ± 2,345	Vector path	Image Classification & Annotation

Conclusion: Why StarVector Matters

SVGs play a crucial role in icons, branding, technical diagrams, and map design. StarVector is currently the most advanced Image-to-SVG and Text-to-SVG generation model. Compared to traditional curve-fitting methods, it offers: ✅ Semantic Understanding, ensuring accurate image structure recognition ✅ Concise Code, generating more efficient SVGs ✅ More Accurate Evaluation Metrics, overcoming pixel-based limitations ✅ Support for Hugging Face training & testing, available for developers

StarVector makes AI-generated SVGs more precise and reliable, opening up new possibilities for vector graphics applications. 💡

👉 Resources:

Share on:

DMflow.chat

DMflow.chat: Your intelligent conversational companion, enhancing customer interaction.

Learn More

What is StarVector?

Core Technologies of StarVector

1. Multimodal Architecture

Challenges in SVG Generation & StarVector’s Advantages

1. Overcoming Traditional Method Limitations

2. More Accurate Evaluation Metrics

StarVector Models & Evaluation Results

SVG-Bench Dataset Overview

Conclusion: Why StarVector Matters

DMflow.chat

Related Posts

NeuralSVG: Turning Words into Magic—Let AI Draw Professional-Grade Vector Graphics for You!

Fudan University Teams Up with Jieyue Xingchen! OmniSVG Debuts – Is AI Vector Generation About to Change Forever?

Midjourney V7 Is Here! Not Just Better Quality—This Time AI Might Read Your Mind

Free to Use in Ghibli Style! EasyControl_Ghibli Model Arrives, Instantly Transforming Photos into Anime Art

OpenAI Launches GPT-4o Image Generation with Multi-Turn Editing

Google AI Studio Enhances Image Generation: Lower False Positives, Greater Usability