StarVector: A Multimodal Model for Generating SVG Code from Images and Text

What is StarVector?

StarVector is a multimodal vision-language model (VLM) designed specifically for Scalable Vector Graphics (SVG) generation. It can produce high-precision, semantically rich SVG code through both Image-to-SVG and Text-to-SVG methods. Unlike traditional curve vectorization techniques, StarVector operates directly at the SVG code level, allowing it to accurately utilize SVG primitives (such as ellipses, rectangles, polygons, and text), thus avoiding common distortions and artifacts seen in conventional methods.


Core Technologies of StarVector

1. Multimodal Architecture

StarVector employs a multimodal architecture capable of processing both images and text as inputs:

  • Image-to-SVG: Converts images into visual tokens and then generates SVG code.
  • Text-to-SVG: Creates new SVGs purely from text instructions, without needing an image.

The model is built upon StarCoder, enabling it to transfer coding capabilities to the SVG generation domain, ensuring concise and syntactically correct code.


Challenges in SVG Generation & StarVector’s Advantages

1. Overcoming Traditional Method Limitations

Traditional SVG generation methods, such as AutoTrace, Potrace, and VTracer, primarily rely on curve fitting and lack semantic understanding of images. This often results in distorted or overly complex path data, making it difficult to handle complex SVG elements.

StarVector’s advantages:

  • Semantic Understanding: The model can analyze image content and correctly select appropriate SVG primitives (e.g., circles, rectangles, polylines).
  • Concise Code: Directly outputs structured and compact SVG code instead of complex <path> data.
  • Supports Various SVG Generation Scenarios: Including logos, technical diagrams, and icons.

2. More Accurate Evaluation Metrics

Many previous SVG generation methods relied on pixel-level evaluation metrics (e.g., MSE), which fail to measure the true semantic accuracy of SVGs. To address this, the StarVector team developed SVG-Bench, a benchmark specifically designed to assess SVG generation quality, covering 10 datasets and 3 types of SVG generation tasks:

  1. Image-to-SVG
  2. Text-to-SVG
  3. Diagram-to-SVG

StarVector Models & Evaluation Results

Currently, StarVector offers two model versions, both available for download on Hugging Face:

  • 💫 StarVector-8B
  • 💫 StarVector-1B

In SVG-Bench testing, StarVector outperformed all baseline models in DinoScore metrics:

Method SVG-Stack SVG-Fonts SVG-Icons SVG-Emoji SVG-Diagrams
AutoTrace 0.942 0.954 0.946 0.975 0.874
Potrace 0.898 0.967 0.972 0.882 0.875
VTracer 0.954 0.964 0.940 0.981 0.882
Im2Vec 0.692 0.733 0.754 0.732 -
LIVE 0.934 0.956 0.959 0.969 0.870
DiffVG 0.810 0.821 0.952 0.814 0.822
GPT-4-V 0.852 0.842 0.848 0.850 -
💫 StarVector-1B 0.926 0.978 0.975 0.929 0.943
💫 StarVector-8B 0.966 0.982 0.984 0.981 0.959

Note: StarVector is not designed for natural images or illustrations since its training data primarily consists of icons, technical diagrams, charts, and logos.


SVG-Bench Dataset Overview

StarVector’s training data comes from SVG-Bench, a specialized dataset for SVG generation models, covering 10 sub-datasets, each targeting different SVG generation scenarios:

Dataset Training Set Validation Set Test Set Avg. Token Length Supported SVG Primitives Annotation Type
SVG-Stack 2.1M 108k 5.7k 1,822 ± 1,808 All SVG Primitives Image Annotation
SVG-Stack_sim 601k 30.1k 1.5k 2,000 ± 918 Vector path -
SVG-Diagrams - - 472 3,486 ± 1,918 All SVG Primitives -
SVG-Fonts 1.8M 91.5k 4.8k 2,121 ± 1,868 Vector path Font Annotation
SVG-Fonts_sim 1.4M 71.7k 3.7k 1,722 ± 723 Vector path Font Annotation
SVG-Emoji 8.7k 667 668 2,551 ± 1,805 All SVG Primitives -
SVG-Emoji_sim 580 57 96 2,448 ± 1,026 Vector path -
SVG-Icons 80.4k 6.2k 2.4k 2,449 ± 1,543 Vector path -
SVG-Icons_sim 80.4k 2.8k 1.2k 2,005 ± 824 Vector path -
SVG-FIGR 270k 27k 3k 5,342 ± 2,345 Vector path Image Classification & Annotation

Conclusion: Why StarVector Matters

SVGs play a crucial role in icons, branding, technical diagrams, and map design. StarVector is currently the most advanced Image-to-SVG and Text-to-SVG generation model. Compared to traditional curve-fitting methods, it offers:
Semantic Understanding, ensuring accurate image structure recognition
Concise Code, generating more efficient SVGs
More Accurate Evaluation Metrics, overcoming pixel-based limitations
Support for Hugging Face training & testing, available for developers

StarVector makes AI-generated SVGs more precise and reliable, opening up new possibilities for vector graphics applications. 💡

👉 Resources:

Share on:
Previous: DeepSeek-V3-0324 Launches: Free for Commercial Use & Runs on Consumer Hardware
Next: OpenAI Introduces New Speech AI Model: gpt-4o-transcribe and Its Potential Applications
DMflow.chat

DMflow.chat

ad

DMflow.chat: Step into the future of customer service. Enjoy persistent memory, customizable fields, and effortless database integration—no extra setup required. Connect multiple platforms to elevate your efficiency, service, and marketing.

NeuralSVG: Turning Words into Magic—Let AI Draw Professional-Grade Vector Graphics for You!
23 April 2025

NeuralSVG: Turning Words into Magic—Let AI Draw Professional-Grade Vector Graphics for You!

NeuralSVG: Turning Words into Magic—Let AI Draw Professional-Grade Vector Graphics for You! T...

Fudan University Teams Up with Jieyue Xingchen! OmniSVG Debuts – Is AI Vector Generation About to Change Forever?
10 April 2025

Fudan University Teams Up with Jieyue Xingchen! OmniSVG Debuts – Is AI Vector Generation About to Change Forever?

Fudan University Teams Up with Jieyue Xingchen! OmniSVG Debuts – Is AI Vector Generation About to...

Midjourney V7 Is Here! Not Just Better Quality—This Time AI Might Read Your Mind
5 April 2025

Midjourney V7 Is Here! Not Just Better Quality—This Time AI Might Read Your Mind

Midjourney V7 Is Here! Not Just Better Quality—This Time AI Might Read Your Mind? The heavywe...

Free to Use in Ghibli Style! EasyControl_Ghibli Model Arrives, Instantly Transforming Photos into Anime Art
2 April 2025

Free to Use in Ghibli Style! EasyControl_Ghibli Model Arrives, Instantly Transforming Photos into Anime Art

Free to Use in Ghibli Style! EasyControl_Ghibli Model Arrives, Instantly Transforming Photos into...

OpenAI Launches GPT-4o Image Generation with Multi-Turn Editing
26 March 2025

OpenAI Launches GPT-4o Image Generation with Multi-Turn Editing

OpenAI Launches GPT-4o Image Generation with Multi-Turn Editing On March 25, 2025, OpenAI announ...

Google AI Studio Enhances Image Generation: Lower False Positives, Greater Usability
21 March 2025

Google AI Studio Enhances Image Generation: Lower False Positives, Greater Usability

Google AI Studio Enhances Image Generation: Lower False Positives, Greater Usability Major Updat...

Eavesdropping on Dolphins? Google’s AI Tool DolphinGemma Unlocks Secrets of Marine Communication
16 April 2025

Eavesdropping on Dolphins? Google’s AI Tool DolphinGemma Unlocks Secrets of Marine Communication

Eavesdropping on Dolphins? Google’s AI Tool DolphinGemma Unlocks Secrets of Marine Communication ...

NeuralSVG: Turning Words into Magic—Let AI Draw Professional-Grade Vector Graphics for You!
23 April 2025

NeuralSVG: Turning Words into Magic—Let AI Draw Professional-Grade Vector Graphics for You!

NeuralSVG: Turning Words into Magic—Let AI Draw Professional-Grade Vector Graphics for You! T...

DeepSeek Open Source Week Day 3: Introducing DeepGEMM — A Game-Changer for AI Training and Inference
26 February 2025

DeepSeek Open Source Week Day 3: Introducing DeepGEMM — A Game-Changer for AI Training and Inference

DeepSeek Open Source Week Day 3: Introducing DeepGEMM — A Game-Changer for AI Training and Infere...