StarVector: A Multimodal Model for Generating SVG Code from Images and Text

What is StarVector?

StarVector is a multimodal vision-language model (VLM) designed specifically for Scalable Vector Graphics (SVG) generation. It can produce high-precision, semantically rich SVG code through both Image-to-SVG and Text-to-SVG methods. Unlike traditional curve vectorization techniques, StarVector operates directly at the SVG code level, allowing it to accurately utilize SVG primitives (such as ellipses, rectangles, polygons, and text), thus avoiding common distortions and artifacts seen in conventional methods.


Core Technologies of StarVector

1. Multimodal Architecture

StarVector employs a multimodal architecture capable of processing both images and text as inputs:

  • Image-to-SVG: Converts images into visual tokens and then generates SVG code.
  • Text-to-SVG: Creates new SVGs purely from text instructions, without needing an image.

The model is built upon StarCoder, enabling it to transfer coding capabilities to the SVG generation domain, ensuring concise and syntactically correct code.


Challenges in SVG Generation & StarVector’s Advantages

1. Overcoming Traditional Method Limitations

Traditional SVG generation methods, such as AutoTrace, Potrace, and VTracer, primarily rely on curve fitting and lack semantic understanding of images. This often results in distorted or overly complex path data, making it difficult to handle complex SVG elements.

StarVector’s advantages:

  • Semantic Understanding: The model can analyze image content and correctly select appropriate SVG primitives (e.g., circles, rectangles, polylines).
  • Concise Code: Directly outputs structured and compact SVG code instead of complex <path> data.
  • Supports Various SVG Generation Scenarios: Including logos, technical diagrams, and icons.

2. More Accurate Evaluation Metrics

Many previous SVG generation methods relied on pixel-level evaluation metrics (e.g., MSE), which fail to measure the true semantic accuracy of SVGs. To address this, the StarVector team developed SVG-Bench, a benchmark specifically designed to assess SVG generation quality, covering 10 datasets and 3 types of SVG generation tasks:

  1. Image-to-SVG
  2. Text-to-SVG
  3. Diagram-to-SVG

StarVector Models & Evaluation Results

Currently, StarVector offers two model versions, both available for download on Hugging Face:

  • 💫 StarVector-8B
  • 💫 StarVector-1B

In SVG-Bench testing, StarVector outperformed all baseline models in DinoScore metrics:

MethodSVG-StackSVG-FontsSVG-IconsSVG-EmojiSVG-Diagrams
AutoTrace0.9420.9540.9460.9750.874
Potrace0.8980.9670.9720.8820.875
VTracer0.9540.9640.9400.9810.882
Im2Vec0.6920.7330.7540.732-
LIVE0.9340.9560.9590.9690.870
DiffVG0.8100.8210.9520.8140.822
GPT-4-V0.8520.8420.8480.850-
💫 StarVector-1B0.9260.9780.9750.9290.943
💫 StarVector-8B0.9660.9820.9840.9810.959

Note: StarVector is not designed for natural images or illustrations since its training data primarily consists of icons, technical diagrams, charts, and logos.


SVG-Bench Dataset Overview

StarVector’s training data comes from SVG-Bench, a specialized dataset for SVG generation models, covering 10 sub-datasets, each targeting different SVG generation scenarios:

DatasetTraining SetValidation SetTest SetAvg. Token LengthSupported SVG PrimitivesAnnotation Type
SVG-Stack2.1M108k5.7k1,822 ± 1,808All SVG PrimitivesImage Annotation
SVG-Stack_sim601k30.1k1.5k2,000 ± 918Vector path-
SVG-Diagrams--4723,486 ± 1,918All SVG Primitives-
SVG-Fonts1.8M91.5k4.8k2,121 ± 1,868Vector pathFont Annotation
SVG-Fonts_sim1.4M71.7k3.7k1,722 ± 723Vector pathFont Annotation
SVG-Emoji8.7k6676682,551 ± 1,805All SVG Primitives-
SVG-Emoji_sim58057962,448 ± 1,026Vector path-
SVG-Icons80.4k6.2k2.4k2,449 ± 1,543Vector path-
SVG-Icons_sim80.4k2.8k1.2k2,005 ± 824Vector path-
SVG-FIGR270k27k3k5,342 ± 2,345Vector pathImage Classification & Annotation

Conclusion: Why StarVector Matters

SVGs play a crucial role in icons, branding, technical diagrams, and map design. StarVector is currently the most advanced Image-to-SVG and Text-to-SVG generation model. Compared to traditional curve-fitting methods, it offers: ✅ Semantic Understanding, ensuring accurate image structure recognition ✅ Concise Code, generating more efficient SVGs ✅ More Accurate Evaluation Metrics, overcoming pixel-based limitations ✅ Support for Hugging Face training & testing, available for developers

StarVector makes AI-generated SVGs more precise and reliable, opening up new possibilities for vector graphics applications. 💡

👉 Resources:

Share on:
DMflow.chat Ad
Advertisement

DMflow.chat

DMflow.chat: Your intelligent conversational companion, enhancing customer interaction.

Learn More

© 2025 Communeify. All rights reserved.