Apple's Rare Move! Open-Sourcing AI Model FastVLM, But Developers Shouldn't Get Too Excited Just Yet
Apple recently quietly released its visual language model FastVLM, released a few months ago, on the Hugging Face platform. This move shocked the entire AI community, as Apple is known for its closed ecosystem. However, this ‘open source’ comes with strict conditions - limited to academic research. Is this a small step for Apple to embrace an open culture, or is there another plan?
In the past, when we talked about Apple, words like “walled garden” and “ecosystem barrier” came to mind. Their hardware and software have always been tightly integrated, creating their own unique system. But recently, this tech giant seems to be loosening up.
Apple has released a series of machine learning models on the well-known AI developer community Hugging Face, with FastVLM and MobileCLIP2 being the most notable. This is undoubtedly good news for researchers, but for developers who want to apply these models to commercial products, it may be a disappointment.
What’s so great about this model called FastVLM?
Let’s first talk about the protagonist, FastVLM. It is a “Vision-Language Model” (VLM), which simply means it’s an AI that can understand both images and text. You can give it a picture and ask it a question in text, and it can understand the picture and answer you like a human.
Sounds cool, right?
The power of FastVLM lies in its efficiency. As you can guess from the “Fast” in its name, its response speed and processing efficiency have been optimized. Apple has also thoughtfully provided different sized versions, from the lightweight 0.5B (500 million parameters) to the powerful 7.76B version (officially called 7B, but we all know it’s actually a bit larger, which is common in the industry).
- FastVLM-0.5B: https://huggingface.co/apple/FastVLM-0.5B
- FastVLM-1.5B: https://huggingface.co/apple/FastVLM-1.5B
- Complete model set: https://huggingface.co/collections/apple/fastvlm-68ac97b9cd5cacefdd04872e
Not only that, Apple also provides an online demo for you to experience the power of FastVLM firsthand, and even includes the source code, showing their sincerity.
- Online demo and source code: https://huggingface.co/spaces/apple/fastvlm-webgpu
So what is MobileCLIP2?
Along with FastVLM, MobileCLIP2 was also introduced. CLIP-type models are specifically designed to create connections between text and images. You can think of it as a “translator” that can tell the AI that the word “cat” is related to a photo of a cat.
And the word “Mobile” suggests its original design purpose - for mobile devices. This means that MobileCLIP2 has been specially optimized for performance and power consumption, making it very suitable for running on devices like iPhones or iPads.
- MobileCLIP2 model set: https://huggingface.co/collections/apple/mobileclip2-68ac947dcb035c54bcd20c47
Behind the open source: the “research only” red line
Seeing this, you might be thinking, “Great! I can use Apple’s models to develop new apps!”
Please calm down first.
Apple’s open source this time is not completely without restrictions. In the license terms, Apple clearly states that it grants a “personal, non-exclusive, worldwide, non-transferable, royalty-free, revocable, limited license.” The most crucial sentence is: “for research purposes only”.
What does this sentence mean? In simple terms:
- Academic researchers: Congratulations, you can freely use, copy, and modify these models to publish papers or conduct academic experiments.
- Commercial developers: Sorry, you cannot use these models or their derivatives in any commercial products or services.
This red line is drawn very clearly. Apple is willing to share its technology with the academic community to accelerate innovation in the AI field, but for now, it does not want these results to flow directly into the commercial market and be used by competitors or independent developers.
What is Apple’s next move?
This move can be said to be an important shift in Apple’s AI strategy. In the past, Apple’s AI technology was mostly “heard of but not seen,” silently integrated into its own products, such as Siri, camera algorithms, etc.
Now, through conditional open source, Apple can not only attract top AI talent, but also leverage the power of the global research community to verify and improve its own models, while maintaining its exclusive advantage in commercial applications.
This is a very smart move. It allows Apple to maintain its closed ecosystem while also having a place in the wave of open source AI, enhancing its influence in academia and research. Perhaps this is also to pave the way for more powerful on-device AI functions in the future, allowing future iPhones and Macs to have a more intelligent experience.
In summary, Apple’s “open source” is a great gift to the academic community and a positive signal to the entire AI community. Although commercial developers cannot yet enjoy this bonus, it does show us Apple’s potential for greater openness in the AI era.