Meta is aggressively ramping up its artificial intelligence efforts in a bid to catch up to rivals like Google, Microsoft, and OpenAI. The social media giant has introduced a new text-to-image model called CM3leon that it claims achieves state-of-the-art performance for generating images from text prompts. But it’s not yet available for testing or commercial use.
CM3leon marks a breakthrough for Meta’s AI capabilities. The model can not only generate high-fidelity images from text descriptions, but also write coherent captions for existing images. This lays the groundwork for more advanced image understanding models in the future.
Meta is leveraging its formidable data science team and computing infrastructure to advance state-of-the-art models like CM3leon. While diffusion-based AI like MidJourney’s has grabbed headlines, Meta is betting on autoregressive transformer architectures (the same tech used by ChatGPT). The company claims CM3leon needs 5x less training compute than other comparable methods.
In head-to-head comparisons, CM3leon appears to handle complex objects and constraints in text prompts better than models like OpenAI’s DALL-E 2, and even Midjourney. Images shared by Meta show that its new text-to-image generator is capable of accurately representing the human anatomy (no more spaghetti hands) and can even render accurate text (no more random words in AI images)
Author: Jose Antonio Lanz
Tip BTC Newswire with Cryptocurrency