Complementing Emu Video, Meta introduced Emu Edit, a precision-focused model dedicated to image manipulation.
Meta Platforms Inc (NASDAQ: META), the parent company of Facebook and Instagram, has unveiled a suite of artificial intelligence (AI) models designed to revolutionize video generation and image editing on its social media platforms.
According to a company blog post on Thursday, the generative AI tools Emu Video and Emu Edit are designed to help content creators generate videos and edit images without hassle on Facebook and Instagram.
Transformative Video and Image Editing Tools
Both tools were unveiled this year at the Meta Connect event in September 2023. According to the company, the two AI models, which are still in the “fundamental research right now,” were built upon the capabilities of the parent model, Emu, which is rooted in generative AI technology.
During the event, Mark Zuckerberg, the company’s founder and CEO, revealed that Meta trained its Emu model using 1.1 billion pieces of data, including photos and captions shared by users on Facebook and Instagram.
Meta has now released its Emu Video, engineered to generate dynamic four-second videos based on text and image inputs, heralding a new era of visual storytelling.
By leveraging a “factorized” approach, the model efficiently divides the video generation process into two steps. The company said the approach ensures responsiveness to different inputs, allowing creators to craft engaging videos easily.
Unlike traditional models, Emu Video employs only two diffusion models to create 512×512 four-second-long videos at a smooth 16 frames per second, eliminating the need for complex cascades of models.
In addition to generating images without altering their natural state, Meta said the AI tool can animate videos based on the user’s instructions.
“Finally, the same model can “animate” user-provided images based on a text prompt where it once again sets a new state-of-the-art outperforming prior work by a significant margin,” the company said.
Believable Images with Precise Altercation
Complementing Emu Video, Meta introduced Emu Edit, a precision-focused model dedicated to image manipulation. This tool allows users to seamlessly add or remove backgrounds, perform color and geometry transformations, and conduct local and global edits on images.
Meta emphasized precision, asserting that the primary goal of the tool is not just to produce believable images but to alter pixels relevant to the edit request precisely. For instance, when adding text to an object, the model ensures that the object itself remains unchanged.
“We argue that the primary objective shouldn’t just be producing a believable image. Instead, the model should focus on precisely altering only the pixels relevant to the edit request. Unlike many generative AI models today, Emu Edit precisely follows instructions, ensuring that pixels in the input image unrelated to the instructions remain untouched,” reads the blog post.
The company trained Emu Edit using an extensive dataset of 10 million synthesized images, making it one of the largest datasets of its kind. The model’s training involved computer vision tasks, where each photo was accompanied by a description of the task and the desired output image.
Despite being in the research stage, Meta anticipates Emu Video and Emu Edit becoming valuable tools for creators, artists, and animators on its social media platforms.
Businesses Explore the Potential of Generative AI
Meanwhile, the launch of the two AI models aligns with a broader trend as companies explore the potential of generative AI technologies to scale their businesses.
In the past year, there has been a significant increase in interest in the burgeoning generative AI market, partially fueled by the success of OpenAI’s ChatGPT.
Earlier this week, the South Korean-based electronics behemoth Samsung unveiled its AI chatbot, named after the renowned German physicist and mathematician Carl Friedrich Gauss.
As Coinspeaker reported, the AI tool, Samsung Gauss, boasts three main features: text generation, image enhancement, and coding to help businesses streamline their operations.