Sora's AI Videos Arrive as Musk Pitches Image Tool for Making Memes

6 days ago 3

If OpenAI gets it right, Sora, the AI tool announced in February that's able to create photorealistic and high-quality video from simple text prompts, will upend the way video is created, with professional video creatives and Hollywood filmmakers among those expected to most feel its effect on their industry.

Now those creatives — and anyone with a $20 monthly subscription to ChatGPT Plus — can test a version called Sora Turbo, which became available in the US last week (as part of OpenAI's 12 days of announcements). You can write a text prompt to create short clips, as well as upload photos and other videos as reference material for your prompts. The results are five to 20 seconds long, with resolutions between 480 and 1,080 pixels, and they can be delivered in wide-screen, vertical or square aspect ratios. There are also postgeneration editing features, like storyboard, remix and loop, for more fine-tuning, CNET's Katelyn Chedraoui notes.

And though the demos of Sora over the past few months have been impressive — with The Wall Street Journal noting that the AI videos "are good enough to freak us out" — there are still some glitches to resolve, OpenAI acknowledged in a blog post. "It often generates unrealistic physics and struggles with complex actions over long durations," the post says. Think characters with extra arms and legs. The company is also limiting the testers who can create videos of humans, as it works to "address concerns about the misappropriation of likeness and deepfakes."

In a nod to the ongoing issues about what training data is being used to feed AI engines (with some publishers alleging that OpenAI and others have co-opted their copyrighted material without permission), OpenAI wrote that Sora is trained on a "mix of publicly available data, proprietary data accessed through partnerships, and custom datasets developed in-house," including images supplied by humans such as employees.

But there are already reports that OpenAI may have trained Sora on unlicensed video-game content (see TechCrunch and ExtremeTech). OpenAI didn't respond to my request for comment on the whole question of video-game content.

The Sora news is a reminder that creating visuals with AI is going to be a big deal in the year ahead, with Google (Gemini image creator) and Meta (AI Studio) also creating image tools to get more users to engage with AI. Elon Musk's xAI last week also announced a photorealistic image editor, code-named Aurora, for its Grok chatbot. Musk, owner of social media platform X, touted Aurora as a way to "create awesome memes superfast."

As for Sora, OpenAI also acknowledged the need for safety guardrails around the use of tools that may make it easier to create deepfakes — which it wants others to help it solve for. "We're introducing our video generation technology now to give society time to explore its possibilities and co-develop norms and safeguards that ensure it's used responsibly as the field advances." the company wrote.

Great, we're crowdsourcing AI safety parameters. I'm sure nothing can go wrong with that.

Here are the other doings in AI worth your attention.

Google experiments with AI agents that do your bidding

Back in May, Google CEO Sundar Pichai debuted a bunch of AI tools, saying the company's vision included having its tech do the thinking for you — specifically, by letting Google do all the "Googling for you" with features like AI Overviews.

I bring this up because the company continues to deliver on that pitch and last week introduced a new version of its Gemini chatbot and a prototype of a tool called Project Mariner, under the blog headline "Our new AI model for the agentic era."

What does that mean? "We have been investing in developing more agentic models, meaning they can understand more about the world around you, think multiple steps ahead, and take action on your behalf, with your supervision."

Welcome to the new world of AI agents. That may include searching out a product for you and finding the best deal, scheduling meetings, interacting with spreadsheets to get answers to complex questions, and even playing games. This new generation of AI tech "will enable us to build new AI agents that bring us closer to our vision of a universal assistant," Pichai said in the blog post.

"It can understand that it needs to press a button to make something happen," Dennis Hassabis, who oversees Google's DeepMind AI lab, said in an interview with The New York Times. "It can take action in the world."

Mariner, developed as an extension for Google's Chrome browser, is designed to be used with "humans in the loop," company executives told the NYT. So, though it may fill your online shopping cart with groceries, you still need to press the buy button and make the purchase, they said.

Still, the next generation of AI agents from Google, as well as from rivals including OpenAI and Anthropic, seem destined to bring us closer to a world where AI does more of the work for us.

And in many cases, instead of us.

The state of AI now — and next year

If you think you've heard a lot about AI issues, challenges, innovations, products and services in 2024, that's nothing compared with how AI will continue to dominate the conversation in 2025.

That's because more people are now trying out chatbots. OpenAI's No. 1 ranked ChatGPT chatbot saw its number of visitors double, to 3.9 billion, in November from the year before, according to Similarweb. And organizations, spending money on AI tools, continue to work on figuring out how to bring AI into the workplace pragmatically and ethically.

So, I thought it was worthwhile to call out a few of the reports looking at the state of AI as we head into the New Year.

Deloitte's survey of 1,874 early career versus longer-tenured workers highlighted their differing views on AI, with newer workers more excited about how AI could change the workforce for the better. "Their use of AI is such that one person we interviewed described AI as 'that first person you ask before going to a manager' for feedback and advice," the firm found. Deloitte's report is here.

In its 2025 AI Business Predictions, PwC says most workforces will double in 2025 as they "welcome a host of new members to the team this year: digital workers known as AI agents." And while there's a lot of talk about which AI chatbots will win marketshare, PwC says having an AI strategy is more important for companies than picking the right large language model. "There will be many good options. Everyone will be using them. A shrewd strategy will instead emphasize what can set you apart" in using AI.

When it comes to AI and higher education, Pearson found in its survey of more than 1,000 US college students that 58% said gen AI helped them get better grades. Meanwhile, 77% of nearly 3,500 faculty members in the US surveyed said they expect to use "AI to enhance their teaching methods."

And when it comes to AI and the music industry, the International Confederation of Societies of Authors and Composers, which says it represents more than 5 million creators, conducted a global survey to determine how AI may affect the music industry in the not-too-distant future. It found that gen AI will "enrich tech companies while substantially jeopardising the income of human creators in the next five years."

Also worth knowing...

In a CNET survey, a quarter of smartphone owners said they don't find AI features helpful, 45% said they're reluctant to pay a monthly subscription fee for AI capabilities, and 34% said they have privacy concerns.

Chatterbox Labs tested the leading large language models as part of its AI safety research, and after testing them across several categories of harm — fraud; hate speech; illegal activity; misinformation; security and malware; self harm; sexually explicit content; and violence — found that AI models from Anthropic and Amazon "showed the most progress in AI safety." Detailed test results can be found here.

Local readers of the historic Ashland Daily Tidings in Oregon have been duped by scammers who used AI and, in some cases, stolen reporters' identities to produce fake news or AI slop for the newspaper, which shut down in 2023, according to an investigation by the nonprofit OPB media org. "Almost as soon as it closed, a website for the Tidings reemerged, boasting a team of eight reporters ... who cranked out densely reported stories every few days," it found. "The mysterious takeover of a more than 140-year-old news outlet offers a warning of ... what an online future supercharged by the next unregulated wave of technology from Silicon Valley companies may hold for news consumers."

Though ChatGPT is positioning itself as a search engine working with publishers, a Tow Center report for the Columbia Journalism Review found that ChatGPT may actually misrepresent publishers' content. "While the company presents inclusion in its search as an opportunity to 'reach a broader audience,' a Tow Center analysis finds that publishers face the risk of their content being misattributed or misrepresented regardless of whether they allow OpenAI's crawlers." Asked for comment, OpenAI told the center that it's "collaborated with partners to improve in-line citation accuracy and respect publisher preferences."

Read Entire Article