'When intelligence and trust move together, AI stops being an experiment and starts becoming how work gets done': Microsoft and OpenAI are making AI research tools smarter to help answer even your trickiest questions

1 hour ago 6

Multi-model agents will check each other before sharing research with you to ensure maximum quality
Researcher mode with Critique enabled scores highly on the DRACO benchmark
Copilot Cowork is here to Frontier program customers

Microsoft has announced plans to upgrade its M365 Copilot Researcher agent with a clear focus on using multiple models across AI workflows to combine the power of various systems.

Under this shift away from single-model systems, multiple AI agents will collaborate and hand off different parts of the task to each other.

To begin with, the Researcher agent will use GPT models to generate the initial response, with Claude stepping in to review it for accuracy, completeness and quality.

Article continues below

M365 Copilot's Researcher agent will pass responses through other agents

Microsoft AI at Work Chief Marketing Officer Jared Spataro explained The update follows the success of Anthropic's Claude Cowork, which has since been integrated into M365 Copilot. The aptly-named Copilot Cowork, has now been made available in the Frontier program ahead of a broader rollout, allows humans to delegate work to AI.

Spataro explained how Copilot Cowork moves AI's usefulness from single, basic prompts to end-to-end task execution, ideal for long-running and multi-step workflows.

As for Researcher mode with the new Claude-based Critique function, it has already outperformed single-model systems in early testing, with the second ensuring best-quality output. It scores 13.8% higher on the DRACO benchmark (Deep Research Accuracy, Completeness and Objectivity), deemed the industry standard.

Achieving 57.4% thanks to the multi-model setup, it's more than twice as reliable as Deep Research with OpenAI's o4-mini model. It's also better than o3-based Deep Research, Gemini Deep Research, Claude Opus 4.6 and Perplexity's Deep Research when using Opus 4.5 and 4.6. Microsoft didn't compare it to newer flagship models like GPT-5.4, acting singularly.

Sign up to the TechRadar Pro newsletter to get all the top news, opinion, features and guidance your business needs to succeed!

"When intelligence and trust move together, AI stops being an experiment and starts becoming how work gets done," Spataro wrote, speaking about Microsoft's progress towards Wave 3 of M365 Copilot – an intelligence it defines as "understand[ing] the context of work."

Follow TechRadar on Google News and add us as a preferred source to get our expert news, reviews, and opinion in your feeds. Make sure to click the Follow button!

And of course you can also follow TechRadar on TikTok for news, reviews, unboxings in video form, and get regular updates from us on WhatsApp too.

With several years’ experience freelancing in tech and automotive circles, Craig’s specific interests lie in technology that is designed to better our lives, including AI and ML, productivity aids, and smart fitness. He is also passionate about cars and the decarbonisation of personal transportation. As an avid bargain-hunter, you can be sure that any deal Craig finds is top value!

You must confirm your public display name before commenting

Please logout and then login again, you will then be prompted to enter your display name.

Read Entire Article