In the rapidly evolving landscape of artificial intelligence, the launch of Manus, a new “agentic” AI platform, has stirred excitement akin to a high-profile celebrity event. Marketed as one of the most impressive AI tools to date, Manus is indeed creating waves—but perhaps not all for the right reasons.
Manus: The New AI Darling?
Described by the head of product at Hugging Face as “the most impressive AI tool I’ve ever tried,” Manus has quickly become a topic of intense discussion and speculation. Dean Ball, an AI policy researcher, echoed this sentiment by labeling it as the “most sophisticated computer using AI.” This buzz helped its Discord server skyrocket to over 138,000 members shortly after launch, with invite codes being hawked on the Chinese reseller app Xianyu for thousands of dollars.
However, beneath the shimmering surface of hype, the true capabilities of Manus remain somewhat ambiguous. Not entirely built from the ground up, Manus leverages a mix of existing and finely-tuned AI models such as Anthropic’s Claude and Alibaba’s Qwen. These are employed for tasks ranging from drafting research reports to analyzing financial filings, suggesting a foundation built on the familiar rather than the revolutionary.
The Butterfly Effect, the Chinese company behind Manus, boasts lofty claims on its website, stating that Manus can handle tasks as diverse as buying real estate and programming video games. Yet, these assertions stand on shaky ground when weighed against practical experiences shared by early users.
Real-World Performance: Expectation vs. Reality
Yichao “Peak” Ji, a research lead for Manus, touted the platform’s superiority over other agentic tools in a viral video, claiming it outperforms competitors in the GAIA benchmark, which measures an AI’s ability to perform tasks involving web browsing and software use. Ji presents Manus as a paradigm shift in human-machine collaboration, envisioning it as a bridge between conception and execution.
Despite these bold claims, feedback from users like Alexander Doria, co-founder of AI startup Pleias, paints a less rosy picture. Doria’s encounters with error messages and endless loops while testing Manus highlight a gap between the hype and the actual utility of the platform. Moreover, other users have noted that Manus often falters on factual questions and fails to consistently cite its sources, overlooking information that is readily available online.
Frustration in Simplicity
Even basic tasks seem to challenge Manus. An attempt to order a fried chicken sandwich from a highly-rated fast food outlet resulted in a system crash and, upon a second try, ended without a way to complete the order. Similar shortcomings emerged when Manus was tasked with booking a flight from NYC to Japan or reserving a restaurant table—jobs it could not fulfill satisfactorily.
The Hype Machine: Why the Fuss?
So, what fuels the hype around Manus if the performance is currently underwhelming? A combination of media exuberance, scarcity of invites enhancing its allure, and misinformation circulating on social platforms contribute significantly. Chinese media outlets, such as QQ News, have branded Manus as a domestic pride, while some AI influencers misrepresent its capabilities, further muddying public perception.
A statement from Manus’s spokesperson to TechCrunch suggests that the current focus is on enhancing the system and ironing out kinks during this closed beta phase. It’s clear that while Manus may represent a future potential in AI development, it remains a work in progress, with its real-world efficacy yet to match the enthusiastic claims made by its creators and marketers.
As Manus continues to be tested and developed, it serves as a reminder that in the world of technology, especially AI, the journey from hype to practical utility can be fraught with challenges and setbacks. For now, Manus remains a testament to the intrigue and excitement that new technologies can foster, even if it has yet to prove itself as the next major milestone in AI development.