The Anthropic Claude 3.5 Sonnet Outperforms GPT-4o

June 22, 2024

Anthropic has launched Claude 3.5 Sonnet, a mid-tier model that excels in various benchmarks, even surpassing the company’s current top-tier model, Claude 3 Opus. Available for free on Claude.ai and the Claude iOS app, with enhanced access for Claude Pro and Team plan subscribers, Claude 3.5 Sonnet can also be utilized through the Anthropic API, Amazon Bedrock, and Google Cloud’s Vertex AI. The model is priced at $3 per million in input tokens and $15 per million in output tokens and features a 200K token context window.

More Details on the Anthropic Claude 3.5 Sonnet

Anthropic claims that Claude 3.5 Sonnet sets new industry benchmarks in areas such as graduate-level reasoning (GPQA), undergraduate-level knowledge (MMLU), and coding proficiency (HumanEval). The model shows enhanced capabilities in understanding nuance, humour, and complex instructions while producing high-quality content with a natural tone. Operating at twice the speed of Claude 3 Opus, Claude 3.5 Sonnet is ideal for complex tasks like context-sensitive customer support and multi-step workflow orchestration. In an internal agentic coding evaluation, it solved 64% of problems, significantly outperforming Claude 3 Opus at 38%.

The model also demonstrates improved vision capabilities, surpassing Claude 3 Opus on standard vision benchmarks. This advancement is particularly noticeable in tasks requiring visual reasoning, such as interpreting charts and graphs. Claude 3.5 Sonnet can accurately transcribe text from imperfect images, a valuable feature for industries like retail, logistics, and financial services.

New Features and Safety Commitment

Alongside the model launch, Anthropic introduced Artifacts on Claude.ai, a feature that enhances user interaction with the AI by allowing users to view, edit, and build upon Claude’s generated content in real time. This creates a more collaborative work environment. Despite its significant intelligence leap, Claude 3.5 Sonnet maintains Anthropic’s commitment to safety and privacy. The company states that its models undergo rigorous testing to reduce misuse. External experts, including the UK’s AI Safety Institute (UK AISI) and child safety experts at Thorn, tested and refined the model’s safety mechanisms.

Anthropic emphasizes its dedication to user privacy, stating, “We do not train our generative models on user-submitted data unless a user gives us explicit permission to do so. To date, we have not used any customer or user-submitted data to train our generative models.”

Future Plans

Looking ahead, Anthropic plans to release Claude 3.5 Haiku and Claude 3.5 Opus later this year to complete the Claude 3.5 model family. The company is also developing new modalities and features to support more business use cases, including integrations with enterprise applications and a memory feature for more personalized user experiences.

This strategic advancement highlights Anthropic’s continuous innovation in the AI space, providing more efficient and powerful tools for businesses and users alike.

Post Views: 673