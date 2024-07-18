OpenAI announced the release of GPT-4o mini, the company’s “most cost-efficient small model” aimed at making AI “as broadly accessible as possible.”

OpenAI unveiled GPT-4o in mid-May, showing off the AI model’s real-time capabilities. GPT-4o is the company’s most powerful model to date, boasting impressive abilities ranging from deciphering written math equations to understanding mood and context.

The company is building on that success with GPT-4o mini, which “outperforms GPT-4 on chat preferences in LMSYS leaderboard, the crowdsourced platform that evaluates large language models. Just as impressive, OpenAI says the new model is 60% cheaper than GPT3.5 Turbo.

GPT-4o mini currently includes support for text and vision, but will add support for image, audio, and video inputs and outputs in future updates.

GPT-4o mini surpasses GPT-3.5 Turbo and other small models on academic benchmarks across both textual intelligence and multimodal reasoning, and supports the same range of languages as GPT-4o. It also demonstrates strong performance in function calling, which can enable developers to build applications that fetch data or take actions with external systems, and improved long-context performance compared to GPT-3.5 Turbo.

OpenAI highlights three key areas where GPT-4o mini performs well in benchmarks.

Reasoning tasks: GPT-4o mini is better than other small models at reasoning tasks involving both text and vision, scoring 82.0% on MMLU, a textual intelligence and reasoning benchmark, as compared to 77.9% for Gemini Flash and 73.8% for Claude Haiku. Math and coding proficiency: GPT-4o mini excels in mathematical reasoning and coding tasks, outperforming previous small models on the market. On MGSM, measuring math reasoning, GPT-4o mini scored 87.0%, compared to 75.5% for Gemini Flash and 71.7% for Claude Haiku. GPT-4o mini scored 87.2% on HumanEval, which measures coding performance, compared to 71.5% for Gemini Flash and 75.9% for Claude Haiku. Multimodal reasoning: GPT-4o mini also shows strong performance on MMMU, a multimodal reasoning eval, scoring 59.4% compared to 56.1% for Gemini Flash and 50.2% for Claude Haiku.

The company says users can now access GPT-4o mini in place of GPT-3.5 across plans.