OpenAI Launches GPT-4o Mini, Revolutionizing Small AI Models

3 min readJul 20, 2024

OpenAI has just unveiled its latest innovation, GPT-4o mini, a compact AI model designed to be more affordable and efficient than its predecessors. The GPT-4o mini is now available for developers and users through the ChatGPT web and mobile apps, with enterprise access scheduled for next week.

The GPT-4o mini surpasses other leading small AI models in reasoning tasks involving text and vision. This makes it a valuable tool for developers who need fast and cost-effective solutions for high-volume, simple tasks. As small AI models become more advanced, their popularity grows among developers due to their superior speed and cost-efficiency compared to larger models like GPT-4 Omni or Claude 3.5 Sonnet.

Replacing GPT-3.5 Turbo, the GPT-4o mini stands out as OpenAI’s smallest model. It boasts impressive performance metrics, scoring 82% on the MMLU benchmark, which measures reasoning capabilities, outpacing competitors like Gemini 1.5 Flash (79%) and Claude 3 Haiku (75%), as reported by Artificial Analysis. In mathematical reasoning (MGSM), it scored 87%, significantly higher than Flash (78%) and Haiku (72%).

Additionally, GPT-4o mini is markedly cheaper to run, costing over 60% less than GPT-3.5 Turbo. Initially supporting text and vision in the API, OpenAI plans to extend its capabilities to video and audio in the future. OpenAI’s head of Product API, Olivier Godement, emphasized the model’s potential to democratize AI by making it more accessible and affordable.

For developers utilizing OpenAI’s API, GPT-4o mini is priced at 15 cents per million input tokens and 60 cents per million output tokens. With a context window of 128,000 tokens — about the length of a book — the model has a knowledge cutoff of October 2023.

While OpenAI hasn’t disclosed the exact size of GPT-4o mini, it is comparable to other small models like Llama 3 8b, Claude Haiku, and Gemini 1.5 Flash. OpenAI asserts that GPT-4o mini is faster, more cost-efficient, and smarter based on pre-launch testing in the LMSYS.org chatbot arena. Early independent tests confirm its speed, with a median output speed of 202 tokens per second, which is more than twice as fast as GPT-4o and GPT-3.5 Turbo. George Cameron, Co-Founder at Artificial Analysis, highlights the model’s suitability for speed-dependent applications, including many consumer and agentic uses.

New Tools for ChatGPT Enterprise

Alongside the release of GPT-4o mini, OpenAI introduced new tools for enterprise customers. The new Enterprise Compliance API aims to assist businesses in regulated industries — such as finance, healthcare, legal services, and government — in meeting logging and audit requirements.

This API will enable admins to audit and manage their ChatGPT Enterprise data by providing records of time-stamped interactions, including conversations, uploaded files, and workspace user activities. Moreover, OpenAI has enhanced admin control for workspace GPTs, custom versions of ChatGPT tailored for specific business needs. Previously, admins could only fully allow or block GPT actions within their workspace, but now, they can create an approved list of domains for GPT interaction.

With these advancements, OpenAI continues to push the boundaries of AI technology, making it more accessible and efficient for developers and businesses alike.

OpenAI Launches GPT-4o Mini, Revolutionizing Small AI Models

Written by Veer Jain

No responses yet