OpenAI has officially launched the O1 model through its API, replacing the previous O1-preview version. The updated model restores key features missing from the experimental version while introducing enhancements tailored to improve developer applications powered by AI.
The O1 model now includes advanced tools like “developer-guided prompts,” allowing developers to customize AI for specific use cases, such as assisting tax professionals. It also introduces the “thinking effort” feature, which adjusts the time the model spends processing queries. This optimization reduces costs and processing time for simpler tasks.
Another standout improvement is the ability to input visual data, such as scanned images of documents, as model inputs. Additionally, the O1 model boasts enhanced functionality for calling internal and external APIs, with structured output formatting based on developer-defined templates.
OpenAI revealed that the new O1 model uses 60% fewer tokens than the previous version, delivering faster and more cost-effective results. Despite the token reduction, performance benchmarks indicate a 25-35% improvement in model accuracy for various tasks.
Developers can access the O1 model starting today, while the professional-grade O1 Pro version will be released soon.
WebRTC Integration for Voice AI Applications
The update also includes full support for WebRTC in OpenAI’s voice APIs, simplifying the creation of voice-enabled AI applications. WebRTC reduces the required codebase from approximately 250 lines to just 12 lines, enabling developers to build smart applications for gaming, wearable devices, smart glasses, and cameras with ease.
OpenAI is providing pre-built WebRTC-compatible code snippets to accelerate development workflows.
New Direct Preference Optimization Method
OpenAI introduced a novel customization method called “Direct Preference Optimization.” This feature allows developers to submit two distinct responses and highlight their preference. The system learns to distinguish between preferred and non-preferred responses, refining aspects like creativity, style, and tone based on developer input.


                                    