ChatGPT for MacOS major upgrade: OpenAI is preparing for a voice model release

· 3 min read
ChatGPT Voice UI on MacOS desktop
ChatGPT Voice UI on MacOS desktop

After the Apple WWDC 2024 conference, OpenAI released a major upgrade for their apps and desktop clients, updating them to version 2. While the visual changes were minimal, users noticed a subtle enhancement in the memory management interface. Now, clicking on the name of the GPT allows users to see a list of GPTs. Currently, this list only includes ChatGPT, but it indicates that OpenAI is working on enabling custom GPTs to have their own memories, a feature not yet available.

Approximately two months ago, custom GPTs briefly had an option to enable memory from the GPT editor, but this feature was later revoked. On the iOS app, users quickly discovered support for screen broadcasting, allowing the device’s screen to be recorded. This functionality is expected to enable ChatGPT in voice mode to interact with on-screen data once the new voice and vision model is released.

The screen recording can be activated from the new conversational UI, which is still hidden under a feature flag. This interface allows starting and stopping screen capturing. The most intriguing feature, also hidden by feature flags, appears on the macOS desktop client. It seems that the voice conversational UI will work in a separate window that can be moved around the screen and appear on top of the ChatGPT app. This widget will have a dedicated shortcut for quick activation, similar to the current chat widget.

ChatGPT Desktop

This new conversational UI includes buttons for screen sharing and camera activation. The camera option will open the device camera, allowing ChatGPT to see the user, while the screen capturing option will let ChatGPT view dynamic on-screen content. For instance, users could show a movie clip to ChatGPT, which, with the new voice and vision capabilities, will be able to understand the content.

ChatGPT Desktop

These new vision capabilities are expected to be a significant improvement for power users, enabling ChatGPT to act as a true copilot. It can observe everything happening on the screen and provide assistance through voice interaction, making it ideal for tasks like pair programming. The quick on/off shortcut adds to the convenience.

However, after some time, the version 2 rollout was rolled back, and users were prompted to update their app to the previous version. The reason for this rollback remains unclear, but it indicates that OpenAI is preparing for the upcoming release of its new voice model. Some longstanding issues, like the inability to open generated images in a pop-up, remain unresolved. It is anticipated that once these major issues are addressed, the release of the new voice model will be imminent.

ChatGPT Desktop

ChatGPT is an advanced AI language model developed by OpenAI, designed to generate human-like text based on input prompts. It excels in various applications, including conversation, content creation, and problem-solving, making it a versatile tool for both personal and professional use.

ChatGPT Alpha
This article can guide you on how to get access to the new alpha and experimental ChatGPT features