Copilot Studio gains early access computer use tool to automate complex GUIs

· 2 min read
Image: Microsoft
Image: Microsoft

Microsoft has announced the introduction of computer use functionality in Copilot Studio, available as part of an early access research preview. This feature enables Copilot Studio agents to interact with graphical user interfaces across websites and desktop applications, bypassing the need for APIs. Agents can perform tasks such as clicking buttons, selecting menus, and typing into fields on screens, making it possible to automate processes even for systems without direct API integration. The capability is designed to adapt automatically to changes in GUIs, ensuring continuous operation without interruptions. Additionally, it is hosted on Microsoft infrastructure, maintaining enterprise data security within Microsoft Cloud boundaries.

Image: Microsoft
Image: Microsoft

The new functionality is particularly suited for use cases like automated data entry, market research, and invoice processing. For example, marketing teams can automate the collection of online data for analysis, while finance departments can streamline invoice processing by extracting and inputting data into accounting systems. This marks a significant evolution in robotic process automation (RPA), addressing common challenges such as the fragility of UI elements and enabling automation for complex dynamic interfaces.

Copilot Studio has been positioned as a comprehensive platform for building AI agents capable of automating workflows. It integrates seamlessly with Microsoft's ecosystem, including Power Automate and other tools, allowing users with minimal coding experience to create sophisticated automation solutions. The platform leverages natural language inputs to simplify the creation of workflows.

While the announcement highlights the potential of computer use in transforming RPA and operational efficiency, user feedback on Copilot Studio has been mixed. Some have reported challenges with its functionality and limitations in certain scenarios, such as generative AI tasks or integrating with external tools like Omnichannel widgets. However, others acknowledge its promise in automating complex processes within enterprise environments.

Microsoft plans to share further details about this feature at its Build conference in May 2025. Interested users can sign up to participate in the early access program.