Emerging Technology
How Will OthersideAI’s New Framework Make AI Seem More Human?
By TechDogs Bureau
Updated on Thu, Nov 30, 2023
Share
This is a mantra followed by many artists, who travel and soak up the world to be able to express themselves in a manner that brings life to the phrase “a picture is worth thousands of words”.
Although, when it comes to software application developers, the general picture is that they’re perpetually glued to a device.
However, that wasn’t the case with OthersideAI Lead Software Engineer Josh Bickett, who found a breakthrough software idea while taking care of his newborn daughter.
As Bickett said, “I’ve been really enjoying time with my daughter, who’s four weeks now old and I had a lot of new lessons in fatherhood and all that stuff. But I also had a little bit of time, and this idea kind of came to me because I saw different demos of GPT-4 vision. The thing we’re working on now can actually happen with GPT-4 vision.”
So, what idea did he have? Let’s explore!
What Was Josh Bickett’s Breakthrough Idea?
- Josh Bickett came up with a framework for a “self-operating computer framework”.
- Released on the OthersideAI GitHub page as open-source software, the AI framework can control where and what the mouse clicks on a computer and what keys are hit on a keyboard.
- The application is not text-based but instead is vision-based. It takes screenshots of the computer and then makes decisions on mouse clicks and keystrokes, just like a person would.
- Essentially, the framework would be compatible with various multimodal models. However, the current integration is set with GPT-4V as the default model.
- The ultimate objective of the framework is to reduce the high error rate GPT-4V faces in estimating XY mouse click locations and “achieve human-level performance in computer operation” by tracking the progress of multimodal models.
- This is an initiative being undertaken by OthersideAI’s HyperwriteAI, as it develops “a multimodal model with more accurate click location predictions”.
- Even when Matt Shumer, Co-founder and CEO of OthersideAI, was shown the framework, he saw great potential in it.
What Did Matt Shumer Say About The Framework?
- Speaking about the framework, Shumer said, “This is a milestone in the road to getting to the equivalent of a self-driving car but for a computer. We have the sensors now. We have the LIDAR systems. Next, we build the intelligence.”
- Shumer added, “A lot of things that people do on computers, right, you can’t really do with APIs, which is how a lot of other people are approaching this problem, [when] they want to build an agent. They built it on top of the publicly available APIs for this service, but that doesn’t extend to everything.”
- [Contd.] “If you truly want to solve something that is autonomous [and] can actually help us or get more done. You have to allow it to work like a person because the world is built for people.”
What Did Josh Bickett Say About Releasing The Framework?
- In an X post, Josh Bickett, the creator of the framework and developer for OthersideAI, spoke about the framework, “We are excited to open-source the Self-Operating Computer Framework that enables multimodal models, including GTP-4-Vision to simulate human-like mouse clicks and keyboard inputs on a computer. Based on a given objective, the model estimates the correct X & Y locations for mouse clicks and the appropriate keyboard inputs at each step. A vision-based agent working at the OS level allows for maximum context and adaptability.”
- [Contd.] “The framework is designed to work with any vision-text multimodal model to evaluate its ability to operate a computer. While significant improvements are needed to achieve human-level performance, this code repository serves as a plugin framework. We are also excited to announce we’ll be integrating our ‘Agent-1’ model with the framework in the coming weeks. Help us build the future of agents in public.”
- Bickett also inserted a link to the GitHub page linking to the framework code.
Both Josh Bickett and Matt Shumer believe that open-source collaboration will garner enhanced productivity and progress, with Shumer saying there’s plenty of room for “players in this space.”
Do you think open-source software collaboration will benefit development and competitiveness or would it lead to complications?
Let us know in the comments below!
First published on Thu, Nov 30, 2023
Enjoyed what you read? Great news – there’s a lot more to explore!
Dive into our content repository of the latest tech news, a diverse range of articles spanning introductory guides, product reviews, trends and more, along with engaging interviews, up-to-date AI blogs and hilarious tech memes!
Also explore our collection of branded insights via informative white papers, enlightening case studies, in-depth reports, educational videos and exciting events and webinars from leading global brands.
Head to the TechDogs homepage to Know Your World of technology today!
Disclaimer - Reference to any specific product, software or entity does not constitute an endorsement or recommendation by TechDogs nor should any data or content published be relied upon. The views expressed by TechDogs' members and guests are their own and their appearance on our site does not imply an endorsement of them or any entity they represent. Views and opinions expressed by TechDogs' Authors are those of the Authors and do not necessarily reflect the view of TechDogs or any of its officials. All information / content found on TechDogs' site may not necessarily be reviewed by individuals with the expertise to validate its completeness, accuracy and reliability.
Tags:
Related News on Emerging Technology
Are Self-Driving Cars Driving Their Own Problems?
Fri, Apr 14, 2023
By TD NewsDesk
Will Virgin Galactic Reach New Heights Or Crash?
Fri, Jun 2, 2023
By Business Wire
Oceaneering Reports Fourth Quarter 2022 Results
Fri, Feb 24, 2023
By Business Wire
Exro Announces C$30 Million Bought Deal Financing
Tue, May 16, 2023
By PR Newswire
Is LinkedIn's 1 Billion Club Your AI Career Oasis?
Fri, Nov 3, 2023
By TD NewsDesk
Join The Discussion