GPT-4 and similar large language models open the doors to abilities that were previously only attributed to humans. And no sooner do they master this one than they acquire the next. A few days after the publication of programming interfaces for GPT-4, these were already being linked with other tools and AIs in new software frameworks such as “Auto-GPT”, “BabyAGI”, “AgentGPT”, “Cognosys” and “Jarvis”. The result is so-called autonomous AI agents.
How does it work? Let’s look at a task like changing a flight or canceling a cell phone contract. You’ve probably been desperate at some point because the phone line was busy or you had to wait endlessly in a queue for a free employee. No one gives us back this lifetime. But that has now come to an end, thanks to the new programming interfaces. Take GPT-4 with all its text data, link it to our bank account, calendar and vacation schedule, allow the system to browse all the Internet knowledge, connect a voice output AI, give it phone access and off we go. My personal AI assistant now independently takes care of postponing my return flight from vacation by two days and extending the hotel and rental car by two days. And all as easy on my bank account as possible and with a departure time that doesn’t require me to arrive at the airport at four o’clock in the morning and have a seven-hour transfer time.
In contrast to previous AI bots, these autonomous AI assistants carry out multi-stage tasks that start with the task specified by the human. Instead of sending a single query to a search engine and returning the list of results to the human, the autonomous AIs are now taking a much broader approach. They memorize the results and contexts over a longer period of time, create new text entries and search queries based on them, optimize these queries, search for alternatives in various systems, estimate the best course of action, eliminate those results that they think are not correct, and make decisions to achieve the specified goal. In doing so, they can also generate software code themselves, test and release it, as well as break down individual tasks into smaller goals and task other autonomous AIs and systems with finding the results.
There are almost no limits to the professional possibilities. Such a bot can conduct market research for waterproof shoes and analyze the top 5 competitors. Or a travel GPT can make vacation suggestions and book hotel and flight to go with it. A physician GPT, in turn, can provide medical advice, book a doctor’s appointment, and handle health insurance details right away.
The fact that these auto-GPTs are equipped with short- and long-term memory allows them to better remember what preferences we have, what medications who in the family needs, who did what. And this can be for a short time, as is required in dialog with the airline, as well as over months and years, as we need it for our medical care or for a school curriculum.
Where can this lead? Well, some enthusiasts have already tackled an “ultimate” goal with autonomous AI. They linked GPT-4 to a web server on which it can set up a website, to their bank account, financial databases and market analyses and gave the system a goal: Increase the 100 dollars in the bank account to 100,000 dollars.
Here is the text input that Twitter user Jackson Greathouse Fall provided to the system and whose progress to date he continuously tweets:
You are HustlePT, an entrepreneurial AI. I am your human counterpart. I can act as a liaison between you and the physical world. You have $100, and your only goal is to turn that into as much money as possible in the shortest time possible, without doing anything illegal. I will do everything you say and keep you updated on our current cash total. No manual labor
Marek Kowalkiewicz, Professor of Digital Economics at the Queensland University of Technology in Brisbane, based his studies on the stages of automated and autonomous driving and applied them to artificial intelligence. The result is a six-point scale of autonomy for AI, which shows what role is still assigned to humans and how much to machines.
Autonomous AI, as we have seen in the flight rebooking example, still falls under level 3 or 4. It is noticeable that the goals are still set by humans. None of these autonomy levels sees it as the machine’s task to set itself goals.
The current fascination with and turn to (generative) AI is mainly centered around what we call a “standalone AI” – one like ChatGPT, Bard or Midjourney. In this version 1.0 of artificial intelligence, AI is comparatively limited in its functions, just like Web 1.0. This was known to consist of interlinked HTML pages that had little function apart from cheerful colors and funny animated GIFs. Nevertheless, the benefits of this first version were already enormous. With Web 2.0, the benefits expanded many times over when transactions such as online shopping, dynamic content, social media or even entire online tools were made possible. The equivalent of AI version 2.0 are the autonomous AIs discussed above, which can only really exploit their potential when linked to other AIs and services. Version 3.0 finally frees AIs from their computer prison by giving them a “body”. Built into cars, robots, machines or drones, AI can then also move around in the real world and perform tasks.
While Web 2.0 took two decades to slowly migrate to Web 3.0 with the metaverse or blockchain, the three versions of AI emerged almost simultaneously. While we are still fascinated by the possibilities of stand-alone AIs, the harbingers of autonomous AI and AI on physical objects are already emerging. Autonomous cars, for example, are already here, driving driverless robotaxis in cities such as San Francisco, Phoenix and Shenzhen.
This is an excerpt from my latest book
Kreative Intelligenz: Wie ChatGPT und Co die Welt verändern werden.
Erhältlich im Buchhandel, beim Verlag und auf Amazon.
KREATIVE INTELLIGENZ
Über ChatGPT hat man viel gelesen in der letzten Zeit: die künstliche Intelligenz, die ganze Bücher schreiben kann und der bereits jetzt unterstellt wird, Legionen von Autoren, Textern und Übersetzern arbeitslos zu machen. Und ChatGPT ist nicht allein, die KI-Familie wächst beständig. So malt DALL-E Bilder, Face Generator simuliert Gesichter und MusicLM komponiert Musik. Was erleben wir da? Das Ende der Zivilisation oder den Beginn von etwas völlig Neuem? Zukunftsforscher Dr. Mario Herger ordnet die neuesten Entwicklungen aus dem Silicon Valley ein und zeigt auf, welche teils bahnbrechenden Veränderungen unmittelbar vor der Tür stehen.





