I used AutoGPT for the first time today, an early entry into the world of autonomous AI agents that can make plans and solve problems. From my understanding, AutoGPT has an iterative loop that permits the AI to learn and adapt as it works to an objective. It has short and long term memory and is able to break down a prompt into multiple steps and then work towards progressing through each of those steps. Again, I am not terrified of current AI technology. I am terrified that current AI technology will improve, which it will. For AutoGPT, you simply put in the goal of the AI agent, such as "make me a bunch of money," and then a few sub goals, such as "search the web to find good companies to start" and "keep track of all of your research and sources and store them in a folder." It doesn't work well at the moment, but it has only been out for a couple of weeks. The promise of autonomous agents is clear. Many white collar jobs can be replaced, and individuals could become much more productive. Research and administrative work will become much easier, and there is a massive incentive to have a smarter agent than your competition. Every advance in AI increases my conviction that we should lean heavily on AI agents to do alignment research. This year really has been quite the revolution.
The speed at which these developments keep coming is paralyzing. I am further convinced that alignment is important, as now every person on Earth will have access to prompting technology that can actually do destructive things in the real world. Anyone can create a website or a business without any technical knowledge, and everyone is vulnerable to whatever sort of chaos this causes. AutoGPT requires a user to prompt "yes" or "no" before it moves forward with real-world interaction, such as scraping a bunch of websites or moving files around. Future agents will not have this, or if they do I really do not see how it will be useful. I just kept clicking yes, with no clue if AutoGPT would follow the robots.txt policies of a website (that determine if you are even allowed to scrape the website). I've built my own web scrapers, and I clearly didn't have the wisdom to walk away from the curious prompt "hey AI agent, increase my net worth" even though I had no clue what the AI would end up doing. How are non-technical people supposed to weight any of these trade offs? Most people probably won't even know that there are laws or policies that they could be breaking, and they are probably liable to whatever their autonomous agent does. The cost of running these agents is already super low (today cost me 8 cents), and as competition heats up it will be virtually free. Saying that this is a legal nightmare is an understatement.
Users will clearly have no idea what their agent is doing, and they probably won't care. The chaos that these point-and-click machines will have is unknown, but it is clear that if they are unaligned they could cause a lot of damage. For example, you prompt "make me a lot of money" and the AI illegally siphons money away from a children's hospital because that it outside of its objective function. What I want to emphasize here though, is even aligned AI can be really, really bad. Because a scammer can say "create a Facebook pretending to be my target's uncle, generate a bunch of realistic photos of the uncle, build up a bunch of friends, and then reach out to the target claiming to be the uncle. Say that you are in trouble and need money. Leave realistic voice memos. Do whatever else you think could be convincing." The AI agent will read that, develop a plan, and then break that plan down into discrete steps. Then it will iterate through each one of the steps and execute the plan. Fraud and deceit become easy. And cheap. Simpler example: a terrorist uses a perfectly aligned agent and says "cripple the US financial system." Even if this agent totally understands the terrorist's intentions, the outcome will be very bad. Even just pursuing the first few steps of this goal could cause a lot of damage. It is probably better if all of these autonomous agents in the future are perfectly aligned, but we shouldn't celebrate that necessarily as a victory. Agents can be aligned to the wrong values. The genie problem mentioned in a previous post rings even truer now. May the person with the most powerful genie win.
No comments:
Post a Comment