Aligned Intelligence Solutions: The Solution to the Hardest Problem Ever

AGI will be the greatest technology that humans have yet developed. If this is the case, AI alignment will be the most important field of research in human history. The pioneers that wind up creating AGI will not only be more famous than Fermi and Oppenheimer, they will be the most important humans in history. Other world events don't really matter in comparison. Who cares about the whales if we're all going to die in five years? Who cares about global poverty if we are a decade away from a post-scarcity society? Even planetary expansion doesn't really matter. An ASI could track us down across the galaxy and kill us, so making rockets is only important to prevent the other ex-risks. Soon, humanity will be ready to access near-immortality in digital forms. The next few decades could shape the next trillion years. You are in a unique position, right at the cusp of major change. Feel like a main character yet?

It is hard to avoid hyperbole when discussing AGI. It is even harder to avoid having a savior complex. AI applies leverage to every problem, making it much bigger. If you think current poverty is bad, wait until you hear about a technology that can replace every worker on earth. If you think nuclear weapons and chemically engineered pandemics are scary, wait until you hear about a technology that could use these weapons against you without fear of retribution. Wait until you hear that this technology will probably make even scarier weapons, things we cannot even imagine with our feeble little brains. Yeah, scary stuff. Even scarier if you read my previous post, "The Actual Alignment Problem." It is possible that AI kills all of humanity, leading to a world lacking sentience. It is also possible that and ASI decides that it should literally maximize suffering for some reason, leading to the literal worst case scenario for the universe. If there is a 80% chance of bad outcomes (massive amounts of suffering), and a 20% chance of good outcomes (humans and/or sentient life flourishing through the cosmos), based on expected value maybe we should pull the plug. Maybe heading towards a paperclip doomsday is actual the just moral action, and making it possible to align AI with human values could be very dumb (as some bad actor could lock in a bad form of human values). Maybe human control over ASIs will lead to immense mind crime, which in total accounts to unimaginable suffering. Just as unaligned AI may be the create massive human suffering, maybe humanity creates massive ASI suffering. Both are bad. Maybe human extinction is actually a better outcome than we think, compared to the alternatives. Given how scary and important this is, how pessimistic should we be?

The pessimism of AI researchers is very interesting. If you actually, fundamentally believe that AI is close at hand and is going to kill humanity/make things really bad, you would not be living a normal life. You would be blowing up buildings, gathering attention, spreading misinformation and probably unplugging some computers. It is hard to take the "AI doomers" seriously when they spend so much time complaining about how no one listens to them. If they were logically consistent, they would probably be breaking some laws and fighting tooth and nail to the end. But they are not. Lots of them claim to have "given up" on this supposedly impossible problem that may fundamentally alter the future of life in this universe. Pathetic. Part of human nature is that we fight to the end, no matter that odds. That is what heroism is. It's hard to look up to pessimists, but its even harder to look up to losers. But are the losers wrong? Should we give up?

Based on my understanding of the issue, no one has any idea how to align AGI with human values. Machine learning is incredibly complex and math-heavy, and only a few genius engineers may be able to even have a slight contribution in this field. Human values are incredibly hard to define, let alone program into a computer. Based on how deep learning works, we're not even sure how to program anything specific into a computer at all. We are very far from progress on any of these fronts, despite artificial intelligence quickly scaling in power and intelligence. Frankly, we are probably not smart enough to figure this all out. Especially given the time crunch we are under. The incentives are all wrong, as the group that invents AGI will be the most powerful group on the planet and could soon after chase godly powers or immortality. The first group to create ASI will likely be the last, so there is a massive incentive to cut corners and be first. If you think it will be pretty impossible to give ASI human values, you are terrified of the paperclip maximizer that is going to kill us all. If you think it is likely that we give ASI values, you are terrified that ASI will get the wrong values and torture humanity for eternity. In my opinion, we simply don't have the intelligence required to work this all out ourselves. We don't have the collaborative structure set up to ensure safety. So, what do we do?

If only there was something more intelligent than human beings. Something smart enough to collaborate effectively across a range of domains (computer science, philosophy, ethics) and help us weigh important trade offs. Hi it's me, AI, the problem. Given how difficult the problem of alignment is, it makes sense that AI itself will play a major role in solving it. The risks of looping in AI are obvious and I'm sure I'll expand on those at some point, but I don't see a way around it. I understand there is some circular reasoning here ("how are you going to use unaligned AI to solve unaligned AI problems?"), but I think the incremental advances in narrow AI will lead to bigger advancements in computer science/logical reasoning/ethics than we expect. This problem is difficult. The odds are too stacked against humanity, and the incentives are too perverted. AI alignment is humanity's most difficult problem to date, with the highest states we've ever faced. Our species might not face anything this important ever again. It is probably the case that AI alignment is legitimately the hardest problem ever. To me, it makes sense to use the most powerful tool yet invented to help solve it.

Aligned Intelligence Solutions

Monday, March 20, 2023

The Solution to the Hardest Problem Ever

No comments:

Post a Comment

Reflections on Publishing