This idea is in collaboration with Roman Yampolskiy who is an Associate Professor at the University of Louisville. Roman is a qualified expert on the subject of #AI and frequently gives talks on both AI and #cybersecurity!
Written by: Roman Yampolskiy Edited by: Reuben Dreiblatt
With an increase in the capabilities of artificial intelligence, over the last decade, a significant number of researchers have realized the importance of creating not only capable intelligent systems, but also making them safe and secure (1). Unfortunately, the field of AI Safety is very young, and researchers are still working to identify the main challenges and limitations.
I’d like to concentrate on a poorly understood concept, known as the unpredictability of intelligent systems (2), which limits our ability to understand the impact of the intelligent systems we are developing. It creates many challenges for software verification and intelligent system control, as well as AI Safety in general.
Unpredictability of AI is best defined as our inability to precisely and consistently predict what specific actions an intelligent system will take to achieve its objectives, even if we know terminal goals of the system.
Unpredictability does not imply that better-than-random statistical analysis is impossible; it simply points out a general limitation on how well such efforts can be performed, and is particularly pronounced with advanced generally intelligent systems (superintelligence) in novel domains such as engineering of novel drugs or redesign of genomes. In fact we can present a proof of unpredictability for such, superintelligent, systems.
Proof. This is proof by contradiction. Suppose that unpredictability is wrong and it is possible for a person to accurately predict decisions of superintelligence. That means they can make the same decisions as the superintelligence which makes them as smart as superintelligence, but that is a contradiction as superintelligence is defined as a system smarter than any person is. That means that our initial assumption was false and unpredictability is not wrong.
Lower intelligence can’t accurately predict all decisions of higher intelligence, a concept known as Vinge’s Principle (3). “Vinge’s Principle implies that when an agent is designing another agent (or modifying its own code), it needs to approve the other agent’s design without knowing the other agent’s exact future actions.” (4)
“”Vingean uncertainty” is the peculiar epistemic state we enter when we’re considering sufficiently intelligent programs; in particular, we become less confident that we can predict their exact actions, and more confident of the final outcome of those actions. (Note that this rejects the claim that we are epistemically helpless and can know nothing about beings smarter than ourselves.)”. (4)
Unpredictability is an intuitively familiar concept. We can usually predict outcomes of common physical processes without knowing specific behavior of particular atoms, just like we can typically predict overall behavior of the intelligent system without knowing specific intermediate steps. It has been observed that “… complex AI agents often exhibit inherent unpredictability: they demonstrate emergent behaviors that are impossible to predict with precision — even by their own programmers. These behaviors manifest themselves only through interaction with the world and with other agents in the environment … In fact, Alan Turing and Alonzo Church showed the fundamental impossibility of ensuring an algorithm fulfills certain properties without actually running said algorithm. There are fundamental theoretical limits to our ability to verify that a particular piece of code will always satisfy desirable properties, unless we execute the code, and observe its behavior.” (5)
In the context of AI safety and AI governance unpredictability implies that certain standard tools and safety mechanisms would not work to make advanced intelligent systems safe to use. For example, when speaking about legislative control: “… unpredictability makes it very unlikely that the law can appropriately encourage or deter certain effects, and more problematically, the failure of our legal structures will allow people using the algorithms to externalize costs to others without having the ability to pay for the injuries they inflict.” (6)
We can conclude that the unpredictability of AI will make it impossible to have AI that is 100% safe and understood, but we can still strive for Safer AI as we are now more capable of making some educated predictions about the AI we design.
Thanks for reading! For the full article check out Roman’s blog here: https://medium.com/@romanyam/unpredictability-of-ai-3551b8310fc2