Professor Jürgen Schmidhuber
Jürgen Schmidhuber, Ph.D., Habil is
of the Swiss AI lab
IDSIA, Lugano, Switzerland,
at TU Munich (Extraordinarius),
Adjunct Professor of the faculty of computer science at
University of Lugano, and
in Manno (Switzerland).
Jürgen is well known for his work on machine learning, universal Artificial Intelligence (AI), artificial neural networks, digital physics, and low-complexity art. His contributions also include generalizations of Kolmogorov complexity and the Speed Prior.
Recurrent Neural Networks
The dynamic recurrent neural networks developed in his lab are simplified mathematical models of the biological neural networks found in human brains. A particularly successful model of this type is called Long Short-Term Memory. From training sequences it learns to solve numerous tasks unsolvable by previous such models. Applications range from automatic music composition to speech recognition, reinforcement learning and robotics in partially observable environments.
Artificial Evolution / Genetic Programming
As an undergrad at TUM, Jürgen evolved computer programs through genetic algorithms. The method was published in 1987 as one of the first papers in the emerging field that later became known as genetic programming. Since then he has coauthored numerous additional papers on artificial evolution. Applications include robot control, soccer learning, drag minimization, and time series prediction.
In 1989 he created the first learning algorithm for neural networks based on principles of the market economy (inspired by John Holland’s bucket brigade algorithm for classifier systems): adaptive neurons compete for being active in response to certain input patterns; those that are active when there is external reward get stronger synapses, but active neurons have to pay those that activated them, by transferring parts of their synapse strengths, thus rewarding “hidden” neurons setting the stage for later success.
In 1990 he published the first in a long series of papers on artificial curiosity for an autonomous agent. The agent is equipped with an adaptive predictor trying to predict future events from the history of previous events and actions. A reward-maximizing, reinforcement learning, adaptive controller is steering the agent and gets a curiosity reward for executing action sequences that improve the predictor. This discourages it from executing actions leading to boring outcomes that are either predictable or totally unpredictable. Instead the controller is motivated to learn actions that help the predictor to learn new, previously unknown regularities in its environment, thus improving its model of the world, which in turn can greatly help to solve externally given tasks. This has become an important concept of developmental robotics.
Unsupervised Learning / Factorial Codes
During the early 1990s Jürgen also invented a neural method for nonlinear independent component analysis (ICA) called predictability minimization. It is based on co-evolution of adaptive predictors and initially random, adaptive feature detectors processing input patterns from the environment. For each detector there is a predictor trying to predict its current value from the values of neighboring detectors, while each detector is simultaneously trying to become as unpredictable as possible. It can be shown that the best the detectors can do is to create a factorial code of the environment, that is, a code that conveys all the information about the inputs such that the code components are statistically independent, which is desirable for many pattern recognition applications.
Kolmogorov Complexity / Computer-Generated Universe
In 1997 Jürgen published a paper based on Konrad Zuse’s assumption (1967) that the history of the universe is computable. He pointed out that the simplest explanation of the universe would be a very simple Turing machine programmed to systematically execute all possible programs computing all possible histories for all types of computable physical laws. He also pointed out that there is an optimally efficient way of computing all computable universes based on Leonid Levin’s universal search algorithm (1973). In 2000 he expanded this work by combining Ray Solomonoff’s theory of inductive inference with the assumption that quickly computable universes are more likely than others. This work on digital physics also led to limit-computable generalizations of algorithmic information or Kolmogorov Complexity and the concept of Super Omegas, which are limit-computable numbers that are even more random (in a certain sense) than Gregory Chaitin’s number of wisdom Omega.
Important recent research topics of his group include universal learning algorithms and universal AI. Contributions include the first theoretically optimal decision makers living in environments obeying arbitrary unknown but computable probabilistic laws, and mathematically sound general problem solvers such as the remarkable asymptotically fastest algorithm for all well-defined problems, by his former postdoc Marcus Hutter. Based on the theoretical results obtained in the early 2000s, he is actively promoting the view that in the new millennium the field of general AI has matured and become a real formal science.
An old dream of computer scientists is to build an optimally efficient universal problem solver. Jürgen uses Gödel’s self-reference trick to achieve this. A Gödel Machine is a computer whose original software includes axioms describing the hardware and the original software (this is possible without circularity) plus whatever is known about the (probabilistic) environment plus some formal goal in form of an arbitrary user-defined utility function, e.g., cumulative future expected reward in a sequence of optimization tasks. The original software also includes a proof searcher which uses the axioms (and possibly an online variant of Levin’s universal search) to systematically make pairs (“proof”, “program”) until it finds a proof that a rewrite of the original software through “program” will increase utility. The machine can be designed such that each self-rewrite is necessarily globally optimal in the sense of the utility function, even those rewrites that destroy the proof searcher.
Low-Complexity Art / Theory of Beauty
Jürgen’s low-complexity artworks (since 1997) can be described by very short computer programs containing very few bits of information, and reflect his formal theory of beauty based on the concepts of Kolmogorov complexity and minimum description length.
He writes that since age 15 or so his main scientific ambition has been to build an optimal scientist, then retire. First he wants to build a scientist better than himself (humorously, he quips that his colleagues claim that should be easy) who will then do the remaining work. He says he “cannot see any more efficient way of using and multiplying the little creativity he’s got”.
Jürgen’s more than 100 peer-reviewed articles include Recurrent Neural Networks, Artificial Evolution Active Exploration, Artificial Curiosity & What’s Interesting, Nonlinear Independent Component Analysis (ICA), Unsupervised Learning, Redundancy Reduction, Generalized Algorithmic Information Generalized Algorithmic Probability Super Omegas, Speed Prior: A New Simplicity Measure for Near-Optimal Computable Predictions (based on the fastest way of describing objects, not the shortest), Computable Universes & Algorithmic Theory of Everything, and Theory of Beauty & Low-Complexity Art. Read the full list of his publications!
Jürgen earned his diploma in computer science in 1987, his Ph.D. in 1991 and his Habilitation in 1993, all from TUM. Watch In the beginning was the code: Jürgen Schmidhuber at TEDxUHasselt.