Marcus Hutter - 1 - Universal Induction & Intelligence

icon

80

pages

icon

English

icon

Documents

2012

Le téléchargement nécessite un accès à la bibliothèque YouScribe Tout savoir sur nos offres

icon

80

pages

icon

English

icon

Documents

2012

Le téléchargement nécessite un accès à la bibliothèque YouScribe Tout savoir sur nos offres

Marcus Hutter - 1 - Universal Induction & Intelligence Foundations of Machine Learning Marcus Hutter Canberra, ACT, 0200, Australia http://www.hutter1.net/ ANU RSISE NICTA Machine Learning Summer School MLSS-2008, 2 { 15 March, Kioloa Marcus Hutter - 2 - Universal Induction & Intelligence Overview † Setup: Given (non)iid data D =(x ;:::;x ), predict x1 n n+1 † Ultimate goal is to maximize proflt or minimize loss † Consider Models/Hypothesis H 2Mi † Max.Likelihood: H =argmax p(DjH ) (overflts ifM large)best i i † Bayes: Posterior probability of H is p(HjD)/p(DjH )p(H )i i i i † Bayes needs prior(H )i † Occam+Epicurus: High prior for simple models. † Kolmogorov/Solomonofi: Quantiflcation of simplicity/complexity † Bayes works if D is sampled from H 2Mtrue † Universal AI = Universal Induction + Sequential Decision Theory Marcus Hutter - 3 - Universal Induction & Intelligence Abstract Machine learning is concerned with developing algorithms that learn from experience, build models of the environment from the acquired knowledge, and use these models for prediction. Machine learning is usually taught as a bunch of methods that can solve a bunch of problems (see my Introduction to SML last week). The following tutorial takes a step back and asks about the foundations of machine learning, in particular the (philosophical) problem of inductive inference, (Bayesian) statistics, and artiflcial intelligence. The tutorial concentrates on principled, unifled, and exact methods.
Voir icon arrow

Publié par

Publié le

14 décembre 2012

Nombre de lectures

148

Langue

English

Marcus Hutter - 1 - Universal Induction & Intelligence
Foundations of Machine Learning
Marcus Hutter
Canberra, ACT, 0200, Australia
http://www.hutter1.net/
ANU RSISE NICTA
Machine Learning Summer School
MLSS-2008, 2 { 15 March, KioloaMarcus Hutter - 2 - Universal Induction & Intelligence
Overview
† Setup: Given (non)iid data D =(x ;:::;x ), predict x1 n n+1
† Ultimate goal is to maximize proflt or minimize loss
† Consider Models/Hypothesis H 2Mi
† Max.Likelihood: H =argmax p(DjH ) (overflts ifM large)best i i
† Bayes: Posterior probability of H is p(HjD)/p(DjH )p(H )i i i i
† Bayes needs prior(H )i
† Occam+Epicurus: High prior for simple models.
† Kolmogorov/Solomonofi: Quantiflcation of simplicity/complexity
† Bayes works if D is sampled from H 2Mtrue
† Universal AI = Universal Induction + Sequential Decision TheoryMarcus Hutter - 3 - Universal Induction & Intelligence
Abstract
Machine learning is concerned with developing algorithms that learn
from experience, build models of the environment from the acquired
knowledge, and use these models for prediction. Machine learning is
usually taught as a bunch of methods that can solve a bunch of
problems (see my Introduction to SML last week). The following
tutorial takes a step back and asks about the foundations of machine
learning, in particular the (philosophical) problem of inductive inference,
(Bayesian) statistics, and artiflcial intelligence. The tutorial concentrates
on principled, unifled, and exact methods.Marcus Hutter - 4 - Universal Induction & Intelligence
Table of Contents
† Overview
† Philosophical Issues
† Bayesian Sequence Prediction
† Universal Inductive Inference
† The Universal Similarity Metric
† Universal Artiflcial Intelligence
† Wrap Up
† LiteratureMarcus Hutter - 5 - Universal Induction & Intelligence
Philosophical Issues: Contents
† Problems
† On the Foundations of Machine Learning
† Example 1: Probability of Sunrise Tomorrow
† 2: Digits of a Computable Number
† Example 3: Number Sequences
† Occam’s Razor to the Rescue
† Grue Emerald and Conflrmation Paradoxes
† What this Tutorial is (Not) About
† Sequential/Online Prediction { SetupMarcus Hutter - 6 - Universal Induction & Intelligence
Philosophical Issues: Abstract
I start by considering the philosophical problems concerning machine
learning in general and induction in particular. I illustrate the problems
and their intuitive solution on various (classical) induction examples.
The common principle to their solution is Occam’s simplicity principle.
Based on Occam’s and Epicurus’ principle, Bayesian probability theory,
and Turing’s universal machine, Solomonofi developed a formal theory
of induction. I describe the sequential/online setup considered in this
tutorial and place it into the wider machine learning context.Marcus Hutter - 7 - Universal Induction & Intelligence
Philosophical Problems
† Does inductive inference work? Why? How?
† How to choose the model class?
† How to choose the prior?
† How to make optimal decisions in unknown environments?
† What is intelligence?Marcus Hutter - 8 - Universal Induction & Intelligence
On the Foundations of Machine Learning
† Example: Algorithm/complexity theory: The goal is to flnd fast
algorithms solving problems and to show lower bounds on their
computation time. Everything is rigorously deflned: algorithm,
Turing machine, problem classes, computation time, ...
† Most disciplines start with an informal way of attacking a subject.
With time they get more and more formalized often to a point
where they are completely rigorous. Examples: set theory, logical
reasoning, proof theory, probability theory, inflnitesimal calculus,
energy, temperature, quantum fleld theory, ...
† Machine learning: Tries to build and understand systems that learn
from past data, make good prediction, are able to generalize, act
intelligently, ... Many terms are only vaguely deflned or there are
many alternate deflnitions.Marcus Hutter - 9 - Universal Induction & Intelligence
Example 1: Probability of Sunrise Tomorrow
dWhat is the probability p(1j1 ) that the sun will rise tomorrow?
(d= past # days sun rose, 1=sun rises. 0= sun will not rise)
† p is undeflned, because there has never been an experiment that
tested the existence of the sun tomorrow (ref. class problem).
† The p=1, because the sun rose in all past experiments.
† p=1¡†, where † is the proportion of stars that explode per day.
d+1† p= , which is Laplace rule derived from Bayes rule.d+2
† Derive p from the type, age, size and temperature of the sun, even
though we never observed another star with those exact properties.
Conclusion: We predict that the sun will rise tomorrow with high
probability independent of the justiflcation.Marcus Hutter - 10 - Universal Induction & Intelligence
Example 2: Digits of a Computable Number
† Extend 14159265358979323846264338327950288419716939937?
† Looks random?!
† Frequency estimate: n= length of sequence. k = number ofi
ioccured i =) Probability of next digit being i is . Asymptoticallyn
i 1! (seems to be) true.n 10
† But we have the strong feeling that (i.e. with high probability) the
next digit will be 5 because the previous digits were the expansion
of ….
† Conclusion: We prefer answer 5, since we see more structure in the
sequence than just random digits.

Voir icon more
Alternate Text