Title: Statistical Bases of Machine Learning

Presenters:

Professor Bruno Apolloni

Dipartimento di Scienze dell'Informazione

Università degli Studi di Milano

Via Comelico 39, 20135 Milano, Italy

E-mail: apolloni@dsi.unimi.it

Assistant Professor Dario Malchiodi

Dipartimento di Scienze dell'Informazione

Università degli Studi di Milano

Via Comelico 39, 20135 Milano, Italy

E-mail: malchiodi@dsi.unimi.it

Abstract: Machine Learning represents the new deal of statistical inference once powerful computational tools have been made available to scientists. The objects we want to infer are not yet simple parameters but entire functions. The data we process are not simple independent observations of a phenomenon; rather they represent complex links between different variables characterizing it. The inference's success depends highly on the sagacity of the algorithms processing these data in relation to their inner structure we want to discover. This tutorial provides a statistical framework for perceiving, discussing and solving the key inference problems on which a large family of machine learning instances are rooted.

The paradigmatic context is a string of data (possibly of infinite length) that we partition into a prefix we assume to be known at present (and therefore call a sample) and a suffix of unknown future data (we call a population). All these data share the feature of being observations of the same phenomenon, which is exactly the object of our inference. The basic inference tool is a twisting between properties we establish on the sample and random properties we are wondering about the population, such as the probability of matching a specific digit.

Moving from the elementary problem of estimating the parameter of a Bernoulli variable, we will revisit two basic inference tools: the computation of confidence intervals and the search for point estimators with nice properties. Then we will go on to learning problems: while the theoretical tools remain unchanged, the sample properties to be twisted on the population must be wisely devised and smartly computed. As for Boolean functions, we restate the bases of PAC learning theory facing the usual related issues, such as: i) curse of dimensionality, ii) corrupted examples, and iii) special learning devices such as Support Vector Machines. Passing to continuous variables, we will touch a few general statistical sentences that can be stated around neural network learning algorithms.

Tutorial outline:

Part I - Foundations 1hour - Presenter: Apolloni

1. Knowledge versus randomness. Fair properties that hold for sets of uncertain data. Basic properties of random variables

2. Algorithmic inference. The predictive approach: a string of bits partitioned by a cursor. A universal sample generator. Twisting argument to compute the distribution law of parameters. Confidence intervals. Solving inverse problems to find the confidence interval extremes.

Part II - Machine learning 2.5 hours

3. Computational learning. Learning Boolean functions. Error measure distribution law. The PAC learning goal. Sentry Points. A twisting argument for learning. Confidence intervals and point estimators for the learning error. Distribution-free complexity; noisy examples. 1 hour -- Presenter: Apolloni

4. Computing the hypotheses. Classification: Boolean Normal forms, SVM. Regression: Learning a confidence region for a regression line. Non linear regression. 1 hour -- Presenter: Malchiodi

5. Subsymbolic learning. Main statistical issues In search of a skillful statistic. Dimensioning training and test sets. Managing the available examples wisely. Principled stopping rules. 0.5 hour -- Presenter: Apolloni

Intended audience: The tutorial is intended for people who want to deal with the management of uncertain data in a rigorous and affordable way. This is the favorite task of the machine learning community. Their specific applications find robust theoretical roots in the tutorial within an unusually compact and well organized corpus of concepts and results. The presenters address them without sacrificing clarity and intuitiveness of results to an excessive formalism. Rather the conceptual aspects of the matter are privileged, while technical details are touched within examples. Hence the tutorial should satisfy the curiosity of many researchers involved at various levels of the vast field known today as computational intelligence. Basic knowledge of calculus and probability is helpful but not a prerequisite.

Format: We will present the material in electronic form via a videobeam. We will use both powerpoint slides and demos executed in Mathematica. Special applicative examples will run on proprietary software either in JAVA or C++. Wi will use extemporary transparencies to elucidate special points on demand.

Biographies:

Bruno Apolloni received his degree in Mechanical Engineering from the Università degli Studi di Napoli, Italy, in 1969. He is a full professor in Computer Science and teaches Cybernetics and Information Theory at the Dipartimento di Scienze dell'Informazione of the University of Milano, Italy. His main research interests are in the frontier area between probability and statistics on the one hand and theoretical computer science on the other, with special regard to computational learning, pattern recognition, optimization, control theory, probabilistic analysis of algorithms, epistemological aspects of probability and fuzziness. Since 1989 he has been Head of the Neural Networks Research Laboratory (LAREN http://laren.dsi.unimi.it) of the above department. Since 2001 he has been President of the Italian Society of Neural Networks (SIREN, http://siren.dsi.unimi.it/). He is a member of Neural Neworks and International Journal of Hybrid Intelligent Systems Editorial Boards and is also a member of the Advisdory Board book series "Advanced Intelligence" published by Advanced Knowledge International. He has authored over a hundred papers.

Dario Malchiodi received a degree in Computer Science in 1996 and a Ph.D. in Applied Mathematics in 2000, both from the University of Milano, Italy. He is currently assistant professor at the same University, where he teaches Computer Programming and Foundations of Computer Science. His main research area ranges from probability theory and mathematical statistics to various aspects of computational learning theory, including applications of these fields to neural networks, pRAMs and support-vector machines.

Sample material

The course will be based on the book: B. Apolloni, D. Malchiodi, S. Gaito: Algorithmic Inference in Machine Learning, Advanced Knowledge International, Pty Ltd Magill, Adelaide, South Australia, 2003. A PDF copy of the book may be downloaded for the sole reviewers' use from the web site http://laren.dsi.unimi.it/algorithmicinference.pdf

An extended version of the tutorial constituted a set of lectures of:

Presenters' relevant bibliography

Journals articles

  1. B. Apolloni, D. Malchiodi, C. Orovas and G. Palmas, From synapses to rules, Cognitive Systems Research 3/2 (2002) 167-201.

  1. B. Apolloni and D. Malchiodi, Gaining degrees of freedom in subsymbolic learning, Theoretical Computer Science 255 (2001) 295-321.

  1. B. Apolloni, A. Piccolboni, E. Sozio A hybrid symbolic subsymbolic controller for complex dynamical systems. Neurocomputing 37 (2001) 127-163.

  1. Apolloni B., E. Battistini, D. de Falco (1999) Higher order Boltzmann machines and entropy. J. of Physics a: Mathematical and General,32, 5529-5538.

  2. Angeleri E., B. Apolloni, D. De Falco, L. Grandi (1999) Speeding up and improving DNA fragment assembling using neural prediction techniques. Int J. of Neural Ntework, 9,523-544.

  3. Apolloni B., G. Zamponi, A.M. Zanaboni (2000) An integrated symbolic/connectionist architecture for automatic parsing of Italian sentences containing PP-attachment ambiguities. Applied Artificial Intelligence,14,202-230.

  4. Apolloni B., Zoppis I. (1999) Subymbolically managing pieces of symbolical functions for sorting. IEEE Trans on Neural Ntworks,10(5), 1099-1122.

  5. Apolloni B., C. Gentile (2000) P-sufficient statistics for PAC learning k-term-DNF formula through enumeration. Theoretical Computer Science, 230,1-37.

  6. Apolloni B., Zamponi G., Zanaboni A.M. (1998) Learning fuzzy decision trees, Neural Networks,11,885-895.

  7. Apolloni B., C. Gentile (1998) Sample Size Lower Bounds in PAC Learning by Algorithmic Complexity Theory, Theoretical Computer Science, 209,141-162.

  8. Apolloni B., Battini F., F. Lucisano (1997) A co-operating neural approach for spacecrafts attitude control, Neural Computing, 16,4,279-290.

  9. Apolloni B., D. De Falco, J.G. Taylor (1997) pRAM layout optimization, Neural Networks,10,9, 1709-1716.

  10. Apolloni B., Chiaravalli S. (1997) PAC Learning of concept classes through the boundaries of their items. Theoretical Computer Science, 172,91-120.

Books

  1. B. Apolloni, D. Malchiodi, S. Gaito: Algorithmic Inference in Machine Learning, Advanced Knowledge International, Pty Ltd Magill, Adelaide, South Australia, 2003.

  1. B. Apolloni, Marinaro, M. and Tagliaferri, R (Eds), Proceedings of WIRN2003: Italian Workshop on Neural Networks, Lecture Notes on Computer Science 2859, Berlin. Springer.

  1. B. Apolloni and F. J. Kurfess (Eds), From Synapses to Rules: Discovering Symbolic Rules from Neural Processed Data (Kluwer Academic / Plenum Publishers, New York, 2002).

Proceedings

  1. B. Apolloni, A. Brega, D. Malchiodi, G. Palmas and A. M. Zanaboni, Learning rule representations from boolean data, in Artificial Neural Networks and Neural Information Processing - ICANN/ICONIP 2003, Istanbul, Turkey, June 26-29, 2003, Proceedings, Springer, Lecture Notes in Computer Science 2714, 875-882.

  1. S. Kasderidis, J. G. Taylor, N. Tsapatoulis and D. Malchiodi, Driving Attention to the Dangerous, in Artificial Neural Networks and Neural Information Processing - ICANN/ICONIP 2003, Istanbul, Turkey, June 26-29, 2003, Proceedings, Springer, Lecture Notes in Computer Science 2714, 909-916.

  1. B. Apolloni, S. Bassis, A. Brega, S. Gaito, D. Malchiodi, and A. M. Zanaboni, A man-machine human interface for a special device of the pervasive computing world, in Proceedings of DC Tales: Tales of the Disapearing Computer Santorini, Greece, June 1-4, 2003), CTI Press, 263-267.

  1. B. Apolloni, A. Brega, D. Malchiodi, N. Valcamonica and A. M. Zanaboni, A symbolic description of the awareness state in car driving, in Proceedings of DC Tales: Tales of the Disapearing Computer Santorini, Greece, June 1-4, 2003), CTI Press, 93-96.

  1. B. Apolloni, S. Bassis, A. Brega, S. Gaito, D. Malchiodi, N. Valcamonica and A. M. Zanaboni, Monitoring of car driving awareness from biosignals, in Proceedings of WIRN2003: Italian Workshop on Neural Networks Vietri sul Mare (SA), Italy, June 5-7, 2003), Lecture Notes in Computer Science 2859 269-277.

  1. B. Apolloni, S. Bassis, S. Gaito and D. Malchiodi, Cooperative games in a stochastic environment, in Proceedings of WIRN2003: Italian Workshop on Neural Networks Vietri sul Mare (SA), Italy, June 5-7, 2003), Lecture Notes in Computer Science 2859 25-34.

  1. B. Apolloni, D. Malchiodi, C. Orovas e A. M. Zanaboni, Fuzzy Methods for Simplifying a Boolean Formula Inferred from Examples, in L. Wang, S. Halgamuge and X. Yao (Eds.), FSKD'02. Proceedings of the First International Conference on Fuzzy Systems and Knowledge Discovery (November 18-22, 2002 Singapore), Vol. 2, 554-558.

  1. B. Apolloni, S. Baraghini and G. Palmas, PAC meditation on boolean formulas, in Proceedings of SARA 2002: Symposium on Abstraction, Reformulation and Approximation Lecture Notes in Computer Science 2371, 274-281.

  1. B. Apolloni, S. Bassis, D. Malchiodi and S. Gaito, Cooperative games in a stochastic environment, in E. Damiani, R. J. Howlett, L. C. Jain and N. Ichalkaranje (Eds.), Knowledge-Based Intelligent Information Engineering Systems and Allied Technologies - KES 2002 (Proceedings of KES'2002: Sixth International Conference on Knowledge-Based Intelligent Information & Engineering Systems Crema, Italy, September 18-19, 2002), 296-300.

  1. B. Apolloni and D. Malchiodi, Twisting statistics with properties, in A. N. Morozevich, V. G. Levashenko and E. N. Zaitseva (Eds.), Proceedings of ICINASTE 2001: International Conference on Information, Networks And System Technologies (Minsk, Belarus, October 2-4, 2001) 48-56.

  1. B. Apolloni, D. Malchiodi, S. Gaito and I. Zoppis, Twisting features with properties, in M. Marinaro and R. Tagliaferri (Eds.), Proceedings of WIRN 2001: Italian Workshop on Neural Networks (Vietri sul Mare (SA), Italy, May 17-19 2001) 301-312.

Book chapters

  1. B. Apolloni, S. Bassis, S. Gaito and D. Malchiodi, Statistical bases for learning, in B. Apolloni and F. Kurfess (Eds.), From synapses to rules. Discovering symbolic rules from neural processed data, Chapter 1, New York: Kluwer Academic/Plenum Publishers, 2002, 5-40.

  1. B. Apolloni, S. Gaito, D. Iannizzi and D. Malchiodi, Learning regression functions, in B. Apolloni and F. Kurfess (Eds.), From synapses to rules. Discovering symbolic rules from neural processed data, Chapter 3, New York: Kluwer Academic/Plenum Publishers, 2002, 61-73.

  1. B. Apolloni, S. Bassis, S. Gaito and D. Malchiodi, Cooperative games in a stochastic environment, in B. Apolloni and F. Kurfess (Eds.), From synapses to rules. Discovering symbolic rules from neural processed data, Chapter 4, New York: Kluwer Academic/Plenum Publishers, 2002, 75-86.

  1. B. Apolloni, D. Malchiodi, C. Orovas and A, M. Zanaboni, Fuzzy methods for simplifying a Boolean formula inferred from examples, in B. Apolloni and F. Kurfess (Eds.), From synapses to rules. Discovering symbolic rules from neural processed data, Chapter 7, New York: Kluwer Academic/Plenum Publishers, 2002, 117-128.

  1. Agnati LF, Benfenati F, Ferri M, Morpurgo A, Apolloni B, and Fuxe K (2002)Molecular basis of learning and memory: modeling based on receptor mosaics, in B. Apolloni and F. Kurfess (Eds.), From synapses to rules. Discovering symbolic rules from neural processed data, Chapter 9, New York: Kluwer Academic/Plenum Publishers, 2002, 165-195.

  1. B. Apolloni, Morpurgo Physiological and logical brain functionalities: a hypothesis for self-referential brain activity, in B. Apolloni and F. Kurfess (Eds.), From synapses to rules. Discovering symbolic rules from neural processed data, Chapter 10, New York: Kluwer Academic/Plenum Publishers, 2002, 197-218.

  1. B. Apolloni, S. Gaito and D. Malchiodi, Learning and checking confidence regions for the hazard function of biomedical data, in B. Apolloni and F. Kurfess (Eds.), From synapses to rules. Discovering symbolic rules from neural processed data, Chapter 13, New York: Kluwer Academic/Plenum Publishers, 2002, 251-260.