First Australasian Computational Intelligence Summer School (ACISS’09)

30 November – 1 December 2009, Melbourne, Australia

 

 

ACISS'09 Tutorials


Tutorial 1: Computational Intelligence in Games

Abstract:  These tutorials are for those interested in the applications of computational intelligence to games. Machine intelligence and games have a long and illustrious association stretching from the pioneering days of computing right up to the present day. Traditional games are a marvellous lens through which to study such machine intelligence problems as planning, reasoning, learning and pattern recognition, while modern interactive computer games pose challenging problems that computational intelligence methods can help to solve.

Two tutorials are offered: an introductory one for novices, and an advanced one for those having some familiarity with the field. Participants can elect to take either or both, depending on their background.

Computational Intelligence in Games – Part I

This tutorial introduces how computational intelligence techniques can be applied for learning in games and will consist of two parts: the first, an introduction to basic techniques, approaches, and existing software frameworks; the second, cases studies and a brief history of past successes.  No prior knowledge is required.  The emphasis will be on the application rather than the theory.

Computational Intelligence in Games – Part II

This tutorial will be a practical tutorial in which participants will work in small teams to develop an agent to control a game character (a bot) for Unreal Tournament 2004, a popular First Person Shooter.  Participants will need some Java programming skills, and familiarity with basic computational intelligence methods (the introductory tutorial CI in Games – Part I would provide this). The aim of the exercise will be to make the bot act in a human-like way, in the spirit of the BotPrize Contest.

Presenters: Dr. Luigi Barone and A/Prof. Philip Hingston

    

Luigi Barone (luigi@csse.uwa.edu.au) and Philip Hingston (p.hingston@ecu.edu.au) were General Chairs of the 2008 IEEE Symposium on Computational Intelligence and Games (www.csse.uwa.edu.au/cig08 ) and are both associate editors of the new IEEE Transactions on Computational Intelligence and AI in Games. They have published both separately and independently on topics encompassing mathematical games such as the Iterated Prisoner’s Dilemma and the Hawks and Doves game, board and card games such as Poker, Othello and Hnefatafl, drinking games like Spoof, and computer games including PacMan and Unreal Tournament. Philip is the organiser of the BotPrize Contest (www.botprize.org), an international competition for bot programmers.


Tutorial 2: How to do good research in database/data mining, and get it published!

Abstract: While the top data mining/databases conferences have traditionally enjoyed an unusually high quality of reviewing, there is no doubt that publishing in them (and other high quality conferences) is very challenging. This is especially true for young faculty, grad students whose primary advisor is not an experienced data mining/databases author, or people from outside the community (i.e. a biologist or mathematician who has a result that might greatly interest the database/data mining community).

In this tutorial Dr. Keogh will demonstrate some simple ideas to enhance the probability of success in getting your paper published in a top data mining conference; and after the work is published, getting it highly cited.

These tips and tricks are based on 12 years experience as a prolific author and reviewer, and wisdom solicited from many of the most prolific data mining researchers/reviewers.

Topics covered in the tutorial include:

  • Finding the right problems to work on (80% of the battle).
  • Don't summarize, sell! Writing abstracts that put the reviewer on your side from the start.
  • Getting or creating the perfect dataset. Experiments that tell a story.
  • Making effective and interesting figures.
  • Getting the reviewers on your side.
  • The top-ten avoidable reasons why papers get rejected from ICDE.
  • Three simple tricks to increase the number of citations to your work.

While Dr. Keogh does not claim to have a “magic bullet” for publishing in database/data mining, his significant track record of publishing in top data mining venues, combined with extensive (and deliberately uncredited) experience in helping younger researchers “break-in” to ICDE/ICDM/VLDB/SIGKDD have placed him in a unique position to share useful and actionable advice.

This tutorial is sponsored by the Monash University Centre for Research in Intelligent Systems (CRIS).

Presenter: Professor Eamonn Keogh

Dr. Keogh is a prolific author in data mining/database conferences. As of June 2009, he is one of only three people to have at least ten papers in each of the top three data mining conferences, ACM SIGKDD, IEEE ICDM and SIAM SDM (The other two are Philip Yu and Jiawei Han). While he is only 8 years out from his PhD he has already obtained an H-index of 34, and obtained full professorship at the University of California-Riverside.

He has given well-received tutorials at SIGKDD (three times), ICDM (twice), VLDB, SDM, ACM Multimedia, CIKM and a host of other venues.

Several of his papers have won “best paper” awards. In addition he has won several teaching awards, and he was the sole recipient of the University of California Riverside University Scholar for 2008. He is the recipient of a 5-year NSF Career Award for “Efficient Discovery of Previously Unknown Patterns and Relationships in Massive Time Series Databases” and two additional large NSF awards for data mining.

What people have said about a previous version of this tutorial.

The tutorial is quite long, but informative and entertaining. I recommend (it) to you. William Webber.

I wish I have read your slides earlier. Very helpful. Lei Tang, Tempe, AZ

I just got great recommendation from a co-worker about your tutorial. Mandis S. Beigi. IBM Watson Research Center

It is very useful for the beginners to research. Xin Zhao

..found it very interesting... it should be very useful.. Huan Liu

Very nice slides! It is very useful, especially as I am currently preparing a paper... Ardian Kristanto Poernomo

I am even grateful for the advice you provide in your SIGKDD tutorial. It makes me felt like i have spent the last years in vain. Sian Lun.

Excellent stuff! I have forwarded it to my group for serious reading! Naren Ramakrishnan. Virginia Tech

Very glad I attended your tutorial in KDD2009. It was very informative. Ding Feng. Nokia

This tutorial is so important, helpful, interesting, beautiful and valuable to me that I have read the tutorial five times. I like this tutorial so much that I am going to recommend it to all my classmates and friends. Suke Li

..wonderful presentation in KDD`09.. ...very useful for me. Chang Cheng, HP Labs China

..quite helpful for me, my colleagues, and my students. Sang-Wook Kim

Today I read your tutorial and liked it. It will be in my archives always. Ahmet Yukselturk

I am using it as a resource in my machine learning class.. Elena Filatova

an amazing tutorial at SIGKDD. Supheakmungkol Sarin


Tutorial 3: Mining Massive Collections of Shapes and Time Series: With Case Studies in Anthropology and Astronomy

Abstract: Time series and multimedia data are ubiquitous; large volumes of such data are routinely created in scientific, industrial, entertainment, medical and biological domains. Examples include gene expression data, medical imagery, electrocardiograms, electroencephalograms, gait analysis, stock market quotes, space telemetry, microarrays, zoology etc. To mine such data we must chose algorithms and data representations. While most representations used in the past have been real valued (i.e wavelets and Fourier methods) in this tutorial I advocate for using discrete (symbolic) representations of the data. Symbolic representations allow us to use very useful algorithms and data structures which are not available for real data, for example suffix trees, hashing and Markov models.

The tutorial will be illustrated with numerous real world examples created just for this tutorial, including examples from archeology (petroglyphs and projectile points), microscopy (nematodes and blood cells), historical manuscripts, zoology, motion capture and biometrics. The data mining tasks considered include indexing, classification, clustering, novelty discovery, motif discovery and visualization.

This tutorial is sponsored by the Monash University Centre for Research in Intelligent Systems (CRIS).

Presenter: Professor Eamonn Keogh

Dr. Keoghs research interests are in Data Mining, Machine Learning and Information Retrieval. He has published over 100 papers on time series/shape mining in venues such as SIGMOD, SIGKDD, SIGIR, SIGGRAPH, VLDB, EDBT, PKDD, PAKDD, IEEE ICDM, IEEE ICDE, SIAM SDM, IDEAL, FQAS, SSDM, AI and INTERFACE conferences and in the TODS, DMKD, KAIS, INFORMATION VISUALIZATION and IJTAI journals. Several of his papers have won “best paper” awards. In addition he has won several teaching awards. He is the recipient of a 5-year NSF Career Award for “Efficient Discovery of Previously Unknown Patterns and Relationships in Massive Time Series Databases”, two additional million dollar NSF grants, and a grant from Aerospace Corp to develop a time series visualization tool for monitoring space launch telemetry. His papers on time series data mining have been referenced well over 5,000 times (see www.cs.ucr.edu/~eamonn/selected_publications.htm).

What people have said about a previous version of this tutorial.

“I thought it was a phenomenal introduction to the material..” Neal J. Rothleder, Ph.D. | Microsoft - CRM Customer Profiling and Analysis

“…an excellent tutorial… made a convincing case at the right level of detail for a tutorial” Darren Vengroff. Amazon.com

“The (SIGKDD tutorial) content is informative, technical and interesting, and lots of fun to read” Raymond Tsang: Copperleaf Consulting Group

“ (I) was very impressed by your tutorial at KDD-04 on time series - great work!… … quite a few people who have attended your KDD-04 tutorial told me it was great!” Dr. Gregory Piatetsky-Shapiro. Editor, KDnuggets.com

“May I use some of your (tutorial slides) in my teaching? They are very nice. I like them very much. In addition to the great technical stuff, the art work is amazing” Dr. Jian Pei Simon Fraser University

“…your tutorials on time series are great” Dr George Kollios, Boston University

“… excellent tutorial concerning temporal mining ...” Dr. Margaret H. Dunham, in her book, Data Mining Introductory and Advanced Topics. pp 273

“I have read your tutorial, great work!!” Dr Paolo Capitani, University of Bologna - Italy

“My students and I enjoyed your tutorials very much. They cover the important concepts in an easy to understand and entertaining way.” Dr. Yan Huang, University of North Texas

“Your presentation on ICDM '01-A Tutorial on Indexing and Mining Time Series Data, helped me a lot in understanding time series data mining, they are the clearest slides I had ever seen.” Zengchang Qin, University of Bristol.

“…I think its contents (SIGKDD tutorial) are FANTASTIC!” Magdiel Galan, Grad Student Arizona State University

“The tutorial is a fantastic resource. Your capacity for work is amazing!”Howard J. Hamilton. University of Regina, Canada.

Tutorial 4: Particle Swarm Optimization

Introduction to Particle Swarm Optimization - Part I

Since the first publication of Particle Swarm Optimization (PSO) in 1995, the number of research papers on PSO, and the number of researcher in PSO, have exploded. Many variations of the PSO have been developed to improve its performance, studies have been done to understand the dynamics of particles, and adaptations have been developed to apply the PSO to different optimization problem types.

This tutorial begins with a gentle introduction to PSO, followed by some specialized topics. The introductory part will have as its objective to provide the attendee with an overview of PSO and its basic variations. A significant problem with the standardPSO will be illustrated, and a few results from studies of particle trajectories will be presented. The specialized topics will consider PSO models that are specifically designed for multimodal optimization, multiobjective optimization, coevolution, and constraint handling. The details of these topics are:

1. Basic PSO: The philosophy of PSO will be discussed, and the basic (original) PSO algorithms will be explained and illustrated. The need for social network structures will be discussed, as well as the importance of PSO control parameters, basic variations (velocity clamping, inertia, constriction). An overview of performance criteria will be given. 2. Particle Trajectories: Illustrations of particle trajectories and the influence of parameter choices on trajectories will be given. Heuristics for the selection of control parameter values will be given. 3. PSO problem: A problem with the PSO that causes premature convergence will be discussed, and solutions proposed. 4. Multimodal optimization: speciation and niching techniques used in PSO will be described. Illustration of a speciation-based PSO will be provided, as well as analysis of its performance and some research issues. 5. Multiobjective optimization: an overview of existing multiobjective PSO models will be provided. In addition, a multiobjective PSO will be shown to have a fast convergence and nice spread of solutions across the Pareto optimal front. 6: Coevolutionary PSO: an example of employing a cooperative coevolutionary PSO for solving large scale optimization optimization problem will be presented. 7. Constraint handling: a brief overview of existing constraint handling techniques adopted in PSO will be provided.

Presenter: Dr. Xiaodong Li

Dr Xiaodong Li  is an associate editor of the IEEE Transactions on Evolutionary Computation, and International Journal of Swarm Intelligence Research (IJSIR). He is a member of IEEE CIS Task Force on Swarm Intelligence since its inception, and a member of IEEE CIS Task Force on Evolutionary Computation in Dynamic and Uncertain Environments (ECiDUE). He is a member of the Technical Committee on Soft Computing, Systems, Man and Cybernetics Society, IEEE, and a member of IASR Board of Editors for the Journal of Advanced Research in Evolutionary Algorithms (JAREA). He is a Vice Chair of Computational Intelligence Society, IEEE Victorian Section, Australia. He was the General Chair of the 7th International Conference on Simulated Evolution And Learning (SEAL'08), and currently a Program Co-chair of the 22nd Australasian Joint Conference on Artificial Intelligence (AI'09).

Particle Swarm Optimization in Dynamic Environments - Part II

Abstract:

The original particle swarm optimization (PSO) algorithms have been developed to solve unconstraint, static continuous-valued optimization problems. Due to the characteristics of PSO, it can not be applied to find solutions in dynamically changing environments. The PSO approach has to be adapted in order to inject diversity into swarms such that the exploration abilities of the swarm is increased. This then allows PSO to find and track optima in dynamic environments.

This tutorial will start by formally defining dynamic environments and discussing different classes of dynamic environments, as well as classes of dynamic optimization problems. Then an introduction to PSO will be provided, with an explanation of why the original PSO can not be used in dynamic environments. Adaptations of PSO to find and track single solutions in dynamic, single objective, and unconstrained environments will then be discussed.

The tutorial will then continue to discuss more complex dynamic optimization problems. It will be shown how PSO can be adapted to track multiple solutions in a dynamic environment, and results will be given to illustrate the performance of PSO in this task. Dynamic multi-objective optimization problems will be considered, discussing how a vector-evaluated PSO can be used to solve dynamic multi-objective optimization problems. Finally, the ability of PSO to cluster temporal data and to train neural networks in the presence of concept drift will be illustrated.

Presenter: Professor Andries P. Engelbrecht bedau
 

 

 

 

 

Dr. Andries P. Engelbrecht is a Full Professor in Computer Science at the Department of Computer Science, University of Pretoria. He manages a research group of 40 Masters and PhD students, most of whom do research in swarm intelligence. He has recently authored a book, “Fundamentals of Computational Swarm Intelligence”, published by Wiley. He is also the author of a book, “Computational Intelligence: An Introduction”, also published by Wiley. He has presented tutorials on PSO and Coevolutionary methods for evolving game agents at IEEE CEC 2005 and IEEE CIG 2005 respectively. He is co-presenter of a tutorial on PSO and DE at IEEE CEC 2007. He has published approximately 100 papers in the last decade, serves as a reviewer for a number of conferences and journals, and is an associate-editor of IEEE TEC, and serves on the editorial board of two other journals. He served as a member of a large number of conference program committees, and is in the organizing committee of several conferences.


Tutorial 5: Introduction to Bayesian Network Models

Abstract: These tutorials will explain the central ideas of Bayesian network (BN) technology, including what BNs can do, how they work, how to knowledge-engineer them, automated causal discovery and reviewing applications of the technology. They are intended for students and researchers new to Bayesian networks.

Presenters: A/Prof. Ann Nicholson and A/Prof. Kevin Korb

Dr Ann Nicholson (ann.nicholson/at/infotech()monash()edu()au) is an Associate Professor in the Clayton School of Information Technology, Monash University. She did her doctorate in the robotics research group at Oxford University (1992), working on dynamic Bayesian networks for discrete monitoring. Since then she has focused on AI methods for reasoning under uncertainty, including knowledge engineering with Bayesian networks and applications of Bayesian networks. She is currently Co-Program-Chair of the 22nd Australasian Joint Conference on Artificial Intelligence (Melbourne, 2009).

Dr Kevin Korb (kbkorb/at/gmail()com) is a Reader in the Clayton School of Information Technology, Monash University. His PhD examined the philosophical foundations of automating causal learning (Indiana University, 1992). His research interests include Bayesian philosophy of science, causal discovery algorithms, Bayesian networks, artificial life simulation, and evolutionary simulation. He is currently Co-Chair of the Australian Conference on Artificial Life (Melbourne, 2009). Korb and Nicholson are joint authors of the monograph "Bayesian Artificial Intelligence" (CRC, 2004).

http://www.csse.monash.edu.au/bai/

Introduction to Bayesian Network Models - Part I

Topics: Bayesian networks, inference in BNs, dynamic and decision networks, causal Bayesian networks, knowledge engineering BNs, causal discovery, applications.

Introduction to Bayesian Network Models - Part II

This tutorial will begin with hands-on use of the Netica Baysian Network software (www.norsys.com). Attendees will work through a series of practical exercises to learn how to build a BNs and decision networks in Netica, as well how to use Netica for sensitivity analysis and learning parameters from data. In the second part of this tutorial, attendees will data mine BN structures from sample data and from the combination of expert priors and data.


Tutorial 6: Estimation of distribution algorithms

Abstract: Estimation of Distribution Algorithms (EDAs) are a class of metaheuristic optimization algorithms that have developed mainly within the field of Evolutionary Computation. They combine ideas and techniques from evolutionary algorithms, machine learning and statistical modelling. EDAs attempt to learn and exploit the structure of a solution space by learning a probabilistic model and then use this model to drive the search process.

This tutorial will provide an introduction to EDAs, assuming minimal prior knowledge (background concepts will be reviewed as required). It will give an overview of the history of the field, discuss key algorithms in detail and highlight the main properties of interest. The final part of the tutorial will focus on what is "state of the art" in EDAs and identify key open questions and directions of future research.

Presenter: Dr. Marcus Gallagher

Dr Marcus Gallagher is a Senior Lecturer in the School of Information Technology and Electrical Engineering at the University of Queensland. His main research interests are metaheuristic optimization and machine learning algorithms, in particular techniques based on statistical modelling. He is also interested in biologically inspired algorithms, methodology for empirical evaluation of algorithms and the visualization of high-dimensional data.


Tutorial 7: Practical introduction to backpropagation

Presenter: Professor Tom Gedeon

(Note: Unfortunately this tutorial has just been cancelled due to unforeseen reasons).


Tutorial 8: Genetic Programming for Data Mining

Abstract: Genetic Programming (GP) is a population based, general methodology for solving specific problems for different kinds of tasks. It is one of the major/largest paradigms in evolutionary computation. GP can be considered a specialisation of genetic algorithms where each individual in the population is a computer program rather than bitstring chromosomes. It can be regarded as an automatic programming technique for automatically evolve computer programs for a specific task. GP is also considered a machine learning and data mining technique as it can automatically learn/evolve mathematical models and other (hidden) patterns for data mining tasks.

Since popularised by John Koza in 1992, GP has been successfully applied to many areas. GP also has one of the largest bibliographies in Artificial Intelligence even Computer Science maintained by William Bill Langdon. This tutorial will focus on the use of GP for data mining tasks particularly regression and classification.

The tutorial will consist of the following aspects. (1) GP basics, including program representation, program generation, terminals and functions, fitness functions, parameter setting and stopping criteria. (2) GP for symbolic regression tasks, including the major advantages over conventional statistical regression techniques. (3) GP for classification: basic method and problems; (4) GP for multiclass classification: fitness function and program class translation; (5) GP for multiclass classification: program structure; and (6) Understanding of GP programs: not a black box. If we have time, examples of using GP for object tracking and detection will be discussed/demonstrated.

Presenter: A/Prof. Mengjie Zhang

Mengjie Zhang (http://homepages.ecs.vuw.ac.nz/~mengjie/) is an Associate Professor at Victoria University of Wellington, New Zealand. His research is focused on Evolutionary Computation especially Genetic Programming and Particle Swarm Optimisation, Intelligent and Evolutionary Computer Vision, and Data Mining and Machine learning. He was a Program Co-chair of the 21st Australasian Joint Conference on Artificial Intelligence (AI'08) and a Program Co-Chair of the 7th International Conference on Simulated Evolution And Learning (SEAL'08). He is a board member of the Evolutionary Computation Technical Committee in the IEEE Computational Intelligence Society.


Tutorial 9: Spatially-Structured Evolutionary Computation

Abstract: This tutorial will focus on properties and potential benefits of implementing evolutionary algorithms using spatially structured populations. A traditional evolutionary algorithm uses a fully-connected, or panmictic, population structure, and consequently places no restriction on the possible paring of individuals for mating. Spatially-structured evolutionary algorithms (SSEAs) alter this behaviour by imposing a structure on the population and restricting interactions to individuals that are topologically close. This simple change introduces many interesting and potentially useful behavioural properties that can improve the likelihood of success of the evolutionary search method. The tutorial will begin by examining the techniques used to implement spatially-structured evolutionary algorithms. Then, it will examine the properties of SSEAs that are useful in extending the use evolutionary computation in various problem domains. Finally, the tutorial will present several examples of recent work using SSEAs to illustrate how they can be successfully used to solve difficult problems.

This tutorial would essentially be at an introductory level. A basic understanding of evolutionary computation methods (particularly genetic algorithms, genetic programming and niching methods) would be useful but, given that the basic algorithms would be outlined, not necessary.

The tutorial would cover the following points:

  • Basic evolutionary computation using panmictic population structures

  • Biological motivations for population structure – selection pressure, genetic drift and speciation models (such as, allopatric, sympatric, parapatric)

  • Models of population structure:

    o Island models

    o Fine-grained (diffusion) models

  • Implementation of SSEAs:

    o Population structure (topology, neighbourhood construction)

    o Selection and recombination within neighbourhoods

    o Basic algorithms – generational versus steady-state

    o Elitist replacement Behavioural properties of SSEAs:

    o Selection pressure o Genetic drift

    o Parapatric speciation

  • Examples of using SSEAs:

    o Optimisation (Numerical, combinatorial)

    o Multiobjective & multimodal fitness landscapes o Bloat control in genetic programming.

Presenter: Dr. Grant Dick

The tutorial contents will be developed jointly by Dr. Grant Dick (who will present the tutorial) and A/Prof. Peter Whigham of the Department of Information Science, University of Otago, Dunedin, New Zealand. Both researchers have an extensive background in the implementation and use of spatially-structured evolutionary algorithms and have co-authored several journal articles investigating the empirical and theoretical aspects of spatially-structured evolutionary algorithm behaviour. Additionally, Dr. Dick’s PhD thesis explored the used of spatially-structured evolutionary algorithms to discover multiple solutions to multimodal problems.


Tutorial 10: Foundations of Intelligent Agents

Abstract: The dream of creating artificial devices that reach or outperform human intelligence is an old one, however a computationally efficient theory of true intelligence has not been found yet, despite considerable efforts in the last 50 years. Nowadays most research is more modest, focussing on solving more narrow, specific problems, associated with only some aspects of intelligence, like playing chess or natural language translation, either as a goal in itself or as a bottom-up approach. The dual, top down approach, is to first find a formal (mathematical, not necessarily computational) solution of the general AI problem, and then to consider computationally feasible approximations. Note that the AI problem remains non-trivial even when ignoring computational aspects. A key property of intelligence is to learn from experience, build models of the environment from the acquired knowledge, and use these models for prediction. In philosophy this is called inductive inference, in statistics it is called estimation and prediction, and in computer science it is addressed by machine learning. The second key property of intelligence is to exploit the learned predictive model for making intelligent decisions or actions. Together, in computer science this is called reinforcement learning, in engineering it is called adaptive control, and in statistics and other fields it is called sequential decision theory. The tutorial will introduce the philosophical, statistical, and computational perspective of inductive inference, and Solomonoff's unifying universal solution. Also the unified view of the intelligent agent framework will be introduced. Putting everything together, we arrive at an elegant mathematical parameter-free theory of an optimal reinforcement learning agent embedded in an arbitrary unknown environment that possesses essentially all aspects of rational intelligence. We will argue that it represents a conceptual solution to the AI problem, thus reducing it to a pure computational problem. Despite the grand vision above, most of the tutorial necessarily is devoted to introducing the key ingredients of this theory, which are important subjects in their own right: Occam's razor; Turing machines; Kolmogorov complexity; probability theory; Solomonoff induction; Bayesian sequence prediction; minimum description length principle; intelligent agents; sequential decision theory; adaptive control theory; reinforcement learning; Levin search and extensions.

Literature:

M. Hutter, On Universal Prediction and Bayesian Confirmation Theoretical Computer Science, 384:1 (2007) 33-48 M.

Hutter, Universal Artificial Intelligence: Sequential Decisions based on Algorithmic Probability EATCS Book, Springer, Berlin (2005)

Presenter: A/Prof. Marcus Hutter

Marcus Hutter is Associate Professor in the RSISE at the Australian National University in Canberra, Australia, and NICTA adjunct. He holds a PhD and BSc in physics and a Habilitation, MSc, and BSc in informatics. Since 2000, his research is centered around the information-theoretic foundations of inductive reasoning and reinforcement learning, which resulted in 50+ published research papers and several awards. His book "Universal Artificial Intelligence" (Springer, EATCS, 2005) develops the first sound and complete theory of AI. He also runs the Human Knowledge Compression Contest (50'000 ? H-prize).

Homepage: http://www.hutter1.net

Tutorial 11: Deep Belief Nets

Deep belief nets are neural networks composed of several layers of stochastic hidden units. Although they are appealing as a rich generative model for data, until recently they have proven very difficult to train. In 2006 Geoffrey Hinton showed how, surprisingly, they could be learned by a greedy layer-wise procedure based on a particular kind of Boltzmann machine, an 'old' idea from the 1980s.

This tutorial introduces the key ideas, and discusses applications. It will cover:

  • Sigmoidal belief nets

    - what they are

    - why explaining away makes them difficult to train, and why greedily training them layer-by-layer won't work.

  • Restricted Boltzmann machines (RBMs)

    - Boltzmann machines in general, and why restricting them helps

    - what sorts of distributions RBMs can capture (example: parity!)

    - the "contrastive divergence" learning algorithm used to train them

    - why they provide a natural way to do greedy training of deep nets

  • Deep sigmoidal belief nets

    - how RBMs relate to (really) deep belief nets

    - how to learn deep belief nets by training RBMs

    - how to do better: fine tuning with the wake-sleep algorithm

    - example: digit recognition

    - using deep belief nets as auto-encoders

    - example: semantic hashing

Presenter: Dr. Marcus Frean

Marcus Frean is a Senior Lecturer in the School of Engineering and Computer Science, Victoria University of Wellington, New Zealand. Marcus has interests in machine learning, complex adaptive systems, and evolutionary dynamics.

   

Last updated: 21 September 2009 - Maintained by Xiaodong Li