Sparse Feature Learning for Deep Belief Networks
M. Ranzato, Y. Boureau, Y. LeCun   Courant Institute  New York University

Poster ID M55

ENCODER

DECODER
CODE
nd

 Unsupervised Algorithm
Learns encoder by coupling with decoder Symmetry avoids filter normalization Simple iterative online algorithm

INPUT

W'
1st stage
ENCODER

W

OUTPUT

 Sparsity penalty
no need to consider partition function

2  stage
ENCODER DECODER

CODE

 Training Deep Networks
training stage by stage top stage produces higher level representations

INPUT

 Inference
feedforward pass through the chain of encoders

DECODER

 Comparison with other algorithms: PCA and Restricted Boltzmann Machine (RBM)

Experiments with MNIST dataset

by trading off RMSE and sparsity level, this machine gives better performance using fewer bits in the code

Some features learned at the 1st stage

 

Reconstructions from 1ofN codes at the 2nd stage   The nonlinear mapping from input pixel intensities to  class labels was discovered (totally unsupervised)