**Stages of training a convolutional neural network (CNN).**
Each point in this animation represents one of the 10,000 handwritten
digits in the MNIST
test data set. Each of the ten colors represents one of the ten digits
(dark blue = 0, ..., yellow = 9, see summary figure below).
The time evolution, or epoch, reflects the
training of the neural network. At the initial frame, the untrained network
is unable to classify images -- it has a 90% error rate, just as one
would expect from random guessing. As the training progresses and the
error rate decreases, the digits in the cloud begin to cluster. (The
network was trained on a separate set of 55,000 images, so the clustering
represents success in recognizing new digits.)

This animation makes use of a new tool, InPCA, for visualizing high-dimensional data (see Visualizing probabilistic models: Intensive Principal Component Analysis, by Katherine N. Quinn, Colin B. Clement, Francesco De Bernardis, Michael D. Niemack, and James P. Sethna.) Our method bypasses the `curse of dimensionality' that arises for large models with too much information by taking the limit of zero information using the 'replica trick'.

**Summary figure, showing digits.**

- Parameter Space Compression
Underlies Emergent Theories and Predictive Models,
Benjamin B. Machta, Ricky Chachra, Mark K. Transtrum, James P. Sethna,
Science
**342**, 604-607 (2013). See also Physicists unify the structure of scientific theories in the Cornell Chronicle (Anne Ju). - Jesse Silverberg's Huffington Post blog and Kathryn McGill's vlog Soft Matters with Jim Sethna from The Physics Factor.
- (Unedited) Interview of Sethna by Steven Reiner, Stony Brook School of Journalism, from a workshop by the Alan Alda Center for Communicating Science sponsored by the Kavli Institute at Cornell, May 2013. Mobile version.
- Other papers on sloppy models

- Sloppy Models
- A sloppy systems biology model
- What is Sloppiness?
- What are Sloppy Models?
- Fitting Exponentials: Prediction without parameters
- Fitting Polynomials: Where is sloppiness from?
- Why sloppiness? The Sloppy Universality Class
- Differential Geometry and Sloppy Models (Transtrum)
- The Model Manifold and Hyperribbons (Transtrum)
- Sloppy Curvature (Transtrum)
- Model Manifold Comparisons of Algorithms (Kloumann)

- Why is science possible? Sloppy models in Physics.
- Jessie Silverberg's Huffington Post article and Katheryn McGill's vlog Interview from The Physics Factor.
- Unedited workshop interview by Steven Reiner, Stony Brook School of Journalism; Mobile version.

- Sloppy model applications
- Do parameters matter? Fits versus measurements.
- Experimental design in sloppy systems
- Robustness and sloppiness
- Estimating systematic errors for interatomic potentials and for density functional theory.
- Learning digits with InPCA

Last Modified: May 9, 2019

James P. Sethna, sethna@lassp.cornell.edu; This work supported by the Division of Materials Research of the U.S. National Science Foundation, through grant DMR-1719490.

Statistical Mechanics: Entropy, Order Parameters, and Complexity, now available at Oxford University Press (USA, Europe).