So What am I looking at?
Here each dot in this represents a neuron from the Squeezenet network, pre-trained to recognize 1000 objects. Different input images will excite different regions of neurons. The neural net runs directly in your browser (and your GPU if you have one).
A neuron receives signals from neurons in the layer before it. The strengths of these connections get 'learned' when a neural network is trained, so a neuron may learn to ignore an input from Neuron A or give extra weight to Neuron B. If the weighted sum of all inputs is big enough (imagine many rivers flowing into a reservoir until a dam is breached; see GIF) the neuron gets 'activated' and sends a signal out to neurons in the next layer, continuing this cycle. If the threshold is not reached, even if it's only by a smidge, there is no output (well this is not always true.. but valid for illustrative purposes).
Each block in this visualization represents a layer of neurons in the neural network. These layers are arranged sequentially and you can see a vague outline of the original image in the first layer; this is because neurons in earlier layers tend to detect low-level details like edges, corners, colors, etc (known as 'features' in computer vision) and will become excited when they see these features.
Interestingly, no one explicitly programmed these neurons to learn to identify these features. This is a sharp contrast to pre-deep learning Computer Vision algorithms where artisinal hand-crafted features were the norm. The whole network figured out on their own that these are important pieces of information to recognize objects!
After the signals pass through the entire network they reach the final layer, composed of 1000 neurons with each representing a type of object. The most excited neuron in the last layer is the network's guess as to what is in the image.
This was created using keras.js and three.js. Source code on Github. Tweet @albertlai feedback!