MNIST Digit Recognition — V1

Java

Earlier educational feedforward neural network in pure Java (sigmoid activations + backpropagation) with MSE convergence checking and CSV weight serialization. Built without any external ML library, focused on the core mechanics of training and inference.

The first iteration of my from-scratch neural network work, built for the Artificial Intelligence course. No external libraries — just double[][] and the chain rule.

What’s inside

Training process

  1. Forward pass: compute the network output for each sample.
  2. Error: difference between predicted and expected.
  3. Backward pass: compute gradients via backpropagation through each layer.
  4. Update: adjust weights and biases via gradient descent.
  5. Convergence check: monitor MSE across epochs; stop when below threshold or when MSE starts climbing.

Defaults

ParameterValue
Learning rate0.1
MSE threshold0.0001
Architecture1 hidden layer
Training dataLines 1–280 of dataset.csv
Test dataLines 1–800 of dataset.csv

Output artifacts

FileContents
pesos.csvSaved weights after training
mse_values.txtMSE per epoch — useful for plotting the learning curve

How it relates to the V2 work

This V1 network laid the groundwork for the much more substantial MNIST Digit Recognition NN V2, replacing sigmoid with ReLU, adding a softmax output layer with cross-entropy loss, switching to mini-batch SGD, and shipping a live web playground that runs the trained weights directly in the browser.