Neural Network from Scratch (V1)

Java

The original from-scratch feed-forward neural network in pure Java — single hidden layer with sigmoid activation, backpropagation, MSE convergence tracking, and persistent weights. The starting point that became the V2 MNIST classifier.

The first iteration of my from-scratch neural network work, built for the Artificial Intelligence course. No external libraries — just double[][] and the chain rule.

What’s inside

Neuron: weights, bias, sigmoid activation, and its derivative; handles forward propagation locally.
Layer: a collection of neurons; manages forward propagation for the entire layer; initializes weights randomly.
NeuralNetwork: orchestrates training and testing, implements backpropagation, computes MSE, and saves/loads trained weights to/from CSV.
Main: entry point with three execution modes — full training, test-with-saved-weights, and an interactive console mode.

Training process

Forward pass: compute the network output for each sample.
Error: difference between predicted and expected.
Backward pass: compute gradients via backpropagation through each layer.
Update: adjust weights and biases via gradient descent.
Convergence check: monitor MSE across epochs; stop when below threshold or when MSE starts climbing.

Defaults

Parameter	Value
Learning rate	0.1
MSE threshold	0.0001
Architecture	1 hidden layer
Training data	Lines 1–280 of `dataset.csv`
Test data	Lines 1–800 of `dataset.csv`

Output artifacts

File	Contents
`pesos.csv`	Saved weights after training
`mse_values.txt`	MSE per epoch — useful for plotting the learning curve

How it relates to the V2 work

This V1 network laid the groundwork for the much more substantial MNIST Digit Recognition NN V2, replacing sigmoid with ReLU, adding a softmax output layer with cross-entropy loss, switching to mini-batch SGD, and shipping a live web playground that runs the trained weights directly in the browser.