How My Neural Net Sees Blackboards

For the last few weeks, I’ve been taking part in a small weekly neural net study group run by Michael Nielsen. It’s been really awesome! Neural nets are very very cool! They’re so cool, I had to use them somehow. Having been interested in mathematical handwriting recognition for a long time, I decided to train a neural net to clean up images of blackboards as a short weekend project/break. Thanks to Dror Bar-Natan’s Academic Pensieve, it was easy to get a bunch of data.

These results are hardly amazing, but I think they’re good enough to be useful.

How does this work? Firstly, I took a lot of images and cleaned them by hand with gimp. I imported them into python with the Python Image Library, cut them into lots of 30×30 pixel images and transformed them into numpy arrays, creating a list of original images and a corresponding output pictures. (Forgive my ugly code, I hacked all of this together very quickly.)

```from glob import glob
import numpy as np
from PIL import Image

inputs, outputs = [], []

for fname in glob("data-images/*-cleaned.jpg"):
name = os.path.splitext(fname)[0][:-8]
print "processing image %s*..." % name
origI  = Image.open(name + ".jpg")
cleanI = Image.open(name + "-cleaned.jpg")
tx, ty = origI.size
for dx in range(20, tx-50, 11):
for dy in range(20, ty-50, 11):
box = (dx, dy, dx+30, dy+30)
origA  = np.array(origI .crop(box)).mean(2)/255.0
cleanA = np.array(cleanI.crop(box)).mean(2)/255.0
inputs.append(origA)
outputs.append(cleanA)
```

From there, I basically just used the neural net code Michael provided as examples for the group. We make a neural net that takes 900 inputs (our 30×30 pixel picture), goes through a 650 hidden layer, and gives us 900 outputs (the cleaned image).

```net = Network([900, 650, 900])
```

And then we train on the data using stochastic gradient descent.

```train_data = zip(inputs, outputs)
split = len(train_data)/100

net.SGD(
train_data, # training
1,          # epoches (repeats)
10,         # batch size
0.0035,     # training rate
0.065)      # regularization constant
```

Nothing very profound here. We’re really just making a function approximater and getting it to approximate a bunch of data.

And then we stitch them back together. And voila!

In any case, it wouldn’t surprise me if one could get similar or better results using much simpler machine vision techniques. But it’s still kind of interesting and possibly useful.

Obviously, the next step is to apply deep learning neural nets. I expect to get much better results. Unfortunately, I’ve been very busy and probably won’t have time to work on this again for a few weeks… But at some point! Once I have something more interesting, I’ll put it up on github.