How My Neural Net Sees Blackboards

For the last few weeks, I’ve been taking part in a small weekly neural net study group run by Michael Nielsen. It’s been really awesome! Neural nets are very very cool! They’re so cool, I had to use them somehow. Having been interested in mathematical handwriting recognition for a long time, I decided to train a neural net to clean up images of blackboards as a short weekend project/break. Thanks to Dror Bar-Natan’s Academic Pensieve, it was easy to get a bunch of data.

Etingof-130323-110039 Etingof-130323-110039
Etingof-130323-110039 Etingof-130323-110039
Etingof-130323-110039 Etingof-130323-110039

These results are hardly amazing, but I think they’re good enough to be useful.

How does this work? Firstly, I took a lot of images and cleaned them by hand with gimp. I imported them into python with the Python Image Library, cut them into lots of 30×30 pixel images and transformed them into numpy arrays, creating a list of original images and a corresponding output pictures. (Forgive my ugly code, I hacked all of this together very quickly.)

from glob import glob
import numpy as np
from PIL import Image

inputs, outputs = [], []

for fname in glob("data-images/*-cleaned.jpg"):
    name = os.path.splitext(fname)[0][:-8]
    print "processing image %s*..." % name
    origI  = + ".jpg")
    cleanI = + "-cleaned.jpg")
    tx, ty = origI.size
    for dx in range(20, tx-50, 11):
        for dy in range(20, ty-50, 11):
            box = (dx, dy, dx+30, dy+30)
            origA  = np.array(origI .crop(box)).mean(2)/255.0
            cleanA = np.array(cleanI.crop(box)).mean(2)/255.0

From there, I basically just used the neural net code Michael provided as examples for the group. We make a neural net that takes 900 inputs (our 30×30 pixel picture), goes through a 650 hidden layer, and gives us 900 outputs (the cleaned image).

net = Network([900, 650, 900])

And then we train on the data using stochastic gradient descent.

train_data = zip(inputs, outputs)
split = len(train_data)/100

  train_data, # training 
  1,          # epoches (repeats)
  10,         # batch size
  0.0035,     # training rate
  0.065)      # regularization constant

Nothing very profound here. We’re really just making a function approximater and getting it to approximate a bunch of data.

And then we stitch them back together. And voila!

In any case, it wouldn’t surprise me if one could get similar or better results using much simpler machine vision techniques. But it’s still kind of interesting and possibly useful.

Obviously, the next step is to apply deep learning neural nets. I expect to get much better results. Unfortunately, I’ve been very busy and probably won’t have time to work on this again for a few weeks… But at some point! Once I have something more interesting, I’ll put it up on github.

Tags: , ,

3 Responses to “How My Neural Net Sees Blackboards”

  1. Rod Carvalho Says:

    Typo: Michael Nielson -> Michael Nielsen

  2. How My Neural Net Sees Blackboards (Part 2) | Christopher Olah's Blog Says:

    […] I discussed training a neural net to clean up images. I’m pleased to say that, using more sophisticated techniques, I’ve since achieved much […]

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: