Monday, May 18, 2015

Learning Torch7 - Part 2: The Basic Tutorial

From this part of the page:
https://github.com/torch/torch7/wiki/Cheatsheet#newbies

I got to the tutorials part:
https://github.com/torch/torch7/wiki/Cheatsheet#tutorials-demos-by-category

And to the first tutorial:
http://code.madbits.com/wiki/doku.php

And to its first part, about the basics:
http://code.madbits.com/wiki/doku.php?id=tutorial_basics

The tutorial is straightforward, but ends with a request to normalize the mnist train data.

This is the code I came up with after a very painful realization.

The realization is what you see on line 4:

train.data = train.data:type(torch.getdefaulttensortype())


It appears that the data is loaded by torch not in tables or torch tensors but in something called "userdata". This is the type assigned to things coming from the dark depths of C / C++ interops with Lua. This userdata had a particular interesting feature: its contents were forced (through wrapping) to be in the range [0, 256) . So storing -26, for instance, would result in the value 256 - 26 = 230 to appear in the data. So casting was the first step to gain back my sanity in this case.

After casting back to torch tensors can you use :mean(), :std() and other tensor methods which make this code short and quick.

By the way - this normalization is called "z-score": http://en.wikipedia.org/wiki/Standard_score
I didn't quite catch the right way to describe the whole act in proper English, but that's what we are doing here (subtracting the mean and dividing by the standard deviation).

More useful things learned along the way:

for k,v in pairs(o) do -- or pairs(getmetatable(o))
   print(k, v)
end

is the equivalent of Python's dir(o) .

I also started working with this IDE: http://studio.zerobrane.com/

Moving on with the tutorials, the supervised learning will be my next post's subject.

No comments:

Post a Comment