Archive

Monthly Archives: September 2014

The following is an excerpt from an e-mail exchange I had with a colleague who was unclear on autoencoders. It is reproduced here in the hopes that others who are looking for an explanation of the material might be able to find something with more intuition and less academic rigor.

Hi [Redacted],
I’ve not done this exercise specifically, but I can try and provide my intuitive, informal understanding of autoencoders. I’ve not gone through that article, so this understanding may be weak, but I hope it will provide some amount of guidance.

An autoencoder, at its simplest, is a neural network with as many inputs as outputs. It is trained by using the same value for its input and output.

That is to say, we want to learn to reproduce our input value.

That may not appear to make much sense, but if you think of it as ‘denoising’ an input, it becomes more reasonable. We want our network to reproduce the sort of things we’ve seen in the past. If we train it to accomplish this task, then when using the network later on new data, aberrant pixels, blobs, and noise will tend to be filtered by the network. There are a few types of autoencoders, some of them ‘deep’, others, like this article, ‘sparse’. In the case of sparse autoencoders, we keep the link between the number of inputs and number of outputs, but add a large number of hidden nodes. In doing this, a single hidden node may correspond to one complete output reconstruction. The difficulty in this case is making sure that the network doesn’t simply learn a one-to-one-to-one mapping of input -> hidden -> output. There are ways to avoid this but I’ll try and save that for later. To make things sparse, in addition to having a lot of nodes in our hidden layer, we only want a few of them to be active at any point in time. We can make only a few of the hidden nodes active at a time using L1 norm, which is basically the absolute value function. If we say the cost of something is the sum of the absolute values of the activations [foreach(x in activations) { cost += Math.abs(x) }], then you’ll notice that ANY non-zero value increases our cost. To minimize our cost function, we want only the smallest number of units active at a time which still reconstruct an input. Compare this with L2 norm: [foreach(x in activations) { cost += x*x }]. In the L2 case, we can have a lot of small (but non-zero) values contributing to our reconstruction. A node with activation 0.01, for example, adds only 0.0001 to our L2 norm cost.

I hope that makes it a little easier to understand. If things are still fuzzy I may be able to go through the article and explain some of the formulas. (Maybe.) I’m still happy to try and answer questions.

As a disclaimer: this information is true to the best of my understanding, but I am still learning and my understanding may be incomplete or even wrong. If you find something that seems to disagree with what I’ve said here, please let me know and we’ll try to figure out what’s right.

Best Regards,
— Jo

Welcome back to Let’s Build. With connecting and command processing out of the way, we open today with a bit of planning and design. First, let’s see how the code looked at the end of the last lesson.


// NOTE: I replaced the tabs with two spaces in this code bar because it was getting a little wide for the website display.  I may change it back in a later example.
import std.stdio;
import std.file;
import std.string;
import std.socket;

void main() {
  string command = "";
  bool run = true;
  while(run) {
    string line = chomp(readln());
    string[] args = split(line);
    command = args[0];
    switch(command) {
      case "quit":
        run = false;
        writeln("Shutting down.");
        break;
      case "say":
        writeln(args[1..$].join(" "));
        break;
      case "start_server":
        writeln("Opening socket on 8001");

        Socket sock = new TcpSocket();
        sock.bind(new InternetAddress(8001));
        sock.listen(1);

        Socket client = sock.accept();
        char[1024] buffer;
        auto received = client.receive(buffer);
        client.send(buffer.idup);
        client.shutdown(SocketShutdown.BOTH);
        client.close();

        writeln("Done listening!");
        break;
      case "start_client":
        writeln("Connecting to 8001");
        Socket sock = new TcpSocket();
        sock.connect(new InternetAddress(8001));
        sock.send([1, 2, 3, 4]);
        char[1024] buffer;
        sock.receive(buffer);
        sock.close();
        writeln("Done.");
        break;
      default: 
        writeln("Command '", command, "' not recognized.");
        break;
    }
  }
}

There are a few things which bother me about the material above. First, we’ve got magic numbers like a hard coded port and buffer size. More notable to me, though, is we have one gigantic main function. I think that going forward it would be nicer to spin off the behaviors into individual functions. Let’s do that, but without modifying our switch case. We’ll keep it simple at the start, splitting off only the quit function.


...

void main() {
	string command = "";
	bool run = true;
	while(run) {
		string line = chomp(readln());
		string[] args = split(line);
		command = args[0];
		switch(command) {
			case "quit":
				quit(&run, line, args);
				break;
			case "say":
... // All the same
				break;
		}
	}
}

void quit(bool* run, string line, string[] args) {
	*run = false;
	writeln("Shutting down.");
}

There’s nothing that prevents us from doing the same thing for each of the say and start functions, but that gigantic switch is kinda’ gross. We can simplify this and borrow a trick from Python, which lacks the switch statement (last time I checked). Instead, we create an associative array which maps strings into function pointers. Our big branch statement becomes something like…


	while(run) {
		string line = chomp(readln());
		string[] args = split(line);
		command = args[0];
		functionTable[command](&run, line, args);
	}

Much better, no? This also gives us an intuitive way of dealing with unknown commands. Any decent associative array will also support getting a default value, which for us will be the ‘command not found’ message. Let’s combine all these things together. Our code now is something like this:


import std.stdio;
import std.file;
import std.string;
import std.socket;

void main() {
	string command = "";
	bool run = true;
	void function(bool*, string, string[]) commandPointer;
	typeof(commandPointer)[string] commandTable; // Delegates can refer to non-static functions.

	// Set up the command table
	commandTable["quit"] = &quit;
	commandTable["say"] = &say;

	while(run) {
		string line = chomp(readln());
		string[] args = split(line);
		command = args[0];
		commandTable.get(command, &unrecognized)(&run, line, args);
	}
}

void unrecognized(bool* run, string line, string[] args) {
	writeln("Command not recognized.");
}

void start_server(bool* run, string line, string[] args) {
	writeln("Opening socket on 8001");

	Socket sock = new TcpSocket();
	sock.bind(new InternetAddress(8001));
	sock.listen(1);

	Socket client = sock.accept();
	char[1024] buffer;
	auto received = client.receive(buffer);
	client.send(buffer.idup);
	client.shutdown(SocketShutdown.BOTH);
	client.close();

	writeln("Done listening!");
}

void start_client(bool* run, string line, string[] args) {
	writeln("Connecting to 8001");
	Socket sock = new TcpSocket();
	sock.connect(new InternetAddress(8001));
	sock.send([1, 2, 3, 4]);
	char[1024] buffer;
	sock.receive(buffer);
	sock.close();
	writeln("Done.");
}

void say(bool* run, string line, string[] args) {
	writeln(args[1..$].join(" "));
}

void quit(bool* run, string line, string[] args) {
	*run = false;
	writeln("Shutting down.");
}

A thousand times better. I’d planned on getting more into the key exchange and crypto stuff, but one trick per lesson seems like enough. Join us next time and we’ll interface with OpenSSL.

Welcome back again for hour two. Last time we made a simple read/execute/print loop. (REPL) This time we’re going to add two commands, one to listen on a socket, another to send data to a listening socket. Last time we rounded off with printing. Let’s review the code as we last saw it.


import std.stdio;
import std.file;
import std.string;

void main() {
	string command = "";
	bool run = true;
	while(run) {
		string line = chomp(readln());
		string[] args = split(line);
		command = args[0];
		switch(command) {
			case "quit":
				run = false;
				break;
			case "say":
				writeln(args[1..$].join(" "));
				break;
			default:
				writeln("Command '", command, "' not recognized.");
				break;
		}
	}
}

Looks good. Let’s start with the listening server. We want to open a TCP socket to listen for inbound connections. After we have communication, we’ll echo it back to the client and close our connection.


import std.socket; // Add an extra import for socket stuff.
...
case "start_server":
	writeln("Opening socket on 8001");

	Socket sock = new TcpSocket();
	sock.bind(new InternetAddress(8001));
	sock.listen(1); // Wait for one connection.

	Socket client = sock.accept();
	char[1024] buffer;
	auto received = client.receive(buffer);
	client.send(buffer.idup); // char[] is mutable, but client.send requires immutable data, so we do idup to clone it.
	client.shutdown(SocketShutdown.BOTH);
	client.close();

	writeln("Done listening!");
	break;

Okay, let’s build and test it.

$ dmd ./main.d
$ ./main
> start_server
Opening socket on 8001

Now it probably looks as though the application has deadlocked here. In some senses, it has. The process is waiting for IO. How does one supply socket data, you ask? Well, arguably the simplest way is Telnet. Telnet allows us to connect to a port and vomit raw data at it. We can open a new terminal or start a new mux/screen. We could also background the process and throw data at it, but then the output from both our programs would get tangled and it would be hard to make sure we’re doing what we need to.

$ telnet localhost 8001
Trying ::1...
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
> foo
foo
?????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????Connection closed by foreign host.
$

What the hell? Oh. You’ll notice that in our server, we are sending a clone of the ENTIRE buffer, not just the text that was forwarded to us. Still, at the start of that exchange we saw the droids message we were looking for. That works well enough to me. Cutting off the extra text is left as an exercise to the reader. Let’s switch back to the terminal running main.d and see what it’s printing.

$ ./main
> start_server
Opening socket on 8001
Done listening!
>

Looks good! We’ve listened to incoming traffic. Let’s cover the other side now and make something which sends a message.


case "start_client":
	writeln("Connecting to 8001");
	Socket sock = new TcpSocket();
	sock.connect(new InternetAddress(8001));
	sock.send([1, 2, 3, 4]);
	char[1024] buffer;
	sock.receive(buffer);
	sock.close();
	writeln("Done.");
	break;

Now compile and run. Let’s run the server AND the client to see if they play well together.

Terminal1 $ ./main
> start_server
Opening socket on 8001

Terminal2 $ ./main
> start_client
Connecting to 8001
Done.
> quit
Shutting down.

Terminal1 $
Done listening!
> quit
Shutting down.

That’s it. We have an application that can launch a client and a server now. This has some limitations, though. In particular, the server doesn’t do anything except wait for a connection. If we were running a server it would be nice if we could do more like issue commands to shut down or to kick users. Next time, we’ll do some more design work and see if we can’t find some solutions.