Archive

Monthly Archives: September 2014

The following is an excerpt from an e-mail exchange I had with a colleague who was unclear on autoencoders. It is reproduced here in the hopes that others who are looking for an explanation of the material might be able to find something with more intuition and less academic rigor.

Hi [Redacted],
I've not done this exercise specifically, but I can try and provide my intuitive, informal understanding of autoencoders. I've not gone through that article, so this understanding may be weak, but I hope it will provide some amount of guidance.

An autoencoder, at its simplest, is a neural network with as many inputs as outputs. It is trained by using the same value for its input and output.

That is to say, we want to learn to reproduce our input value.

That may not appear to make much sense, but if you think of it as 'denoising' an input, it becomes more reasonable. We want our network to reproduce the sort of things we've seen in the past. If we train it to accomplish this task, then when using the network later on new data, aberrant pixels, blobs, and noise will tend to be filtered by the network. There are a few types of autoencoders, some of them 'deep', others, like this article, 'sparse'. In the case of sparse autoencoders, we keep the link between the number of inputs and number of outputs, but add a large number of hidden nodes. In doing this, a single hidden node may correspond to one complete output reconstruction. The difficulty in this case is making sure that the network doesn't simply learn a one-to-one-to-one mapping of input -> hidden -> output. There are ways to avoid this but I'll try and save that for later. To make things sparse, in addition to having a lot of nodes in our hidden layer, we only want a few of them to be active at any point in time. We can make only a few of the hidden nodes active at a time using L1 norm, which is basically the absolute value function. If we say the cost of something is the sum of the absolute values of the activations [foreach(x in activations) { cost += Math.abs(x) }], then you'll notice that ANY non-zero value increases our cost. To minimize our cost function, we want only the smallest number of units active at a time which still reconstruct an input. Compare this with L2 norm: [foreach(x in activations) { cost += x*x }]. In the L2 case, we can have a lot of small (but non-zero) values contributing to our reconstruction. A node with activation 0.01, for example, adds only 0.0001 to our L2 norm cost.

I hope that makes it a little easier to understand. If things are still fuzzy I may be able to go through the article and explain some of the formulas. (Maybe.) I'm still happy to try and answer questions.

As a disclaimer: this information is true to the best of my understanding, but I am still learning and my understanding may be incomplete or even wrong. If you find something that seems to disagree with what I've said here, please let me know and we'll try to figure out what's right.

Best Regards,
-- Jo

Welcome back to Let's Build. With connecting and command processing out of the way, we open today with a bit of planning and design. First, let's see how the code looked at the end of the last lesson.


// NOTE: I replaced the tabs with two spaces in this code bar because it was getting a little wide for the website display.  I may change it back in a later example.
import std.stdio;
import std.file;
import std.string;
import std.socket;

void main() {
  string command = "";
  bool run = true;
  while(run) {
    string line = chomp(readln());
    string[] args = split(line);
    command = args[0];
    switch(command) {
      case "quit":
        run = false;
        writeln("Shutting down.");
        break;
      case "say":
        writeln(args[1..$].join(" "));
        break;
      case "start_server":
        writeln("Opening socket on 8001");

        Socket sock = new TcpSocket();
        sock.bind(new InternetAddress(8001));
        sock.listen(1);

        Socket client = sock.accept();
        char[1024] buffer;
        auto received = client.receive(buffer);
        client.send(buffer.idup);
        client.shutdown(SocketShutdown.BOTH);
        client.close();

        writeln("Done listening!");
        break;
      case "start_client":
        writeln("Connecting to 8001");
        Socket sock = new TcpSocket();
        sock.connect(new InternetAddress(8001));
        sock.send([1, 2, 3, 4]);
        char[1024] buffer;
        sock.receive(buffer);
        sock.close();
        writeln("Done.");
        break;
      default: 
        writeln("Command '", command, "' not recognized.");
        break;
    }
  }
}

There are a few things which bother me about the material above. First, we've got magic numbers like a hard coded port and buffer size. More notable to me, though, is we have one gigantic main function. I think that going forward it would be nicer to spin off the behaviors into individual functions. Let's do that, but without modifying our switch case. We'll keep it simple at the start, splitting off only the quit function.


...

void main() {
	string command = "";
	bool run = true;
	while(run) {
		string line = chomp(readln());
		string[] args = split(line);
		command = args[0];
		switch(command) {
			case "quit":
				quit(&run, line, args);
				break;
			case "say":
... // All the same
				break;
		}
	}
}

void quit(bool* run, string line, string[] args) {
	*run = false;
	writeln("Shutting down.");
}

There's nothing that prevents us from doing the same thing for each of the say and start functions, but that gigantic switch is kinda' gross. We can simplify this and borrow a trick from Python, which lacks the switch statement (last time I checked). Instead, we create an associative array which maps strings into function pointers. Our big branch statement becomes something like...


	while(run) {
		string line = chomp(readln());
		string[] args = split(line);
		command = args[0];
		functionTable[command](&run, line, args);
	}

Much better, no? This also gives us an intuitive way of dealing with unknown commands. Any decent associative array will also support getting a default value, which for us will be the 'command not found' message. Let's combine all these things together. Our code now is something like this:


import std.stdio;
import std.file;
import std.string;
import std.socket;

void main() {
	string command = "";
	bool run = true;
	void function(bool*, string, string[]) commandPointer;
	typeof(commandPointer)[string] commandTable; // Delegates can refer to non-static functions.

	// Set up the command table
	commandTable["quit"] = &quit;
	commandTable["say"] = &say;

	while(run) {
		string line = chomp(readln());
		string[] args = split(line);
		command = args[0];
		commandTable.get(command, &unrecognized)(&run, line, args);
	}
}

void unrecognized(bool* run, string line, string[] args) {
	writeln("Command not recognized.");
}

void start_server(bool* run, string line, string[] args) {
	writeln("Opening socket on 8001");

	Socket sock = new TcpSocket();
	sock.bind(new InternetAddress(8001));
	sock.listen(1);

	Socket client = sock.accept();
	char[1024] buffer;
	auto received = client.receive(buffer);
	client.send(buffer.idup);
	client.shutdown(SocketShutdown.BOTH);
	client.close();

	writeln("Done listening!");
}

void start_client(bool* run, string line, string[] args) {
	writeln("Connecting to 8001");
	Socket sock = new TcpSocket();
	sock.connect(new InternetAddress(8001));
	sock.send([1, 2, 3, 4]);
	char[1024] buffer;
	sock.receive(buffer);
	sock.close();
	writeln("Done.");
}

void say(bool* run, string line, string[] args) {
	writeln(args[1..$].join(" "));
}

void quit(bool* run, string line, string[] args) {
	*run = false;
	writeln("Shutting down.");
}

A thousand times better. I'd planned on getting more into the key exchange and crypto stuff, but one trick per lesson seems like enough. Join us next time and we'll interface with OpenSSL.

Welcome back again for hour two. Last time we made a simple read/execute/print loop. (REPL) This time we're going to add two commands, one to listen on a socket, another to send data to a listening socket. Last time we rounded off with printing. Let's review the code as we last saw it.


import std.stdio;
import std.file;
import std.string;

void main() {
	string command = "";
	bool run = true;
	while(run) {
		string line = chomp(readln());
		string[] args = split(line);
		command = args[0];
		switch(command) {
			case "quit":
				run = false;
				break;
			case "say":
				writeln(args[1..$].join(" "));
				break;
			default:
				writeln("Command '", command, "' not recognized.");
				break;
		}
	}
}

Looks good. Let's start with the listening server. We want to open a TCP socket to listen for inbound connections. After we have communication, we'll echo it back to the client and close our connection.


import std.socket; // Add an extra import for socket stuff.
...
case "start_server":
	writeln("Opening socket on 8001");

	Socket sock = new TcpSocket();
	sock.bind(new InternetAddress(8001));
	sock.listen(1); // Wait for one connection.

	Socket client = sock.accept();
	char[1024] buffer;
	auto received = client.receive(buffer);
	client.send(buffer.idup); // char[] is mutable, but client.send requires immutable data, so we do idup to clone it.
	client.shutdown(SocketShutdown.BOTH);
	client.close();

	writeln("Done listening!");
	break;

Okay, let's build and test it.

$ dmd ./main.d
$ ./main
> start_server
Opening socket on 8001

Now it probably looks as though the application has deadlocked here. In some senses, it has. The process is waiting for IO. How does one supply socket data, you ask? Well, arguably the simplest way is Telnet. Telnet allows us to connect to a port and vomit raw data at it. We can open a new terminal or start a new mux/screen. We could also background the process and throw data at it, but then the output from both our programs would get tangled and it would be hard to make sure we're doing what we need to.

$ telnet localhost 8001
Trying ::1...
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
> foo
foo
?????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????Connection closed by foreign host.
$

What the hell? Oh. You'll notice that in our server, we are sending a clone of the ENTIRE buffer, not just the text that was forwarded to us. Still, at the start of that exchange we saw the droids message we were looking for. That works well enough to me. Cutting off the extra text is left as an exercise to the reader. Let's switch back to the terminal running main.d and see what it's printing.

$ ./main
> start_server
Opening socket on 8001
Done listening!
>

Looks good! We've listened to incoming traffic. Let's cover the other side now and make something which sends a message.


case "start_client":
	writeln("Connecting to 8001");
	Socket sock = new TcpSocket();
	sock.connect(new InternetAddress(8001));
	sock.send([1, 2, 3, 4]);
	char[1024] buffer;
	sock.receive(buffer);
	sock.close();
	writeln("Done.");
	break;

Now compile and run. Let's run the server AND the client to see if they play well together.

Terminal1 $ ./main
> start_server
Opening socket on 8001

Terminal2 $ ./main
> start_client
Connecting to 8001
Done.
> quit
Shutting down.

Terminal1 $
Done listening!
> quit
Shutting down.

That's it. We have an application that can launch a client and a server now. This has some limitations, though. In particular, the server doesn't do anything except wait for a connection. If we were running a server it would be nice if we could do more like issue commands to shut down or to kick users. Next time, we'll do some more design work and see if we can't find some solutions.

Welcome to Let's Build. In this series, I'm going to walk through the creation of an application from concept to finished product. We'll be building a file sharing application in D. Why D? D is a time-tested, compiled, C-like language, and I have no idea how to use it. Learning for everyone! In the course of this project you'll likely see a lot of Python-isms written in D by someone with formal training in C and Java. It should make for some deliciously haphazard semantic salad. This series will target people with a basic familiarity of ADA-Class languages who know something of flow control and variables. We brake for nobody. It should illuminate the thought process of a professional (but not very good) software developer, and the mechanisms/brute-force by which a goal is achieved.

What are we building?

In short, I'd like an application which can securely share medium to large files with groups of arbitrary size. Revision history isn't important to me, so it will be less like git and more like BitTorrent. Perhaps Napster is a reasonable descriptor, but I don't like the piracy connotations. Let's get to designing.

What features do we want our application to have? What do we want to be able to do?

  • Host a server effortlessly, connect effortlessly.
  • Invite people.
  • Search for files owned by many peers.
  • Send files peer-to-peer in a secure fashion, piecewise or in whole. MAYBE fall back to centralized distribution if we can't get things right.
  • Broadcast messages from user-to-user or from user-to-all.

That seems to touch nicely the core of the matter. If we can accomplish this, embellishing things on the client side shouldn't be too hard. That might include ignoring someone in chat, or doing private messages (which is really just sending a specific kind of data peer-to-peer). Most of these things could be handed by a pretty simple command-line or curses interface. This would also extend nicely to a GUI layer set on top. We'll have to expand each of these elements later on and make sure they fit together, but in the meanwhile, let's drill down and lay out a simple command-parsing interface.

Ohai World


import std.stdio;
import std.file;
import std.string;

void main() {
	string command = "";
	bool run = true;
	while(run) {
		command = chomp(readln());
		switch(command) {
			case "quit":
				run = false;
				break;
			default:
				writeln("Command '", command, "' not recognized.");
				break;
		}
	}
}

Let's compile and test this app.

$ dmd ./main.d
$ ./main
> derp
Command 'derp' not recognized.
> quit
Shutting down.
$

Hooray! We can read whole word commands. That's a little bit limited, though. It would be nice if we could issue commands of the form, "exec [something]". Let's add the 'say' command and split up our arguments.


	// Replace `command = chomp(readln());` with...
	string line = chomp(readln());
	string[] args = split(line);
	command = args[0];

So, if we are prompted and enter "This is my command to you!", we have access to the variables line (This is my command to you!), args ([This, is, ..., you!]), and command (This). Let's add a function 'say', which takes everything after the first argument (the command), and writes it to the screen.


	// Rest of case statements up here.
	case "say":
		writeln(args[1..$].join(" ")); // Write words 1 through the end, joined by a single space.
		break;
	default:
	// Rest of code down here

Compile and test.

$ dmd main.d
$ ./main
> say Hello.  My name is bob.
Hello. My name is bob.
> quit
Shutting down.

Smashing. I think that's enough for a first foray into D. I'm going to have a sandwich and, when we get back, I'll do a simple socket connection.