It's done enough!

tl;dr: Download a Runnable Jar Here

Standalone PC/OSX builds are pending.

Kudos to Peter Queckenstedt (@scutanddestroy) for doing an amazing job on the Proctor, Hillary, and Trump.


‚ÄčThis has been a positive experience. I love games that actually have nontrivial interactions in them and completely open-ended text inputs. I'm a fan of interactive fiction, but hate that feeling when you're digging around and grasping for action words like some sort of textual pixel-hunt.

The language processing systems in DS2016 aren't particularly complicated, but they're more simple than I'd like. In the first week of the jam I started writing a recurrent neural network to parse and analyze the sentiment of the player's comments. I realized, perhaps too late, that there wasn't enough clean data for me to use to accurately gauge the sentiment and map it to social groups. Instead, I wrote a basic multinomial naive bayes classifier that takes a sentence, tokenizes it, and maps it to 'like' or 'dislike'. Each group has its own classifier and tokenizer, so I could program demographics with a base voting likelihood and give each of them a few sentences on the "agrees with" and "disagrees with" sides, then have them automatically parse and change their feelings towards the player.

A usability change that came in later than one would guess was as follows: I had originally grabbed the demographic with the largest emotional response to a comment and displayed them with the sentiment change. Unfortunately, this turned out to over-exaggerate one particularly noisy group. Another change, shortly thereafter, was masking the exact amount of the change. Instead of saying +1.05% opinion, it simply became "+Conservatives" or "-Hipsters". This was visually far easier to parse and I think helped the overall readability of the game.

There is still a call to add some more direct public opinion tracking in the game, letting players know in closer to real time how they're doing among the demographics. I may find it in myself to introduce that.

The last interesting aspect that I noticed during playtesting: I had slightly over-tuned the language models to my style of writing. Instead of opining on matters at any length, people were making enormous run-on sentences which appealed to every demographic at the same time. These statements, often self-contradictory, were not something I expected or handled well. I found the game to be rather difficult, but it looks like playtesters had a dandy time making the states go all blue.

Here I sit, overwhelmed at the insanity taking place in the political arena. I'm taking a moment to collect my thoughts for the same reason as anyone else that keeps a journal: so when we look back at the injustices and failures of the past we get some sense of their context. Maybe it will also remind me that we tried.

The Net Neutrality Repeal

The FCC, as chaired by Ajit Pai, has stated its intention to roll back the Title II protections afforded in 2015 under President Barack Obama. There are five members of the board, three Republicans and two Democrats. The Democrats have voiced their opposition to the changes. The three majority members favor of the repeal of the consumer protections and have given the bill the compelling title, "Restoring Internet Freedom Order." Their argument is that regulations are stifling innovation. Comcast and Verizon, in investor meetings, have both declared that Net Neutrality rules do not, in fact, hinder innovation. There have been millions of comments voiced by consumers who favor the added protections from Title II. There are some form letters. There have also been millions of automated bot comments in opposition. It seems reasonably likely that major ISPs are not expecting to get away with the fabricated comments in opposition, but hope to muddy the waters enough to make public feedback seem unusable.

It's looking like the repeal will go through, followed by a litany of confusing legal actions which will likely ALSO be muddied by telecom providers. (This can happen because only one appellate court can hear the petition and it's chosen more or less at random -- first come first serve. If a telecom files a petition against some part of the FCC order, the jurisdiction is entered into the lottery. This will allow them to change to a more favorable venue.)

Healthcare Repeal

The House and The Senate have both voted to try and dismantle key provisions of the Affordable Care Act. The ACA has insured a record number of people (in b4 smarmy individual mandate comment) and has started to restrain the growth of health care costs. It has been life saving for more than a few people and protected countless others from bankruptcy. Health care costs could be further reduced if states wouldn't refuse federal funds. (This has actually happened.) Additionally, since the president is required to basically sign checks reimbursing insurers for the high-risk pools, that adds uncertainty to the market and makes it harder for insurance providers to plan forward -- removing smaller providers and driving up costs for all participants.

Tax Reform

After a massive public outcry against the mass-repeal of healthcare, it looks like Republicans have doubled down on the, "Look, we're all about taxes," mantra. The new tax bill contains provisions to break one of the three legs of the ACA: the individual mandate. Without the individual mandate, there's no incentive for anyone to join the pool of insured until they need insurance. The addition of young, healthy, low-risk persons decreases the cost of providing care and drives down premiums. Without the individual mandate, people can refuse to acquire healthcare until they actually need it which, due to the rules on preexisting conditions, means they can't be refused service (a good thing, if coupled with the individual mandate). This makes Obamacare untenable and gives Republicans deniability. "Look, we always said it was doomed. It had nothing to do with us sabotaging it. It just fell apart due to nothing we did. All we did was pass this one unrelated tax bill and suddenly it exploded."

In Short Supply

I've been in fairly regular contact with my representatives at the House and Senate level. (Props to Jamario and Alex L. You guys rock.) It feels every day though that the best we can hope for is to throw lives at the battlefront while we retreat. Corporate profits continue to skyrocket. Dividends are paid to board members and shareholders instead of employees. The middle class' wages stagnate or shrink while the working poor's numbers grow. A fraction of a handful of a few self-serving people are setting up our country for failure to further their own personal gains and are manipulating countless thousands into believing it's for their own personal good. No hope for a better tomorrow.

Legislation should be driven by evidence and inspired by axioms.
Check your results and revisit your decisions. Did anything change? What? Why?
There's no shame in making mistakes -- learn from them and do better.
A good solution now is better than a perfect solution never.

Seek the greatest good for the greatest number of people.
What we're able to do is more important than what we're allowed to do.
Social safety nets make for a healthier society.

Science is fundamental to our economy and our future.
Learning makes for a better world -- it should be accessible to all people.

Employment is a right -- a person should be able to thrive on the fruits of their labor, not merely survive.
There's honor in trades. Craftsmanship should be celebrated.
A person's worth is not contingent upon their employment.

Be decent to everyone.
Be excellent to those who are excellent to you.
In all things, try and maintain a sense of humor.

I had the distinct honor of working with a number of talented individuals at the unofficial unconference.

We were ambitious, maybe too much so, but I comfort myself by saying that incremental progress is progress nonetheless, and that it's a worthwhile endeavor to report both our successes and our failures. The following is an account of what we tried and our reason for trying.

We began with the problem phrased as follows: "Given a high-level input description of a program, produce source code which meets the requirements." We didn't expect to solve this problem in a day, but figured that generating a dataset would be both feasible in the time frame and worthwhile. Collectively, we decided to produce tuples of problem descriptions, programs, and example outputs. The next question was one of output language: we wanted to generate examples in a language which was simple enough for a machine to learn and also usable by a human creating the programming problems. Python was the favorite language among the group, but had its limitations -- it was a big language with lots of minutia. Learning to generate Python would require apprehending list-comprehension and more subtlety syntactically than we felt was strictly necessary. Instead, we opted for a high-level Turing-complete computational graph representation of a language (basically, an AST). The graph could be "compiled" to a description or "compiled" to Python and then run, giving all the required outputs of the problem.

The next issue was generating programming problems of sufficient variety to be useful. Too few examples would basically guarantee overfitting, so the manual construction of programming examples was out. Too repetitive would mean the program would ignore the details of the English language and would pick up on the structure of the sentences. That seemed okay at the time -- we figured we could remove details without too much effort to make problems more 'programming-challenge-esque'. It became apparent quickly that selecting which details to omit to frame the problem was almost as big a challenge as the original.

Our graph approach was producing, "Given variables a, b, c, return a*c - b if c > a else c*b;" Not a particularly interesting problem, since it basically involves a direct translation from the description to machine code, and we wanted to avoid building, "a compiler, but with ML."

The remainder of the day was spent first trying to construct more elaborate program descriptions and more subtle, interesting problems. The latter was spent in an entirely alternative mode where we decided to try and use an autoencoder to learn the Python AST and an autoencoder to learn the structure of English, then bridge the two using our limited dataset scraped from Project Euler and from OpenAI's sample dataset.

I'm not yet sure what we'd do differently. It seemed we made the right choices given the available information most of the time -- the biggest oversight to me remains misjudging the quality of the English output graph. I have to wonder if that could be improved through clever graph optimization tricks. Perhaps that's a project for another hackathon.

On a personal note, I was overwhelmingly impressed by the depth of skills expressed by the persons there. As much as I don't enjoy being the least smart person in a room, that's a good state for a programming collaboration. I look forward to working with everyone again after I've had a chance to study some more. Maybe a lot more.

Thank you to everyone who participated. It has been a pleasure.

I did my masters in machine learning, so I'm a little touchy on the subject. It always stands out to me when someone says, 'big data punishes poor people' because it sounds like "polynomials are anti-semetic" or "bolt cutters are racist".

Machine learning is a tool like any other, and it can be used for nefarious purposes. I don't think it's an unreasonable assertion that things like search-bubbling actually contribute negatively to echo-chamber effects, as they result in people seeing only data that reinforces their viewpoints (as a side effect of being more relevant). To cast the blanket statement like this, however, I think is a catchy but unnecessarily negative act.

I hope the book doesn't overlook the positive contributions that data mining has made, like discovering genetic markers for diseases, finding new antibiotics, finding treatments for cancers, decreasing water consumption in agriculture, tracking diminishing animal populations, or even more mundane things like providing automatic subtitles to videos for the hearing impaired.

The most interesting question I have to raise is this: is it _more_ humane to remove the biases of a human? Humans are REALLY good at seeing patterns. We're so good at seeing patterns that we see them where there are none -- we see Jesus in toast, we see faces in the sky, we see people as part of a group. That last one is racist, and while we can't alter our perceptions we can be made aware of them and do everything we can to try and work around our 'feelings'. Machines are getting good at recognizing patterns too, now. They even beat us in a lot of cases. If we train a model with racist data, though, it will generate racist predictions. Can we efficiently sanitize data to be sure that it's fair to everyone involved? Is it inevitable that people will abuse statistics to further their own ends? Equally curious: if data suggests a 99% chance that someone will default on a loan, should we chide the operator of the tool for using it? What if they're trying to protect their own best interests? I don't know if there's a winner there.

There's a lot of answers I don't have and, ironically, an inability to predict the future, but I do have an emotional response to the article: it's unpleasant and bothersome. I can't say it's wrong, but I can say it's an incomplete picture and that it furthers the author's agenda: making a boogeyman of an emerging technology. I don't like that.

tl;dr: This is a nuanced topic and I'm dubious that the author can reasonably cover it, fearing instead that it devolves into fear-monger.