Hacker News from Y Combinator

Syndicate content
Links for the intellectually curious, ranked by readers. // via fulltextrssfeed.com
Updated: 1 hour 4 min ago

Forgotten Corners of World of Warcraft

1 hour 4 min ago

The giant random crystals of Silithus (All images by Eric Grundhauser)

As more and more of our time and lives move into the digital world, the online landscape is becoming so vast, we have begun leaving some things behind.

Namely, much like the physical world, whole swaths of towns, islands, forests, palaces, and simple shacks have been constructed digitally and then abandoned. These are the forgotten wonders of the digital world—in this case, the World of Warcraft.

For the uninitiated, World of Warcraft (WoW) is a massively multiplayer online role playing game (MMORPG). Players download the game and pay a monthly fee to login, taking part in a shared world called Azeroth. The essential size of the world is fixed but at any given time, WoW has around 10 million active subscribers and to manage the unbelievable number of players, there are hundreds of realms that split up WoW subscribers. What this means, structurally, is that not of all of Warcraft's millions of players exist in the same space at the same time,

And the world is vast. At the game’s launch in 2004, Azeroth consisted of just two continents, split into 41 zones that can best be thought of as countries. Each one defined by their own look and flavor. The Barrens is a bit of the African veldt, while the Burning Steppes are a blasted, molten wasteland. In real world terms, Azeroth has been compared both in size and thematic construction to Disney World. (For geography nerds, the real world-game world comparison ratio has been explored a few different ways including delving into the code to find the in-game distance measurements, as well as using the average stride length of a player’s sprite to extrapolate some numbers—people tend to agree that the original continents are around 8 miles long).

The world of Azeroth

As the game has grown, new continents have been introduced in add-on content expansions that players can purchase—and whenever players rush into the newer areas, the original spots get ignored. Among them, the Asian-inspired Pandaria, the icy northern lands of Northrend, and Outland, which literally exists on the remnants of another world. Today, there are 91 zones split across six land-masses, saying nothing of the dozens of dungeon spaces that exist as separate little places for specific adventures. With new continents providing players quicker ways to advance and novel locations to explore, many older zones have simply become bygone curiosities.

So what are the abandoned parts of WoW? Unfortunately, the game designers declined to comment, so it was left to me to do some on-the-digital-ground reporting.

What an attractive young man. 

I reactivated Baerf, my level 86 troll rogue. Turns out he was right where I had left him countless months before, in Pandaria, the fifth continent to be introduced. I had purchased the Mists of Pandaria expansion, which gave me access to a new continent, but I quickly canceled my account because I knew it would be too much of a delightful time drain. Adulthood is a bummer.

I hearthstone'd (an item given to every player at start, that can be used to return you to a populated place) back to Orgrimmar, the central Horde city, located on one of the original continents. When I had last played, this orc metropolis was bustling with so many other live players that it made my computer wheeze. Now it was almost empty.

Entering a command that listed all of the players in the zone, I found that there were just 19 people in the hub city. The computer controlled, non-player characters (NPCs) were there of course, but flesh-and-blood players were scarce. Nonetheless, I sent out a public chat to everyone in the city, asking after the least inhabited places.

In the nearly empty city, I got little response. A couple of players responded with their favorite corners of the world, inspired more by nostalgia than any confirmed numbers. But my question may not even have been understood. WoW is a game designed to be shared with other players. It is its raison d'être. Looking for places that no one goes is kind of antithetical to most players’ view of the game. Nonetheless, some of the most beautiful moments in the game occur in its loneliest locales.

The Scarab Wall in Silithus

Venturing out into the world in search of solitude, my first stop was Silithus, in the south of the continent of Kalimdor. This zone was part of the original release and was designed for players between the levels of 55-60. As the world grew, better options to grow characters emerged and Silithus, despite being an evocative wasteland, seems to have lost its appeal. The overriding theme is bugs. The area is full of titanic crystalline plinths floating against an orange sky, and ant mounds buzzing with rings of pests. Organic claws reach out of the ground everywhere you look, and if you run for more than a few seconds in any direction, you will end up in a pit filled with chittering Silithid enemies.

The Swarming Pillar in Silithus

Nearby is another zone from the original game, the Un’Goro Crater, which was inspired by the very real Ngorongoro Crater in Tanzania. A strangely disconnected area even in the beginning, this massive depression is a mist-shrouded primordial jungle full of gorillas, dinosaurs and mysterious ancient ruins that look like they could have created by Roman artisans. The jungle forms a ring on the crater floor with a roiling volcano at its center. The ground is littered with bones left behind by the prehistoric beasts roaming the grounds, juxtaposed by the colorful natural crystals that grow nowhere else on Azeroth. Meant for players around the 50-55 level range, the Un’Goro Crater suffered much the same fate as its insect-infested neighbor.

Fire Plume Ridge in the center of the Un'Goro Crater

The Shaper's Terrace

As I flew through the zones there were just over 20 players in either of them.

The gates of Bogpaddle in the Swamp of Sorrows

Across the sea on the continent known as the Eastern Kingdoms, there are zones such as the Swamp of Sorrows. One of a handful of areas in the game designed as a dank swamp environment, the Swamp suffers for being a small sliver of an area sandwiched between two zones with more bombastic scenery. However its sunken temples and oppressive hanging moss create a haunting space to explore. When I visited, there were only 14 others trudging through the swamps.

The sunken Temple of Atal'Hakkar in the Swamp of Sorrows

One of the more recent areas that seems to have been almost immediately tossed aside is the underwater zone of Vashj’ir. Introduced in the Cataclysm expansion that saw a number of the original zones reshaped to try and bring people back to these forgotten lands. Vashj’ir was a newly introduced zone, and the first one in the game to be entirely underwater. The sub-oceanic playground is big enough to be split into three separate zones itself. All three are jam packed with wonders like fantastical forests of rising kelp, the titanic shells of dead crustaceans that can be explored like a cave system, and a swirling abyssal vortex that sucks players off to a dungeon instance. As of my visit, there were all of nine active players exploring the vortex.

The massive living caves of Nespirah

The Abyssal Breach

Kelp'thar Forest

However, maybe the most unloved areas of WoW may not even be on Azeroth. Outland, introduced in the first expansion, The Burning Crusade, was a whole new continent existing through a legendary portal, where players could join the battle against demons and evil(er) orcs. Unlike the more-often seamless continents of the original game, Outland’s landscapes seemed like a barely connected selection of dreams. You could walk from the spiked geological impossibilities of the Blade’s Edge Mountains right into the scattered scraps of untethered land known as the Netherstorm. It was bright, crazy, and now, almost entirely empty. I visited every zone on Outland for this piece and not one of them had more than 30 players.

Hellfire Citadel in Outland's Hellfire Peninsula

The anti-dragon spikes of Blade's Edge Mountains

Eco-Dome Midrealm in the Netherstorm

The most recent expansion to the World of Warcraft, Warlords of Draenor, was released in November of 2014, drawing players to the newly created areas. Nonetheless, all of these older corners of WoW still exist. The generated bugs of Silithus still hunt; Vashj’ir’s kelp forests continue to sway in the currents; and the demon hordes are still waiting in Outland. Like the real world, it can get lonely wandering solo through a kelp forest or spiky mountain range, but it can be beautiful, too. 

Baerf flies on.

Rust once, run everywhere

1 hour 4 min ago

Rust's quest for world domination was never destined to happen overnight, so Rust needs to be able to interoperate with the existing world just as easily as it talks to itself. For this reason, Rust makes it easy to communicate with C APIs without overhead, and to leverage its ownership system to provide much stronger safety guarantees for those APIs at the same time.

To communicate with other languages, Rust provides a foreign function interface (FFI). Following Rust's design principles, the FFI provides a zero-cost abstraction where function calls between Rust and C have identical performance to C function calls. FFI bindings can also leverage language features such as ownership and borrowing to provide a safe interface that enforces protocols around pointers and other resources. These protocols usually appear only in the documentation for C APIs -- at best -- but Rust makes them explicit.

In this post we'll explore how to encapsulate unsafe FFI calls to C in safe, zero-cost abstractions. Working with C is, however, just an example; we'll also see how Rust can easily talk to languages like Python and Ruby just as seamlessly as with C.

Rust talking to C

Let's start with a simple example of calling C code from Rust and then demonstrate that Rust imposes no additional overhead. Here's a C program which will simply double all the input it's given:

int double_input(int input) { return input * 2; }

To call this from Rust, you might write a program like this:

extern crate libc; extern { fn double_input(input: libc::c_int) -> libc::c_int; } fn main() { let input = 4; let output = unsafe { double_input(input) }; println!("{} * 2 = {}", input, output); }

And that's it! You can try this out for yourself by checking out the code on GitHub and running cargo run from that directory. At the source level we can see that there's no burden in calling an external function beyond stating its signature, and we'll see soon that the generated code indeed has no overhead, either. There are, however, a few subtle aspects of this Rust program, so let's cover each piece in detail.

First up we see extern crate libc. The libc crate provides many useful type definitions for FFI bindings when talking with C, and it makes it easy to ensure that both C and Rust agree on the types crossing the language boundary.

This leads us nicely into the next part of the program:

extern { fn double_input(input: libc::c_int) -> libc::c_int; }

In Rust this is a declaration of an externally available function. You can think of this along the lines of a C header file. Here's where the compiler learns about the inputs and outputs of the function, and you can see above that this matches our definition in C. Next up we have the main body of the program:

fn main() { let input = 4; let output = unsafe { double_input(input) }; println!("{} * 2 = {}", input, output); }

We see one of the crucial aspects of FFI in Rust here, the unsafe block. The compiler knows nothing about the implementation of double_input, so it must assume that memory unsafety could happen whenever you call a foreign function. The unsafe block is how the programmer takes responsibility for ensuring safety -- you are promising that the actual call you make will not, in fact, violate memory safety, and thus that Rust's basic guarantees are upheld. This may seem limiting, but Rust has just the right set of tools to allow consumers to not worry about unsafe (more on this in a moment).

Now that we've seen how to call a C function from Rust, let's see if we can verify this claim of zero overhead. Almost all programming languages can call into C one way or another, but it often comes at a cost with runtime type conversions or perhaps some language-runtime juggling. To get a handle on what Rust is doing, let's go straight to the assembly code of the above main function's call to double_input:

mov $0x4,%edi callq 3bc30 <double_input>

And as before, that's it! Here we can see that calling a C function from Rust involves precisely one call instruction after moving the arguments into place, exactly the same cost as it would be in C.

Safe Abstractions

Most features in Rust tie into its core concept of ownership, and the FFI is no exception. When binding a C library in Rust you not only have the benefit of zero overhead, but you are also able to make it safer than C can! Bindings can leverage the ownership and borrowing principles in Rust to codify comments typically found in a C header about how its API should be used.

For example, consider a C library for parsing a tarball. This library will expose functions to read the contents of each file in the tarball, probably something along the lines of:

// Gets the data for a file in the tarball at the given index, returning NULL if // it does not exist. The `size` pointer is filled in with the size of the file // if successful. const char *tarball_file_data(tarball_t *tarball, unsigned index, size_t *size);

This function is implicitly making assumptions about how it can be used, however, by assuming that the char* pointer returned cannot outlive the input tarball. When bound in Rust, this API might look like this instead:

pub struct Tarball { raw: *mut tarball_t } impl Tarball { pub fn file(&self, index: u32) -> Option<&[u8]> { unsafe { let mut size = 0; let data = tarball_file_data(self.raw, index as libc::c_uint, &mut size); if data.is_null() { None } else { Some(slice::from_raw_parts(data as *const u8, size as usize)) } } } }

Here the *mut tarball_t pointer is owned by a Tarball, which is responsible for any destruction and cleanup, so we already have rich knowledge about the lifetime of the tarball's memory. Additionally, the file method returns a borrowed slice whose lifetime is implicitly connected to the lifetime of the source tarball itself (the &self argument). This is Rust's way of indicating that the returned slice can only be used within the lifetime of the tarball, statically preventing dangling pointer bugs that are easy to make when working directly with C. (If you're not familiar with this kind of borrowing in Rust, have a look at Yehuda Katz's blog post on ownership.)

A key aspect of the Rust binding here is that it is a safe function, meaning that callers do not have to use unsafe blocks to invoke it! Although it has an unsafe implementation (due to calling an FFI function), the interface uses borrowing to guarantee that no memory unsafety can occur in any Rust code that uses it. That is, due to Rust's static checking, it's simply not possible to cause a segfault using the API on the Rust side. And don't forget, all of this is coming at zero cost: the raw types in C are representable in Rust with no extra allocations or overhead.

Rust's amazing community has already built some substantial safe bindings around existing C libraries, including OpenSSL, libgit2, libdispatch, libcurl, sdl2, Unix APIs, and libsodium. This list is also growing quite rapidly on crates.io, so your favorite C library may already be bound or will be bound soon!

C talking to Rust

Despite guaranteeing memory safety, Rust does not have a garbage collector or runtime, and one of the benefits of this is that Rust code can be called from C with no setup at all. This means that the zero overhead FFI not only applies when Rust calls into C, but also when C calls into Rust!

Let's take the example above, but reverse the roles of each language. As before, all the code below is available on GitHub. First we'll start off with our Rust code:

#[no_mangle] pub extern fn double_input(input: i32) -> i32 { input * 2 }

As with the Rust code before, there's not a whole lot here but there are some subtle aspects in play. First off, we've labeled our function definition with a #[no_mangle] attribute. This instructs the compiler to not mangle the symbol name for the function double_input. Rust employs name mangling similar to C++ to ensure that libraries do not clash with one another, and this attribute means that you don't have to guess a symbol name like double_input::h485dee7f568bebafeaa from C.

Next we've got our function definition, and the most interesting part about this is the keyword extern. This is a specialized form of specifying the ABI for a function which enables the function to be compatible with a C function call.

Finally, if you take a look at the Cargo.toml you'll see that this library is not compiled as a normal Rust library (rlib) but instead as a static archive which Rust calls a 'staticlib'. This enables all the relevant Rust code to be linked statically into the C program we're about to produce.

Now that we've got our Rust library squared away, let's write our C program which will call Rust.

#include <stdint.h> #include <stdio.h> extern int32_t double_input(int32_t input); int main() { int input = 4; int output = double_input(input); printf("%d * 2 = %d\n", input, output); return 0; }

Here we can see that C, like Rust, needs to declare the double_input function that Rust defined. Other than that though everything is ready to go! If you run make from the directory on GitHub you'll see these examples getting compiled and linked together and the final executable should run and print 4 * 2 = 8.

Rust's lack of a garbage collector and runtime enables this seamless transition from C to Rust. The external C code does not need to perform any setup on Rust's behalf, making the transition that much cheaper.

Beyond C

Up to now we've seen how FFI in Rust has zero overhead and how we can use Rust's concept of ownership to write safe bindings to C libraries. If you're not using C, however, you're still in luck! These features of Rust enable it to also be called from Python, Ruby, Javascript, and many more languages.

When writing code in these languages, you sometimes want to speed up some component that's performance critical, but in the past this often required dropping all the way to C, and thereby giving up the memory safety, high-level abstractions, and ergonomics of these languages.

The fact that Rust can talk to easily with C, however, means that it is also viable for this sort of usage. One of Rust's first production users, Skylight, was able to improve the performance and memory usage of their data collection agent almost instantly by just using Rust, and the Rust code is all published as a Ruby gem.

Moving from a language like Python and Ruby down to C to optimize performance is often quite difficult as it's tough to ensure that the program won't crash in a difficult-to-debug way. Rust, however, not only brings zero cost FFI, but also makes it possible to retain the same safety guarantees as the original source language. In the long run, this should make it much easier for programmers in these languages to drop down and do some systems programming to squeeze out critical performance when they need it.

FFI is just one of many tools in the toolbox of Rust, but it's a key component to Rust's adoption as it allows Rust to seamlessly integrate with existing code bases today. I'm personally quite excited to see the benefits of Rust reach as many projects as possible!

Crystal promises to help you understand how best to talk to any person

1 hour 4 min ago

When my editor Joe told me to write this story, I knew with algorithmic certainty how to respond: “Done. Absolutely. It’s taken care of.”

I got this advice from Crystal, a site that promises to help you understand how best to talk to any particular person. All you have to do is pick the subject. Crystal will then slurp up public data from around the web, run it through “proprietary personality detection technology,” and spit out a detailed report on that person’s preferred style of communicating. It’s one part oppo research, one part algorithmic astrology. It’s definitely creepy, perhaps useful, and almost certainly a look at how we’ll communicate in the future.

In the case of my editor, Crystal’s dossier was surprisingly accurate: “Joe is an achiever: fast-paced, ambitious, and persuasive, so get to the bottom line and don’t feel insulted by a direct or blunt comment.” When speaking to him, it told me to use words like “done,” “absolutely,” and “it’s taken care of.” In email, it recommended limiting my message to three sentences and stating my purpose clearly in the first line. After all, Crystal informed me, “it does not come naturally to Joe to be accommodating and forgiving with his time.”

Sorry boss—Crystal said it, not me.


Like anything that cultivates an association with magic, Crystal is slightly less impressive once you know how it works. If someone were looking you up, the site would start by examining things you’ve written publicly—social media profiles being a primary source—and analyzing factors like writing style and sentence structure. Then it processes what others have written about you. Using those data points, the site identifies you as one of 64 communicative types, which the company has adapted from well-known personality frameworks. Crystal doesn’t really know you, in other words, it just knows what you’re like.

According to co-founder Drew D’Agostino, that’s usually enough. “The beauty of these frameworks is that, if you know one bit of data about a person and you’re accurate about it, you can make really good assumptions about how they’re likely to communicate,” he says. Building the model required considerable experimentation, but given a certain volume of writing, “we figured out a few algorithms that really nailed it,” D’Agostino says.

It wasn’t immediately obvious what to do with the technology, but D’Agostino and company eventually arrived at an answer: email. Beyond letting you look up reports on individuals, paying customers get access to a Chrome extension that puts Crystal’s oracular advice right in your inbox. While emailing D’Agostino to arrange an interview, I noticed a new ochre button in my Gmail compose window. It bore an exhortation: “Be brief.” I was startled, then followed its instruction.

There’s a lot of anxiety involved in sending email, D’Agostino says. It can be hard to know what sort of greeting to use, or whether to include a joke. Crystal tries to remove some of that ambiguity. Clicking the “Be brief” button pulled up more detailed suggestions, providing specific phrases to use and avoid in that particular scenario. Crystal even goes so far as to offer a fully-written email template, algorithmically derived for the recipient.

That of course is the dream implicit in all this: A button that sends the perfect email every time. Indeed, a number of artists have explored the contours of this queasy future in recent months. A browser extension by Joanne McNeil fills emails with exclamation points and smileys, automating the “emotional labor” required in today’s cheerful correspondence. Lauren McCarthy and Kyle McDonald took these ideas a step further with Pplkpr, an app that uses biometric signals sort the real life acquaintances that invigorate you from those that aren’t worth your time.

Pplkpr is satire—it’s funded by the Andy Warhol Foundation for the Visual Arts—but it seems like an artifact from a plausible, perhaps even likely future. Marshall McLuhan once said artists are always the first to figure out how technology will change culture, and there are signs we’re headed in Pplkpr’s direction. Increasingly, we meet partners on dating sites, paired by algorithms. Our phones become ever more adept at parsing conversations and suggesting programmed replies.

D’Agostino says he built Crystal because he wanted to technologically enhance his emotional intelligence. And such a tool definitely could be helpful, say for knowing the person you’re sending a job application to hates wordy emails. But surely there’s a point at which algorithmically informed communication curls back around, mobius-strip style, and we end up even more remote and unknowable to each other than we were when we started.

Becoming Productive in Haskell

13 hours 4 min ago

Sometime recently I found that I had become proficient enough in Haskell to be productive and I wanted to capture some of my thoughts on the learning experience before it got too far away. I do most of my web prototyping in Haskell now, though I still regularly use and enjoy Python.

This is more of a thought on moving from a dynamic language to a static language, but in Haskell, the structure of your data is mostly stated in data declarations and type signatures. In python it’s mostly implied by the code.

My first thought with a Haskell function is “What does the data look like? This function takes a ____ and returns a _____ ?”, while in python my first thought is “What does the code say?”.

Thinking ‘data first’ improved my coding, even when coming back to Python. I more often recognize when the structure of my data changes for no real reason other than it was easy and I was very ‘zoomed in’ on the problem at the time.

Limiting changes in data structure also makes the code less complex, and easier to understand.

One of my main motivations for using Python is readability of code. Haskell originally looked ugly outside of what seemed to be carefully crafted examples. Pieces of it looked very clear, but were surrounded by flotsam and jetsam of nonsense. But it was also obviously powerful.

I definitely wanted to avoid ‘clever’ code that was powerful but confusing.

However, my ability to assess readability was in assessing other imperative languages. It was a bit like criticizing the readability of Mandarin as an English reader.

I found that Haskell is not ‘clever but deceptive’. Of course you can write ‘clever’ code in Haskell, just like any language, but it’s not the common case.

Actually, in Haskell that ‘clever code’ can only do so many clever things, as it’s constrained by the type system. If it says it returns an Int, it will return an Int or fail to compile.

The more powerful and precise abstraction mechanisms that Haskell supplies just sometimes smell like the magic that I try to avoid in Python.

In the beginning though, you kinda have to have faith that yes, people do read it without any trouble and on a regular basis. Once over the hump, Haskell became very readable for me.

  1. Type signatures. They’re like getting a little summary at the top of a chapter of a book. With the added bonus that it’s guaranteed to be true. Wouldn’t that be great to have next time you try to learn another language?

This is the chapter where Tommy goes to the market and buys a duck.

chapter :: Tommy -> Market -> Duck
  1. Composing functions out of other, smaller functions offers a big reduction in complexity. If they’re named well, this allows you to write functions that are easy to read.

  2. It’s concise. You don’t need a ton of code to express a powerful idea.

I also wanted to mention something about the infix symbols that are common in Haskell code too($,<$>,<-,->,etc.), as they can create a sort of symbol induced despair/anger in newcomers.

Don’t despair! I know they reek of deceptive cleverness, but there are only a limited number of common ones and once you know them you’ll see they’re useful and simple. I think there are maybe 5 infix symbols that I use on a regular basis.

That being said, I would say ignore the lens library in the beginning, as it has a ton of infix symbols. It’s a very cool library, but you can get by just fine without it. Wait until you’re comfortable creating medium sized things in Haskell, and then approach it at your leisure.

There are a lot of completely new words to learn when you learn Haskell. Things like Functor and Monad.

These words are going to feel heavier to learn for a few reasons. When starting to learn imperative programming, a lot of the new vocabulary has at least some familiarity. A loop brings to mind…well, loops. Race tracks, roller coasters, uhh….cereal.

We store memories by attaching them to previously made memories, so there is going to be a tendency for your brain to just shut off if too many of these new, heavy words show up in a sentence or paragraph. I had no associations with the word Functor, so it was hard to store.

My strategy in learning these words was to come up with my own name that made sense to me and mentally substitute it every time that heavy word came up. After a while, these made up synonyms anchored me and I had no problem with the ‘heavy word’.

For example: Functor.

In Haskell, this is something that can be mapped over. For example, a list is a Functor. This means there is a mapping function that takes another function and applies it to everything in the list and creates a list with the results.

map (+1) [1,2,3,4] -- results in [2,3,4,5]

So, I started calling it Mappable. Mappable was easy for me to remember and was descriptive of what it did. A list is a Functor. A list is Mappable.

In Python, my main development tool is the print statement/function.

In Haskell, my main development tool is the type system. It checks what I’d normally use print statements to check: what data a function is actually receiving or returning.

But! You can use Debug.Trace as a Python style print function without having to bother with Haskell’s IO type. This can be very useful to get started. Though, once you get moving in Haskell, you probably won’t use it as much as you think you would.

If you leave trace statements in your code after you’re finished debugging…well, you will feel dirtier when you do that in Haskell than when you do it in Python.

was a Parsec tutorial.

Mostly when you hear about someone becoming productive in Haskell, it involves a description of how they finally understood Monads. Well, damn, here it goes.

I needed to write a parser. I had something in Python, but due to my inexperience in writing parsers, the growing complexity of my code was slowing me down considerably.

So, I had some extra time, I thought maybe I should give it a go in Haskell.

I found the Youtube video, Parsing Stuff in Haskell, which explains how to create a JSON parser in Haskell using the Parsec library.

But it also inadvertently showed me how to use Monads and Applicatives as tools to create something I needed. It showed me how they function(har, har) and how they are related to each other.

After writing a parser with them, I began to understand other code that used them. I then started to understand the abstract nature of them…but that abstractness was a lesson for another day, not for starting out.

Also, Parsec provided enough structure that my inexperience in writing parsers did not really matter. In fact, as someone just learning Haskell, I was able to write a parser that was better in every measure(lower complexity, speed, readability, extensibility), compared to what I could do as a programmer who has worked with Python for years, but with no expertise in parsers.

Haskell is my main web prototyping language now for several reasons.

Well, reason 0 is I have the opportunity to choose what technology I use. I know that’s a luxury.

  1. I’m able to write a prototype faster, and that prototype is usually my production version.
  2. I don’t have to waste my time on trivial bugs.
  3. The bugs I do encounter are generally more meaningful and lead me to understanding the problem more. Note, meaningful doesn’t always mean harder.
  4. Python taught me not to worry about speed that much. Haskell agreed with that but let me have it anyway.
  5. Refactoring is a breeze. In Python, I always had a nagging feeling that I forgot to change some small part of your code that will be important later.
  6. Excellent libraries. I feel that the basic guarantees of the Haskell language make the standard quality of libraries exceptionally high. Then there are libraries that were game-changers for me (Parsec and QuickCheck immediately come to mind, but there are others.)
  7. A helpful community
  8. Easy to scale code up to using many cores.
  9. Haskell infrastructure is improving all the time. Last year, GHC(which is the Haskell compiler) 7.8 came out which doubled the performance of Warp, one of the prominent web servers that was already pretty fast.

And finally, I have to say that writing Haskell code comes with a deep level of satisfaction. It’s more rewarding than most any coding experience I’ve had.

It can be tough to find a good starting point.

Here’s how I would do it if I had to learn Haskell again.

First, reading at least Chapters 1 through 8 in Learn you a Haskell for Great Good.


  1. Write a small module that doesn’t worry about IO. Something like a Sudoku module that generates Sudoku puzzles. Don’t worry about using a random number as a seed. Use Debug.Trace as your print statement to see what’s going on. Generate a puzzle and Debug.Trace it to the screen. Create your own data types, and just use functions(i.e. no custom typeclasses).
  2. Turn that into a Website using either Scotty or Spock. Keep it simple; a URL that shows a sudoku puzzle. Then, a URL that produces JSON of a sudoku puzzle.
  3. Mess around with real IO. Try printing the puzzle to the terminal without Debug.Trace.
  4. Find incremental ways to add to it. Design a file format for sudoku puzzles and write a Parsec parser for it! Don’t have the file format be JSON, make something up.

Good luck!

Legacy of Agent Orange

13 hours 4 min ago

As April 30 approaches, marking 40 years since the end of the Vietnam War, people in Vietnam with severe mental and physical disabilities still feel the lingering effects of Agent Orange.

Respiratory cancer and birth defects amongst both Vietnamese and U.S. veterans have been linked to exposure to the defoliant. The U.S. military sprayed millions of gallons of Agent Orange onto Vietnam's jungles during the conflict to expose northern communist troops.

Reuters photographer Damir Sagolj travelled through Vietnam to meet the people affected, four decades on.

13 Apr 2015. DANANG, VIETNAM. REUTERS/Damir Sagolj

If you are on the plane taking off from Danang airport in Vietnam, look through the window on your right - between the departure building and the yellow wall separating the airport from densely populated neighbourhoods - you will see an ugly scar on the already not very pretty face of the Vietnam War.

This is where barrels of Agent Orange were kept in the airport U.S. military used to spray the defoliant across the country. Now, more than forty years later, the spot is finally being decontaminated.

12 Apr 2015. DANANG, VIETNAM. REUTERS/Damir Sagolj

When covering an anniversary, it’s easy to fall into the trap of a “before and after” cliché or, even worse, to try to do something different but irrelevant.

Even so, I wanted to do a story on the legacy of Agent Orange. There were several raised eyebrows around me, as colleagues asked: Couldn't I find something new instead of retelling a story told over and over already?

14 Apr 2015. HO CHI MINH CITY, VIETNAM. REUTERS/Damir Sagolj

I can’t say where and when I heard it but I remember the advice well: no matter how many times the story has been done and how many people have done it, do it as if you are the first and only one to witness it. I listened to this advice so many times in the past and I listened to it now.

Such assignments have rules, among the most important being the longer you spend in the unknown, the more chance you have of getting strong pictures.

So a Vietnamese colleague and I set off to travel around Vietnam, a country stretching more than 1,500 kilometres from north to south, with a great many people still affected by Agent Orange.

11 Apr 2015. THAI BINH, VIETNAM. REUTERS/Damir Sagolj

The Vietnam Association of Victims of Agent Orange/Dioxin (VAVA) told Reuters that more than 4.8 million people in Vietnam have been exposed to the herbicide and over 3 million of them have been suffering from deadly diseases.

14 Apr 2015. HO CHI MINH CITY, VIETNAM. REUTERS/Damir Sagolj

But soon after I started taking pictures and talking to victims and their relatives, I realised I would need to think again about how to do this story. My immediate and natural reaction was to get closer, almost into the face of a victim, to show what has happened to human bodies.

A forensic photography approach, almost. In a hospice outside Hanoi, after a few strong portraits of a kid born with no eyes and other victims whose bodies are horribly twisted, my original plan felt wrong. The faces and eyes in the pictures hurt; the focus is there but I may be missing things around, possibly even the story itself.

8 Apr 2015. HANOI, VIETNAM. REUTERS/Damir Sagolj

Former soldier Nguyen Hong Phuc, 63, sits on the bed with his son Nguyen Dinh Loc, 20.

I wanted to put it all in the context of today’s Vietnam, forty years on. To see victims of the second and third generations, where and how they live. To learn why children and grandchildren of people affected are still being born with disabilities, to find out if people know about the dangers, and if so when did they found out.

And to take pictures of all that.

As we got closer to the former front lines travelling from the north, the number of cases increased. We kept in touch with VAVA, the main association helping victims, and they gave us much needed information, including the number of victims and where they live.

Throughout the assignment, VAVA and other local officials together with family members confirmed that the health conditions of people we met and photographed are linked to Agent Orange as their parents or grandparents were exposed to it.

12 Apr 2015. DANANG, VIETNAM. REUTERS/Damir Sagolj

In yet another village, Le Van Dan, an ex South Vietnamese soldier, wearing a worn-out military jacket of the communists, his former enemy force, told me how he was sprayed directly from the U.S. planes not far from his home today.

As the tough man spoke through broken teeth, two of his grandsons in a room behind the kitchen were given milk provided by a government aid agency. Both kids were born severely disabled, doctors say because of Agent Orange.

10 Apr 2015. THAI BINH, VIETNAM. REUTERS/Damir Sagolj

In a small village in Thai Binh province, in a cold room empty of any furniture, Doan Thi Hong Gam shrank under a light blue blanket. The room’s dirty walls suggest anger and some sort of struggle. She’s been kept in isolation since the age of sixteen because of her aggressiveness and severe mental problems. She is 38 now.

I took pictures of the poor woman for about 15 minutes. They were possibly the strongest frames I have taken in a long time. Her father, a former soldier lying in the bed in a room next to hers, also very sick, was exposed to Agent Orange during the war.

11 Apr 2015. THAI BINH, VIETNAM. REUTERS/Damir Sagolj

Then another village and another picture. On a hill above his home, former soldier Do Duc Diu showed me the cemetery he built for his twelve children, who all died soon after being born disabled. There are a few extra plots next to the existing graves for where his daughters, who are still alive but very sick, will be buried.

The man was also a North Vietnamese soldier exposed to the toxic defoliant. For more than twenty years he and his wife were trying to have a healthy child. One by one their babies were dying and they thought it was a curse or bad luck, so they prayed and visited spiritual leaders but that didn’t help.

They found out about Agent Orange only after their fifteenth child was born, also sick. I took a picture of the youngest daughter. It was not an easy thing to do.

9 Apr 2015. THAI BINH, VIETNAM. REUTERS/Damir Sagolj

Lai Van Manh, who has physical and mental disabilities, rests in bed.

Village after village, strong pictures and even stronger stories emerge. My camera stayed at a distance. I shot through mosquito nets and against the light, I shot details and reflections. We took many notes trying not to miss any important details needed to build an accurate picture. Then we drove further south.

Back in Danang, next to its international airport, we visited a young couple who have lived and worked there since late 1990s. When they first moved there the man used to go fishing, collecting snails and vegetables to bring home to eat.

The family was poor and all food was welcomed. What he didn't know was that Agent Orange, which used to be stored nearby, had contaminated the waters and everything around the lake situated next to the airstrip.

12 Apr 2015. DANANG, VIETNAM. REUTERS/Damir Sagolj

His daughter was born sick in 2000 and died aged seven. Their son was born in 2008, also sick with the same symptoms as his late sister. I took pictures and then we drove the family to the hospital for the boy’s blood transfusion. The blind and very sick boy held my finger and later blew a kiss into the emptiness. I saw it from afar as I walked away.

The United States stopped spraying Agent Orange in 1971 and the war ended in 1975. Twenty years later, some people from villages and cities didn’t know all about it. Forty years later, today, children and their parents still suffer and a large part of the story remains untold. Agent Orange is one big tragedy made of many small tragedies, all man made.

There is not much I can do about it with my pictures except to retell the story, despite all the raised eyebrows. The pictures I took are not about the before and after, they are all about now. As for how poorly we read history and stories from the past, I’m afraid that is about our future, too.

Microsoft .NET CoreCLR is now running on FreeBSD 10.1 (amd64)

13 hours 4 min ago

@janhenke was asking the same thing in the gitter. @mmitche is looking into it. There is no stock image for FreeBSD in the Azure gallery, so it will require a little work on our side.

Hurrah!! Congrats, FreeBSD Port Team. Nice job. I guess, however, that this is just the first milestone.

Holding On and Letting Go

13 hours 4 min ago
By Judy Tankoos — 2 hours ago

Cristy, what beautiful, eloquent words that brought tears to my eyes. You and your entire family have been traveling a very rough road the last few years. My heart breaks for you all that this journey is coming to an end. Your dad is obviously loved very much by all who have had the pleasure of knowing him. From all that Dave has told me over the many years he has been friends with Paul and your mom, and you girls as well, your father has loved each of you with his whole heart. This is evident in how involved he always was in your lives, from the very first day you were born. Your family has many happy memories to sustain you and hopefully bring you some peace and comfort as you face the coming days. Many fun stories to tell as you reminisce of time spent together. Know that the love and prayers of more people than you can possibly imagine are with each and every one of you as you let go. Your father will always be with you in spirit, watching over you, protecting you. You will see him in the cardinals that rest on the tree branches outside your window, the butterfly that rests momentarily on your hand, that ray of sunshine warm upon your cheek, the light summer breeze that ruffles your hair, the first fall leaf that falls to the ground in front of you, the snowflake that lands on your cheek. Have no fears, he will be there. God bless you all. Give the family hugs from me. I will be thinking of you. Godspeed, Paul. ❤️❤️❤️❤️❤️❤️❤️❤️❤️❤️

By tom bentley — 3 hours ago

I first knew the younger Paul, as a college friend, graduate school roommate (MIT), piano player, and intramural sports teammate. It has been inspirational to watch him since then excel as an engineer, a husband, father, educator, coach, department chair, and mentor to many young engineers. Paul is the most complete package of talent and humility that I have ever known, a part of him stays with everyone he touches.

By Steve Spaulding — 3 hours ago

Keeping the entire Hudak family deep in my heart and prayers and wishing you all peace, love and comfort. My little brother remembers your father at Yale, and has told me about what a respected leader he is. Your words are beautiful and moving.

By Jenny — 6 hours ago

The last words my grandfather spoke to me resonate the strongest- "Peace be with you". I often remember those words, knowing that he was giving me the best advice he could to help us both let go while still holding on. I wish your family peace through this transition. May the web of love in this world surround you always!

By Satnam Singh — 7 hours ago

My brief encounters with Paul over the years have left a lasting impression of a wonderfully kind and brilliant and inspirational man who I admired greatly -- I am very sorry for your loss.

By Mary Jane & David Cope — last edited 8 hours ago

Thank you for posting this heartfelt message, Cristina. We met Paul only once when we visited Yale a few years ago, but we'll never forget his warmth and hospitality - we felt we had known him forever. To Paul: you and all your family are in our prayers and our thoughts during this difficult time; know that you are surrounded by love and concern from near and far; you have touched so many lives! Peace to you.

By Phil Wadler — 10 hours ago

Paul is a friend and a mentor, an inspiration and a guide for my career. I last saw Paul and Cathy in Tarragona two years ago over a fantastic meal, a good memory. Please give him my best, and let me know if there is anything I can do.

By Courtney Bedocs — 10 hours ago

Eloquently written, Christina! I remember my junior year competing in states with Jenny. We had no business winning the game but that wasn't the point, we were just so excited to have made it that far! After the talented opposing team scored multiple goals in a row, Mr. Hudak pulled us in for a timeout and reminded us to have fun. We may have lost, but we all got hugs from him our last game of the season and yes, I know our team had fun. Between your father's thorough practice plans coupled with his unique game strategy to make sure everyone recieved equal playing time (dragons, remember the pods?), we always had fun! "Mr. Hudak you are the reason I ever picked up a lacrosse stick and I am forever grateful for your mentorship! I love you so much!" Sending hugs to you and your family. xoxo

By Rishiyur Nikhil — 10 hours ago

Paul is a beautiful man. Kind, generous, caring, brilliant, inspired. Our thoughts are with all of you.

By Liz Storch — 10 hours ago

Our heart aches for you all during this difficult time. Please know you are in our thoughts and prayers. Paul has such a wonderful and strong spirit and we know he will always be with and near us all...he will be missed !!
Tears, Hugs & Love,
Liz & Lee

Honor Paul

Tribute donations are vital to keeping CaringBridge and Paul's Site running.

Healthcare Facility

Yale-New Haven Hospital

20 York Street
New Haven, CT 06510-3202
United States

Copyright © 1997-2015 CaringBridge ®, a nonprofit organization. By using this website you agree to our Terms of Use and Privacy Policy. View our Site Map.

My CaringBridge

Paul Hudak

Overview Journal Guestbook Photos Planner Tributes


Start a Site About Us Get Involved Donate to CaringBridge

Google Sued by Job Candidate for Age Discrimination

13 hours 4 min ago
Associated Press

A 64-year-old Florida tech worker filed an age-discrimination lawsuit against Google on Wednesday, claiming the company passed on him after a job interview because of his age.

Robert Heath says in his complaint filed in U.S. District Court in San Jose, Calif., that Google unfairly dismissed his application for a software engineering job in 2011 when he was 60 years old, despite his work experience at IBM , Compaq, and General Dynamics . The lawsuit says Google based its decision not to hire Heath on a brief phone interview, despite telling him in an email that the company was “embarking on its largest recruiting / hiring campaign in its history,” and “you would be a great candidate to come work at Google.”

Heath, represented by law firm Kotchen & Low, is seeking a class-action case on behalf of job applicants 40 and older who were not hired by the Internet search company. “There are very qualified older tech workers who are out of work,” Heath said Thursday. “We had to do something about it.”

A Google spokeswoman said: “We believe that the facts will show that this case is without merit and we intend to defend ourselves vigorously.”

The lawsuit cites a survey of employees of different companies by Payscale.com, a workforce information website, that Google had a median age of 29 in 2013, while the U.S. Department of Labor reported that the median age was 43 years in the U.S. for computer programmers. A spokeswoman for Payscale.com said the median age was based on the self-reporting of 840 Google employees. Payscale data shows Google has the sixth-youngest workforce among 22 tech companies. AOL has the youngest workforce, with a median age of 27, and Facebook had the second-youngest at 28, Payscale found in December. Hewlett-Packard has the oldest, with a median age of 39, and Oracle the second-oldest at 38.

The lawsuit also cites an earlier case, Reid v. Google, in which former Google executive Brian Reid said he was referred to at the company as an “old fuddy duddy,” and that his ideas were “too old to matter.” That 2007 case was settled for undisclosed damages.

Google helped to jump-start a new wave of diversity disclosures last year by releasing data on its workforce’s racial and gender makeup. The lawsuit notes that Google’s Diversity webpage does not include age-related workforce data, despite disclosing data about other worker characteristics. Other tech companies releasing workforce diversity data did not typically disclose age data either.

Here are the median ages of 22 tech companies’ employees, which payscale.com says come from surveys of 100 to 2,000 employees, depending on company size:

Company Median Age Aol, LLC 27 Facebook Inc 28 Linkedin Corporation 29 Salesforce.com, Inc. 29 Google, Inc. 30 Amazon.com Inc 31 Apple Computer, Inc 31 Yahoo Inc. 31 eBay Inc. 32 Nvidia Corp 32 Adobe Systems Incorporated 33 Microsoft Corp 33 Samsung 33 Intel Corporation 34 Nokia, Inc. 34 Sony Electronics Company 36 Dell, Inc. 37 Monster.Com 37 International Business Machines (IBM) Corp. 38 Oracle Corp. 38 Hewlett-Packard Company 39

Correction: The name of law firm Kotchen & Low was incorrectly given as Kotchen & Law in a previous version of this article.


For the latest news and analysis,

Get breaking news and personal-tech reviews delivered right to your inbox.

More from WSJ.D: And make sure to visit WSJ.D for all of our news, personal tech coverage, analysis and more, and add our XML feed to your favorite reader.

What happens if you remove randomness from Doom?

13 hours 4 min ago

What happens if you remove randomness from Doom?

For some reason, recently I have been thinking about Doom. This evening I was wanting to explore some behaviour in an old version of Doom and to do so, I hex-edited the binary and replaced the random number lookup table with static values.

Rather than consume system randomness, Doom has a fixed 256-value random number table from which numbers are pulled by aspects of the game logic. By replacing the whole table with a constant value, you essentially make the game entirely deterministic.

What does it play like? I tried two values, 0x00 and 0xFF. With either value, the screen "melt" effect that is used at the end of levels is replaced with a level vertical wipe: the randomness was used to offset each column. Monsters do not make different death noises at different times; only one is played for each category of monster. The bullet-based (hitscan) weapons have no spread at all: the shotgun becomes like a sniper rifle, and the chain-gun is likewise always true. You'd think this would make the super-shotgun a pretty lethal weapon, but it seems to have been nerfed: the spread pattern is integral to its function.

With 0x00, monsters never make their idle noises (breathing etc.) On the other hand, with 0xFF, they always do: so often, that each sample collides with the previous one, and you just get a sort-of monster drone. This is quite overwhelming with even a small pack of monsters.

With 0xFF, any strobing sectors (flickering lights etc.), are static. However, with 0x00, they strobe like crazy.

With 0x00, monsters seem to choose to attack much more frequently than usual. Damage seems to be worst-case. The most damaging floor type ("super hellslime"/20%) can hurt you even if you are wearing a radiation suit: There was a very low chance of it hurting whilst wearing the suit (~2.6%) each time the game checked; this is rounded up to 100%.

Various other aspects of the game become weird. A monster may always choose to use a ranged attack, regardless of how close you are. They might give up pursuing you. I've seen them walk aimlessly in circles if they are obstructed by another thing. The chance of monster in-fighting is never, or a certainty. The player is either mute, or cries in pain whenever he's hurt.

If you want to try this yourself, the easiest way is to hack the m_random.c file in the source, but you can hex-edit a binary. Look for a 256-byte sequence beginning beginning ['0x0', '0x8', '0x6d', '0xdc', '0xde', '0xf1'] and ending ['0x78', '0xa3', '0xec', '0xf9'].


Amazon Finally Discloses Cloud Services Sales, Showing 49% Jump

23 April 2015 - 7:00pm

A truck drives by Amazon.com's Fulfillment Center in Fernley, Nevada on December 15, 2004. Photographer: Ken James/ Bloomberg News.

Turns out there’s real money in the cloud. The Amazon.com Inc. division that serves up computer power, storage, and software via the Internet, generated $1.57 billion in first-quarter sales.

The first-ever disclosure of results from the Amazon Web Services division showed revenue increased 49 percent from a year earlier. AWS cranked out operating income of $265 million for Amazon, helping offset losses in other businesses. Net losses for the enlarged company came in at $57 million.

Amazon’s reluctance to break out results from AWS spurred years of speculation as to the true size of its cloud. The numbers suggest Amazon is ahead of rivals Google Inc. and Microsoft Corp., which are estimated to generate lower revenue in comparable businesses. A recent note by Karl Keirstead, an analyst at Deutsche Bank AG, pegged AWS sales at 10 times those of Microsoft’s cloud services.

“Amazon Web Services is a $5 billion business and still growing fast -- in fact it’s accelerating,” Amazon Chief Executive Officer Jeff Bezos said today in a statement.

Amazon introduced its cloud services in 2006 with two services -- rentable storage and computers. These have since become common building blocks of Internet-based computing systems, supporting companies ranging from large enterprises like Infor US Inc. and Netflix Inc. to startups such as Instacart Inc.

Since its inception, Amazon has refined and expanded AWS while competitors including Microsoft and Google have tried to replicate its success with similar projects. Those companies have only recently started to deliver basic services on par with Amazon’s.

Comcast Plans to Drop Time Warner Cable Deal

23 April 2015 - 7:00pm

Comcast Corp. is planning to walk away from its proposed $45 billion takeover of Time Warner Cable Inc., people with knowledge of the matter said, after regulators planned to oppose the deal.

Comcast is planning to make a final decision on its plans Thursday, and an announcement on the deal’s fate may come as soon as Friday, said one of the people, who asked not to be named discussing private information.

This week, U.S. Federal Communications Commission staff joined lawyers at the Justice Department in opposing the planned transaction. FCC officials told the two biggest U.S. cable companies on Wednesday that they are leaning toward concluding the merger doesn’t help consumers, a person with knowledge of the matter said.

An FCC hearing can take months to complete and effectively kill a deal by dragging out the approval process beyond the companies’ time frame for completion. Justice Department staff is also leaning against the deal, Bloomberg reported last week.

Comcast shares rose 2.2 percent to $60.06 at 3:07 p.m. in New York, while Time Warner Cable climbed 0.5 percent.

Sena Fitzmaurice, a spokeswoman for Comcast, declined to comment.

While the DOJ has to present a case in court to block the deal, an FCC hearing referral could prove to be the bigger obstacle to Comcast’s bid to expand its cable and Internet footprint.

The last time the FCC staff proposed sending a merger to a hearing was over AT&T Inc.’s bid to buy T-Mobile USA Inc. in 2011, prompting the companies to drop the deal. The Justice Department had already brought a lawsuit seeking to block the merger.

Comcast representatives came away from the FCC meeting with the impression the deal was in trouble, according to a person familiar with the matter.

‘Gods’ edging out robots at Toyota facility

23 April 2015 - 7:00pm

Inside Toyota Motor Corp.’s oldest plant, there’s a corner where humans have taken over from robots in thwacking glowing lumps of metal into crankshafts. This is Mitsuru Kawai’s vision of the future.

“We need to become more solid and get back to basics, to sharpen our manual skills and further develop them,” said Kawai, a half century-long company veteran tapped by President Akio Toyoda to promote craftsmanship at Toyota’s plants. “When I was a novice, experienced masters used to be called gods, and they could make anything.”

These gods, or “kami-sama” in Japanese, are making a comeback at Toyota, the company that long set the pace for manufacturing prowess in the auto industry and beyond. Toyota’s next step forward is counterintuitive in an age of automation: Humans are taking the place of machines in plants across the nation so workers can develop new skills and figure out ways to improve production lines and the car-building process.

“Toyota views their people who work in a plant like this as craftsmen who need to continue to refine their art and skill level,” said Jeff Liker, who has written eight books on Toyota and visited Kawai last year. “In almost every company you would visit, the workers’ jobs are to feed parts into a machine and call somebody for help when it breaks down.”

The return of the kami-sama is emblematic of how Toyoda, 57, is remaking the company founded by his grandfather as the chief executive officer has pledged to tilt priorities back toward quality and efficiency from a growth mentality. He’s reining in expansion at the world’s-largest automaker with a three-year freeze on new car plants.

The importance of following through on that push has been underscored by the millions of cars General Motors Co. has recalled for faulty ignition switches linked to 13 deaths.

“What Akio Toyoda feared the company lost when it was growing so fast was the time to struggle and learn,” said Liker, who met with Toyoda in November. “He felt Toyota got big-company disease and was too busy getting product out.”

While the freeze and spread of manual work may bear fruit in the long run, it could come at the expense of near-term sales growth and allow GM to Volkswagen AG challenge Toyota by deepening their foothold in markets such as China.

The effort comes as Toyota overhauls vehicle development, where the carmaker will shift to manufacturing platforms that could cut costs by 30 percent. It also underscores Toyota’s commitment to maintain annual production of 3 million vehicles in Japan.

Learning how to make car parts from scratch gives younger workers insights they otherwise wouldn’t get from picking parts from bins and conveyor belts, or pressing buttons on machines. At about 100 manual-intensive workspaces introduced over the last three years across Toyota’s factories in Japan, these lessons can then be applied to reprogram machines to cut down on waste and improve processes, Kawai said.

In an area Kawai directly supervises at the forging division of Toyota’s Honsha plant, workers twist, turn and hammer metal into crankshafts instead of using the typically automated process. Experiences there have led to innovations in reducing levels of scrap and shortening the production line 96 percent from its length three years ago.

Toyota has eliminated about 10 percent of material-related waste from building crankshafts at Honsha. Kawai said the aim is to apply those savings to the next-generation Prius hybrid.

The work extends beyond crankshafts. Kawai credits manual labor for helping workers at Honsha improve production of axle beams and cut the costs of making chassis parts.

Though Kawai doesn’t envision the day his employer will rid itself of robots — 760 of them take part in 96 percent of the production process at its Motomachi plant in Japan — he has introduced multiple lines dedicated to manual labor in each of Toyota’s factories in its home country, he said.

“We cannot simply depend on the machines that only repeat the same task over and over again,” Kawai said. “To be the master of the machine, you have to have the knowledge and the skills to teach the machine.”

Kawai, 65, started with Toyota during the era of Taiichi Ono, the father of the Toyota Production System envied by the auto industry for decades with its combination of efficiency and quality. That means Kawai has been living most of his life adhering to principles of “kaizen,” or continuous improvement, and “monozukuri,” which translates to the art of making things.

“Fully automated machines don’t evolve on their own,” said Takahiro Fujimoto, a professor at the University of Tokyo’s Manufacturing Management Research Center. “Mechanization itself doesn’t harm, but sticking to a specific mechanization may lead to omission of kaizen and improvement.”

Toyoda turned to Kawai to replicate the atmosphere at Toyota’s Operations Management Consulting Division, established in 1970 by Ono. Early in his career, Toyoda worked in the division, whose principles are now deployed at Toyota plants and its parts suppliers to reduce waste and educate employees.

Newcomers to the division such as Toyoda would be given three months to complete a project at, say, the loading docks of a parts supplier, which their direct boss could finish in three weeks, Liker said. The next higher up could figure out the solution in a matter of three days.

“But they wouldn’t tell him the answer,” Liker said of Toyoda’s time working within the division. “He had to struggle, and they’d give him three months. He told me that’s what he thought Toyota lost in that period of time when it was growing so fast. That was his main concern.”

During its rise to the top of the automotive industry — Toyota has set a target for 2014 to sell more than 10 million vehicles, a milestone no automaker has ever crossed — the company was increasing production at the turn of the century by more than half a million vehicles a year.

A year after the failure of Lehman Brothers Holdings Inc. in 2008 sent car demand tumbling, Toyota began recalling more than 10 million vehicles to fix problems linked to unintended acceleration, damaging its reputation for quality.

Last month, the company agreed to pay a record $1.2 billion penalty to end a probe by the U.S. Justice Department, which said Toyota had covered up information and misled the public at the time. Lawmakers are now considering fines and suggesting criminal penalties for companies after GM took more than a decade to disclose defects with its cars.

In the aftermath of its crisis, Toyoda has paused from announcing any new car assembly plants as GM and VW push for further spending on new capacity.

In the years leading up to the recalls, Kawai had also been increasingly concerned Toyota was growing too fast, he said. One way for him to help prevent such a recurrence is to help humans keep tabs on the machines.

“If there is ever a technology that’s flawless and could always make perfect products, then we will be ready and willing to install that machine,” Kawai said. “There’s no machine that is eternally stable.”

Borg: The Predecessor to Kubernetes

23 April 2015 - 7:00pm
Google has been running containerized workloads in production for more than a decade. Whether it's service jobs like web front-ends and stateful servers, infrastructure systems like BigTable and Spanner, or batch frameworks like MapReduce and Millwheel, virtually everything at Google runs as a container. Today, we took the wraps off of Borg, Google’s long-rumored internal container-oriented cluster-management system, publishing details at the academic computer systems conference Eurosys. You can find the paper here.

Kubernetes traces its lineage directly from Borg. Many of the developers at Google working on Kubernetes were formerly developers on the Borg project. We've incorporated the best ideas from Borg in Kubernetes, and have tried to address some pain points that users identified with Borg over the years.

To give you a flavor, here are four Kubernetes features that came from our experiences with Borg:

1) Pods. A pod is the unit of scheduling in Kubernetes. It is a resource envelope in which one or more containers run. Containers that are part of the same pod are guaranteed to be scheduled together onto the same machine, and can share state via local volumes.

Borg has a similar abstraction, called an alloc (short for “resource allocation”). Popular uses of allocs in Borg include running a web server that generates logs alongside a lightweight log collection process that ships the log to a cluster filesystem (not unlike fluentd or logstash); running a web server that serves data from a disk directory that is populated by a process that reads data from a cluster filesystem and prepares/stages it for the web server (not unlike a Content Management System); and running user-defined processing functions alongside a storage shard. Pods not only support these use cases, but they also provide an environment similar to running multiple processes in a single VM -- Kubernetes users can deploy multiple co-located, cooperating processes in a pod without having to give up the simplicity of a one-application-per-container deployment model.

2) Services. Although Borg’s primary role is to manage the lifecycles of tasks and machines, the applications that run on Borg benefit from many other cluster services, including naming and load balancing. Kubernetes supports naming and load balancing using the service abstraction: a service has a name and maps to a dynamic set of pods defined by a label selector (see next section). Any container in the cluster can connect to the service using the service name. Under the covers, Kubernetes automatically load-balances connections to the service among the pods that match the label selector, and keeps track of where the pods are running as they get rescheduled over time due to failures.

3) Labels. A container in Borg is usually one replica in a collection of identical or nearly identical containers that correspond to one tier of an Internet service (e.g. the front-ends for Google Maps) or to the workers of a batch job (e.g. a MapReduce). The collection is called a Job, and each replica is called a Task. While the Job is a very useful abstraction, it can be limiting. For example, users often want to manage their entire service (composed of many Jobs) as a single entity, or to uniformly manage several related instances of their service, for example separate canary and stable release tracks. At the other end of the spectrum, users frequently want to reason about and control subsets of tasks within a Job -- the most common example is during rolling updates, when different subsets of the Job need to have different configurations.

Kubernetes supports more flexible collections than Borg by organizing pods using labels, which are arbitrary key/value pairs that users attach to pods (and in fact to any object in the system). Users can create groupings equivalent to Borg Jobs by using a “job:<jobname>” label on their pods, but they can also use additional labels to tag the service name, service instance (production, staging, test), and in general, any subset of their pods. A label query (called a “label selector”) is used to select which set of pods an operation should be applied to. Taken together, labels and replication controllers allow for very flexible update semantics, as well as for operations that span the equivalent of Borg Jobs.

4) IP-per-Pod. In Borg, all tasks on a machine use the IP address of that host, and thus share the host’s port space. While this means Borg can use a vanilla network, it imposes a number of burdens on infrastructure and application developers: Borg must schedule ports as a resource; tasks must pre-declare how many ports they need, and take as start-up arguments which ports to use; the Borglet (node agent) must enforce port isolation; and the naming and RPC systems must handle ports as well as IP addresses.

Thanks to the advent of software-defined overlay networks such as flannel or those built into public clouds, Kubernetes is able to give every pod and service its own IP address. This removes the infrastructure complexity of managing ports, and allows developers to choose any ports they want rather than requiring their software to adapt to the ones chosen by the infrastructure. The latter point is crucial for making it easy to run off-the-shelf open-source applications on Kubernetes--pods can be treated much like VMs or physical hosts, with access to the full port space, oblivious to the fact that they may be sharing the same physical machine with other pods.

With the growing popularity of container-based microservice architectures, the lessons Google has learned from running such systems internally have become of increasing interest to the external DevOps community. By revealing some of the inner workings of our cluster manager Borg, and building our next-generation cluster manager as both an open-source project (Kubernetes) and a publicly available hosted service (Google Container Engine), we hope these lessons can benefit the broader community outside of Google and advance the state-of-the-art in container scheduling and cluster management.  

The Slow Death of the University

23 April 2015 - 7:00pm

A few years ago, I was being shown around a large, very technologically advanced university in Asia by its proud president. As befitted so eminent a personage, he was flanked by two burly young minders in black suits and shades, who for all I knew were carrying Kalashnikovs under their jackets. Having waxed lyrical about his gleaming new business school and state-of-the-art institute for management studies, the president paused to permit me a few words of fulsome praise. I remarked instead that there seemed to be no critical studies of any kind on his campus. He looked at me bemusedly, as though I had asked him how many Ph.D.’s in pole dancing they awarded each year, and replied rather stiffly "Your comment will be noted." He then took a small piece of cutting-edge technology out of his pocket, flicked it open and spoke a few curt words of Korean into it, probably "Kill him." A limousine the length of a cricket pitch then arrived, into which the president was bundled by his minders and swept away. I watched his car disappear from view, wondering when his order for my execution was to be implemented.

This happened in South Korea, but it might have taken place almost anywhere on the planet. From Cape Town to Reykjavik, Sydney to São Paulo, an event as momentous in its own way as the Cuban revolution or the invasion of Iraq is steadily under way: the slow death of the university as a center of humane critique. Universities, which in Britain have an 800-year history, have traditionally been derided as ivory towers, and there was always some truth in the accusation. Yet the distance they established between themselves and society at large could prove enabling as well as disabling, allowing them to reflect on the values, goals, and interests of a social order too frenetically bound up in its own short-term practical pursuits to be capable of much self-criticism. Across the globe, that critical distance is now being diminished almost to nothing, as the institutions that produced Erasmus and John Milton, Einstein and Monty Python, capitulate to the hard-faced priorities of global capitalism.

Much of this will be familiar to an American readership. Stanford and MIT, after all, provided the very models of the entrepreneurial university. What has emerged in Britain, however, is what one might call Americanization without the affluence — the affluence, at least, of the American private educational sector.

This is even becoming true at those traditional finishing schools for the English gentry, Oxford and Cambridge, whose colleges have always been insulated to some extent against broader economic forces by centuries of lavish endowments. Some years ago, I resigned from a chair at the University of Oxford (an event almost as rare as an earthquake in Edinburgh) when I became aware that I was expected in some respects to behave less as a scholar than a CEO.

When I first came to Oxford 30 years earlier, any such professionalism would have been greeted with patrician disdain. Those of my colleagues who had actually bothered to finish their Ph.D.’s would sometimes use the title of "Mr." rather than "Dr.," since "Dr." suggested a degree of ungentlemanly labor. Publishing books was regarded as a rather vulgar project. A brief article every 10 years or so on the syntax of Portuguese or the dietary habits of ancient Carthage was considered just about permissible. There had been a time earlier when college tutors might not even have bothered to arrange set tutorial times for their undergraduates. Instead, the undergraduate would simply drop round to their rooms when the spirit moved him for a glass of sherry and a civilized chat about Jane Austen or the function of the pancreas.

Today, Oxbridge retains much of its collegial ethos. It is the dons who decide how to invest the college’s money, what flowers to plant in their gardens, whose portraits to hang in the senior common room, and how best to explain to their students why they spend more on the wine cellar than on the college library. All important decisions are made by the fellows of the college in full session, and everything from financial and academic affairs to routine administration is conducted by elected committees of academics responsible to the body of fellows as a whole. In recent years, this admirable system of self-government has had to confront a number of centralizing challenges from the university, of the kind that led to my own exit from the place; but by and large it has stood firm. Precisely because Oxbridge colleges are for the most part premodern institutions, they have a smallness of scale about them that can serve as a model of decentralized democracy, and this despite the odious privileges they continue to enjoy.

Elsewhere in Britain, the situation is far different. Instead of government by academics there is rule by hierarchy, a good deal of Byzantine bureaucracy, junior professors who are little but dogsbodies, and vice chancellors who behave as though they are running General Motors. Senior professors are now senior managers, and the air is thick with talk of auditing and accountancy. Books — those troglodytic, drearily pretechnological phenomena — are increasingly frowned upon. At least one British university has restricted the number of bookshelves professors may have in their offices in order to discourage "personal libraries." Wastepaper baskets are becoming as rare as Tea Party intellectuals, since paper is now passé.

Teaching has for some time been a less vital business in British universities than research. It is research that brings in the money, not courses on Expressionism or the Reformation.

Philistine administrators plaster the campus with mindless logos and issue their edicts in barbarous, semiliterate prose. One Northern Irish vice chancellor commandeered the only public room left on campus, a common room shared by staff and students alike, for a private dining room in which he could entertain local bigwigs and entrepreneurs. When the students occupied the room in protest, he ordered his security guards to smash the only restroom near to hand. British vice chancellors have been destroying their own universities for years, but rarely as literally as that. On the same campus, security staff move students on if they are found hanging around. The ideal would be a university without these disheveled, unpredictable creatures.

In the midst of this debacle, it is the humanities above all that are being pushed to the wall. The British state continues to distribute grants to its universities for science, medicine, engineering, and the like, but it has ceased to hand out any significant resources to the arts. It is not out of the question that if this does not change, whole humanities departments will be closed down in the coming years. If English departments survive at all, it may simply be to teach business students the use of the semicolon, which was not quite what Northrop Frye and Lionel Trilling had in mind.

Humanities departments must now support themselves mainly by the tuition fees they receive from their students, which means that smaller institutions that rely almost entirely on this source of income have been effectively privatized through the back door. The private university, which Britain has rightly resisted for so long, is creeping ever closer. Yet the government of Prime Minister David Cameron has also overseen a huge hike in tuitions, which means that students, dependent on loans and encumbered with debt, are understandably demanding high standards of teaching and more personal treatment in return for their cash at just the moment when humanities departments are being starved of funds.

Besides, teaching has been for some time a less vital business in British universities than research. It is research that brings in the money, not courses on Expressionism or the Reformation. Every few years, the British state carries out a thorough inspection of every university in the land, measuring the research output of each department in painstaking detail. It is on this basis that government grants are awarded. There has thus been less incentive for academics to devote themselves to their teaching, and plenty of reason for them to produce for production’s sake, churning out supremely pointless articles, starting up superfluous journals online, dutifully applying for outside research grants regardless of whether they really need them, and passing the odd pleasant hour padding their CVs.

In any case, the vast increase in bureaucracy in British higher education, occasioned by the flourishing of a managerial ideology and the relentless demands of the state assessment exercise, means that academics have had little enough time to prepare their teaching even if it seemed worth doing, which for the past several years it has not. Points are awarded by the state inspectors for articles with a bristling thicket of footnotes, but few if any for a best-selling textbook aimed at students and general readers. Academics are most likely to boost their institution’s status by taking temporary leave of it, taking time off from teaching to further their research.

They would boost its resources even more were they to abandon academe altogether and join a circus, hence saving their financial masters a much grudged salary and allowing the bureaucrats to spread out their work among an already overburdened professoriate. Many academics in Britain are aware of just how passionately their institution would love to see the back of them, apart from a few household names who are able to pull in plenty of customers. There is, in fact, no shortage of lecturers seeking to take early retirement, given that British academe was an agreeable place to work some decades ago and is now a deeply unpleasant one for many of its employees. In an additional twist of the knife, however, they are now about to have their pensions cut as well.

As professors are transformed into managers, so students are converted into consumers. Universities fall over one another in an undignified scramble to secure their fees. Once such customers are safely within the gates, there is pressure on their professors not to fail them, and thus risk losing their fees. The general idea is that if the student fails, it is the professor’s fault, rather like a hospital in which every death is laid at the door of the medical staff. One result of this hot pursuit of the student purse is the growth of courses tailored to whatever is currently in fashion among 20-year-olds. In my own discipline of English, that means vampires rather than Victorians, sexuality rather than Shelley, fanzines rather than Foucault, the contemporary world rather than the medieval one. It is thus that deep-seated political and economic forces come to shape syllabuses. Any English department that focused its energies on Anglo-Saxon literature or the 18th century would be cutting its own throat.

Hungry for their fees, some British universities are now allowing students with undistinguished undergraduate degrees to proceed to graduate courses, while overseas students (who are generally forced to pay through the nose) may find themselves beginning a doctorate in English with an uncertain command of the language. Having long despised creative writing as a vulgar American pursuit, English departments are now desperate to hire some minor novelist or failing poet in order to attract the scribbling hordes of potential Pynchons, ripping off their fees in full, cynical knowledge that the chances of getting one’s first novel or volume of poetry past a London publisher are probably less than the chances of awakening to discover that you have been turned into a giant beetle.

Education should indeed be responsive to the needs of society. But this is not the same as regarding yourself as a service station for neocapitalism. In fact, you would tackle society’s needs a great deal more effectively were you to challenge this whole alienated model of learning. Medieval universities served the wider society superbly well, but they did so by producing pastors, lawyers, theologians, and administrative officials who helped to sustain church and state, not by frowning upon any form of intellectual activity that might fail to turn a quick buck.

Times, however, have changed. According to the British state, all publicly funded academic research must now regard itself as part of the so-called knowledge economy, with a measurable impact on society. Such impact is rather easier to gauge for aeronautical engineers than ancient historians. Pharmacists are likely to do better at this game than phenomenologists. Subjects that do not attract lucrative research grants from private industry, or that are unlikely to pull in large numbers of students, are plunged into a state of chronic crisis. Academic merit is equated with how much money you can raise, while an educated student is redefined as an employable one. It is not a good time to be a paleographer or numismatist, pursuits that we will soon not even be able to spell, let alone practice.

The effects of this sidelining of the humanities can be felt all the way down the educational system in the secondary schools, where modern languages are in precipitous decline, history really means modern history, and the teaching of the classics is largely confined to private institutions such as Eton College. (It is thus that the old Etonian Boris Johnson, the mayor of London, regularly lards his public declarations with tags from Horace.)

It is true that philosophers could always set up meaning-of-life clinics on street corners, or modern linguists station themselves at strategic public places where a spot of translation might be required. In general, the idea is that universities must justify their existence by acting as ancillaries to entrepreneurship. As one government report chillingly put it, they should operate as "consultancy organisations." In fact, they themselves have become profitable industries, running hotels, concerts, sporting events, catering facilities, and so on.

If the humanities in Britain are withering on the branch, it is largely because they are being driven by capitalist forces while being simultaneously starved of resources. (British higher education lacks the philanthropic tradition of the United States, largely because America has a great many more millionaires than Britain.) We are also speaking of a society in which, unlike the United States, higher education has not traditionally been treated as a commodity to be bought and sold. Indeed, it is probably the conviction of the majority of college students in Britain today that higher education should be provided free of charge, as it is in Scotland; and though there is an obvious degree of self-interest in this opinion, there is a fair amount of justice in it as well. Educating the young, like protecting them from serial killers, should be regarded as a social responsibility, not as a matter of profit.

I myself, as the recipient of a state scholarship, spent seven years as a student at Cambridge without paying a bean for it. It is true that as a result of this slavish reliance on the state at an impressionable age I have grown spineless and demoralized, unable to stand on my own two feet or protect my family with a shotgun if called upon to do so. In a craven act of state dependency, I have even been known to call upon the services of the local fire department from time to time, rather than beat out the blaze with my own horny hands. I am, even so, willing to trade any amount of virile independence for seven free years at Cambridge.

It is true that only about 5 percent of the British population attended university in my own student days, and there are those who claim that today, when that figure has risen to around 50 percent, such liberality of spirit is no longer affordable. Yet Germany, to name only one example, provides free education to its sizable student population. A British government that was serious about lifting the crippling debt from the shoulders of the younger generation could do so by raising taxes on the obscenely rich and recovering the billions lost each year in evasion.

It would also seek to restore the honorable lineage of the university as one of the few arenas in modern society (another is the arts) in which prevailing ideologies can be submitted to some rigorous scrutiny. What if the value of the humanities lies not in the way they conform to such dominant notions, but in the fact that they don’t? There is no value in integration as such. In premodern times, artists were more thoroughly integrated into society at large than they have been in the modern era, but part of what that meant was that they were quite often ideologues, agents of political power, mouthpieces for the status quo. The modern artist, by contrast, has no such secure niche in the social order, but it is precisely on this account that he or she refuses to take its pieties for granted.

Until a better system emerges, however, I myself have decided to throw in my lot with the hard-faced philistines and crass purveyors of utility. Somewhat to my shame, I have now taken to asking my graduate students at the beginning of a session whether they can afford my very finest insights into literary works, or whether they will have to make do with some serviceable but less scintillating comments.

Charging by the insight is a distasteful affair, and perhaps not the most effective way of establishing amicable relations with one’s students; but it seems a logical consequence of the current academic climate. To those who complain that this is to create invidious distinctions among one’s students, I should point out that those who are not able to hand over cash for my most perceptive analyses are perfectly free to engage in barter. Freshly baked pies, kegs of home-brewed beer, knitted sweaters, and stout, handmade shoes: All these are eminently acceptable. There are, after all, more things in life than money.

Terry Eagleton is a distinguished visiting professor of English literature at the University of Lancaster. He is the author of some 50 books, including How to Read Literature (Yale University Press, 2013).

Version control, collaborative editing and undo

23 April 2015 - 7:00am

 - 22 Apr 2015

Collaborative editing, undo/redo and version control are all instances of the same problem. They are also all legendarily hard to get right. With that in mind, I would like to have more eyes on the design I'm proposing for Eve.


The standard solution to any hard problem is to find someone who solved it already and steal their answer. Let's look at existing distributed version-control systems and collaborative editors.

The hardest problem that a DVCS has to solve is figuring out how to take changes made in one context and apply them in another context. This is a hard problem because most of the important semantic information is missing - while the user is thinking about changes like 'inline function foo into function bar' the DVCS can only see changes like 'add this text at line 71'.

This difficulty is compounded by recording changes after the fact (by diffing) rather than as they are made. Detecting even simple operations like moving a function into another file has to rely on heuristics about textual similarity instead of just observing the cut and paste.

With this kind of information the problem isn't even well-defined. There is no guarantee that merging two good branches will result in a project that even parses, let alone compiles and passes tests. All the widely used tools settle for heuristics that try to balance reducing surprise with reducing manual effort. In practice, predictable merges are valued more than clever merges.

In Git and Mercurial the input to these heuristics is the two versions to be merged and their most recent common ancestor (if there are multiple candidates they can be recursively merged).

Darcs additionally considers the chain of patches that led to each version. It transforms individual patches one by one in order to be able to apply them in the context of the other branch. The upside of this bag of patches model is that it makes cherry-picking and rebasing easy. The downside is that it occasionally hits dramatically slow cases and doesn't track the project history.

Operational Transform algorithms solve collaborative editing with a similar approach. Each editor broadcasts changes to the text as they happen and transforms incoming changes to apply to the current text. The editors do not track the merge history and it is very hard to prove that every editor will eventually reach the same state. In fact, Randolph et al found that every proposed algorithm in the academic literature can fail to converge and that guaranteeing convergence is not possible without additional information.

Treedoc and various other CRDTs provide this additional information in the form of hidden identity tokens for each chunk of text. The change events refer to these tokens rather than to line and column numbers, which makes it easy to apply changes in any order. The proof of convergence is simple and, more importantly, the algorithm is obvious once you have seen the identity tokens. Practical implementations are tricky though since the tokens can massively inflate the memory required to store the document.

So, things to think about when designing our solution:

  • Recording changes as they happen is easier than inferring them after the fact

  • Preserving history - the context in which a change was made - is necessary for merging correctly

  • Having stable identities reduces the amount of context needed to understand a change

  • Being predictable is more important than being smart


Eve is a functional-relational language. Every input to an Eve program is stored in one of a few insert-only tables. The program itself consists of a series of views written in a relational query language. Some of these views represent internal state. Others represent IO that needs to be performed. Either way there is no hidden or forgotten state - the contents of these views can always be calculated from the input tables.

The code for these views is stored similarly. Every change made in the editor is inserted into an input table. The compiler reads these inputs and emits query plans which will calculate the views.

Eve is designed for live programming. As the user makes changes, the compiler is constantly re-compiling code and incrementally updating the views. The compiler is designed to be resilient and will compile and run as much of the code as possible in the face of errors. The structural editor restricts partially edited code to small sections, rather than rendering entire files unparseable. The pointer-free relational data model and the timeless views make it feasible to incrementally compute the state of the program, rather than starting from scratch on each edit.

We arrived at this design to support live programming but these properties also help with collaborative editing. In particular:

  • Tables are unordered, so inputs can be inserted in any order without affecting the results

  • The editor assigns unique ids to every editable object on creation, so changes require very little context

  • Partial edits and merge conflicts only prevent the edited view from running, not the whole program

  • Runtime errors only prevent data flow through that view, rather than exiting the program

  • If all users are editing on the same server they can share runtime state


As a thought experiment, let's suppose we connect two editors by just unioning their input tables. What would go wrong?

Firstly, the compiler crashes. While each independent editor respects the invariants that the compiler relies on, the union of their actions might not. For example, each editor might set a different human-readable name for some view, breaking the unique-key invariant on the human-readable-name table.

As a first pass, we can resolve this with last-write-wins. Both users share a server and the server is authoritative. For online collaborative editing this is actually good enough - we are able to write programs with multiple editors without problems.

To support offline editing, version control and undo/redo we need to be more careful about what we mean by 'last write'. We can track the causal history of each change. When two conflicting changes are causally ordered we pick the later change. If the changes are concurrent we tag it as a merge conflict and disable that view until the conflict is resolved.

The difficult question is when do two changes conflict? There is no 'correct' answer to this question - merging is an under-defined problem. In any VCS you can make code changes that are individually valid, will be merged automatically and will break in the final program (a simple example in dynamic languages is defining two functions with the same name). Existing tools simply try to guarantee that the user is not surprised by the result and that no information is lost. We will add an extra constraint - the result will always compile (even if we have to emit warnings and disable some views).

This last constraint gives us our answer - two changes conflict when applying them both would break an invariant and crash the compiler (this is remarkably similar to I-confluence). The invariants fall into a few categories:

  • Types: Reject any ill-typed errors and warn the user.

  • Unique keys: If one change is causally later than all other changes then it wins. If there multiple changes which are not ancestors of any other change, disable the view and warn the user.

  • Foreign keys: Replace the missing value with a default value and warn the user.

  • Ordering: Instead of using consecutive integers use an ordered CRDT. Resolve ties arbitrarily but consistently (eg by comparing the items uuid).

To summarise: most changes don't conflict, some can be resolved by recording additional information, the remainder are flagged as conflicts and are not compiled until they are fixed.

I glossed over the recording of causal history. This is a core difference in philosophy between darcs and git. In darcs, the history is a mutable bag of patches and is easily rearranged. This makes features like cherry-picking easy to use and implement. In git, the history is an immutable DAG of commits and is hard to change. This is useful for auditing and bisecting changes. The choice of how we record causal history has important ramifications for how we implement undo/redo.

I really wanted undo to behave like undo-tree, where undo moves up the tree of changes, editing creates a branch and redo moves down the current branch. Unfortunately, I can't reconcile this with collaborative editing, where undo should only undo operations made by the same user (imagine Alice is off working in a different section of the program and hits undo, undoing Bob's last change in mid edit). To undo something that is not on the tip, git offers revert and rebase. Rebase creates an entirely new chain of commits which causes confusing merges if someone else is concurrently working on the same branch. Revert doesn't allow for undo-tree -like behaviour.

I'm considering an approach that gives us both the easy cherry-picking of darcs' bag of patches model and the auditable history of git. It's able to be simpler than both because we are solving an easier problem - the changes we are merging contain much more information about intent than raw text diffs.

There are three kinds of data tracked by the system:

  • a patch has a unique id and a set of input tuples to be added
  • an insert has a unique id, a set of parent commits and names a patch to be applied
  • a delete has a unique id, a set of parent commits and names an insert to be removed

The inserts and deletes form a causal history DAG. Collecting all the inserts and removes between the current tip and the root gives you an observed-removed set of patches. The union of these patches is passed to the compiler, which flags any merge conflicts.


Suppose Alice and Bob are collaborating together on a web page. Alice sets the title of the page to 'Bobs Sandwiches'. Bob sees this, sets the background to 'burger.jpg' and edits the title to read 'Bobs Burgres'.

Alice: Patch{id=P0, inputs=[Title{text="Bobs Sandwiches"}]} Alice: Insert{id=I0, parents=[], patch=P0} Bob: Patch{id=P1, inputs=[Background{src="burger.jpg"}} Bob: Insert{id=I1, parents=[I0], patch=P1} Bob: Patch{id=P2, inputs=[Title{text="Bobs Burgres"}]]} Bob: Insert{id=I2, parents=[I1], patch=P2}

There can only be one title. I0 is an ancestor of I2 so 'Bobs Burgres' wins the conflict.

Now Alice and Bob both notice the typo in the title and try to fix it at the same time.

Alice: Patch{id=P3, inputs=[Title{text="Bobs Sandwiches and Burgers"}]} Alice: Insert{id=I3, parents=[I2], patch=P3} Bob: Patch{id=P4, inputs=[Title{text="Bobs Burgers"}]} Bob: Insert{id=I4, parents=[I2], patch=P4}

These inserts are concurrent - neither is an ancestor of the other - so the editor displays a merge conflict warning. Alice resolves the conflict with a compromise:

Alice: Patch{id=P5, inputs=[Title{text="Bobs Burgers (and Sandwiches)"}]} Alice: Insert{id=I5, parents=[I3, I4], patch=P5}

Every other insert is an ancestor of I5 so the new title wins all conflicts.

Meanwhile, Bob is now tired of the whole thing and hits undo twice:

Bob: Remove{id=R0, parents=[I5], insert=I4} Bob: Remove{id=R1, parents=[R0], insert=I2}

This removes P2 and P4 from the set of patches that the compiler sees. Removing P4 has no effect, because it has already been superseded by P5. Removing P2 removes the burger background.


Redo is still tricky. Suppose Bob liked the background so he panics and mashes redo.

Bob: Insert{id=I6, parents=[R1], patch=P2} Bob: Redo{id=I7, parents=[I6], patch=P4}

This brings back the background, but also makes P4 the new winner of the title conflict. This is probably pretty surprising to Bob. One possible solution to this is when Bob hits undo it should skip P4, because P4 has lost a conflict and has no effect on the current state.

Another potential problem is that we might automatically merge changes which are individually innocent but together cause some unexpected effect. This is possible in any VCS but we automatically merge a much higher percentage of changes. We may want to add a merge review which highlights sections that may need more attention.

I haven't yet figured out how to handle metadata such as branch pointers. If I make edits offline and then reconnect, should other users in the same session receive my changes or should I be booted into a new session? How do we undo changes to metadata (eg accidentally overwriting a branch)? I'm considering storing metadata in branch of its own so that changes are recorded and can be undone.

On problems with threads in Node.js

23 April 2015 - 7:00am

„Wait a moment!”, you might say, „Threads in node.js? That’s preposterous! Everyone knows node.js is single-threaded!”. And you would be right, to some extent. True, all the JavaScript code that you write will indeed run on a single thread. However, your JavaScript code is not everything node is dealing with. Without revealing too much just yet, let me show you an example.

The following code does very little. It reads the contents of the current directory three times, ignores the results and simply prints out how long, from the beginning of the program, it took to reach the callback for each particular iteration.

var fs = require('fs'); var util = require('util'); var start = process.hrtime(); for (var i = 0; i < 3; ++i) { (function (id) { fs.readdir('.', function () { var end = process.hrtime(start); console.log(util.format('readdir %d finished in %ds', id, end[0] + end[1] / 1e9)); }); })(i); }

Nothing out of the ordinary here. Each iteration took roughly the same amount of time to reach the callback. However, watch what happens if we double the number of iterations:
readdir 1 finished in 1.003170344s
readdir 0 finished in 1.052704191s
readdir 2 finished in 1.058100525s
readdir 3 finished in 1.060514229s
readdir 4 finished in 2.003446385s
readdir 5 finished in 2.007682862s

Surely, this cannot be right, can it? The last two calls took twice as much time to finish than the rest. But it is right and there is a very good reason for this behaviour. I hope I piqued your interest, the explanation will follow shortly.
But first…

I cheated

Not much, mind you, but cheated, nonetheless. You might have wondered why would, on any modern PC, reading the contents of a directory take more than a second. And herein lies the cheat. I made it so. I prepared a small shared library that would deliberately delay the operation by 1 second.

#define _GNU_SOURCE #include <dlfcn.h> #include <unistd.h> #include <dirent.h> int scandir64(const char *dirp, struct dirent64 ***namelist, int (*filter)(const struct dirent64 *), int (*compar)(const struct dirent64 **, const struct dirent64 **)) { int (*real_scandir)(const char *dirp, struct dirent64 ***namelist, int (*filter)(const struct dirent64 *), int (*compar)(const struct dirent64 **, const struct dirent64 **)); real_scandir = dlsym(RTLD_NEXT, "scandir64"); sleep(1); return real_scandir(dirp, namelist, filter, compar); }

Nothing fancy here, the code simply sleeps for a second before calling the actual scandir() function. After compiling to a shared library, I just ran node through.

$> LD_PRELOAD=./scandir.so node index.js

The only purpose of this modification was to prepare a minimal code showcasing the issue with consistent results. Without it, the original sample runs in fraction of a second and you cannot really spot the problem.

With that out of the way, we can move on to actually explaining what is going on.

A look inside

Some types of operations, such as file system access or networking, are expected to take orders of magnitude more CPU cycles to complete than, say, RAM access, especially if we combine them into larger functions – an example of which could be node’s fs.readFile(), reading contents of an entire file. As you might imagine, had this function not been asynchronous, trying to read several gigabytes of data at once (which, in itself, doesn’t seem like the best idea but that’s not the point) would’ve left our application completely unresponsive for a noticeable period of time, which obviously would have been unacceptable.

But since it is asynchronous, everything is fine and dandy. We’re free to continue doing whatever it is that we’re doing while the file we wanted loads itself into memory. And once that’s done, we get a gentle reminder in the form of a callback. But how did that happen? To the best of my knowledge node.js, is not powered by magic and fairy dust and things don’t just get done on their own. The answer should hardly be a mystery at this point, as both the title and the introduction reveal it – it’s threads.

There are many parts that make node.js whole. The one we’re particularly interested in today is a library providing it with its asynchronous I/O – libuv. Close to the bottom of the list of its features is the one thing we’re after – the thread pool. At this point, we’re moving away from JavaScript and into the mystical land of C++.

The libuv library maintains a pool of threads that is used by node.js to perform long-running operations in the background, without blocking its main thread. Effectively, deep under the hood, node.js is thread-based, whether you like it or not.

The thread pool is used through submitting a work request to a queue. The work request provides:

  • a function to execute in the separate thread,
  • a structure containing data for that function,
  • a function collecting the results of processing done in the separate thread.

This is a bit of a simplification but it’s not a goal of this article to teach libuv programming.

In general, before submitting the work request you’d convert V8 engine’s JavaScript objects (such as Numbers, Strings) that you received in the function call from JavaScript code to their C/C++ representations and pack them into a struct. V8 is not thread-safe so it’s a much better idea to do this here than in the function we’re going to run. After the function running on the separate thread finishes, libuv will call the second function from the work request with the results of the processing. As this function is being executed back in the main thread, it’s safe to use V8 again. Here we’d wrap the results back into V8 objects and call the JavaScript callback.

Now back to the work queue I mentioned briefly. When at least one thread in the thread pool is idle, the first work request from the queue is assigned to that thread. Otherwise, work requests await threads to finish their current tasks. This should start making it clear what was going on in our initial example.

Simplified diagram of execution flow for a function using the thread pool

The default size of libuv’s thread pool is 4. That is the reason why, out of our 6 calls to the fs.readdir() function, two of them finished after two seconds instead of one. Since all threads in the thread pool were busy for a whole second, waiting on that sleep() call, the remaining tasks in the work queue had to wait for one of those threads to finish, then get through their sleep(), to finally end after two seconds.

Under normal conditions, without that artificial slowdown, this wouldn’t be noticeable at all, of course. But, in some cases it might be very noticeable and it’s worth being aware that, first of all, it might happen at all and if it does, how to solve it.

At this moment, it should be noted that not all asynchronous operations are performed through thread pool. This particular mechanism is used mainly for:

  • handling file system operations, which is, as explained by libuv’s docs, caused by significant disparities in asynchronous file system access APIs between OSes,
  • most importantly for us, user code.

Architecture of libuv – source: http://docs.libuv.org/

So when does that all matter?

Imagine you’re writing an application that heavily utilizes a database. It’s Oracle, so you’ll use the brand new, official Oracle database driver. You expect to run a lot of queries so you decide to use a connection pool and let it create, at most, 20 connections. And you’d think „Fine, up to 20 queries running in parallel should be more than enough”. But you’ll never really get those 20 parallel queries, all because of this. As you can see, query execution submits a work request here. And since you’re running node with default number of threads in thread pool, you will never get past 4 queries running in parallel. Moreover, if your database suffers a slowdown, even parts of your application not making use of database access may become unresponsive as their asynchronous tasks are stuck in the work queue behind database queries.

There is, however, something we can do about it. For once, we could lower the size of database connection pool. But let’s say we don’t want to do that. Fortunately, there’s another solution. Enter UV_THREADPOOL_SIZE environmental variable. By changing its value we can influence the number of threads that will be available in the thread pool. You can choose any value between the hardcoded limits of 1 and 128. Let’s try that out on our sample code:

$> UV_THREADPOOL_SIZE=5 LD_PRELOAD=./scandir.so node index.js
readdir 2 finished in 1.005758445s
readdir 0 finished in 1.046712749s
readdir 3 finished in 1.056222923s
readdir 1 finished in 1.057267272s
readdir 4 finished in 1.05897112s
readdir 5 finished in 2.007336631s

As you can see, it worked. Only one work request had to wait for a thread this time around. You can also change this programmatically, from your node application, by writing to process.env.UV_THREADPOOL_SIZE variable. Keep in mind that this is very limited, though. You cannot affect the size of the thread pool once it is created, which is once the first work request is submitted.

var fs = require('fs'); process.env.UV_THREADPOOL_SIZE = 10; // This will work // First time thread pool is required fs.readdir('.', function () { process.env.UV_THREADPOOL_SIZE = 20; // This won't fs.readdir('.', function () { }); });

To finish off, just to give you an idea on what issues, stemming from the usage of the thread pool, may arise in actual applications, an example of a non-trivial problem I helped solve.

Our application required us to have a possibility of terminating currently executed queries on demand. The actual implementation was fairly simple and was an extension for an already existing module written mainly in C++. All initial tests indicated that everything works flawlessly and the issue was marked as resolved. Soon enough, however, we’ve received a report indicating that the feature, occasionally, did not work at all.

Sparing you the details of a long investigation, the explanation of the problem turned out to be painfully simple. Since we allowed multiple queries to run parallelly, there were situations where all four worker threads were busy, waiting for four queries to finish executing themselves. And since we’ve implemented the query termination functionality as an asynchronous function, the request to terminate a query was patiently waiting in work queue. The end result was that sometimes we couldn’t terminate a query because we had to wait for that specific query to finish.

This particular incident is, in fact, one of the key reasons this article exists at all, as the knowledge I gathered during its resolution was not readily available, and to this day doesn’t seem to be, and I felt it requires sharing.

I hope this article gave you enough insight to spot where such problems may occur in your applications making use of native code. If you feel I’ve left something amiss or would like to know more about some of the topics I mentioned, please let me know in the comments below.

Relationships Are More Important Than Ambition (2013)

23 April 2015 - 7:00am


This month, many of the nation's best and brightest high school seniors will receive thick envelopes in the mail announcing their admission to the college of their dreams. According to a 2011 survey, about 60 percent of them will go to their first-choice schools. For many of them, going away to college will be like crossing the Rubicon. They will leave their families -- their homes -- and probably not return for many years, if at all.

That was journalist Rod Dreher's path. Dreher grew up in the small southern community of Starhill, Louisiana, 35 miles northwest of Baton Rouge. His family goes back five generations there. His father was a part-time farmer and sanitarian; his mother drove a school bus. His younger sister Ruthie loved hunting and fishing, even as a little girl.

Ambition drives people forward; relationships and community, by imposing limits, hold people back.

But Dreher was different. As a bookish teenager, he was desperate to flee what he considered his intolerant and small-minded town, a place where he was bullied and misunderstood by his own father and sister. He felt more at home in the company of his two eccentric and worldly aunts -- great-great aunts, actually -- who lived nearby. One was a self-taught palm reader. She looked into his hand one day when he was a boy and told him, "See this line? You'll travel far in life." Dreher hoped she was right. When he was 16, he decided to leave home for a Louisiana boarding school with the intention of never looking back.

That decision created a divide between him and his sister Ruthie, who was firmly attached to Starhill. Leaving for boarding school was "the fork in the road for us, the moment in our lives in which we diverged," he writes in his new book, The Little Way of Ruthie Leming: A Southern Girl, a Small Town, and the Secret of a Good Life.

In the book, he describes leaving his Starhill home to pursue a career in journalism -- a career that took him to cities like Baton Rouge, Washington DC, Fort Lauderdale, Dallas, New York, and Philadelphia. He was chasing after a bigger and better career with each move. "I was caught up in a culture of ambition," Dreher told me me in an interview.

While Dreher was a dreamer, Ruthie was satisfied with what she had. When Dreher was living in big cities, going to fancy restaurants, carousing with media types, writing film reviews for a living, and traveling to Europe, Ruthie was back home in Louisiana, living down the road from her parents, starting a family of her own, and devoting herself to her elementary school students as a teacher. Ruthie could not understand Dreher's lifestyle. Why would he want to leave home for a journalism career? Wasn't Starhill good enough? Did Rod think he was better than all of them?

Alli Polin/Flickr

These "invisible walls" stood between Ruthie and Dreher when, on Mardi Gras of 2010, Ruthie was unexpectedly diagnosed with terminal lung cancer -- devastating news that ripped through her community "like a cyclone" says Dreher, who was living in Philadelphia at the time. She was a healthy non-smoking 40-year-old, beloved by her students, her neighbors, her three daughters, and her husband. Now, she had about three months to live. She actually lived for nineteen. On September 15, 2011, Ruthie passed away.

Watching her struggle with terminal cancer for 19 months, and seeing her small-town community pour its love into supporting her, was a transformational experience for Dreher. "There are some things that we really cannot do by ourselves," Dreher said. "When Ruthie got sick, there were things that her family could not do -- they couldn't get the kids to school without help, they couldn't get meals on the table without help, they couldn't pay the bill without help. It really took a village to care for my sick sister. The idea that we are self-reliant is a core American myth."

When news spread of Ruthie's cancer, some friends planned an aid concert to raise money for her medical bills. Hundreds of people came together, raising $43,000 for their friend. "This is how it's supposed to be," someone told Dreher that night. "This is what folks are supposed to do for each other."


The conflict between career ambition and relationships lies at the heart of many of our current cultural debates, including the ones sparked by high-powered women like Sheryl Sandberg and Anne Marie Slaughter. Ambition drives people forward; relationships and community, by imposing limits, hold people back. Which is more important? Just the other week, Slate ran a symposium that addressed this question, asking, "Does an Early Marriage Kill Your Potential To Achieve More in Life?" Ambition is deeply entrenched into the American personae, as Yale's William Casey King argues in Ambition, A History: From Vice to Virtue­ -- but what are its costs?

In psychology, there is surprisingly little research on ambition, let alone the effect it has on human happiness. But a new study, forthcoming in the Journal of Applied Psychology, sheds some light on the connection between ambition and the good life. Using longitudinal data from the nine-decade-long Terman life-cycle study, which has followed the lives and career outcomes of a group of gifted children since 1922, researchers Timothy A. Judge of Notre Dame and John D. Kammeyer-Mueller of the University of Florida analyzed the characteristics of the most ambitious among them. How did their lives turn out?

The causes of ambition were clear, as were its career consequences. The researchers found that the children who were the most conscientious (organized, disciplined, and goal-seeking), extroverted, and from a strong socioeconomic background were also the most ambitious. The ambitious members of the sample went on to become more educated and at more prestigious institutions than the less ambitious. They also made more money in the long run and secured more high-status jobs.

But when it came to well-being, the findings were mixed. Judge and Kammeyer-Mueller found that ambition is only weakly connected with well-being and negatively associated with longevity.

"There really wasn't a big impact from ambition to how satisfied people were with their lives," Kammeyer-Mueller, a business school professor, told me. At the same time, ambitious people were not miserable either. "People who are ambitious are happy that they have accomplished more in their lives," he says.

People with ten or more friends at their religious services were about twice as satisfied with their lives than people who had no friends there.

When I asked about the connection between ambition and personal relationships, Kammeyer-Mueller said that while the more ambitious appeared to be happier, that their happiness could come at the expense of personal relationships. "Do these ambitious people have worse relationships? Are they ethical and nice to the people around them? What would they do to get ahead? These are the questions the future research needs to answer."

Existing research by psychologist Tim Kasser can help address this issue. Kasser, the author of The High Price of Materialism, has shown that the pursuit of materialistic values like money, possessions, and social status-the fruits of career successes-leads to lower well-being and more distress in individuals. It is also damaging to relationships: "My colleagues and I have found," Kasser writes, "that when people believe materialistic values are important, they...have poorer interpersonal relationships [and] contribute less to the community." Such people are also more likely to objectify others, using them as means to achieve their own goals.

So if the pursuit of career success comes at the expense of social bonds, then an individual's well-being could suffer. That's because community is strongly connected to well-being. In a 2004 study, social scientists John Helliwell and Robert Putnam, author of Bowling Alone, examined the well-being of a large sample of people in Canada, the United States, and in 49 nations around the world. They found that social connections -- in the form of marriage, family, ties to friends and neighbors, civic engagement, workplace ties, and social trust -- "all appear independently and robustly related to happiness and life satisfaction, both directly and through their impact on health."

In Canada and the United States, having frequent contact with neighbors was associated with higher levels of well-being, as was the feeling of truly belonging in a group. "If everyone in a community becomes more connected, the average level of subjective well-being would increase," they wrote.

This may explain why Latin Americans, who live in a part of the world fraught with political and economic problems, but strong on social ties, are the happiest people in the world, according to Gallup. It may also explain why Dreher's Louisiana came in as the happiest state in the country in a major study of 1.3 million Americans published in Science in 2009. This surprised many at the time, but makes sense given the social bonds in communities like Starhill. Meanwhile, wealthy states like New York, New Jersey, Connecticut, and California were among the least happy, even though their inhabitants have ambition in spades; year after year, they send the greatest number of students to the Ivy League.

In another study, Putnam and a colleague found that people who attend religious services regularly are, thanks to the community element, more satisfied with their lives than those who do not. Their well-being was not linked to their religious beliefs or worshipping practices, but to the number of friends they had at church. People with ten or more friends at their religious services were about twice as satisfied with their lives than people who had no friends there.


These outcomes are interesting given that relationships and community pose some challenges to our assumptions about the good life. After all, relationships and community impose constraints on freedom, binding people to something larger than themselves. The assumption in our culture is that limiting freedom is detrimental to well-being. That is true to a point. Barry Schwartz, a psychological researcher based at Swarthmore College, has done extensive research suggesting that too much freedom -- or a lack of constraints -- is detrimental to human happiness.

"Relationships are meant to constrain," Schwartz told me, "but if you're always on the lookout for better, such constraints are experienced with bitterness and resentment."

Dreher has come to see the virtue of constraints. Reflecting on what he went through when Ruthie was sick, he told me that the secret to the good life is "setting limits and being grateful for what you have. That was what Ruthie did, which is why I think she was so happy, even to the end."

Meanwhile, many of his East Coast friends, who chased after money and good jobs, certainly achieved success, but felt otherwise empty and alone. As Dreher was writing his book, one told him, "Everything I've done has been for career advancement ... And we have done well. But we are alone in the world." He added: "Almost everybody we know is like that."


For many years, Ruthie and her mother had a Christmas Eve tradition of visiting the Starhill cemetery and lighting candles on each of the hundreds of graves there. On that first Christmas Eve after Ruthie died, her mother could not bring herself to keep the tradition going. And yet, driving past the cemetery after sunset on that Christmas Eve, Dreher saw sparks of light illuminating the graveyard. Someone else had lit the candles on the graves -- but who? It turns out that a member of their community named Susan took it upon herself to pay that tribute to the departed, including Ruthie.

In the final paragraph of the novel Middlemarch, George Eliot pays another kind of tribute to the dead. Eliot writes, "The growing good of the world is partly dependent on unhistoric acts; and that things are not so ill with you and me as they might have been, is half owing to the number who lived faithfully a hidden life, and rest in unvisited tombs."

In other words, those many millions of people who live in the "unvisited tombs" of the world, though they may not be remembered or known by you and me, are the ones who kept the peace of the world when they were alive. For Dreher, Ruthie is one of those people. She was, he told me, "A completely unfussy, ordinary, neighborly person, who you'd never notice in a crowd, but whose deep goodness and sense of order and compassion saved the day."

Dreher also said of his sister, "What I saw over the course of her 19-month struggle with cancer was the power of a quiet life lived faithfully with love and service to others." While Ruthie, an ordinary person, did not live the kind of life our culture celebrates, she "penetrated deeply into the lives of the people she touched," Dreher told me. "She did not live life on the surface."

There's More to Life Than Being Happy

What's remarkable, though, is that she was not extraordinary in this regard. Most of the people in the Starhill community were like her in their kindness and compassion, Dreher said.

After Ruthie passed away, Rod decided, with his wife and young children, to put aside their East Coast lives and move back home to Louisiana. They have been there for a year and a half and love it. "Community means more than many of us realize," he says. "It certainly means more than your job."

Why Is Spoofing Bad?

23 April 2015 - 7:00am

Yesterday Navinder Singh Sarao was accused of spoofing S&P 500 E-mini futures in a way that might have contributed to the May 2010 flash crash. Prosecutors and the Commodity Futures Trading Commission claim that Sarao put in lots of big orders to sell futures, never intending to actually trade on those orders, to create the impression that there was a lot of selling interest and thus drive the price down. Here's how the CFTC describes the effect:

Many market participants, relying on the information contained in the Order Book, consider the total relative number of bid and ask offers in the Order Book when making trading decisions. For instance, if the total number of sell orders significantly outweighs the total number of buy orders, market participants may believe a price drop is imminent and trade accordingly. Similarly, if the balance of buy and sell orders changes abruptly, market participants may believe the new orders represent legitimate changes to supply and demand and therefore trade accordingly. Further, many market participants utilize automated trading systems that analyze the market for these types of order imbalances and use that information to determine trading strategies. Consequently, actions in the Order Book can, and do, affect the price of the E-mini S&P.

One reaction that a lot of people have is: Tough luck on them! If you're making your trading decisions based on the trading decisions that you think other people have made, you are a bad guy, and you deserve to be tricked. 

This is a reasonable intuition. John Arnold has endorsed it here at Bloomberg View, arguing that spoofers only harm "front-running" high-frequency traders who try to profit by trading ahead of other legitimate orders:

The battles between spoofers and front-runners are games being played between one computer and another in a tenth the time that it takes the human eye to blink. No human can see these trades, much less react to them in real time. The only party that is touched by the spoofer’s deception is the front-running HFT, whose strategies are harmful to every other market participant.

So ... what do we think? One framework that I often like to use in thinking about market structure is that high-frequency trading is controversial because it might make markets too efficient. This sort of depends on your definition of efficiency, but really simply, look at that CFTC description. If there are a lot of sellers at just above the current price, and not a lot of buyers at just below the current price, then that probably is an indication that sentiment is negative and the market will move down. Eventually some of that flock of sellers will get impatient and agree to sell at just below the current price, and down the price will go. If you recognize that fact faster than everyone else, you can make a bit of money by selling at the current price and buying back when the price drops. That is nice for you, but it also in some strict sense makes the market more efficient: If there are more sellers than buyers at (around) the current price, then the current price is "wrong," and your selling at the current price will help it move to the "correct" price more quickly. And by selling at the wrong price and buying back at the right price, you will make a bit of money to reward you for correcting the market.

That's sort of abstract and unsatisfying but basically true. Markets are more efficient when they move quickly to incorporate information. Order-book information is a kind of information. It is sort of second-degree information -- it reflects not, like, "what has changed in the fundamental value of this thing?", but rather "what has changed in people's perceptions of the fundamental value of this thing?" -- but it's still information, and incorporating it quickly makes the price more right.  This is what traditional market makers do -- if you are a market maker, and all your customers want to sell and none of them want to buy, you'll probably move your price down -- but high-frequency traders tend to do it faster and more mechanically.

Who cares if the price is right? Well, you might want the price to be right if you are an individual investor buying stock in your E*Trade account on your lunch break. You're buying at 12:45 not because you think the stock is unusually undervalued at 12:45, but because it's your lunch break. It would be reassuring if the market price was as accurate as possible, so that you'd know you've got a good chance of getting a fair price. Index funds, which are similarly value-agnostic, might have similar preferences. Accurate prices probably also make spreads smaller, so trading costs for both big and small investors are lower.

On the other hand, if you are a big fundamental investor looking to buy a lot of stock, you probably don't want a fair price. If you're a hedge fund manager who has spent months researching a company and come to the conclusion that its stock is undervalued, and you decide to buy 10 million shares, and you put in an order for 10 million shares at the current price, you will be sad if the price jumps up instantly. The whole point of all your research was to identify unfairly priced stocks; if high-frequency traders can just free-ride on your work by reacting to your order, then that feels like cheating. It feels like "front-running," actually, which is how you'd probably describe it.

The other day I said that "Norway's sovereign wealth fund wishes it could trade giant blocks of stock without impacting the price, which many large asset managers consider to be a fundamental human right." For a big fundamental investor, trading without moving prices is the goal of market structure, so anything that makes prices react more quickly to order information is bad.

And so a game exists in which big fundamental investors try to disguise their intentions so that the market doesn't efficiently incorporate those intentions into the price. They don't just blindly put in orders to buy 10 million shares at the current price. They break their orders into smaller pieces, and build (or rent from their brokers) algorithms to make their big orders harder to spot and less likely to move markets. John Arnold "spent millions of dollars developing a proprietary order-entry system to disguise and conceal strategies from external algorithms."

Okay, so, spoofing! Spoofing is about filling the order book with lies. Rather than just reflecting how much people want to buy and sell, and at what prices, the spoofed order book will be a more or less random collection of numbers. It will become un-informative. This is:

  • bad for the high-frequency traders who make markets based on order-book information;
  • bad for people who want prices to be maximally and instantaneously efficient, which may or may not include retail investors, index funds, etc.; and
  • good for people who don't want the market to instantly incorporate order information, which probably includes big fundamental investors.

So when regulators write and aggressively enforce rules against spoofing, they are in a sense favoring high-frequency traders over fundamental investors, sure. They are also making some other regulatory choices -- favoring market efficiency over rewarding research, punishing dishonesty regardless of its broader good or bad effects -- that are debatable but defensible. They are also creating gray areas, by banning a practice -- "spoofing," or lying about your trading intent -- that is dangerously close to another practice -- disguising your trading intent -- that is absolutely normal and ethical and even necessary.

There's a complication to this, though. Those algorithms that big funds use to disguise their intentions and avoid being "front-run" sometimes mimic the decision-making that high-frequency traders use. In particular, they rely on the information in the order book to decide how much to buy or sell, how fast, and at what price. Kipp Rogers:

These days, algorithmic trading tools are used by a wide class of traders. There is an entire industry, possibly larger than that of vanilla HFT, focused on creating and marketing these tools. Tremendous volume is executed via algorithms on behalf of traditional long-term traders. I’m not an expert on such algorithms, but my impression is that they tend to be much less sophisticated than a lot of vanilla HFT, and thus more likely to be tricked by spoofing. 

So while an equilibrium with lots of legalized spoofing might, as Arnold argues, be good for big fundamental investors, in the current equilibrium spoofing is probably bad for them: Their algorithms are often the ones that get spoofed.

Finally, what about spoofing and the flash crash? I obviously don't think that Navinder Singh Sarao caused the flash crash. For one thing, he turned off his spoofing algorithm a few minutes before the crash. Also, as Craig Pirrong puts it, "The complaint alleges that Sarao employed the layering strategy about 250 days, meaning that he caused 250 out of the last one flash crashes." But it's certainly possible that he contributed to it. I've heard a couple of theories on that. One, which I mentioned yesterday, is that his spoofing might have interacted badly with the algorithm that Waddell & Reed -- a big fundamental investor -- was using to sell E-mini futures. This is a story of spoofing tricking fundamental investors, with bad results for everyone. Another theory, which I heard from a high-frequency trader today, is that by spoofing high-frequency traders out of $879,018 -- Sarao's alleged profits on the day of the flash crash -- he might have caused them to hit their loss limits and shut off their own trading algorithms. Without the usual market-makers to provide liquidity, the market would have been more easily spooked than usual, and it would have been a lot easier for it to produce the wrong price. Which, for a few minutes on May 6, 2010, it definitely did.

To contact the author on this story:
Matt Levine at mlevine51@bloomberg.net

To contact the editor on this story:
Tobin Harshaw at tharshaw@bloomberg.net