Hacker News from Y Combinator

Syndicate content
Links for the intellectually curious, ranked by readers. // via fulltextrssfeed.com
Updated: 4 hours 4 min ago

Parents with annual family incomes below $125,000 will pay no tuition

4 hours 4 min ago
Stanford Report, March 27, 2015

Stanford has extended undergraduate admission offers to the Class of 2019 and announced an increase in financial aid. Now, parents with annual family incomes below $125,000 and typical assets will be expected to pay no Stanford tuition.

Stanford University has offered admission to 2,144 students, including 742 applicants who were accepted last December through the early action program, the Office of Undergraduate Admission and Financial Aid announced today.

In addition, Stanford announced that it is expanding financial aid by increasing the income thresholds at which parents are not expected to contribute toward educational costs.

Under the new policy, Stanford will expect no parental contribution toward tuition from parents with annual incomes below $125,000 – previously $100,000 – and typical assets. And there will be zero parental contribution toward tuition, room or board for parents with annual incomes below $65,000 – previously $60,000 – and typical assets.

"Our highest priority is that Stanford remain affordable and accessible to the most talented students, regardless of their financial circumstances," said Provost John Etchemendy. "Our generous financial aid program accomplishes that, and these enhancements will help even more families, including those in the middle class, afford Stanford without going into debt. Over half of our undergraduates receive financial aid from Stanford, and we are pleased that this program will make it even easier for students to thrive here."

Admits to Class of 2019

The Class of 2019 was selected from 42,487 candidates, the largest applicant pool in Stanford's history. The admitted students come from 50 states and 77 countries.

Of the admitted class, 16 percent are first-generation college students.

"We are honored by the interest in Stanford and the experiences shared by all prospective students through the application process," said Richard Shaw, dean of admission and financial aid. "The young people admitted to the Class of 2019 will engage their undergraduate years at Stanford with energy and initiative. Their contributions will impact the world in immeasurable ways. We are thrilled to communicate the good news to these accomplished students. The opportunities at Stanford are limitless, and our newly enhanced financial support makes these opportunities more accessible than ever before."

Students admitted under the early and regular decision admission program have until May 1 to accept Stanford's offer.

Expanded financial aid

Stanford has long been committed to need-blind admissions for U.S. students, supported by a financial aid program that meets the demonstrated financial need of all admitted undergraduate students.

Since 2008-09, Stanford has provided two simple benchmarks that make it easy for prospective students to understand the possibilities for getting financial support from Stanford. These two benchmarks are being updated for all undergraduates for the 2015-16 year, with no parental contribution toward tuition expected for those with annual incomes below $125,000 and typical assets, and no parental contribution toward tuition, room or board expected for those below $65,000 with typical assets. Scholarship or grant funds will be provided to cover these costs in lieu of a parental contribution.

In either case, students will still be expected to contribute toward their own educational expenses from summer income, savings and part-time work during the school year. Students are expected to contribute at least $5,000 per year from these sources but are not expected to borrow to make the contribution.

Currently, 77 percent of Stanford undergraduates leave the university at graduation with no student debt.

Families with incomes at higher levels, typically up to $225,000, may also qualify for financial assistance, especially if more than one family member is enrolled in college. Financial aid offers vary by family, but the financial aid expansion for 2015-16 will allow Stanford to reduce the expected parental contribution for many families at these higher income levels.

Annual costs for a typical Stanford student total roughly $65,000 before financial aid.

"This expansion of the financial aid program is a demonstration of Stanford's commitment to access for outstanding students from all backgrounds – including not only those from the lowest socioeconomic status, but also middle- and upper-middle-class families who need our assistance as well," said Karen Cooper, associate dean and director of financial aid.

Richard H. Shaw, dean of admission and financial aid: (650) 723-2091, rhshaw@stanford.edu

Suburban sprawl is stifling the US economy

4 hours 4 min ago

As the nation's metro areas grow ever larger, the sprawling suburbs can come with all sorts of problems: endless commutes, growing pollution, and unwalkable neighborhoods, to name a few. Now, a pair of studies released this week suggests that America's evolving cities are also bad for the economy. Not only are mammoth, spread-out metro areas economically wasteful, but they're also hurting Americans' job prospects as work disperses out into the suburbs.

One new report finds that suburban sprawl in US cities costs the country more than $1 trillion a year. (That's huge; as context, consider that the US annual GDP is around $17 trillion.) According the London School of Economics Cities project and the Victoria Transport Policy Institute, individuals and cities in the US pay $1 trillion to $1.1 trillion in additional infrastructure, public works, driving, and health costs as a result of these massive metro areas.

The additional spending on road building, parking, and police and emergency services that come with a spread-out metro area really add up, they write — sprawl increases infrastructure and public services costs by 10 to 40 percent, the report says.

It's a huge cost to try to calculate, of course, and considering various cost estimates (for road costs and land values, etc.) that author Todd Litman includes, the $1 trillion figure is perhaps best seen as just one attempt at a ballpark estimate. For his part, Litman writes that his $1 trillion figure doesn't even include cost estimates for other problems sprawl can cause — lower social mobility and more traffic accidents, for example.

More jobs, but fewer opportunities

Yet another study released this week suggests that jobs migrating to the suburbs keep people from working, and that the problem is especially pronounced in America's poorest neighborhoods.

The number of jobs within typical commuting distance to residents of major metro areas fell by 7 percent between 2002 and 2012, according to a new study from the Brookings Institution.

In addition, the situation was even worse for poor and minority neighborhoods. While the number of jobs nearby fell by 7 percent for all large-metro-area residents, it fell by 14 percent for blacks and 17 percent for Hispanics, as well 17 percent for poor neighborhoods.

(Brookings Institution)

One big contributor to that trend in hard-to-reach jobs also appears to be sprawl. The distance between people and work grew partly because in large metro areas nationwide, jobs both moved to the suburbs and spread out more, Brookings explains. Between 2002 and 2012, the number of jobs in urban areas fell by nearly 2 percent, but the number in suburban areas grew by more than 4 percent. However, the density of jobs in suburban areas fell, meaning that even as the number of jobs grew, they also spread even farther apart.

This paper deals in the broader phenomenon of the distance between workers and jobs, and not sprawl, per se, but it's true that the trend of far-flung jobs also grew over this time period. Other Brookings research found that between 2000 and 2010, the share of jobs located within three miles of downtown cities declined in 91 out of the 100 largest metro areas. Meanwhile, the share of jobs located 10 or more miles away from those city centers grew in 85 out of 100 metro areas.

This has implications far beyond long commute times; it matters because living farther from jobs tends to mean a longer job search, as researchers found in a 2014 NBER working paper. It also means longer stretches of unemployment — so even as jobs came back during the recent recovery, they didn't all return in places where unemployed Americans could reach them.

The ladder gets even tougher to climb

It's not just that sprawl can help keep unemployment high and make the economy less efficient; it also may play a part in the growing, entrenched divide between the richest and poorest Americans.

In a landmark 2014 study on social mobility across US metro areas, researchers Raj Chetty, Emmanuel Saez, Nathan Hendren, and Patrick Kline found that cities with less sprawl (as measured in shorter commute times) tended to have higher social mobility — that is, in those cities, children born to poor parents were more likely to move up the income ladder. The University of Utah's Metropolitan Research Center reinforced this in a 2014 report. Reid Ewing and Shima Hamidi created an index score measuring a city's compactness. And that score had a marked relationship to how well kids fare later in life:

"For every 10 percent increase in an index score, there is a 4.1 percent increase in the probability that a child born to a family in the bottom quintile of the national income distribution reaches the top quintile of the national income distribution by age 30," they wrote.

The researchers stop short of showing a direct causal relationship between sprawl and mobility, but their research contains a few clues to how this relationship might happen. For example, they find there's more economic opportunity in compact cities, as well as lower housing and transportation costs.

All of this points to an important policy dynamic: national-level politicians may tout their social-mobility and job-creation agendas ahead of the 2016 elections, but that can only go so far, particularly as cities spread out and become more economically segregated. State- and local-level policies may be the key to reducing economic gaps — promoting better zoning, affordable housing, and public transit may be the best route to combat sprawl, and it's something state legislators and mayors may be best equipped to do.

Notes from Facebook's Developer Infrastructure at Scale F8 Talk

4 hours 4 min ago

Any time Facebook talks about technical matters I tend to listen. They have a track record of demonstrating engineering leadership in several spaces. And, unlike many companies that just talk, Facebook often gives others access to those ideas via source code and healthy open source projects. It's rare to see a company operating on the frontier of the computing field provide so much insight into their inner workings. You can gain so much by riding their cotails and following their lead instead of clinging to and cargo culting from the past.

The Facebook F8 developer conference was this past week. All the talks are now available online. I encourage you to glimpse through the list of talks and watch whatever is relevant to you. There's really a little bit for everyone.

Of particular interest to me is the Big Code: Developer Infrastructure at Facebook's Scale talk. This is highly relevant to my job role as Developer Productivity Engineer at Mozilla.

My notes for this talk follow.

"We don't want humans waiting on computers. We want computers waiting on humans." (This is the common theme of the talk.)

In 2005, Facebook was on Subversion. In 2007 moved to Git. Deployed a bridge so people worked in Git and had distributed workflow but pushed to Subversion under the hood.

New platforms over time. Server code, iOS, Android. One Git repo per platform/project -> 3 Git repos. Initially no code sharing, so no problem. Over time, code sharing between all repos. Lots of code copying and confusion as to what is where and who owns what.

Facebook is weeks away from completing their migration to consolidate big three repos to a Mercurial monorepo.

Reasons:

  1. Easier code sharing.
  2. Easier large-scale changes. Rewrite the universe at once.
  3. Unified set of tooling.

Facebook employees run >1M source control commands per day. >100k commits per week. VCS tool needs to be fast to prevent distractions and context switching, which slow people down.

Facebook implemented sparse checkout and shallow history in Mercurial. Necessary to scale distributed version control to large repos.

Quote from Google: "We're excited about the work Facebook is doing with Mercurial and glad to be collaborating with Facebook on Mercurial development." (Well, I guess the cat is finally out of the bag: Google is working on Mercurial. This was kind of an open secret for months. But I guess now it is official.)

Push-pull-rebase bottleneck: if you rebase and push and someone beats you to it, you have to pull, rebase, and try again. This gets worse as commit rate increases and people do needless legwork. Facebook has moved to server-side rebasing on push to mostly eliminate this pain point. (This is part of a still-experimental feature in Mercurial, which should hopefully lose its experimental flag soon.)

Starting 13:00 in we have a speaker change and move away from version control.

IDEs don't scale to Facebook scale. "Developing in Xcode at Facebook is an exercise in frustration." On average 3.5 minutes to open Facebook for iOS in Xcode. 5 minutes on average to index. Pegs CPU and makes not very responsive. 50 Xcode crashes per day across all Facebook iOS developers.

Facebook measures everything about tools. Mercurial operation times. Xcode times. Build times. Data tells them what tools and workflows need to be worked on.

Facebook believes IDEs are worth the pain because they make people more productive.

Facebook wants to support all editors and IDEs since people want to use whatever is most comfortable.

React Native changed things. Supported developing on multiple platforms, which no single IDE supports. People launched several editors and tools to do React Native development. People needed 4 windows to do development. That experience was "not acceptable." So they built their own IDE. Set of plugins on top of ATOM. Not a fork. They like hackable and web-y nature of ATOM.

The demo showing iOS development looks very nice! Doing Objective-C, JavaScript, simulator integration, and version control in one window!

It can connect to remote servers and transparently save and deploy changes. It can also get real-time compilation errors and hints from the remote server! (Demo was with Hack. Not sure if others langs supported. Having beefy central servers for e.g. Gecko development would be a fun experiment.)

Starting at 32:00 presentation shifts to continuous integration.

Number one goal of CI at Facebook is developer efficiency. We don't want developers waiting on computers to build and test diffs.

3 goals for CI:

  1. High-signal feedback. Don't want developers chasing failures that aren't their fault. Wastes time.
  2. Must provide rapid feedback. Developers don't want to wait.
  3. Provide frequent feedback. Developers should know as soon as possible after they did something. (I think this refers to local feedback.)

Sandcastle is their CI system.

Diff lifecycle discussion.

Basic tests and lint run locally. (My understanding from talking with Facebookers is "local" often means on a Facebook server, not local laptop. Machines at developers fingertips are often dumb terminals.)

They appear to use code coverage to determine what tests to run. "We're not going to run a test unless your diff might actually have broken it."

They run flaky tests less often.

They run slow tests less often.

Goal is to get feedback to developers in under 10 minutes.

If they run fewer tests and get back to developers quicker, things are less likely to break than if they run more tests but take longer to give feedback.

They also want feedback quickly so reviewers can see results at review time.

They use Web Driver heavily. Love cross-platform nature of Web Driver.

In addition to test results, performance and size metrics are reported.

They have a "Ship It" button on the diff.

Landcastle handles landing diff.

"It is not OK at Facebook to land a diff without using Landcastle." (Read: developers don't push directly to the master repo.)

Once Landcastle lands something, it runs tests again. If an issue is found, a (Phabricator) task is filed. Task can be "push blocking." Code won't ship to users until the "push blocking" issue resolved. (I guess they don't do backouts? I guess they rely on catching all serious issues before a backout would be necessary?)

After a while, branch cut occurs. Some cherry picks onto release branches.

In addition to diff-based testing, they do continuous testing runs. Much more comprehensive. No time restrictions. Continuous runs on master and release candidate branches. Auto bisect to pin down regressions.

Sandcastle processes >1000 test results per second. 5 years of machine work per day. Thousands of machines in 5 data centers.

They started with buildbot. Single master. Hit scaling limits of single thread single master. Master could not push work to workers fast enough. Sandcastle has distributed queue. Workers just pull jobs from distributed queue.

"High-signal feedback is critical." "Flaky failures erode developer confidence." "We need developers to trust Sandcastle."

Extremely careful separating infra failures from other failures. Developers don't see infra failures. Infra failures only reported to Sandcastle team.

Bots look for flaky tests. Stress test individual tests. Run tests in parallel with themselves. Goal: developers don't see flaky tests.

There is a "not my fault" button that developers can use to report bad signals.

"Whatever the scale of your engineering organization, developer efficiency is the key thing that your infrastructure teams should be striving for. This is why at Facebook we have some of our top engineers working on developer infrastructure." (Preach it.)

Excellent talk. Mozillians doing infra work or who are in charge of head count for infra work should watch this video.

Clean Up Your Mess – A Guide to Visual Design for Everyone

28 March 2015 - 1:00pm
Preface 1. What is Clean Design?

Have a look at the two flyers below. Which one is more appealing to you? Which one looks cleaner?

If the one on the right looks cleaner to you, then – hooray! I've done my job. But what does it mean to "look cleaner"?

1.1. Clean Designs Reduce the Effort Needed to Find Information

Whatever you're creating — a brochure, a resume, a web page, a party invitation — the basic purpose it must serve is to convey the information your audience is interested in. It must help your audience answer questions like "Is this document the one I'm looking for?" and "What text explains this chart?" and "By what time is the ransom due?" For example, the aikido flyer helps people answer, "What's this about?" and "When is it?" and "Where is it?".

The two versions of the flyer provide the exact same information. The one on the right, however, makes it easier for people to find that information.

That flyer is clean, then, because it helps people find and consume the information within it with less conscious effort. It's designed so that its visual qualities allow the brain's visual thinking capabilities to make correct assumptions about how the different bits of content are organized and about what's relevant.

This is desirable because visual thinking happens much more quickly than conscious, step-by-step logical thinking. For example, if you wanted to understand what the main ideas were from just the text of the aikido flyers, it would take much longer and be more tedious than it would to use even the "messy" flyer. You can see for yourself by looking at the text to the right.

To answer the question "What is clean design?" most succinctly: a clean design is one that supports visual thinking so people can meet their informational needs with a minimum of conscious effort.

Aikido. Beginner class. Starts Sunday, April 27, 2008, 1:00 - 2:00 p.m. 8-week course — $95. Adult class (12 and older). No martial arts experience necessary. Call to reserve a space. Regular classes. Tuesday 7:30 p.m. Thursday 7:45 p.m. Sunday 1:00 & 2:15 p.m. Come visit. Please come and visit any of our classes to determine if aikido is right for you!

1.2. Informational Needs

People have the same basic informational needs when looking at a document you've designed:

  1. Deciding relevance. Do I even care?
  2. Getting an overview. What are the main ideas? What's most important?
  3. Basic comprehension. What text explains this chart?
  4. Retrieving buried details. I remember something about an orangutan... where was that?
  5. Finding actionable details. How do I get in touch?

The informational needs of a single person will usually evolve in roughly the order listed above. For example, if someone's looking at a brochure, his thought process might be something like:

  1. What's this brochure about? Hm... looks like it's about phrenology. I need a phrenologist!
  2. What services do they provide? Are they certified? Are they local? My heavens — yes, yes, and yes!
  3. How do I get in touch? Oh goody, there's the phone number! I'll call when I get home.
  4. (Later, at home) Now where was that number...

The important thing to remember is that you're not making your whatever-it-is-you're-making for you. You're making it to help someone else find the information she needs.

This Institute totally exists

1.3. Supporting Visual Thinking

You convey information by the way you arrange a design's elements in relation to each other. This information is understood immediately, if not consciously, by the people viewing your designs. This is great if the visual relationships are obvious and accurate, but if they're not, your audience is going to get confused. They'll have to examine your work carefully, going back and forth between the different parts to make sure they understand.

If you want to feel what this is like, try saying the colors of the words to the right. For example, for the first word you would say "red". Now, try saying the words themselves. Does your brain hurt yet?

It's harder to say the color of each word than it is to say the word itself because our brains automatically determine the semantic meaning of the word. This conflicts with the identification of the color of the word, a process which is not automatic. Our brains must then resolve the conflicting interpretations of the visual stimuli, a process which takes work. (This is known as the Stroop Effect.)

When you're creating a document, you want to eliminate these conflicts between automatically perceived meaning and actual meaning. The rest of this guide is dedicated to doing just that by explaining three visual features: size, proximity, and alignment. It will also explain a valuable tool: elimination. But before that...

purple blue green yellow red red purple green yellow blue blue yellow green red purple yellow blue green red purple yellow red blue green purple green red yellow purple blue purple green red blue yellow yellow red green purple blue blue green purple yellow red green yellow purple blue red yellow green red blue purple yellow purple red blue

1.4. The Most Important Thing to Remember

The basic cause of messy design is slight, unintentional differences among elements. This causes your brain's visual processing to falter. First, your brain has to determine if there actually is a visual difference. Then, it has to determine the significance of that difference. Because the small discrepancies don't actually signify anything, you end up wasting your audience's brainpower. Strive for a consistent visual style where elements which are logically similar look similar to each other and look unambiguously different otherwise.

You can see this in the image at the right. In each set, a black dot differs in size and a green dot differs in color (well it's no longer green in the bottom set, but you get the picture). By using clear, intentional contrast, it's easier to distinguish the different kinds of elements in your design. That makes it easier to find information. And that's what makes clean design.

2. Size

Have another look at the aikido flyers:

How do they use size differently? How do the differences contribute (or detract from) their understandability?

There are two main differences in how size is used between the two flyers. First, the contrast between the header text and the details is much greater; the header text is much larger. Second, size is used consistently among similar elements.

2.1. Use Size Consistently to Indicate Role

These improvements serve to clearly reveal the roles of the different bits of text. The improved flyer makes it clear what's header text and what's detail text. Even if the text were in another language, you would be able to differentiate the headers from the details.

Can you say the same about the original? It has slight changes in font size between one block of text and the next, and no two blocks look quite alike.

For example, the block starting with "Please come and visit" is slightly smaller, but in all caps and bold. Can you easily tell what's header text and what's detail text?

Click to view full-size

2.2. Recognizable Roles Help Users Find Information

Why is it important to make these roles (in this case, header and detail) obvious? Making these roles visually distinct helps people find the info they need. We rely on headers to help us get an idea of what the document's about. We also use them to narrow in on the details we're most interested in. At the same time, we know to pay less attention to headers when we're going through detail text.

Using size to clearly distinguish the roles of the different bits of content helps people efficiently direct their attention. They know what to focus on and what to ignore, depending on what they're looking for.

You can see the same idea at work at a supermarket. The aisle signs act as headers — they're short, readily noticeable, and they're meant to help you find the brownie mix that you're for some reason craving at 2:00 am.

2.3. Size Summary Identify the roles of your visual elements.

What purposes do the different bits of text serve? What questions do they help answer? Headers answer questions like "what is this about?" and "where is the contact info?"

Size elements consistently.

Headers should look like other headers. Detail text should look like other detail text. Small variations in size are confusing, because the brain has to figure out if the differences are meaningful.

Provide strong contrast between elements with different roles.

This helps people easily identify the different kinds of content in your design, and that helps them find the information they need.

3. Proximity

Have a look at the kittens below:

When you looked at the collection on the right, did you immediately recognize that there are two groups of kittens? How about the collection on the left - are the kittens grouped?

3.1. Elements Placed Near Each Other Form Groups

When we see kittens (or any other visual elements) placed closely to each other our brains immediately assume that they form a group sharing a unifying concept. The kittens to the right are grouped by color — grey on the bottom, non-grey on the top.

By grouping related content and visuals together you help the user quickly find the information he needs. To prevent confusion and frustration, group elements which are actually related. That sounds obvious, but it's easy to forget.

Examples of this concept abound. For myself, I have this great habit of absent-mindedly setting down whatever's in my hand wherever I am. This once resulted in my leaving an open can of cat food in a cupboard amongst piles of tupperware. This made it hard to later find the cat food, as it was not in close proximity to the cat dish or to anything remotely related to feeding cats.

I know what you're thinking: "That's gross! Why would you include that story?" I'll tell you why: it's so you'll think twice before you misuse or abuse proximity in your designs.

3.2. Make Sure the Grouping Is Obvious

Check out the photos to the right. In the left group, you can't really tell who the caption applies to. Which guy has a complicated relationship to hotdogs!? You don't really know. Maybe you can guess based on their facial expressions and on your own experience with hotdogs.

In the right group, however, you can see exactly which photo the caption is meant to apply to.

The lesson is that you should leave much more whitespace between non-related elements than related elements in order to make the logical groupings visually clear.

Click to view full-size

3.3. Proximity Summary Place related items close to each other.

Our visual brains assume that items placed close to each other form logically related groups.

Make sure the grouping is obvious.

Put enough whitespace between groups to make it clear what elements are actually grouped together. Otherwise people have to go through the trouble of closely examining your design to figure out what goes with what.

4. Alignment

Alignment is crucial to giving your designs a clean appearance and to conveying organization. Slight misalignments are confusing and look messy. On the other hand, using strongly contrasting alignments can make your design more interesting and attractive.

Alignment is one of those features that's easy to overlook — newbies usually don't give any conscious attention to it. However, you can really change the character of your design by changing the alignment.

4.1. Use Alignment to Make It Look Clean

Time for an example! Behold!

Sloppy Alignment

Centerish Alignment

Left Alignment

The slight misalignments in the leftmost "ad" make it look sloppy. If there's one thing you should have learned by now, it's that small, unintentional visual differences are like sand in the engine of your visual brain.

Our brains expect related content to be lined up neatly. If something is slightly out of line, our brains assume there's a reason and try to find it. Since the last line of the ad ("Trapping Hands...") is out of line, our brains search for a reason. When they don't find one they send out minute amounts of grumpiness-inducing chemicals that cause us to consider the ad sloppy.

The center ad employs center alignment and doesn't look much better. To be fair, it's not completely centered, but even if it were it would look bad. Centering gives text ragged left and right edges, which look worse the more text there is.

In general, the uncouth barbarians who haven't read this guide are especially prone to center-aligning everything. You can see this in the original aikido flyer, shown to the right.

Often, the best bet is to use left- or right-alignment. You can see left-alignment in the rightmost "Acatemy of Evil" ad. It won't win any awards, but it does look better than the other two ads.

Click to view full-size

4.2. Use Alignment to Make It Look Cool

You can make designs look more interesting out by intentionally using different alignments. Have a look at the image to the right.

The headlines "A Checklist for Content Work" and "CSS Floats 101" are centered, as are their bylines. By contrast, the headline "Content-tious Strategy" is left-aligned.

This difference in centering subtly helps to distinguish the content in the left column, indicating that it's more noteworthy than the content in the right column. There are other visual features at play, too. For example, the headlines in the left column are slightly larger and the column is wider. Still, the center alignment is a nice touch.

By the way, this example is a cropped screen shot of the A List Apart home page. The web site exhibits superb design, and you can learn a lot just by looking at it.

Click to view full-size

4.3. Alignment Summary Avoid Slight Misalignments

Slight misalignments look messy. Because the brain thinks that visual differences mean something, a misalignment makes it waste effort trying to figure out the meaning.

Break Your Center Alignment Habit

Center alignment often gives a much "weaker" impression than left or right alignment. It looks especially bad with many lines of text.

Experiment With Contrasting Alignments

Consciously using different alignments in your design can serve to make it look more interesting. It can also make it easier for people to find information in it by further distinguishing the separate sections.

5. Elimination

Often, people add extra lines, boxes, bullets and other visual flotsam in order to convey information that's adequately conveyed with Size, Proximity and Alignment. Including this fluff makes your user’s brain work harder, as it has to figure out the significance of these elements. For example:

Ah yes, our trusty aikido flyers. See how the original flyer has a bunch of lines? Some of the lines act as boundaries, enclosing the entire space and dividing the left and right sections. The rest of the lines separate sub-sections.

All of the lines are thoroughly unnecessary. Proximity and alignment alone provide our visual brains with enough information to distinguish the various chunks of content.

Usually the folks who add these unneeded visual elements do so because there's too much crammed in the design. It's like when you're at a bar and the people around you are talking loudly, so you start talking louder, so they start talking louder, and so on until everyone passes out or goes home.

The solution isn't to add more noise to the design. If instead you use what you've learned here — Size, Proximity, and Alignment — you'll end up with a design that's a pleasure to look at.

6. Learning More

There's so much more to design! Color, typography, harmony, balance, symmetry, rhythm — the list goes on and on, and it's all super fun. The best way to really "get" design, though, is to do it. Just like pickle backs (google it!).

6.1. Observing and Doing

One of my favorite pastimes is analyzing the visual design of the stuff I encounter in everyday life. Menus are probably my favorite target: is it easy to tell which price goes with which item? Is it easy to find the dish I want?

It's also fun to keep an eye out for designs produced by average Joes and Janes. Basically anything you'd see tacked to a public cork board.

I recommend you take up this practice. Ask yourself what you would do differently. Could you make it cleaner? What would you remove? You can start with this web site, even. What could be made clearer? Could the flyers be improved?

You could also ask to critique the work of your co-workers, or have them critique yours. If you have time, try to redesign their work to see if you can make it look cleaner and clearer.

6.2. Further Reading

Another great way to learn more is to read design books. As you learn more design concepts, you'll be able to notice them in other designs and apply them in your own. Fun!

Below is a list of books I've read myself which I consider useful for beginners. If you click the links I provide and buy stuff, then Amazon gives me some money. Hooray!

Universal Principles of Design
by William Lidwell, Kritina Holden, and Jill Butler

One of the biggest challenges in learning a new discipline is learning the vocabulary. This book gives you the names for 125 design concepts, with great textual descriptions of the concept and plenty of illustrations. Coverage is succinct and engaging at the same time.

The best thing about this book is that it lends itself to random reading. Pop it open to any page and you'll learn something new.
Buy on Amazon

The Non-Designer's Design Book
by Robin Williams

This was the first design book I read. It opened my eyes to the world of design, for which I'm very grateful. Robin covers some of the same material as I cover here, but in a different way.

The book also covers typography, and I still think it's the best beginner's guide out there. Worth it for that alone. You can probably find an older edition for a couple bucks.
Buy on Amazon

Don't Make Me Think
by Steve Krug

This is a quick and readable book on web usability. It really made me understand that design isn't about me trying to show off my creative ability; it's about meeting the needs of your users.

It's a must-have for web developers, but I think almost everyone else would find it interesting too.
Buy on Amazon

Visual Thinking for Design
by Colin Ware

If you're interested in the underlying mechanics of visual perception and how to bring that to design, this is the book for you. The scientific discussion of design will tell you, at the deepest level, why different techniques and principles work.

It assumes a degree of familiarity with design concepts, but don't let that deter you, especially now that you've read this guide!
Buy on Amazon

Here are some additional links which I've hastily thrown together:

7. Closing Thoughts

I hope that you've found this guide informative and entertaining. More than that, I hope you'll feel more creative and confident the next time you have to create any form of visual communication.

I intentionally made this site free. I love design, and want to share that love with as many people as possible. If the site was helpful to you, share it with others :) It makes me happy to know that my work is in some small way helping people. I'd love to hear any feedback on it, either through email, twitter (@nonrecursive), or on the discussion page. If you have any questions or criticism, I would love to hear that too.

Happy designing!

8. Thank Yous

A lot of people took time to give me feedback on this site, for which I am very grateful. Here they are!

  • Pat Shaughnessy
  • Alex Rothenberg
  • Daniel Rodriguez
  • Jess Bulu
  • Lea Downing
  • Su Jones
  • Chris Laws
  • Nicole Rose
  • Nadir Ait-Laoussine
  • Zed Shaw
  • Chad Wegner

A fast, offline reverse geocoder in Python

28 March 2015 - 1:00pm
README.md

A Python library for offline reverse geocoding. It improves on an existing library called reverse_geocode developed by Richard Penman.

@thampiman ajaythampi.com

GeoNames.

PyPI.

  1. reverse_geocode)
  2. Mode 2: Multi-threaded K-D Tree (default)
import reverse_geocoder as rg coordinates = (51.5214588,-0.1729636),(13.9280531,100.3735803) results = rg.search(coordinates) # default mode = 2 print results

The above code will output the following:

[{'admin1': 'England', 'admin2': 'Greater London', 'cc': 'GB', 'lat': '51.51116', 'lon': '-0.18426', 'name': 'Bayswater'}, {'admin1': 'Nonthaburi', 'admin2': '', 'cc': 'TH', 'lat': '13.91783', 'lon': '100.42403', 'name': 'Bang Bua Thong'}]

If you'd like to use the single-threaded K-D tree, set mode = 1 as follows:

results = rg.search(coordinates,mode=1)

Mode 2 runs ~2x faster for very large inputs (10M coordinates).

  1. reverse_geocode library
  2. Parallelised implementation of K-D Trees is extended from this article by Sturla Molden
  3. Geocoded data is from GeoNames

A Loss for Words: Can a Dying Language Be Saved?

28 March 2015 - 1:00pm
The consequences of losing a language may not be understood until it is too late. Credit Illustration by Stephen Doyle

It is a singular fate to be the last of one’s kind. That is the fate of the men and women, nearly all of them elderly, who are—like Marie Wilcox, of California; Gyani Maiya Sen, of Nepal; Verdena Parker, of Oregon; and Charlie Mungulda, of Australia—the last known speakers of a language: Wukchumni, Kusunda, Hupa, and Amurdag, respectively. But a few years ago, in Chile, I met Joubert Yanten Gomez, who told me he was “the world’s only speaker of Selk’nam.” He was twenty-one.

Yanten Gomez, who uses the tribal name Keyuk, grew up modestly, in Santiago. His father, Blas Yanten, is a woodworker, and his mother, Ivonne Gomez Castro, practices traditional medicine. As a young girl, she was mocked at school for her mestizo looks, so she hesitated to tell her children—Keyuk and an older sister—about their ancestry. They hadn’t known that their maternal relatives descended from the Selk’nam, a nomadic tribe of unknown origin that settled in Tierra del Fuego. The first Europeans to encounter the Selk’nam, in the sixteenth century, were astonished by their height and their hardiness—they braved the frigid climate by coating their bodies with whale fat. The tribe lived mostly undisturbed until the late eighteen-hundreds, when an influx of sheep ranchers and gold prospectors who coveted their land put bounties on their heads. (One hunter boasted that he had received a pound sterling per corpse, redeemable with a pair of ears.) The survivors of the Selk’nam Genocide, as it is called—a population of about four thousand was reduced to some three hundred—were resettled on reservations run by missionaries. The last known fluent speaker of the language, Angela Loij, a laundress and farmer, died forty years ago.

Many children are natural mimics, but Keyuk could imitate speech like a mynah. His father, who is white, had spent part of his childhood in the Arauco region, which is home to the Mapuche, Chile’s largest native community, and he taught Keyuk their language, Mapudungun. The boy, a bookworm and an A student, easily became fluent. A third-grade research project impassioned him about indigenous peoples, and Ivonne, who descends from a line of shamans, took this as a sign that his ancestors were speaking through him. When she told him of their heritage, Keyuk vowed that he would master Selk’nam and also, eventually, Yagán—the nearly extinct language of a neighboring people in the far south—reckoning that he could pass them down to his children and perhaps reseed the languages among the tribes’ descendants. At fourteen, he travelled with his father to Puerto Williams, a town in Chile’s Antarctic province that calls itself “the world’s southernmost city,” to meet Cristina Calderón, the last native Yagán speaker. She subsequently tutored him by phone.

If it is lonely to be the last of anything, the distinction has a mythic romance: the last emperor, the last of the Just, the last of the Mohicans. Keyuk’s precocity enhanced his mystique. A Chilean television station flew him to Tierra del Fuego as part of a series, “Sons of the Earth,” that focussed on the country’s original inhabitants. He was interviewed, at sixteen, by the Financial Times. A filmmaker who knew him put us in touch, and we met at a café in Santiago.

It was a mild autumn morning during Easter week. The city was quiet after a series of student demonstrations protesting tuition costs. Keyuk, who was studying linguistics on a scholarship at the University of Chile, supported their cause. (“The word ‘Selk’nam’ can mean ‘We are equal,’ ” he noted, “though it can also mean ‘we are separate.’ ”) Keyuk is tall, loose-limbed, and baby-faced, with a thatch of black hair. His style is nonchalant—stovepipe jeans and a leather jacket. Since his teens, Keyuk has composed songs in Selk’nam, and he performs with an “ethno-electronic” band. But he carried himself with solemnity, as if conscious of the flame he tended—or, at least, said that he tended. How, I asked, could I be sure that he really spoke Selk’nam, if no one else did? He smiled slightly and said, “I guess I have the last word.”

Keyuk’s voice is a boyish tenor, but when he speaks Selk’nam it changes; the language is harsher and more percussive than Spanish. To master the grammar and the vocabulary, he had studied, among other texts, a lexicon published in 1915 by José María Beauvoir, a Salesian missionary. The sound of the language was preserved in recordings that the eminent anthropologist Anne Chapman made forty years ago. Chapman, a protégée of Claude Lévi-Strauss, was an early activist for endangered languages in Meso- and South America. Cristina Calderón, Keyuk’s tutor, was one of her subjects, and, having heard of Keyuk’s projects, Chapman sought him out in Santiago, about ten years ago. She was then in her mid-eighties; she died in 2010.

I joined Keyuk and his mother the next evening for dinner at a restaurant in the old fish market, where the local sea bass is a specialty. Ivonne is petite, blond, and animated, but, like Keyuk, she has a regal poise, and it is hard to imagine her as a bullied outcast. We shouted cheerfully above the din, though Keyuk seemed detached—as prodigies grow out of their teens, they sometimes mistrust the curiosity they have inspired. But when he spoke of the Selk’nam it was with intensity. “Our mythology is rich,” he said. “Everything in our world—plants and animals, the sun and stars—has a voice. On our map of the universe, we called the East ‘the space without time’ ”—the realm of the unknown. “We had a Paleolithic skill set yet a boundless imagination. They both existed with a high degree of social conformity. Long after we dispersed, we preserved our beliefs.” He added, “One precious thing, to me, about the language is its vocabulary of words for love. They change according to the age, sex, and kinship of the speakers and the nature of the emotion. There are things you can’t say in Spanish.” 

There are approximately seven billion inhabitants of earth. They conduct their lives in one or several of about seven thousand languages—multilingualism is a global norm. Linguists acknowledge that the data are inexact, but by the end of this century perhaps as many as fifty per cent of the world’s languages will, at best, exist only in archives and on recordings. According to the calculations of the Catalogue of Endangered Languages (ELCat)—a joint effort of linguists at the University of Hawaii, Manoa, and at the University of Eastern Michigan—nearly thirty language families have disappeared since 1960. If the historical rate of loss is averaged, a language dies about every four months.

The mother tongue of more than three billion people is one of twenty, which are, in order of their current predominance: Mandarin Chinese, Spanish, English, Hindi, Arabic, Portuguese, Bengali, Russian, Japanese, Javanese, German, Wu Chinese, Korean, French, Telugu, Marathi, Turkish, Tamil, Vietnamese, and Urdu. English is the lingua franca of the digital age, and those who use it as a second language may outnumber its native speakers by hundreds of millions. On every continent, people are forsaking their ancestral tongues for the dominant language of their region’s majority. Assimilation confers inarguable benefits, especially as Internet use proliferates and rural youth gravitate to cities. But the loss of languages passed down for millennia, along with their unique arts and cosmologies, may have consequences that won’t be understood until it is too late to reverse them.

“Fire department, I guess—point is I made too much pasta.”
Buy the print »

Little is known about the origins of human speech. It seems unlikely, though, that there was ever a pre-Babel world. The geographic isolation of small groups breeds heterogeneity, both of dialects and of language isolates, as it probably did among Paleolithic hunters. Nowhere is there a richer or more concentrated cluster of languages, some eight hundred, than in Papua New Guinea, with its daunting topography of highlands and rain forests. In New Guinea, as in other hot spots of endangerment, indigenous languages are a user’s guide to ecosystems that are increasingly fragile and—in the face of climate change—increasingly irreplaceable.

Richard Schultes, a professor of biology at Harvard, who died in 2001, is considered the father of modern ethnobotany. He was among the first to study the use of plants, including hallucinogens, by indigenous peoples in the rain forest and to publicize the alarming rate at which both were disappearing. (More than ninety tribes, he noted, vanished in Brazil between 1900 and 1975.) In the nineteen-forties, doing field work in the Amazon, Schultes identified the source of curare, a derivative of which, d-tubocurarine, is used to treat muscle disorders like those associated with Parkinson’s disease. His students Michael Balick, now the director of economic botany at the New York Botanical Garden, and Paul Alan Cox, the executive director of the Institute for Ethnomedicine, in Jackson Hole, Wyoming, continued his explorations. They have written with authority on the “ethnobotanical approach to drug discovery,” which is, in essence, field work guided by shamans and healers.

In Samoa, Cox discovered that Polynesian herbal doctors had an extensive nomenclature for endemic diseases and a separate one for those introduced by Europeans. Their sophistication is not unique. The taxonomies of endangered languages often distinguish hundreds more types of flora and fauna than are known to Western science. The Haunóo, a tribe of swidden farmers on Mindoro, an island in the Philippines, have forty expressions for types of soil. In Southeast Asia, forest-dwelling healers have identified the medicinal properties of some sixty-five hundred species. In the nineteen-fifties, drug researchers for Eli Lilly and Company, working on several continents, studied folk remedies for diabetes based on the rosy periwinkle, and isolated an active ingredient—vinblastine—that is used in chemotherapy for Hodgkin’s disease. (The healers who led the researchers to their discoveries never saw any of the profits. Such “bio-prospecting” by pharmaceutical companies is a controversial practice that was largely unregulated until 1993.) Quinine, aspirin, codeine, ipecac, and pseudoephedrine are among the common remedies that, according to Cox and Balick, we owe to ethnobotanists guided and informed by indigenous peoples.

Daniel Kaufman, a linguist who directs the Endangered Language Alliance, a nonprofit institute on West Eighteenth Street, would be thrilled to hear that a cure for cancer had been discovered in a rain-forest flower for which we have no name, other than one in a dying language, but saving the flower is not his concern. I was introduced to Kaufman last June at a screening of “Language Matters,” a documentary directed by David Grubin and hosted by the poet Bob Holman. Kaufman, who teaches at Columbia University, consulted on the film. He is a slight, studious-looking man in his late thirties, whose expertise is in the Austronesian languages of Madagascar and the Pacific. But the alliance, which he founded six years ago, grew out of his commitment to support the more than eight hundred endangered languages of the New York area, which has a higher concentration of them, Kaufman estimates, than any city in the world.

The alliance has recorded Shughni, from Tajikistan, which is spoken by a few families in Bay Ridge; Kabardian, from the northern Caucasus, which survives in a Circassian community in Wayne, New Jersey; and Amuzgo, from southwestern Mexico, still alive in Sunset Park, Corona, and Port Richmond—enclaves of immigrants from Oaxaca and Guerrero. Mandaic, an ancient Semitic language of Iraq and Iran, has only a few elderly speakers left, in Flushing and Nassau County. Garífuna, however, is firmly based in a mostly working-class community of some two hundred thousand people concentrated in eastern Brooklyn and the South Bronx. The Garífuna are descendants of West Africans who were shipwrecked in 1635 off the coast of St. Vincent, where they intermarried with the indigenous Arawaks and Caribs. The language that evolved combines Arawak grammar with African, English, and Spanish loan words. In the eighteenth century, the British deported the Garífuna to Central America; during the past fifty years, many have settled in New York.

“Let’s be honest,” Kaufman said. “The loss of these languages doesn’t matter much to the bulk of humanity, but the standard for assessing the worth or benefit of a language shouldn’t rest with outsiders, who are typically white and Western. It’s an issue of the speakers’ perceived self-worth.” He suggested that I meet some of those speakers not far from home—members of the Mohawk nation. “The older people are the only ones who can tell you what their youth stands to lose,” he said. “The young are the only ones who can articulate the loss of an identity rooted in a mother tongue that has become foreign to them.” He told me about a two-week immersion program that takes place each summer at the Kanatsiohareke community center, in Fonda, New York, a village on the Mohawk River between Utica and Albany.

Until the eighteenth century, Fonda (which was named for the Dutch ancestors of Henry, Jane, and Peter), the neighboring town of Palatine (named for the Palatine Germans who took refuge there), and much of the land to the north and east, into Canada, was Mohawk territory. The Mohawk were feared for their ferocity, but it was chastened by a matriarchal system of consensus governance. One of the students in the intermediate class at Kanatsiohareke was a local I.B.M. employee who told me that he was learning Mohawk because the tribe had saved the lives of his German ancestors.

During the American Revolution, the Mohawk supported the British, and after the defeat they were forced to cede their territory. Their chiefs led them to Canada, and most of their settlements are still on the border of New York and Ontario. In recent decades, two factions have divided Mohawk loyalties: a party of modernizers that has aggressively championed casino development, and an Old Guard that fears the corruption that casinos invite. The founder of the Kanatsiohareke center, Sakokweniónkwas, whose English name is Tom Porter, belongs to the latter.

Porter is a commanding figure in his early seventies, who speaks in a quietly hypnotic voice. He was born on a reservation, the son of an ironworker—one of the legendary Mohawk who built Manhattan’s skyscrapers. Porter and his son both followed him into the trade. “It’s a myth that Mohawk don’t suffer from vertigo,” he told me. “I was afraid of heights all my life.” His grandmother encouraged him to marry a maiden of old-fashioned virtue, and while he was on a trip to Mississippi, a matchmaker introduced him to Alice Joe, a Choctaw. They settled on Mohawk land west of Albany, where he worked as an ambulance driver, a carpenter, and a teacher. Their six children were raised speaking both Choctaw and Mohawk. When Porter was twenty-one, the clan mothers chose him as one of the nation’s nine chiefs. He retired after twenty-five years, though he is still much in demand for his eloquent funeral orations.

Porter bought the Fonda property at auction, twenty years ago, with help from the local community. Kanatsiohareke was conceived as a bulwark of “longhouse” values: reverence for nature, parents, ancestral spirits, and the language. “Mohawk isn’t just a form of speech,” he said. “It’s a holistic relationship to the cosmos.” The Porters host concerts and lectures in addition to the language camp, and some of their land is farmed organically. But Kanatsiohareke is a homespun operation: the compound includes an old red barn, a ramshackle farmhouse, and a rustic B. and B. with a craft shop that sells T-shirts and baskets.

“About your cat, Mr. Schrödinger—I have good news and bad news.”
Buy the print »

The Mohawk are one of five hundred and sixty-six tribes recognized by the United States whose presence on the continent predates “contact”—the advent of Europeans. Only about a hundred and seventy indigenous languages are still spoken, the majority by a dwindling number of elders like Marie Wilcox, of the Wukchumni, who is eighty-one, and who spent her youth doing farmwork south of Fresno. About fifteen years ago, she started recording her tribe’s creation myths and compiling a dictionary of its unwritten language. Navajo, which helped to decide the outcome of the Second World War (the Japanese were never able to decrypt messages relayed among native speakers—the celebrated “code talkers”), is an exception. It is used in daily life by two-thirds of the nation’s two hundred and fifty thousand citizens, who refer to it as “Diné bizaad,” “the people’s language.” Fluency, however, is declining. The election of a new tribe president was suspended, in October, by a dispute over the requirement that he or she speak fluent Navajo. A leading candidate, Chris Deschene—a state representative from Arizona and the grandson of a code talker—was disqualified for that reason. “I’m the product of cultural destruction,” he told the Navajo Times, when he was asked why he couldn’t speak Diné. (He is a graduate of the U.S. Naval Academy, and, after retiring as a major in the Marine Corps, he earned two graduate degrees, in engineering and law.) A new election will take place in April.

About twenty-five thousand North Americans identify themselves as Mohawk, but only about fifteen per cent speak the language well enough to conduct their daily lives in it. Transcribing Mohawk is an arduous task. In the eighteen-seventies, Alexander Graham Bell, a recent immigrant to Canada, fell in love with its sound and created an orthography. (The Mohawk made him an honorary chief.) The grammar is at least as challenging as that of Latin. Noun roots are modified by a welter of adjectival prefixes; the addition of the letter “h,” for example, can alter a meaning dramatically. If you err in trying to describe a man as “tall,” you may have said that he has “long balls.” Verbs are muscular and poetic. “To bury” someone is “to wrap his body with the blanket of our Mother Earth.” A man who fathers a child “lends him his life.” In the ethos of Mohawk culture, as in its language, “I” cannot stand on its own—the first-person singular is always part of a relationship. So you don’t say, “I am sick.” “The sickness,” in Mohawk, “has come to me.”

In the advanced seminar at Kanatsiohareke, Mina Beauvais, whose Mohawk name is Tewateronhiakhwa, was teaching students the optative, an arcane mood, akin to the subjunctive, that exists in Kurdish, Albanian, Navajo, Sanskrit, and ancient Greek. The students also had to contend with compound words, some longer than those of German, which aren’t pronounced as they are written. You need a bard’s memory and a singer’s breath to speak Mohawk as Beauvais does: she makes it sound incantatory. I took and failed a test that she gave her class: to repeat tahotenonhwarori’taksen’skwe’tsherakahrhatenia’tonháîtie. (It is a single word that means “the fool comes tumbling down the hill.”)

Beauvais, who grew up near Montreal, is a native speaker in her late seventies. She is small and sturdy, with a wry patience bred of hardship. When she was seven, the state compelled her parents to send her to a school “for Indians,” at which students were beaten for speaking their native tongue. Tom Porter’s grandmother hid him, at the same age, so that the authorities couldn’t put him in a boarding school. The forcible assimilation of First Nation children in punitively austere, mostly church-run institutions was made compulsory by Canadian law in the eighteen-eighties and continued until the nineteen-seventies. “That system almost destroyed us,” Porter said. “When you deprive a kid of his language at the sponge time of life, the most precious learning years, a bond is broken.”

Attendance at the camp was lower than in the past; there were just four students in the advanced seminar, though all were parents who hoped to pass the language on to their young children. Gabrielle Doreen, a stately woman of thirty-seven, who wears her graying hair in a long braid, is the mother of four. While honing her grammar, she was teaching kindergarten at the Mohawk “nest” on the Tyendinaga Mohawk Territory, in Ontario. The nest—totahne—is an immersion program for preschoolers. Doreen had enrolled in the camp with her fiancé, Lou Williams, an Oneida. He was moving from his native Wisconsin to Ontario, he told me, “because in Mohawk tradition men join their women’s clan.”

Iehnhotonkwas—Bonnie Jane Maracle—started as a student at the camp when it began, in 1998, and became its coördinator in 2005. “We originally had much better attendance,” she said. “But eight Mohawk communities now have their own immersion classes, so people can study closer to home.” Other First Nations—the Ojibwe, in Minnesota; the Blackfoot, in Montana; the Iñupiat, of northern Alaska—also have nests, and the trend has been gaining momentum since the passage, in 2006, of the Esther Martínez Native American Language Preservation Act, which provided funding for language survival and restoration programs from pre-K through college. (Martínez, who lived in New Mexico, was a linguist, a storyteller, and a champion of her native Tewa. She died at ninety-four, the year that her namesake legislation was enacted.) There are now some thirty institutions of higher learning on or near reservations that offer instruction in indigenous languages.

K. David Harrison, an associate professor of linguistics at Swarthmore College, is the director of research at the Living Tongues Institute for Endangered Languages, based in Salem, Oregon, and heads National Geographics Enduring Voices Project. He is prominent in the field and writes prolifically about endangerment. Part of his mission, he told me, is to help communities “technologize their language.” It heartens him, he said, to see “Mohawk kids texting in Mohawk.” (The tribe also has its own television and radio stations.) The Yurok, of Northern California, are one of many tribes with a Web site. And smartphone users can download apps to study Nishnaabe (of Ontario), Salteaux (of Saskatchewan), Potawatomi (of the Great Lakes), Arikara (of North Dakota), or Mi’kmaq (of Canada’s Atlantic provinces and the Gaspé Peninsula). Harrison’s institute also hosts a YouTube channel. “Living tongues have to evolve to deserve the term,” he said. “I am working on a dictionary of Siletz”—a critically endangered language native to Oregon—“and the community is having an interesting dialogue about contemporary words like ‘computer.’ Should they import it from the English or coin a phrase that means ‘brain in a box’?”

An app, however, can’t replace the live transmission of a language to children at what Porter calls “the sponge time.” The Maori of New Zealand were the first to develop the language-nest concept. (A nest is a sanctuary from predation as much as an incubator.) The nest movement in the United States, which began in Hawaii, where it is called Pūnana Leo, was inspired by the Maori movement, Kōhanga Reo. They both date to the early nineteen-eighties, although they have roots in years of community organizing to reverse colonial policies. The Hawaiian language was banned in public schools from 1896 until 1986—two years after activists, skirting the law, opened the first private nest. Today, some twenty-four hundred students attend one of nineteen Hawaiian language-immersion sites around the state. Researchers have suggested that students taught in Hawaiian perform as well, if not better, than their peers who, like most Americans, are educated monolingually. At the best immersion-program site, ninety per cent of the class goes on to college. And graduate students at the University of Hawaii, Hilo, can now earn a doctorate in their native tongue.

Political activism has been a catalyst in nearly every narrative of a language rescued from the brink. The most famous example is that of Welsh. Resistance to English rule has an eight-hundred-year history in Wales that is intimately connected with the struggle to preserve its Celtic language, Cymraeg. In the documentary “Language Matters,” Bob Holman and David Grubin pick up the saga in the mid-nineteen-sixties, when the British government flooded the ancient village of Capel Celyn, one of the few remaining Welsh-language communities, to create a reservoir that supplied water to Liverpool. This act fuelled an independence movement and demands to give Cymraeg parity with English in the public sphere. The BBC launched a Welsh radio station in 1977. Since 1999, instruction in Welsh has been compulsory for students in state schools up to the age of sixteen. According to the most recent census, in 2011, nineteen per cent of the population speak the language. That means, of course, that eighty-one per cent do not.

“What you find ‘boring’ spies from all over the world would find extremely interesting.”
Buy the print »

The struggle to preserve a language often creates an atmosphere of siege. I felt that sense of embattlement at Kanatsiohareke and, again, last September, when I sat in on a radio show sponsored by Dan Kaufman and broadcast from the Endangered Language Alliance offices, on Eighteenth Street. The show, “Voces sin Fronteras” (“Voices Without Borders”), was improvised—conversation punctuated by music. There were three hosts of indigenous descent—Leobardo Ambrocio Ajtzalam, José Juarez, and Segundo Angamarca—who alternated between Spanish and their respective native languages: K’iche’, of Guatemala; Totonac, of Mexico; and Kichwa, of Colombia and Ecuador. Their listeners were a small online audience of fewer than two hundred people and a larger one of uncertain size in Guatemala. Radio, Kaufman noted, is an important tool for language activists. It reaches remote populations that might not have access to other media and boosts their morale.

The music was upbeat, but the faded maps on the office wall, the tangle of wires from a jury-rigged console, and the esprit de corps around a scuffed conference table might have been those of a guerrilla redoubt. A fourth endangered language crackled over the airwaves—that of left-wing revolution. “Fellow-combatants!” the men exhorted. “A mother tongue is a human birthright. We must fight for our own!”

If peripheral languages are to survive, they will have to find a way to coexist with what Bob Holman calls the “bully” languages. David Harrison told me, “The ideal of stable bilingualism is a given. Nobody wants these communities to remain isolated.” (China and Russia, however, consider ethnic languages a threat to their hegemony and have taken measures of varying severity to suppress them.) Even when there is persecution, the challenge, as Harrison sees it, is to “increase the prestige of a language so that the young embrace it.” In that respect, the fate of endangered languages may ultimately rest, as Mohawk does, with couples like Gabrielle Doreen and Lou Williams. They are determined to set an example for their children—both of fluency and self-worth. Then it will be up to the kids. Mina Beauvais spoke Mohawk with her only son, but, she said, “he married a Canadian English lady and didn’t pass it on.” Tom Porter told me, “We will do what we can, and if the young don’t cherish our way of life the Mother will take it back.”

On rare occasions, an extinct language has been resurrected. Jessie Little Doe Baird, a member of the Mashpee Wampanoag tribe, in Massachusetts, received a MacArthur grant, in 2010, for her efforts to revive her people’s extinct language, Wôpanâak. The tribe had been decimated by disease in the seventeenth century, and the last speakers died a hundred years ago. But written records of the language were relatively plentiful. A Wôpanâak Bible was published in 1663, the first translation of Scripture in Colonial America. John Eliot, a Puritan missionary who called himself “the Apostle to the Indians,” created an orthography with the tribe’s assistance, and taught its members to read. The Wampanoag welcomed literacy and left an archive of deeds and documents.

When Baird was pregnant with her fifth child, Mae Alice, she had a vision in which her ancestors called on her to fulfill an old prophecy that their language would come back to life. She was a social worker with no experience in linguistics, but she drafted a plan to revive Wôpanâak and was accepted into the Community Fellows Program at the Massachusetts Institute of Technology. A distinguished faculty of linguists, including Noam Chomsky, supported her project. Mae Alice is now the first native speaker of Wôpanâak in some seven generations.

Kaufman also cited the case of Daryl Baldwin—Kinwalaniihsia—a member of the Miami tribe of Oklahoma. The Miami (or Myaamia) originally lived in the Great Lakes area, where Baldwin was born. They spoke an Algonquian language that died out some fifty years ago, but there were texts and recordings of it, and some elders—“rememberers,” as linguists call them—taught him a few words. Baldwin earned a linguistics degree, specializing in Native American languages, from the University of Montana. He and his wife homeschooled their children in the Miami language, and in 2013 he founded the Myaamia Center, at Miami University in Ohio, to provide the community with cultural resources. Miami is now a growing language.

Kaufman was surprised when I told him about Keyuk—he hadn’t heard about his work with Selk’nam. I, in turn, was surprised to hear from Keyuk that he had given up his formal studies of linguistics. “I can reach more people through music than I could have as an academic,” he told me in an e-mail. When I pressed him for details, he was typically reticent, but he did mention that he had been working on a new Selk’nam lexicon and that, last May, he and a friend had met with a community in Tierra del Fuego. “We recorded some fragments that the elders remembered,” he said.

Keyuk’s friend turned out to be a twenty-four-year-old linguist, Luis Miguel Rojas-Berscia, who has corresponded at length on scholarly subjects with David Harrison. Rojas-Berscia himself is a prodigy. I reached him by telephone in his native Lima, where he was visiting his family. His childhood household was trilingual: his father is Peruvian, his mother is Italian, and his grandmother spoke Piedmontese. English was his fourth language—he learned it as a toddler—and the next seventeen tongues in which he is fluent, including Mandarin and Quechua, were, he says, “relatively easy to master.” (He has a working knowledge of fifteen others.)

After graduating from the Pontifical Catholic University of Peru, Rojas-Berscia moved to Holland, where he does research on language and cognition at the Max Planck Institute of Psycholinguistics. His doctoral thesis is on the Shawi, hunter-gatherers of the upper Amazon. The Shawi, he told me, number “about twenty thousand, but I give their language better odds than Quechua, which has ten million speakers.” That sounded counterintuitive but, he said, “Every language has its ecology. If it isn’t useful, the community will be forced to abandon it. Indigenous people in Latin America face all kinds of discrimination, and necessity dictates that, sooner or later, they adopt Spanish. Once that happens, the attrition is fast. Where a group is isolated from external pressures, they aren’t forced to accept the dominant language. So you can’t just go by the demographics.”

Selk’nam was the subject of Rojas-Berscia’s master’s research. A colleague thought that a young Chilean might be of help. It was Keyuk. “When I heard about him, I had my doubts,” Rojas-Berscia said. “I studied with some of the best linguists in the world, but how could a middle-school autodidact have mastered a language that died fifty years ago? I know that old Beauvoir lexicon he used—you can’t learn much grammar from it. So I devised a test. I held up pictures and asked him to describe them. The man is a mystery, but his Selk’nam is good.”

Rojas-Berscia had a travel stipend from the honors academy at Radboud University, in the Netherlands, which paid for the trip to Tierra del Fuego. The Selk’nam survivors whom he and Keyuk interviewed had forgotten their language, though not their identity. One of the elders was a tiny woman named Herminia Vera. She hadn’t spoken Selk’nam in eighty years, she told them, and, initially, she seemed suspicious of their interest. (Like Ivonne Gomez Castro, she had been mocked, as a girl, for her mestizo looks—though in her case it was because she looked “too European.”) As she warmed to Rojas-Berscia, he gave her his picture test, and the language of her childhood began to thaw. She and Keyuk engaged in a halting conversation about food, farming, and family heritage. “I don’t know who among us was the most surprised,” Rojas-Berscia said. Perhaps it was the glaciers (xųṣ), the rivers (ṣįkįn), the beaches (kųxhįjįk), and the sky (sįųn) hearing their own voice. Herminia Vera died two months later. ♦

Book Notes: Founders at Work

28 March 2015 - 1:00pm

In Founders at Work (2007), Jessica Livingston, founding partner at at Y Combinator, interviews 30 startup founders.

The Steve Wozniak interview alone is worth the price of admission, but it also happens to be posted free in its entirety on the book’s website. Check that out to get a sense of the questions, answers, and how thorough each interview is. (Put another way: the book is very long and also very good.)

The companies were founded from the 1970s to the 2000s. The interviews take place a few years after the dot-com crash, and slightly before iPods, phones, and internet communications devices became a single thing.

There were a lot of “Ohhh yeah” moments—the nostalgic kind, not the Macho Man kind. And they sort of come in two forms: 1.) memories from the founder stories where I could remember how that technology impacted me90s things like Hotmail and WebTV were real fun for me, and 2.) memories from “present day” 2007 before mobile took over the world—like Digg driving heavy traffic to sites.

pg had this to say in an old HN thread about the book:

“Startups are basically comedies.” I highlighted a ton of things and picked a few to share—let’s start with some scenes from pg’s own comedy.

Paul Graham — Cofounder, Viaweb (and some investment firm)

Years ago, working out of an apartment wasn’t quite as accepted. Here, Paul Graham talks about trying to look legitimate when, you know, business people visited.

When that first giant company wanted to buy us and sent people over to check us out, all we had in our so-called office was one computer. Robert and Trevor mostly worked at home or at school. So we borrowed a few more computers and stuck them on desks, so it would look like there was more going on.

It really does sound like a sitcom. And the fun doesn’t stop there. Here, he talks about power going out in Cambridge. After running the generator indoors and deciding it was too loud (not, you know, too dangerous), they put the generator in the street and ran an extension cord:

It was running through our office at chest height and you could kind of twang it and it would go “boinnnnnggg.” Then we started the gas generator up in the street and that was just about bearable, so we ran the servers on that for a couple hours until the power came back.

pg: scrappy. Founders at Work has a lot of stories involving hardware: servers getting overloaded, learning to use a shrink-wrapping machine for CD distribution, servers going offline so someone could drive it somewhere with a T3 line, etc.

Steve Perlman — Cofounder, WebTV

When I mentioned “Ohhh yeah” moments, this was a big one for me. My cousin had a WebTV and so whenever my family visited his we’d use that to browse the web. And I have no idea what we’d even look at. At night we’d use his dad’s computer to play TetriNet.

With so many people connected in the Valley, founders also share early-day anecdotes from companies that aren’t their own. Here, Steve Perlman talks about Apple and dialog boxes:

The following is not something from my personal experience—it’s a story told to me by the Mac team—but they said that, when they first did the dialog boxes for the Lisa, instead of saying “OK,” it said, “Do It.” They found that people were reluctant to click on that, and they couldn’t figure out why. Then, once they had a test subject there who just wouldn’t click on it, they said, “Why didn’t you click on that little button there?” He said, “I’m not a dolt. Why would I click on that?”

So they changed it to “OK”. Don’t be a dolt, test with your users.

Mark Fletcher — Founder, ONElist, Bloglines:

A lot of common themes run through the interviews. They’re things you hear about all the time about startups and building products. Here, Mark Fletcher talks about finding a problem to solve:

I started ONElist because I wanted to start a mailing list for my parents, and at that time you had to download software and you had to have a computer connected to the Internet. It was just really difficult for an average person to put together a mailing list. So it was the same thing. I guess my advice is: solve a problem that you have, first and foremost, and chances are, other people may have the same problem.

How do you find a good problem to solve? Solve your own. Ben Trott and Mena Trott, founders of Six Apart, created Movable Type because there wasn’t a great existing solution for Mena’s rapidly growing blog. Stephen Kaufer built TripAdvisor after trying to plan a trip and finding the end steps of booking had solutions, but planning where to go and stay wasn’t a great experience.

Another theme Mark Fletcher talks about is iterating:

So just get something out there. If you find really early versions of ONElist or Bloglines on archive.org, the websites are horrible. They are crap, they don’t have any features, they just try to do one thing. And you just iterate because users are going to tell you what they want, and they’re your best feedback. It’s critical just to get something out quickly. Just to start shipping and then you can iterate.

Founders at Work was published before The Lean Startup when lean methodologies and MVPs weren’t so widespread. Some of the concepts surely existed, but others wouldn’t work when releases were months or years apart. Charles Geschke, cofounder of Adobe, talks about Xerox in the 70s, “They said, ‘Oh wait a minute. At Xerox it takes us at least 7 years to bring a product out.’” Not to mention iterating after that.

Joshua Schachter — Founder, del.icio.us

The variety of people and companies profiled is great. Some grew to thousands of employees, some stayed small, some were funded, some were bootstrapped. Some bordered on ruin and others say things along the lines of “I don’t know, it actually went pretty smoothly!”

Here, Joshua Schachter talks about building del.icio.us in his spare time:

I could come in and look at it, figure out what I’m doing, do it, and be done for the day in 15 minutes. So if I could get one thing done a day, I was happy.

I love this. Success comes through a lot of paths and it’s good to see that something millions of people found useful was built in (sometimes very) small chunks of free time.

Stephen Kaufer — Cofounder, TripAdvisor

Again, a lot of companies built things we take for granted now. I booked a trip to Spain with some friends last year and, along with other sites, TripAdvisor was part of that. Still it was this whole production to plan out.

I can’t imagine planning without those resources. Stephen Kaufer talks about planning a trip before TripAdvisor and learning an island wasn’t all that great through a chat room. Here, he talks about finding content for TripAdvisor:

Then we hired people to read every single travel article we could find on the Net, and classify that article into our database, and write a one-line summary. It’s a fairly significant effort, and people that we talked to said, “You’re nuts. You’ll never finish.”

Do things that don’t scale.

Ron Gruner — Cofounder, Alliant Computer Systems; Founder, Shareholder.com

The most entertaining stories are the most ridiculous. Ron Gruner shares a story about an 800-number mixup. A company printed millions of annual reports with the wrong 800-number for the shareholder’s call. Reprinting wasn’t an option, so Ron Gruner and his team had a few days to figure out how to take that number over.

Because of privacy issues, the paging company wouldn’t give up their customer’s details. Ron shares their solution to this privacy issue:

Then Josiah Cushing, one of the college grads I had hired, during our staff meeting said, “Ron, why don’t you try hiring a private detective?”

The guy who owns the number agrees to transfer the number over as long as they pay for his pager subscription for a year. And a subscription for his wife. “We’ll make it two years.” Not much of a price to save an entire company.

And then there’s the “Ohhh yeah” moment of remembering that pagers were a thing.

If you have any interest in startups or technology, Founders at Work will have something you’ll like. In a few years, maybe it’d be cool to see a sequel with today’s companies. “Our site was getting hammered. We had to spin up more Heroku workers.” Maybe not.

On Saving the World and Other Delusions

28 March 2015 - 1:00pm

1.

I recently had a conversation with a friend of mine who suffered a crisis of faith of sorts. His startup, which initially had an extremely ambitious, world-altering business plan, had to retrench and start to find a more modest product-market fit. He was upset, not so much because of decreased prospects for a big dollar exit, but because, as he put it, “if I’m not trying to save the world, what’s the point of all this?”

It’s a standard narrative in the startup world: “the world is broken; I have a really ambitious plan to fix it.” But what I told him was that this is a totally crazy way of measuring both impact and a meaningful life. Most of the people who make a big impact in the world are doing paperwork, publishing research, working with the constraints of the system. They’re closer to a paper-pushing bureaucrat than a bold maverick. Sometimes the papers you’re pushing are exit visas for Jews.

The nerd’s sense of measuring everything here is a big handicap when it comes to assessing life meaningfulness. Our instincts for impact evolved in a world where only a few dozen people had real agency in your world; you were part of what we’d perceive as a small ingroup by default, and it wouldn’t be too crazy to think you could be one of the most respected and influential people in the known world. Today, it’s more difficult but still possible to achieve that feeling – but crucially, you have to carefully cultivate insensitivity to scope. You could become the manager of a small business, or a local leader in the Mormon church. Despite all the social disruptions of mobility and super-Dunbar living, that could probably still feel pretty similar from the inside to being a tribal elder.

But then nerds have to come in and ruin everything by measuring in terms of real world impact. And by that metric, nobody measures up to our brain’s expectation of impactfulness. Measured in terms of a civilization of billions, even the most successful career is going to feel like a drop in the bucket, and narrative-based dreams of world-changing are cartoonish. In theory, this quantitative thinking should also provide compensating solace, by saying “Yeah, well at least you did 10x what the average person is able to accomplish,” yet in practice I haven’t seen that many people deeply satisfied by that. It’s “save the world” or bust, without a sense of moderation.

It’s also not at all clear that saving the world is the best way to measure your life. Almost all societies in the past had a complex bucket of metrics involving personal virtue, material success, and success of the family – with “impact on the state of the world” being an also-ran at best. I suspect something in that vein is the most sustainable thing for humans, and that the startup bluster is maybe economically adaptive (as a way to overcome risk aversion and to project confidence) but also deeply insane given how human brains work. And the undermining of traditional notions of life success proportionally increases the importance of saving the world.

2.

One of the odd things I’ve noticed in our depictions of great leaders is that a big part of their influence comes from being able to get people to buy into a vision, and thereby get people to do things that they would otherwise never do. An ordinary leader can assemble a bunch of people doing their normal jobs at market wages, but if you can extract an effort or flexibility surplus in service of your vision, that makes it possible to attack a whole different class of coordination problems. Messianic leaders have been a staple throughout history, of course, but it seems that both the supply and the demand for such leaders is at an all-time high. Reading a self-improvement book published in the 1800s, it struck me how much of the leadership advice was personal, almost feudal: to make people follow you, be a publicly virtuous, reliable guy, someone people would be proud to work for. By contrast, for the vision-based leader, the pathos of the vision precedes the ethos of his claim to leadership ability.

I think this demand is related to our dysfunctional sense of meaningfulness. An undermining of traditional sources of meaningfulness leads people to seek meaning in their work, and this produces both a demand and an incentive for narrative-supplying entrepreneurs to fill that gap in exchange for super-market loyalty and dedication. This is potentially a fair bargain – the question, of course, is whether the entrepreneurs end up delivering, or whether they’re just providing the leverage to inflate a meaningfulness bubble that never gets paid off.

3.

There’s a phenomenon in psychiatry where people with two different psychiatric disorders – narcissistic personality disorder and borderline personality disorder – are frequently found in pairs. Commonly, you’d have a narcissist and borderline as close friends, or a (usually) male narcissist in a relationship with a (usually) female borderline. Narcissism is exactly what it sounds like: someone who for whatever reason has a deeply held need to be admired and considers his life story the most important thing in the world. Borderline personality disorder is best defined as a lack of a sense of identity; they tend to have huge emotional swings and identify themselves with a rapid succession of people in their lives. The narcissist needs others to validate his self-narrative; the borderline needs someone to give her a narrative to live. And so, it may not be surprising that relationships between a narcissist and borderline are pretty frequent, and, if not exactly stable, at least as stable as can be expected for people with personality disorders.

You can see where this is going. The need for, and premium on, vision-based leadership sort of looks like a widespread, subclinical version of borderline personality disorder – maybe we could rebrand it as “chronic questlessness.” Of course I’m not suggesting that people are crazy in the Beautiful Mind sense. Psychiatric disorders in general and personality disorders in particular are more a gradient than a Boolean diagnosis; they’re almost always exaggerations of heuristics that normal people use all the time. The threshold for diagnosis is nothing more than “okay, you’ve got some weird stuff going on; does it interfere with your functioning?” So what I’m suggesting can be translated to saying that there’s a broad-based, subtle shift in heuristics resulting in a lot of people seeking outside opinion on what they should value.

4.

For a long time I regarded the save-the-world thing as a basically harmless motivating delusion, the nerd equivalent of the coach’s pre-game pep talk where he tells your team that, against all odds and in the face of all objective evidence to the contrary, you are a bunch of winners and are going to take home the division trophy. But seeing my friend having his motivational system semi-permanently warped was something of a wake-up call, and got me thinking about how to avoid being sucked into that attractor. It’s tough because the tools of quantitative analysis that underpin this change-the-world heuristic are valid and indeed valuable. But these observations suggest that we should be wary of how easy it is to smuggle in the assumption that our benchmark should be a totally unrealistic amount of efficacy. And at the same times they argue for keeping a diversified life-meaning portfolio – you should include things like family success, physical and emotional quality of life, human relationships, and even relative social status as part of how you measure your life.

. Bookmark the

.

Han Solo and Bayesian Priors

28 March 2015 - 1:00pm

One of the most memorable errors in statistical analysis is a scene from The Empire Strikes Back. Han Solo, attempting to evade enemy fighters, flies the Millennium Falcon into an asteroid field. The ever knowledgeable C3PO informs Solo that probability is not on his side.

C3PO: Sir, the possibility of successfully navigating an asteroid field is approximately 3,720 to 1!

Han: Never tell me the odds!

Here's the scene for those who haven't seen it or may have forgotten. Superficially this is just a fun movie dismissing 'boring' data analysis, but there's actually an interesting dilemma here. Even the first time you watch Empire you know that Han can pull it off. But, despite deeply believing that Han will make it through, is C3PO's analysis wrong? Clearly Han believes it's dangerous, 'They'd have to be crazy to follow us.' None of the pursuing tie fighters make it through, which provides pretty strong evidence that C3PO's numbers are not off. So what are we missing?

What's missing is that we know Han is a badass! C3PO isn't wrong, he's just forgetting to add essential information to his calculation. The question now is: can we find a way to avoid C3PO's error without dismissing probability entirely as Han proposes? To answer this we'll have to model how both C3PO thinks and what we believe about Han, then find a method to blend those models.

C3PO's mind

We'll start by taking apart C3PO's reasoning. We know C3PO well enough by this point in the film to realize that he's not just making numbers up. C3PO is fluent in over 6 million forms of communication, and that takes a lot of data to support. We can assume then that he has actual data to back up his claim of 'approximately 3,720 to 1'. Because C3PO mentions 1:3720 is the approximate odds of successfully navigating an asteroid field we know that the data he has only gives him enough information to suggest a range of possible rates of success.

The only outcomes that C3PO is considering are successfully navigating the asteroid field or not. If we want to look at the various possible probabilities of success given the data C3PO has, the distribution we're going to use is the Beta distribution. We can define C3PO's reasoning with the following equation:

P(RateOfSuccess|Successes) = Beta(α,β)

The Beta distribution is parameterized with an α (number of times success observed) and a β (the number of times failure is observed). This distribution tells us which rates of success are most likely given the data we have. We can't really know what's in C3PO's head, but let's assume that not too many people have actually made it through an asteroid field and in general not that many people try (because it's crazy!). We're going to say that C3PO has records of two people surviving and 7,440 people ending their trip through the asteroid field in a glorious explosion. Below is a plot of the probability density function that represents C3PO's belief in the true rate of success when entering an asteroid field.

For any ordinary pilot entering an asteroid field this looks bad. In Bayesian terms, C3PO's estimate of the true rate of success given observed data is referred to as the likelihood.

But Han is a badass

The problem with C3PO's analysis is that his data is on all pilots, and Han is far from your average pilot. If we can't put a number to Han's 'badass' then our analysis is broken, not just because Han makes it (we have p-values to blame for that), but because we believe he's going to. Statistics is a tool to aid and organize our reasoning and beliefs about the world. If our statistical analysis not only contradicts our reasoning and beliefs but also fails to change them, then something is wrong with our analysis.

Why do we believe Han will make it? Because Han makes it through everything that's happened so far. What makes Han Solo, Han Solo is that no matter how unlikely he is to make it through something he still succeeds! We have a prior belief that Han will survive the asteroid field. The prior probability is something that is very controversial for people outside of Bayesian analysis. Many people feel that just 'making up' a prior is not objective. This scene from Empire is an object lesson in why it is even more absurd to throw out our prior beliefs. Imagine watching Empire the first time, getting to this scene and having a friend sincerely tell you that 'whelp, Han is dead now'. It is worth pointing out again, C3PO is not entirely wrong. If your friend said 'whelp, those Tie fighters are dead now', you would likely chuckle in agreement.

Now we have to come up with an estimate for our prior probability that Han will successfully navigate the asteroid field. We do have a real problem though, we have a lot of reasons for believing Han will survive but no numbers to back that up. We have to make a guess. Let's start with some sort of upper bound on his badassness. If we believe it was impossible for Han to die then the movie becomes boring. At the other end, I personally feel much more strongly about Han being able to make it than C3PO does about him failing. I'm going to say I roughly feel that Han has a 20,000:1 chance of making it through a situation like this. We'll use another beta distribution to express this for two reasons. First my beliefs are very approximate, so I'm okay with the true rate of survival being variable. Second, it makes calculations we need to do later much easier. Here is our distribution for our prior probability that Han will make it:

Creating suspense with a posterior

We have now established what C3PO believes (likelihood) and modeled our own beliefs in Han (prior probability), but we need a way to combine these. By combining beliefs we create what is called our posterior distribution. In this case the posterior models our sense of suspense upon learning the likelihood from C3PO. The purpose of C3PO's analysis is in part to poke fun at his analytical thinking, but also to create a sense of real danger. Our prior alone would leave us completely unconcerned for Han, but when we adjust it based on C3PO's data we get a new belief in the real danger. The formula for the posterior is actually very simple and intuitive:

Posterior = Likelihood⋅Prior

The only thing not explicitly stated in this formula is that we usually want to normalize everything so it sums up to 1. It also turns out that combining our two beta distributions in this way, including the normalization, is remarkably easy.

Beta(αposterior,βposterior) = Beta(αlikelihood + αprior,βlikelihood + βprior)

And here is what our final, posterior, belief looks like:

Combining our C3PO belief with our 'Han is badass' belief we find that we have a much less extreme position than either of these. Our Posterior belief is a roughly 75% chance of survival, which means we still think Han has a good shot of making it, but we're much more nervous.

This post has been pretty light on math, but the real aim was to introduce the idea of Bayesian priors and show that they are as rational as believing that Han Solo isn't facing certain doom by entering the asteroid field. At the same time we can't just throw away the information that C3PO has to share with us. The only sane way to understand the situation is to combine our belief in Han with the information C3PO has provided. This concept is a fundamental principle of Bayesian analysis.

This article first appeared on Will Kurt's Count Bayesie blog

Slack was hacked

28 March 2015 - 1:00am

We were recently able to confirm that there was unauthorized access to a Slack database storing user profile information. We have since blocked this unauthorized access and made additional changes to our technical infrastructure to prevent future incidents. We have also released two factor authentication and we strongly encourage all users to enable this security feature.

We are very aware that our service is essential to many teams. Earning your trust through the operation of a secure service will always be our highest priority. We deeply regret this incident and apologize to you, and to everyone who relies on Slack, for the inconvenience.

Here is some specific information we can share about this incident:

  • Slack maintains a central user database which includes user names, email addresses, and one-way encrypted (“hashed”) passwords. In addition, this database contains information that users may have optionally added to their profiles such as phone number and Skype ID.
  • Information contained in this user database was accessible to the hackers during this incident.
  • We have no indication that the hackers were able to decrypt stored passwords, as Slack uses a one-way encryption technique called hashing.
  • Slack’s hashing function is bcrypt with a randomly generated salt per-password which makes it computationally infeasible that your password could be recreated from the hashed form.
  • Our investigation, which remains ongoing, has revealed that this unauthorized access took place during a period of approximately 4 days in February.
  • No financial or payment information was accessed or compromised in this attack.

Since the compromised system was first discovered, we have been working 24 hours a day to methodically examine, rebuild and test each component of our system to ensure it is safe. We are collaborating with outside experts to cross-check assumptions and ensure that we are meticulous in our approach. In addition we have notified law enforcement of this illegal intrusion.

As part of our investigation we detected suspicious activity affecting a very small number of Slack accounts. We have notified the individual users and team owners who we believe were impacted and are sharing details with their security teams. Unless you have been contacted by us directly about a password reset or been advised of suspicious activity in your team’s account, all the information you need is in this blog post.

We are committed to continual improvement of both internal security practices and development of features that help you take control of your own and your team’s security on Slack. In addition to the recent changes to our infrastructure, we have also just released two new features you should know about:

  • Two Factor Authentication (“2FA”; also known as “two step verification”), which is now available for all users/teams. Detailed instructions are available on our help site and if you are signed in, you can set it up right now on your team site. We strongly recommend that everyone use 2FA, both on Slack and everywhere else it is available.
  • A “Password Kill Switch” for team owners, which allows for both instantaneous team-wide resetting of passwords and forced termination of all user sessions for all team members (which means that everyone is signed out of your Slack team in all apps on all devices). Team owners can find this option under the authentication tab of your team settings.

For more on our security practices and policies, see https://slack.com/security. Should you have any questions, see our FAQ below or contact us at security@slack.com.

Again, our most sincere apologies. We are making every effort to prevent any similar occurrence in the future.

Anne Toth
VP, Policy & Compliance Strategy

FAQ

Q: How do I reset my password?

You can reset your password in your Slack profile settings. In addition, team owners and administrators can now easily reset passwords for an entire team at once using our new “password kill switch” feature.

If your Slack team uses single sign-on (SSO) you do not need to reset your password as we do not store passwords for users with this feature enabled.

Q: Why are you releasing Two Factor Authentication now? Why not earlier?

Two Factor Authentication has been in development for the last few months. It is a complicated change which requires additional support resources, administrative capabilities, changes to all applications, mobile and desktop, and extensive testing. We were about a week from release, with just a few small UI tweaks to simplify and clarify the usage experience.

We have decided to release it immediately, despite the remaining bits of clunky-ness: the feature works and it does provide a significant new level of protection against unauthorized access to your Slack account. We will be improving this feature in future releases but the feature functionality is what is most important right now.

Q. What are you doing to prevent additional breaches?

We cannot overemphasize how seriously we take this incident and the importance we place on the security of your information in the broadest sense, from internal compliance processes, audits and physical access control to continual review of our systems design and approach to technical operations.

We have launched Two Factor Authentication and additional administrative security tools to help users and teams better manage the security of their own accounts. You can expect to hear more about new security initiatives and features in Slack and you can count on our commitment to the ongoing investment in and prioritization of Slack’s security.

Q: Were my messages taken/read/accessed?

If you have not been explicitly informed by us in a separate communication that we detected suspicious activity involving your Slack account, we are very confident that there was no unauthorized access to any of your team data (such as messages or files).

Q: Who can I reach if I have additional questions?

If you have questions outside of those covered here please contact security@slack.com.

422 Free Art Books from the Metropolitan Museum of Art

28 March 2015 - 1:00am

You could pay $118 on Amazon for the Metropolitan Museum of Art’s catalog The Art of Illumination: The Limbourg Brothers and the Belles Heures of Jean de France, Duc de Berry. Or you could pay $0 to download it at MetPublications, the site offering “five decades of Met Museum publications on art history available to read, download, and/or search for free.” If that strikes you as an obvious choice, prepare to spend some serious time browsing MetPublications’ collection of free art books and catalogs.

You may remember that we featured the site a few years ago, back when it offered 397 whole books free for the reading, including American Impressionism and Realism: The Painting of Modern Life, 1885–1915; Leonardo da Vinci: Anatomical Drawings from the Royal Library; and Wisdom Embodied: Chinese Buddhist and Daoist Sculpture in The Metropolitan Museum of ArtBut the Met has kept adding to their digital trove since then, and, as a result, you can now find there no fewer than 422 art catalogs and other books besides. Those sit alongside the 400,000 free art images the museum put online last year.

So have a look at MetPublications’ current collection and you’ll find you now have unlimited access to such lush as well as artistically, culturally, and historically varied volumes as African IvoriesChess: East and West, Past and PresentModern Design in The Metropolitan Museum of Art, 1890–1990; Vincent Van Gogh: The Drawings; French Art Deco; or even a guide to the museum itself (vintage 1972).

Since I haven’t yet turned to art collection — I suppose you need money for that — these books don’t necessarily make me covet the vast sweep of artworks they depict and contextualize. But they do make me wish for something even less probable: a time machine so I could go back and see all these exhibits firsthand.

Related Content:

Download Over 250 Free Art Books From the Getty Museum

The Metropolitan Museum of Art Puts 400,000 High-Res Images Online & Makes Them Free to Use

The Guggenheim Puts 109 Free Modern Art Books Online

Where to Find Free Art Images & Books from Great Museums, and Free Books from University Presses

700 Free eBooks for iPad, Kindle & Other Devices

Colin Marshall hosts and produces Notebook on Cities and Culture as well as the video series The City in Cinema and writes essays on cities, language, Asia, and men’s style. He’s at work on a book about Los Angeles, A Los Angeles Primer. Follow him on Twitter at @colinmarshall or on Facebook.


Ikea's flat-pack refugee shelter is entering production

28 March 2015 - 1:00am

UN refugee agency buys 10,000 lightweight 'Better Shelters' for delivery this summer

(© Ikea Foundation)

Ikea's line of flat-pack refugee shelters are going into production, the Swedish furniture maker announced this week, after being tested among refugee families in Ethiopia, Iraq, and Lebanon. The lightweight "Better Shelter" was developed under a partnership between the Ikea Foundation and the United Nations High Commissioner for Refugees (UNHCR). Each unit takes about four hours to assemble and is designed to last for three years — far longer than conventional refugee shelters, which last about six months.

That's important considering the prolonged refugee crisis that has unfolded across the Middle East. The ongoing war in Syria has spurred nearly 4 million people to leave their homes, according to UN figures, and as the conflict enters it's fifth year, there's still no end in sight. Many have sought refuge in neighboring countries, while others have tried to cross into Europe.

The crisis has put considerable strain on refugee camps, but the Ikea Foundation, Ikea's philanthropic arm, hopes the Better Shelter could make life a little easier for those staying there. Measuring about 188 square feet, each shelter accommodates five people and includes a rooftop solar panel that powers a built-in lamp and USB outlet. The structure ships just like any other piece of Ikea furniture, with insulated, lightweight polymer panels, pipes, and wires packed into a cardboard box. According to Ikea, it only takes about four hours to assemble.

"Putting refugee families and their needs at the heart of this project is a great example of how democratic design can be used for humanitarian value," Jonathan Spampinato, the Ikea Foundation's head of strategic planning and communications, said in a statement Tuesday. "We're incredibly proud that the Better Shelter is now available, so refugee families and children can have a safer place to call home."

Production of the Better Shelter is scheduled to begin soon. The UNHCR has agreed to buy 10,000 of the shelters, and will begin providing them to refugee families this summer.

Hint: Use the 's' and 'd' keys to navigate

  • Each shelter measures 17.5 square meters (188 square feet), accommodating up to five people. (© BetterShelter.org)

  • A look at the interior of a Better Shelter prototype used in the Kawergosk Refugee Camp in Erbil, Iraq (© BetterShelter.org)

  • Each shelter takes about four hours to assemble. (© Ikea Foundation)

  • The ongoing civil war in Syria has triggered a refugee crisis across the Middle East, forcing millions to seek shelter in Iraq and other neighboring countries. (© BetterShelter.org)

  • According to UN figures, Ethiopia has received about 200,000 new refugees since the beginning of 2014, mostly from South Sudan. Above, families assemble a Better Shelter prototype in the Hilawyen Refugee camp in Dollo Ado, Ethiopia. (© BetterShelter.org)

  • The Better Shelter is designed to last for up to three years, far longer than conventional tents. (© Ikea Foundation)

  • A young Somali refugee stands with her baby in front of a Better Shelter prototype. (© Ikea Foundation)

  • The shelter's textile sheet is designed to reflect sun during the day and retain heat at night. (© Ikea Foundation)

  • Production of the Better Shelter is scheduled to begin soon, and the UNHCR will begin providing them to families this summer. (© Ikea Foundation)

  • Solar panels on the top of the shelters power its lights and a USB connector. (© Ikea Foundation)

Twitter's live-streaming app Periscope may kill Meerkat I used a robot to go to work from 3,500 miles away Put on headphones and hear New York City in 3D audio This is why Fox is bringing back The X-Files The new MacBook Pro's trackpad clicks you with electromagnets See more videos data.settings.autoUpdateAlertMaxShown) { %]--> ]]>

A standard for building APIs in JSON

28 March 2015 - 1:00am
JSON API :: A standard for building APIs in JSON.

HTTP/1.1 200 OK Server: GitHub.com Date: Sat, 28 Mar 2015 02:05:54 GMT Content-Type: text/html; charset=utf-8 Content-Length: 12133 Last-Modified: Fri, 27 Mar 2015 12:28:15 GMT Expires: Sat, 28 Mar 2015 02:15:54 GMT Cache-Control: max-age=600 Access-Control-Allow-Origin: * Accept-Ranges: bytes

If you've ever argued with your team about the way your JSON responses should be formatted, JSON API is your anti-bikeshedding weapon.

By following shared conventions, you can increase productivity, take advantage of generalized tooling, and focus on what matters: your application.

Clients built around JSON API are able to take advantage of its features around efficiently caching responses, sometimes eliminating network requests entirely.

Here's an example response from a blog that implements JSON API:

{ "links": { "self": "http://example.com/posts", "next": "http://example.com/posts?page[offset]=2", "last": "http://example.com/posts?page[offset]=10" }, "data": [{ "type": "posts", "id": "1", "title": "JSON API paints my bikeshed!", "links": { "self": "http://example.com/posts/1", "author": { "self": "http://example.com/posts/1/links/author", "related": "http://example.com/posts/1/author", "linkage": { "type": "people", "id": "9" } }, "comments": { "self": "http://example.com/posts/1/links/comments", "related": "http://example.com/posts/1/comments", "linkage": [ { "type": "comments", "id": "5" }, { "type": "comments", "id": "12" } ] } } }], "included": [{ "type": "people", "id": "9", "first-name": "Dan", "last-name": "Gebhardt", "twitter": "dgeb", "links": { "self": "http://example.com/people/9" } }, { "type": "comments", "id": "5", "body": "First!", "links": { "self": "http://example.com/comments/5" } }, { "type": "comments", "id": "12", "body": "I like XML better", "links": { "self": "http://example.com/comments/12" } }] }

The response above contains the first in a collection of "posts", as well as links to subsequent members in that collection. It also contains resources linked to the post, including its author and comments. Last but not least, links are provided that can be used to fetch or update any of these resources.

JSON API covers creating and updating resources as well, not just responses.

Status

This document is a work in progress and will change as implementation work progresses. See the Status page for more information.

MIME Types

JSON API has been properly registered with the IANA. Its media type designation is application/vnd.api+json.

Format documentation

To get started with JSON API, check out documentation for the base specification.

Extensions

JSON API can be extended in several ways.

Official extensions are available for Bulk and JSON Patch operations.

Update history
  • 2015-03-16: Release candiate 3 released.
  • 2013-05-03: Initial release of the draft.
  • 2013-07-22: Media type registration completed with the IANA.

You can subscribe to an RSS feed of individual changes here.

Built with Jekyll and Highlight.js. Hosted by GitHub Pages.
Generated at 2015-03-27T12:28:13+00:00. JSON API :: A standard for building APIs in JSON.

HTTP/1.1 200 OK Server: GitHub.com Date: Sat, 28 Mar 2015 02:05:54 GMT Content-Type: text/html; charset=utf-8 Content-Length: 12133 Last-Modified: Fri, 27 Mar 2015 12:28:15 GMT Expires: Sat, 28 Mar 2015 02:15:54 GMT Cache-Control: max-age=600 Access-Control-Allow-Origin: * Accept-Ranges: bytes

If you've ever argued with your team about the way your JSON responses should be formatted, JSON API is your anti-bikeshedding weapon.

By following shared conventions, you can increase productivity, take advantage of generalized tooling, and focus on what matters: your application.

Clients built around JSON API are able to take advantage of its features around efficiently caching responses, sometimes eliminating network requests entirely.

Here's an example response from a blog that implements JSON API:

{ "links": { "self": "http://example.com/posts", "next": "http://example.com/posts?page[offset]=2", "last": "http://example.com/posts?page[offset]=10" }, "data": [{ "type": "posts", "id": "1", "title": "JSON API paints my bikeshed!", "links": { "self": "http://example.com/posts/1", "author": { "self": "http://example.com/posts/1/links/author", "related": "http://example.com/posts/1/author", "linkage": { "type": "people", "id": "9" } }, "comments": { "self": "http://example.com/posts/1/links/comments", "related": "http://example.com/posts/1/comments", "linkage": [ { "type": "comments", "id": "5" }, { "type": "comments", "id": "12" } ] } } }], "included": [{ "type": "people", "id": "9", "first-name": "Dan", "last-name": "Gebhardt", "twitter": "dgeb", "links": { "self": "http://example.com/people/9" } }, { "type": "comments", "id": "5", "body": "First!", "links": { "self": "http://example.com/comments/5" } }, { "type": "comments", "id": "12", "body": "I like XML better", "links": { "self": "http://example.com/comments/12" } }] }

The response above contains the first in a collection of "posts", as well as links to subsequent members in that collection. It also contains resources linked to the post, including its author and comments. Last but not least, links are provided that can be used to fetch or update any of these resources.

JSON API covers creating and updating resources as well, not just responses.

Status

This document is a work in progress and will change as implementation work progresses. See the Status page for more information.

MIME Types

JSON API has been properly registered with the IANA. Its media type designation is application/vnd.api+json.

Format documentation

To get started with JSON API, check out documentation for the base specification.

Extensions

JSON API can be extended in several ways.

Official extensions are available for Bulk and JSON Patch operations.

Update history
  • 2015-03-16: Release candiate 3 released.
  • 2013-05-03: Initial release of the draft.
  • 2013-07-22: Media type registration completed with the IANA.

You can subscribe to an RSS feed of individual changes here.

Built with

Jekyll

and

Highlight.js

. Hosted by

GitHub Pages

.

Generated at 2015-03-27T12:28:13+00:00.

Inform: A Language for Interactive Fiction

28 March 2015 - 1:00am

Inform is a design system for interactive fiction based on natural language. It is a radical reinvention of the way interactive fiction is designed, guided by contemporary work in semantics and by the practical experience of some of the world's best-known writers of IF.

Interactive fiction lets the player explore your worlds and stories through text. Write adventure games, historical simulations, gripping stories or experimental digital art.

Inform's source reads like English sentences, making it uniquely accessible to non-programmers. It's very easy to get started. Watch a screencast.

Inform runs under Mac OS X, Windows, Linux, and more. The games it produces can be played on an even wider range of platforms, including handheld devices, legacy computers and the iPhone. Download Inform for your platform.

Inform is used in the classroom by teachers at all levels from late elementary school through university. Playing and writing interactive fiction develops literacy and problem-solving skills and allows the development of historical simulations. See tutorials and reports from the field.

Inform is widely used with screen-readers and other tools serving the visually-impaired.

Inform build 6L02, new for May 2014, makes a major reform of the language. Text handling is better, Inform can now generate adaptive grammar, there's real number support, and much more.

Build 6L02 also introduces a Public Library of extensions, downloadable from within the Inform application. (This feature debuts on Mac OS X, but will spread to other platforms soon.)

Inform now has an open bug tracker (powered by Mantis). Of course, the software is perfect, but this is where faults would be reported if not

The two books in Inform come built in to the application, but you can also download them as EPUB ebooks.

Inform's first website was a single hand-coded HTML page in the primitive, 10,000-site Web of 1995. Today the Web has a hundred million sites, and we're larger and better too. This is our fourth website, coded by Liza Daly of Threepress Consulting using the Django content management framework. Welcome.

Super Mario 64 HD

27 March 2015 - 1:00pm

Demonstration project for the Super Character Controller, a recreation of Super Mario 64’s first level, Bob-Omb Battlefield. Everything is just as you remember, except some really minor stuff that nobody cared about like red coins or the Wing Cap or the Big Bob-omb. Replacing them are crowd pleasers like giant springs and coin blocks.

Gamepad support is available, so if you have one make sure to open the controls menu to set it up (keyboard controls are default). I’ve tested it with the Xbox One, Xbox 360, and DualShock 3 and 4. If any of these do not work, please post a comment below describing your situation. Likewise, if you use a gamepad that I did not list and it did work, please mention it so I can add it to the list.

If any of the links on this page are down, please post in the comments so that I can get them working again!

A desktop version is also available below for the two most popular operating systems and some lame one nobody outside of it’s devoted cult following actually likes.

When playing the desktop versions, do not edit the input settings in the Unity input menu. Instead, setup your input configuration in the in-game controls menu.

For Unity developers, the Unity project zip can be downloaded below. While this project was not developed at all using Unity 5, I did upgrade it to Unity 5 and resolve all the various issues, so there should not be any complications. It’s worth noting that the framerate is much, much worse in the Editor, so if you don’t have a 1337 computer like mine you may notice some pretty big hits.

I currently do not have any plans to develop this any further or to resolve any bugs, unless they’re horrendously gamebreaking and horrendously simple to fix. This project is provided as-is, and you are free to use it for any purposes you like, with the exception of selling it for profit. All included code libraries’ previous licences still apply.

Other than the Super Character Controller (and it’s respective libraries), I use a heavily modified cInput v1.4 from roidz to handle custom input, and pixelplacement’s iTween for path tweening (for the rolling balls). All the art and animations were done by myself, with the exception of the Mario, Goomba and Power Star meshes, which are ripped (without animations) from Super Mario Galaxy. A large portion of the sounds are from existing Mario games, while the ones I found and edited myself are from freesound.org. If I’ve used anyone’s work and missed a citation, please tell me in the comment section (or message me through the Unity forum). The UI elements were painted by me, based on the original Mario 64 user interface.

Like this:

Like Loading...

Termui – Go terminal dashboard

27 March 2015 - 1:00pm
README.md

Go terminal dashboard. Inspired by blessed-contrib, but purely in Go.

Cross-platform, easy to compile, and fully-customizable.

Demo:

Grid layout:

Expressive syntax, using 12 columns grid system

import ui "github.com/gizak/termui" // init and create widgets... // build ui.Body.AddRows( ui.NewRow( ui.NewCol(6, 0, widget0), ui.NewCol(6, 0, widget1)), ui.NewRow( ui.NewCol(3, 0, widget2), ui.NewCol(3, 0, widget30, widget31, widget32), ui.NewCol(6, 0, widget4))) // calculate layout ui.Body.Align() ui.Render(ui.Body)

demo code:

The helloworld color scheme drops in some colors!

demo code

demo code

demo code

demo code

demo code

demo code

godoc

MIT License

U.S. Air Force overstepped bounds in SpaceX certification: report

27 March 2015 - 1:00pm

The unmanned Falcon 9 rocket, launched by SpaceX and carrying NOAA's Deep Space Climate Observatory Satellite, lifts off from launch pad 40 the Cape Canaveral Air Force Station in Cape Canaveral, Florida February 11, 2015.

Reuters/Scott Audette

WASHINGTON (Reuters) - The U.S. Air Force overstepped its bounds as it worked to certify privately held SpaceX to launch military satellites, undermining the benefit of working with a commercial provider, an independent review showed on Thursday.

The report cited a "stark disconnect" between the Air Force and SpaceX, or Space Exploration Technologies, about the purpose of the certification process and recommended changes.

Air Force Secretary Deborah James ordered the review after the service missed a December deadline for certifying SpaceX to compete for some launches now carried out solely by United Launch Alliance, a joint venture of Lockheed Martin Corp and Boeing Co.

The Pentagon is eager to certify SpaceX as a second launch provider, given mounting concerns in Congress about ULA's use of a Russian-built engine to power its Atlas 5 rocket.

The Air Force said on Monday it was revamping the certification process, but did not release the report on the review until Thursday and hoped to complete the work by June.

The report, prepared by former Air Force Chief of Staff General Larry Welch, said the Air Force treated the process like a detailed design review, dictating changes in SpaceX's Falcon 9 rocket and even the company's organizational structure.

That approach resulted in over 400 issues that needed to be resolved, which was "counterproductive" to a national policy aimed at encouraging competition in the sector.

In fact, the process was intended to show that SpaceX met overall requirements to launch military satellites, not carry out the more detailed review required for each launch on a case-by-case basis, he said.

Welch faulted SpaceX for assuming its experience launching other Falcon 9 rockets would suffice to be certified, and not expecting to have to resolve any issues at all.

"The result to date has been ... the worst of all worlds, pressing the Falcon 9 commercially oriented approach into a comfortable government mold that eliminates or significantly reduces the expected benefits to the government of the commercial approach. Both teams need to adjust," he said.

He urged the Air Force's Space and Missiles Systems Center to "embrace SpaceX innovation and practices," while SpaceX needed to understand the Air Force's need to mitigate risks, and be more open to benefiting from the government's experience.

(Reporting by Andrea Shalal. Editing by Andre Grenon)

Density – Fast compression library

27 March 2015 - 1:00pm
README.md

Superfast compression library

DENSITY is a free C99, open-source, BSD licensed compression library.

It is focused on high-speed compression, at the best ratio possible. DENSITY features a buffer and stream API to enable quick integration in any project.

sharc -c1 density 0.12.0 0.111s (900 MB/s) 0.085s (1175 MB/s) 61 524 502 61,52% 0.196s lz4 -1 lz4 r126 0.461s (217 MB/s) 0.091s (1099 MB/s) 56 995 497 57,00% 0.552s lzop -1 lzo 2.08 0.367s (272 MB/s) 0.309s (324 MB/s) 56 709 096 56,71% 0.676s sharc -c2 density 0.12.0 0.212s (472 MB/s) 0.217s (460 MB/s) 53 156 782 53,16% 0.429s sharc -c3 density 0.12.0 0.361s (277 MB/s) 0.396s (253 MB/s) 47 991 605 47,99% 0.757s lz4 -3 lz4 r126 1.520s (66 MB/s) 0.087s (1149 MB/s) 47 082 421 47,08% 1.607s lzop -7 lzo 2.08 9.562s (10 MB/s) 0.319s (313 MB/s) 41 720 721 41,72% 9.881s

Squash

Squash is an abstraction layer for compression algorithms, and has an extremely exhaustive set of benchmark results, including density's, available here. You can choose between system architecture and compressed file type. There are even ARM boards tested ! A great tool for selecting a compression library.

FsBench

FsBench is a command line utility that enables real-time testing of compression algorithms, but also hashes and much more. A fork with the latest density releases is available here for easy access. The original author's repository can be found here. Very informative tool as well.

Here are the results of a couple of test runs on a MacBook Pro, OSX 10.10.2, 2.3 GHz Intel Core i7, 8Go 1600 MHz DDR, SSD :

enwik8 (100,000,000 bytes)

Codec version args C.Size (C.Ratio) E.Speed D.Speed E.Eff. D.Eff. density::chameleon 2015-03-22 61524474 (x 1.625) 903 MB/s 1248 MB/s 347e6 480e6 density::cheetah 2015-03-22 53156746 (x 1.881) 468 MB/s 482 MB/s 219e6 225e6 density::lion 2015-03-22 47991569 (x 2.084) 285 MB/s 271 MB/s 148e6 140e6 LZ4 r127 56973103 (x 1.755) 258 MB/s 1613 MB/s 111e6 694e6 LZF 3.6 very 53945381 (x 1.854) 192 MB/s 370 MB/s 88e6 170e6 LZO 2.08 1x1 55792795 (x 1.792) 287 MB/s 371 MB/s 126e6 164e6 QuickLZ 1.5.1b6 1 52334371 (x 1.911) 281 MB/s 351 MB/s 134e6 167e6 Snappy 1.1.0 56539845 (x 1.769) 244 MB/s 788 MB/s 106e6 342e6 wfLZ r10 63521804 (x 1.574) 150 MB/s 513 MB/s 54e6 187e6

silesia (211,960,320 bytes)

Codec version args C.Size (C.Ratio) E.Speed D.Speed E.Eff. D.Eff. density::chameleon 2015-03-22 133118910 (x 1.592) 1040 MB/s 1281 MB/s 386e6 476e6 density::cheetah 2015-03-22 101751474 (x 2.083) 531 MB/s 493 MB/s 276e6 256e6 density::lion 2015-03-22 89433997 (x 2.370) 304 MB/s 275 MB/s 175e6 159e6 LZ4 r127 101634462 (x 2.086) 365 MB/s 1815 MB/s 189e6 944e6 LZF 3.6 very 102043866 (x 2.077) 254 MB/s 500 MB/s 131e6 259e6 LZO 2.08 1x1 100592662 (x 2.107) 429 MB/s 578 MB/s 225e6 303e6 QuickLZ 1.5.1b6 1 94727961 (x 2.238) 370 MB/s 432 MB/s 204e6 238e6 Snappy 1.1.0 101385885 (x 2.091) 356 MB/s 1085 MB/s 185e6 565e6 wfLZ r10 109610020 (x 1.934) 196 MB/s 701 MB/s 94e6 338e6

SpookyHash algorithm, which is extremely fast and offers a near-zero performance penalty. An additional integrity check will then be automatically performed during decompression.

https://github.com/tarsa). It is derived from chameleon and uses swapped double dictionary lookups and predictions. It can be extremely good with highly compressible data (ratio reaching 10% or less). On typical compressible data compression ratio is about 50% or less. It is still extremely fast for both compression and decompression and is a great, efficient all-rounder algorithm.

Lion ( DENSITY_COMPRESSION_MODE_LION_ALGORITHM )

Lion is a multiform compression algorithm derived from cheetah. It goes further in the areas of dynamic adaptation and fine-grained analysis. It uses swapped double dictionary lookups, multiple predictions, shifting sub-word dictionary lookups and forms rank entropy coding. Lion provides the best compression ratio of all three algorithms under any circumstance, and is still very fast.

the SHARC project.

Australia outlaws warrant canaries

27 March 2015 - 1:00pm

The Australian Parliament has passed a series of amendments to the country's Telecommunications (Interception and Access) Act 1979, requiring "telecommunications service providers to retain for two years telecommunications data (not content) prescribed by regulations."

The two-year retention period equals the maximum allowed under the EU's earlier Data Retention Directive that was struck down last year by the Court of Justice of the European Union for being "a wide-ranging and particularly serious interference with the fundamental rights to respect for private life and to the protection of personal data." This month, the European Commission announced that it had no plans to introduce a new Data Retention Directive, although Member States are still able to introduce their own national legislation.

Despite that move away from retaining communications metadata by the EU and continuing concerns in the US about the National Security Agency's bulk phone metadata spying program, the Australian government was able to push through the amendments implementing data retention thanks to the support of the main opposition party. Labor agreed to vote in favor of the Bill once a requirement to use special "journalist information warrants" was introduced for access to journalists' metadata, with a view to shielding their sources. No warrant is required for obtaining the metadata of other classes of users, not even privileged communications between lawyers and their clients. Even for journalists, the extra protection is weak, and the definition of what constitutes a journalist is rather narrow—bloggers and occasional writers are probably not covered.

Warrant canaries can't be used in this context either. Section 182A of the new law says that a person commits an offense if he or she discloses or uses information about "the existence or non-existence of such a [journalist information] warrant." The penalty upon conviction is two years imprisonment.

During the relatively quick passage of the amendments, the Australian government made the usual argument that metadata needs to be retained for long periods in order to fight terrorism and serious crime—even though the German experience is that, in practice, data retention does not help. Toward the end of the debate, when concerns about journalist sources were raised, one senior member of the Australian government adopted a more unusual approach to calming people's fears.

Speaking to Sky News, Australia's Communications Minister Malcolm Turnbull said that there were "always ways for people to get around things." As The Guardian reported, Turnbull went on to list a few ways to dodge the new law: "If... I communicate with you via Skype, for a voice call, or Viber, or I send you a message on Whatsapp or Wickr or Threema or Signal or Telegram—there's a gazillion of them—or indeed if we have a Facetime call, then all that the telco can see insofar as it can see anything is that my device has had a connection with, say, the Skype server or the Whatsapp server… it doesn’t see anything happen with you."

Of course, it won't only be journalists that use these and other tools to mask their metadata. Many of the applications mentioned by Turnbull are already very widely used by members of the public, and it's likely to become even more popular assuming the Australian data retention scheme survives challenges in the courts. That means implementation will make it more expensive to go online thanks to additional costs that are passed on by ISPs. The setup puts civil liberties at risk from leaks and theft from the stores of personal metadata or from abuse by officials, all while doing little to provide the security benefits claimed by the Australian government.