Archive | May, 2012

I feel vindicated by what Netflix did with the results of the Netflix Prize.

1. Fancy ML techniques don’t matter. The winning BellKor/Pragmatic Chaos teams implemented ensemble methods with something like 112 techniques smushed together. You know how many of those the Netflix team implemented? Exactly two: RBM’s and SVD.

2. Domain knowledge trumps statistical sophistication. This has always been the case in the recommendation engines I’ve done for clients. We spend most of our time trying to understand the space of your customers’ preferences — the cells, the topology, the metric, common-sense bounds, and so on. You can OO program these characteristics. And (see bottom) doing so seems to improve the ML result a lot.

3. What you measure matters more than what you squeeze out of the data

(Full Story: I feel vindicated by what Netflix did with the results of the Netflix Prize.)

Projects Are the New Job Interviews – HBR

Ultimately, the reason why I’m confident that projects are the new job interviews is not simply because I’m observing a nascent trend but because this appears to be a more efficient and effective mechanism for companies and candidates to gain the true measure of each other. Designing great applijects and projeclications will be a craft and art. The most successful utilizers will quickly be copied. Why? Because the brightest and most talented people typically like having real-world opportunities to shine and succeed.

(Full Story: Projects Are the New Job Interviews – HBR)

5 Ways Process Is Killing Your Productivity

1. Empowering with permission – but without action: It’s not empowering when people are given more responsibility, yet must still obtain an unreasonable number of approvals and sign-offs to get anything done. This signals a lack of trust.
2. Leaders focused on process instead of people: Leaders look to processes, not people, to solve problems and it doesn’t work. Where’s the inspiration, the vision? This signals a lack of humanity.
3. Overdependence on meetings: productive teamwork does not require meetings for every single action or decision. People become overwhelmed and ineffective when they are always stuck in meetings.
4. Lack of (clear) vision
5. Management acts as judge, not jury: If the purpose of a meeting is to think, create, or build, management has to stop tearing people down when they propose new ideas or question the status quo. This signals a lack of perspective and openness.

(Full Story: 5 Ways Process Is Killing Your Productivity)

Startups are Creating a New System of the World for IT

One reason for this revolution is explained by Etsy in terms of Conway’s Law:

When a team makes a product the product ends up resembling the team that made it.

I’ll extend this notion to say the team and thus the product end up resembling the underlying technology used to make it. When you change the underlying development infrastructure, by moving to a cloud, you are bound to change teams and processes they create.

(Full Story: Startups are Creating a New System of the World for IT)

How Amazon saved Zynga’s butt—and why Zynga built a cloud of its own | Ars Technica

For all Amazon’s scalability, the offerings can be a bit rigid. For example, you can rent an Amazon instance with a certain amount of storage and compute power, but adding a few gigabytes of memory or another processor might require buying a whole separate instance, which may have more resources than you really need.

“You can’t go to the public cloud and say I want another 64GB of memory here. They look at you and say ‘buy another instance of this type,’” Leinwand said.

Leinwand said the Amazon instance model leads to over-subscription, meaning you end up buying more storage than necessary. Internally, Zynga uses direct-attached storage striped across multiple servers, providing a big I/O performance boost and more efficient utilization, he said.

(Full Story: How Amazon saved Zynga’s butt—and why Zynga built a cloud of its own | Ars Technica)

Pinterest Architecture Update – 18 Million Visitors, 10x Growth,12 Employees, 410 TB of Data

80 million objects stored in S3 with 410 terabytes of user data, 10x what they had in August. EC2 instances have grown by 3x.

(Full Story: Pinterest Architecture Update – 18 Million Visitors, 10x Growth,12 Employees, 410 TB of Data)

A Big Data Infographic – taming big data

Big Data includes data sets whose size and type make them impractical to process and analyze with traditional database technologies.

(Full Story: A Big Data Infographic – taming big data)

What Powers Instagram: Hundreds of Instances, Dozens of Technologies

Most of our data (users, photo metadata, tags, etc) lives in PostgreSQL; we’ve previously written about how we shard across our different Postgres instances. Our main shard cluster involves 12 Quadruple Extra-Large memory instances (and twelve replicas in a different zone.)

We’ve found that Amazon’s network disk system (EBS) doesn’t support enough disk seeks per second, so having all of our working set in memory is extremely important. To get reasonable IO performance, we set up our EBS drives in a software RAID using mdadm.

(Full Story: What Powers Instagram: Hundreds of Instances, Dozens of Technologies)

Timeline

Timeline is also great for pulling in media from different sources. It has built in support for pulling in Tweets and media from Twitter, YouTube, Flickr, Vimeo, Google Maps and SoundCloud. Creating one is as easy as filling in a Google spreadsheet or as detailed as JSON.

(Full Story: Timeline)