Life Grand Challenges – Innocentive but not enough Incentives (Part 3)

In this series I explore three aspects of the Life Grand Challenge. In the first part, I briefly described the Life Grand Challenges, estimated active participation and lastly proposed the current unfairness in the challenge which may be what is holding back active participation. In the second part, I detailed the current resources available to accuracy challenge participants and the possibility that our current generation are only looking for quick rewards. In this last part, I will describe the successful NetFlix challenge and their crowd sourcing model.

NetFlix Challenge

Ever used IMDB or Rotten Tomatoes as an indicator of how good a movie is and whether you should watch it? I have but lately I really need to find out who has been rating crap movies as good. For example, Animal Kingdom is the biggest piece of artsy fartsy crap I’ve ever seen. It got 97% on Rotten Tomatoes, while Tekken best movie ever only got 5.0 on IMDB. This highlights the problem – just because someone else likes the movie doesn’t mean I would like the movie.

Netflix wanted to know, what a customer would rate a movie given a customer’s history of movie ratings. They have a current implementation called Cinematch but wanted people to improve the accuracy. The competition rules/specifications in point form

  1. Competition was open in 2006 and would end in 2011
  2. A million dollar prize was given to the person/team that improved accuracy by 10%. Once a participant has hit this target, other participants are notified and have 30 days to try and beat this submission.
  3. A progress prize of 50, 000 is given each year to the person/team that improved accuracy by 1% of the previous progress prize winner. Therefore, for the first year this would just be the submission with the best accuracy.

NetFlix Data Set

The challenge participants were given two data sets.

  1. “The training data set consists of more than 100 million ratings from over 480 thousand randomly-chosen, anonymous customers on nearly 18 thousand movie titles.” Customers details withheld and a random customer ID was used.
  2. “A qualifying test set is provided containing over 2.8 million customer/movie id pairs with rating dates but with the ratings withheld. These pairs were selected from the most recent ratings from a subset of the same customers in the training data set, over a subset of the same movies.”

The movie rating is an integer score between 1 and 5 stars. The goal is to predict what the withheld rating is in the test data set. The root mean square error (RMSE) is then calculated from predicted and actual (i.e. the withheld) ratings.  The RMSE score achieved by the Cinematch program was 0.9525, therefore the grand prize winner must achieve 0.8572 (i.e. 10% accuracy improvement). This was achieved in September 2009.

How big was the crowd Netflix was sourcing from?

According to the leader board:

“There are currently 51051 contestants on 41305 teams from 186 different countries. We have received 44014 valid submissions from 5169 different teams.”  This does not make the 500 or so accuracy challenge participants with currently NO submissions seem so impressive for the Life Grand Challenge. Now I will highlight the differences between the two crowd sourcing models.

Competition Fairness

First and foremost with the NetFlix challenge you are not competing with the company itself. There is not a team of Cinematch programmers with access to their whole database competing against you. Second, everyone has the same data set to try to figure out a better algorithm. In massive contrast, for the Ion Torrent, larger sequencing centers will have larger data sets for their employees to work with. I currently have a pathetic TWO whole 314 data set to work with. Yes, this is better than having ONE but not much better. While an accuracy challenger working at Sanger, Broad and BGI are swimming in this stuff. Not to mention preferred and early access customers. This clearly highlights that Life Technologies is NOT democratizing sequencing, more like a dictatorship. Don’t be surprised if someone from BGI wins this competition. This unfairness in competition is even worse for the speed and throughput challenge which also favors the sequencing centers with the massive budgets. i.e. Sanger, Broad and BGI!

Moving Target

The target for the NetFlix challenge was a RMSE 0.8572 set in 2006. This target did not change during the period of the competition. In contrast, the target for the accuracy challenge changes every 3 months. This quarter’s target is actually twice as hard to achieve than last quarter. If the active participation was next to zero last quarter, this isn’t a good way of encouraging people to take up the challenge.

Milestone rewards

The milestone reward each year was $50,000 for the NetFlix challenge. In massive contrast, the Life Grand Challenge gives you absolutely nothing for all your hard efforts. Kinda makes you wanna code to 4am each night and turn up to work as a zombie 😆 No way, I’m waiting for the release of Diablo 3 for doing that 😛

There are two simple things that made the NetFlix crowd sourcing model successful. These two concepts was taken from a great blog post discussing the NetFlix Challenge. I have included my own two points, that of fairness and the value of ideas and concepts.

Reward system

Both the NetFlix challenge and the Life Grand Challenge have a 1 million dollar grand prize, however the similarities end there. The NetFlix challenge rewards the best submission for the year with $50,000. Imagine all the instant noodle packets you can buy with that money 😆 In addition, the challenge participants are publicly known and therefore is a free advertisement to the world about their talents. This serves as non-monetary but good reward. At the moment for the grand challenge, all participants, submission and leader board is kept hidden from the public.

The value of Ideas and Concepts

My suggestion is to reward ideas/concepts and not just solutions. This in a way will provide small rewards or milestones in trying to reach the best solution. These ideas and solutions may be valued at 5,000 or 10,000 and of course a solution that smashes the benchmark is still worth 1 million. The likelihood of someone in the public coming up with a solution is low but much higher for ideas/concepts that appear to have potential.

To expand a little further on the solution vs idea/concepts discussion. A solution must work in a restrictive framework which computational analysis has no control of. For example, most people working on this solution do not have an Ion Torrent therefore cannot modify flow order, chemistry, etc. If you view the accuracy challenge as a holistic problem with many factors coming together to achieve the goal, sadly a solution may be difficult to achieve. However, an idea/concept does not have to be limited to a restrictive framework. For example, an idea/concept could become a solution if things upstream were tweaked to take advantage of this idea/concept.


Besides some people having bigger brains and no life, the NetFlix challenge was quite fair. Everyone had the same data set available to them and they were not competing with NetFlix employees. While the Life Grand Challenge is as fair as an election in Zimbabwe 🙂 In the Ion Community the concept of offering a small grant to promising participants was entertained. Also a discount of 30% was offered for a PGM purchase for accuracy challenge participants. I argued that it would be impossible for someone to prove that they were active. It would go something like this “Thanks for the discounted PGM, I tried improving accuracy but it was too hard so I gave up after 5 minutes”. This offer on Innocentive was promptly removed for other reasons. I do hope they come up with some innovative initiatives to improve an obvious lack of fairness.


The 454 homopolymer problem has existed since the technology was released. Why would Ion Torrent persist with a similar flawed design? The 454 problem still persists so is not going to be solved in one quarter, therefore it is important to keep the motivation of the challengers up at all times. My suggestion, a leader board that is publicly available. All programmers love to see their name in lights particularly if it is at the top. Also a substantial quarterly prize and I’m not talking about a free T-shirt 😛 Even the PGM Users in the Ion Community showing their best Ion Torrent runs are given a reward. Developers have been offered nothing, little surprise there hasn’t been any third party plugins developed for the Torrent Browser.

This concludes my opinion based blog on the Grand Challenge and how it may be a “grand challenge” to improve the current model that appears more like a public relations stunt rather than a true fair competition. The criticism is harsh but I believe I am voicing the concerns of the many and through feedback, I have faith the competition will improve. My next few posts will be purely technical because there so much opinions I can write before I start sounding like Abe Simpson.

My next post will be Part 3 on the Fundamentals of base calling, where I will outline the EXACT computational challenges that must be overcome to improve accuracy. This post will be released at the start of next week along with the source code used to generate the data for that series.

Disclaimer: For the good of all mankind! This is purely my opinion and interpretations.  I have tried my best to keep all analyses correct. Opinons are like PGMs, everyone should have one 🙂

5 responses to “Life Grand Challenges – Innocentive but not enough Incentives (Part 3)

  1. All excellent points that Life should seriously consider, however the Netflix challenge appealed to a wider audience that Life can’t reach. Are there even 50k people on the globe that are aware of Ion Torrent?

  2. Thanks for the comments Dave. Yeah I forgot to point out that everyone loves movies !!! Especially ones where video games are turned into movies 😆 In contrast not so many people are interested in staring at a screen full of numbers or nucleotide incorporation curves. Therefore, there is even more reasons why there should be more incentives and the illusion of fairness to encourage more people to actively participate.

  3. This is a round about number of Ion Torrents installed around the world
    click on Ion Torrent to filter.
    Not many…. but more people would know about it if I got a free shirt… *hint hint*. Not that the bogans in my home town would appreciate.

  4. I agree that there is a fair contrast in the userbase between people looking for a good movie to watch and people wanting to improve the output of their PGM, but the concept is still good. As you pointed out in the first part, there are a lot of very intelligent people out there that you can make use of who either have not been motivated enough to do something or just didn’t know enough about the field to get involved in it. Being able to access them will surely be good for development.
    It is also true that having no intermediate rewards can be seen as a negative to the plan, particularly with the moving target. It seems to me that under these conditions that as time passes the difficulty of getting more people involved increases dramatically, as if they start now they are already months behind other people working on it, not to mention the actual staff employed to work on the issue. Incorporating a larger number of much smaller rewards as well as the final prize makes sense to me, as people tend to work much more efficiently when you break down a large task into a series of smaller ones. It is far more daunting to have a week to do your giant paper than it is to set yourself the task to finish the page you’re on before you go have some lunch. Smaller rewards for smaller tasks would, in my opinion, not only increase your participation rates but also the productivity of those participants.
    Using the numbers from the NetFlix example, it’s not even going to cost them that much. $50,000 a year isn’t really that much money to a large company, but to me it’s more than a year of my full time wages. This is going to give me a fair bit of motivation to work at a task, particularly if i can see it as a smaller and more easy to complete task than the overall final goal. The result of this sort of motivation from the company is, as you also mentioned earllier, a lot of man hours working on a problem for relatively low costs. If 10 people spend 4 hours a week on it, you have effectively employed a full time staff member to work on the problem (without needing to worry about any of the other issues that come with employing a staff member like workspace etc). However if 100 people spend 4 hours a week working on it, for that same expendature you have basically employed 10 people. To my mind it just seems logical that you would set up a more extensive system of smaller goals for smaller rewards.
    However, at the end of the day the fact that they are willing to work with the community at all with their IP is a good step forwards, which will hopefully be of a benefit to the scientific community as a whole.

    Also i agree with your comments regarding Diablo III, and may join you as a workplace zombie upon it’s release (although if i can make enough money out of it’s auction house system my job and hobby may reverse roles. . .)

  5. Thanks Andrew for your support and taking the time to write a detailed response. I too hope Life Technologies revisits the reward/incentive structure after this quarter has ended in a few weeks. However, maybe it was never their intention to pay out any money and it was merely for publicity. Only time will tell.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s