In the first part of this three part series, I gave a brief introduction to the Life Grand Challenge, estimated the current participation then speculated why it isn’t higher. To be unbiased and not to be seen as a hater, I’ve decided to make this into a three part series. Before discussing the lessons learned from the successful NetFlix crowd sourcing model (Part 3), I will write about what Life Technologies has done right with the Grand Challenge.
One million dollar prize money
If this doesn’t motivate you then I guess nothing will. Rothberg and the PR machinery couldn’t have done anymore with getting it out there all over the internet. Although if the prize was a date with a girl, this would have worked equally well for Life Technologies. See Beauty and the Geek as a reference. I guess the girls working on the challenge would probably want the one million 😛
Life Technologies are quite serious and have provided what I believe is adequate resources. First, the source code which is extremely important learning resource even if you aren’t interested in the accuracy challenge. It is very rare opportunity that people are allowed to get a look at this valuable piece of intellectual property. In addition, you get an insight on the progress of the code as new versions are released each quarter. The code itself is very impressive, particularly exploiting the NVIDIA Telsa GPU, although the matrix and linear algebra GNU libraries may not be fully optimized yet 😦 I can’t wait for more bioinformatics programs to exploit the power of GPU computing. Only criticism, get larger screens for the developers… I feel their pain. I can tell by where they are taking their new lines how big their screens are 😦
Second, a forum space on the Ion Community to engage with the actual people who have written with the code. This is an extremely rare opportunity to have a developer tell you why they have decided to do things the way they have.
Third, access to computing resources which emulate the Ion Torrent server. This is done through making a Ubuntu server image available on the Amazon EC2 cloud. The cloud includes ONE full 314 data set of an E. coli run, which is approximately 50 Gb in size. Access to this cloud, however comes at a cost depending on how much CPU time you consume. On Innocentive, a VMware Ubuntu 64 bit server image is offered as a download for the poor students. Before our Ion Torrent server arrived, I used VirtualBox which is a freeware alternative to VMware and is compatible with VMware virtual disks. There still some work to do with regards to be making MORE raw data sets (i.e. DAT files) easily accessible. This will remove the temptation for accuracy challengers to over fit their implementation to the ONE data set they’re given.
Last, a brief introduction to the Life Grand Challenges and major topics and themes in the accuracy challenge. This is in audio form on blog talk radio. Therefore, if you are like me and don’t like to read, you don’t have an excuse because someone reads it for you … too easy !!! And there are no tricky big words 🙂 Also a webinar series to help those starting out particularly with Amazon EC2. Another helpful tool is a custom R package to parse and analyze the data sets.
Why aren’t people motivated
There is one million dollars on offer and enough resources for someone who is competent in bioinformatics and programming. Besides the reasons I outlined in Part 1, what else could be holding people back?
I have been on this Earth for a while and find humans very weird and much prefer lizards because they can grow their tails back. Anyways, humans of our generation like small rewards and milestones while trying to achieve something big or hard to achieve otherwise they get disinterested extremely fast. While, people from the Facebook/Twitter generation think that if something can’t be done in 5 minutes then it’s just too difficult or impossible. Nature publication – that’s easy that will take one week’s worth of work attitude. Someone in the Ion Community asked for the mathematical model of nucleotide incorporation (implemented in the source code) to be spoon fed to them. Probably after their brilliant idea of googling the C++ solution by using the query – “Life Grand Challenge C++ solution” didn’t work. But what is expected from a generation that googles their English essays.
Anyways back on topic. The way the reward system is structured is that you get absolutely nothing if you do not produce something twice as good. Although, it was hinted that any solution would be considered this promise does not seem to be set in stone because it was not written anywhere on the Innocentive specifications or I didn’t take notice of it.
In conclusion, it’s quite obvious having a challenge is much better than not having a challenge at all. If anything, it has made a load of resources available to the scientific community. In contrast, Roche never shared much resources with its 454 community. It explains why after years of struggling by scientists, the naive Bayesian approach taken by PyroBayes is the best publication on dealing with the homopolymer problem. A great publication but limited by the resources available. Roche and Illumina should loosen their grip on their Intellectual property, if someone finds a better way of doing something then everyone wins. Yes, people may laugh and criticize your Intellectual property but eventually someone will help find a better way of doing something you are struggling with and then everyone wins 🙂
Disclaimer: For the good of all mankind! This is purely my opinion and interpretations. I have tried my best to keep all analyses correct. Open source, funny comments 🙂