The Ion Torrent Accuracy Challenge – A non-biological explanation

My computing friends have read my blog and don’t understand it but think my lame jokes are quite good. They’re my friends so they MUST think the lameness is funny 😛  Anyways, in this blog post I will attempt to explain the Accuracy challenge in a more computer science friendly albeit nerdier way. Most of the biological jargon is good to know but not absolutely necessary to be able to solve the problem. The below is aimed at people who are familiar with the ACM problem sets (Sorry I fixed the dodgy link, my bad). I used to solve these problems when I was working in IBM, it was a good way of staying awake after a big lunch. My point with this post is to show that this could be pitched to the talented people who participate in ACM challenges if approached correctly.

Problem:

King Koopa (Bowser?) – Has no respect for thesis writing. This guy is really bad!

Yoshi – I think this dude is a lizard

The pipe used to hide Yoshi’s PhD thesis!

The evil King Koopa has stolen Yoshi’s PhD thesis before he was able to submit. Yoshi being a lizard has a language composed of 4 letters {A,C,T,G} from which words are constructed. Koopa decides to shred Yoshi’s thesis into one million words and place each one of them in these green tubes and seals them.

In order to rewrite his thesis, Yoshi  must infer what word is in each tube. This can be achieved by tapping one of the four sides of the tube over a 11 second period (round). Yoshi taps at a rate of 15 taps per second. Yoshi repeats this procedure on the tube sides for 260 rounds. Therefore each tube receives a total of 15 x 11 x 260 taps spread equally across the 4 sides.

The echo caused by the tapping allows Yoshi to infer the letters (because that’s what lizards do!). The inferred letter length can be 0, 1, 2, etc depending on how loud the echos are at crucial taps in a round of tapping. Each of the 4 sides corresponds to one of the letters in Yoshi’s 4 letter language. Yoshi tests each letter and cycles in a logical fashion.

Input:

Measured echo intensities can be thought of as a two dimensional array.

int data[n][m]

0 <= n < 260 – The round number. This determines the letter currently being decoded (i.e. side of the tube that’s being tapped on).

0 <= m < 11 x 15 – The tap number. Each round of tapping goes for 11 seconds. Therefore at a tapping rate of 15 taps/sec, there are 11 x 15 taps in total for a round.

Output:

Below is the result of 4 different rounds of tapping the same side of the green tube corresponding to the letter T in Yoshi’s alphabet. Yoshi has very good ears and is able to tell the difference between the subwords T, TTT, TTTTT and when there are no T in the current position of the word. Although Yoshi has good ears, over time the echos created from previous taps persists for longer making it hard for Yoshi to hear echos within and between rounds. In addition, in later rounds Yoshi gets tired and doesn’t tap as hard so the echos produced aren’t as loud making it difficult for Yoshi to discriminate between letter lengths. Often Yoshi gives up before getting to the end of the word and takes guesses at times.

The accuracy challenge is to help Yoshi filter out the persistent echos within and between rounds so he can concentrate on the “true echo”. Also it’s to get Yoshi to HARDEN UP and not get so tired in the later rounds of tapping.

Figure 1. Yeah I didn’t have time to relabel the axis. Anyways, this is the echo intensities plotted from 4 rounds of tube tapping. Each round lasts for 11 seconds at 15 taps per second.

Disclaimer: For the good of all mankind! This is purely my opinion and interpretations. If Koopa did that with my thesis, I would hit him with the red shell!

Advertisements

One response to “The Ion Torrent Accuracy Challenge – A non-biological explanation

  1. I understand from the technology description at ion torrents website (without having entered the competition yet, perhaps more detailed information and a more specific problem description is available upon doing so?) that if it is a mismatch, -i.e. flooded with A when T is required to continue the sequence- then there would be no signal. In this plot (source?) however a 0-mer gives a strong signal as well (only at the start of the signal do they seem to differ appreciably). I assume the company’s animation and explanation is a bit oversimplified… Trying to understand your explanation: the buildup (even for 0-mer) is due to an “echo” of the “taps”, and not due to electron release? What is a “tap” physically speaking? if it is due to buildup of previous taps, why does the graph decrease after a while? If simply echoe’s it should saturate due to previous taps… The plot reminds me a lot of experiments I did with PMT tubes etc…

    also strange from the perspective of the commercial explanation is that the plots show voltage, when it is claimed a charge is released. This would necessitate a capacitance function, which would depend on the specific setup… also unclear is what preprocessing (both hardware and software) was done to arrive at this array of integers you describe.

    does this graph in fact represent an impulse response?

    something else I dont understand is how exactly polymerization of one or more nucleotides releases a net charge: chemistry conserves charge, allthough I can understand the mobility of hydrogen ions will be much higher than heaver other ions released.

    is there more information available once entering the challenge and agreeing to the terms?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s