What do you think about when you look at a picture like this one? Chances are you try to find the ways in which the child resembles its parents. Sometimes it resembles one parent much more than the other, and this can also change as it grows older. Most often, however, the child will have traits that are a complicated mixture of its parents’ traits (for instance, if you look at the color of this particular baby’s eyes, you will notice that it differs from the color of both of its parents’ eyes). Not surprisingly, the way this mixture works can be explained by mathematics.
Many traits, such as eye color, nose shape, and height, depend on our genetic code, which, as I discussed in my interview with Mathieu Blanchette, can simply be thought of as a very long string of As, Cs, Gs and Ts. But when a child is conceived, it will get a part of its genetic code from the mother and a part from the father. The choice of which part comes from which parent is largely random, although some contiguous regions, called linkage disequilibrium blocks, will usually come entirely from one parent. The majority of traits, however, are influenced by multiple parts of multiple genes, and this is where the complexity comes from.
For instance, if height were entirely determined by a single gene (and nutritional and other environmental factors did not play a role), the child’s height would either be equal to its mother’s or its father’s height. However, if 100 different genes were to influence a child’s height, and each one had an equal probability of coming from the mother and from the father, and contributed equally to the child’s height, what would the height distribution look like? The answer is the same as the answer to this question: if you flip a fair coin 100 times, what is the distribution of the number of heads you will get? This is the binomial distribution, which looks like this:
Notice how closely the histogram matches the red line – a bell curve, or a normal distribution. This is a consequence of the central limit theorem, an important result in mathematics, which says that if you draw a lot of samples from the same probability distribution, the average is going to look like a normal distribution. By the way, height is actually influenced by about 200 genes, but new ones are still being discovered in large GWAS (genome-wide association studies), which I briefly discussed in an earlier post.
This example suggests that whenever a large number of random events (such as which gene comes from which parent) are involved, mathematics can provide useful insights and precise statements about the chances of a particular outcome observed in real life. This gets us to how mathematics can expose lies, or at least infidelity.
Suppose that the father in the picture has doubts that the child is really his. There is a simple procedure that can allow him to test whether his doubts are justified. He can swab the child’s cheek, as well as his own, and send the two swabs to an agency. For a small fee, the agency will then extract the genetic code of each and compare a small number of genetic loci (spots in the genome). If the child and the father are indeed genetically related, then we would expect them to have about half of these genetic loci. On the other hand, if they are not, then we would expect the match to be very low, in fact very close to 0.
There is one exception to this, however – if the child is not his, but his identical twin’s, there is essentially no way to determine that. (By the way, I’m not endorsing infidelity with one’s partner’s close relatives by any means, but mathematically, the closer the relative the less likely the resulting child can be distinguished from your partner’s; however, even the child of a partner’s non-identical twin would be easily distinguishable provided that enough genetic loci are tested).
However, it is not only human genetics that provides fodder for mathematicians. In a famous case from the mid-nineties, a doctor was convicted of infecting his former lover with HIV. The conviction was done on the basis of the virus’s genetic code. The human immunodeficiency virus (HIV) mutates fairly quickly, which results in different people being infected with different strains of it. The particular strain of HIV found in the doctor’s lover, however, matched very closely the strain found in a sample taken from one of the doctors’ patients.
To rule out a simple coincidence, the strains of a number of other HIV-infected people living in the state were collected and compared to it as well. No other strain matched it as closely. With just a little bit more mathematics than I described here, it was possible to estimate the probability of the original match occurring at random, and since it was incredibly small the doctor was convicted of attempted murder and is now serving a 50-year prison term.
Of course, the potential uses of mathematics in solving crimes are far from limited to genetics. A mathematical observation known as Benford’s law is frequently used to detect fraud (both fiscal and electoral) – but that will be the subject of a future post, so stay tuned!
Picture credit: www.namingforsuccess.com and http://zoonek2.free.fr/UNIX/48_R/07.html