# micro biology question

wormer
Science 15 Mar '17 15:49
1. 15 Mar '17 15:49
With age, somatic cells are thought to accumulate genomic scars as a resilt of the inaccurate repair to double stranded breaks by NHEJ. Estimates based on frequency of breaks in primary human fibroblasts suggest that by the age of 70 each human somatic cell carry some 2000 NHEJ-induced mutations due to inaccurate repair. If these mutations were distributed randomly around the geome, how mnay genes would you expect to be affected?
(assume 2.5% of the genome is crucial information provided by genes)
2. 15 Mar '17 15:56
I'm not quite sure how i'm supposed to go about answering this question. I need help forming a plan to make calculations.
3. 15 Mar '17 17:102 edits
Originally posted by wormer
If these mutations were distributed randomly around the geome, how mnay genes would you expect to be affected?
(assume 2.5% of the genome is crucial information provided by genes)
If one mutation occurs, it has a 2.5% chance of being in a 'crucial region'.
Next you need to figure out the probability of at least one of the total number of mutations being in the cruical region. I am afraid I don't know the formula, but this site might help:

http://www.mathgoodies.com/lessons/vol6/independent_events.html

An equivalent would be picking a single ball from a bag multiple times and always putting it back. If 2.5% of the balls are black, what is the probability of picking a black ball after 2000 attempts.

 Actually you asked how many black balls one would expect to pick. So rather more complicated.
4. 15 Mar '17 17:16
Could it really be this simple?
The probability that each mutation is in a critical region is 2.5% Therefore 2.5% of the mutations will be in critical regions
Therefore the answer is 2.5% of 2000 or 4.5
5. 15 Mar '17 17:37
Originally posted by wormer
With age, somatic cells are thought to accumulate genomic scars as a resilt of the inaccurate repair to double stranded breaks by NHEJ. Estimates based on frequency of breaks in primary human fibroblasts suggest that by the age of 70 each human somatic cell carry some 2000 NHEJ-induced mutations due to inaccurate repair. If these mutations were distributed ra ...[text shortened]... you expect to be affected?
(assume 2.5% of the genome is crucial information provided by genes)
correction- 2% of genome- 1.5% coding and 0.5 regulatory
6. 15 Mar '17 18:00
Could it really be this simple?
The probability that each mutation is in a critical region is 2.5% Therefore 2.5% of the mutations will be in critical regions
Therefore the answer is 2.5% of 2000 or 4.5
these numbers make no sense
7. 15 Mar '17 19:44
Originally posted by wormer
these numbers make no sense
Follow the logic not the numbers. Yes, I got the numbers wrong.
So its 2% of the mutations hit 'critical regions'.
2% of 2000 = 40
8. DeepThought
16 Mar '17 03:06
Originally posted by wormer
With age, somatic cells are thought to accumulate genomic scars as a resilt of the inaccurate repair to double stranded breaks by NHEJ. Estimates based on frequency of breaks in primary human fibroblasts suggest that by the age of 70 each human somatic cell carry some 2000 NHEJ-induced mutations due to inaccurate repair. If these mutations were distributed ra ...[text shortened]... you expect to be affected?
(assume 2.5% of the genome is crucial information provided by genes)
Suppose the probability of a coding mutation is x (which you seem to be saying is 2%, so 0.02). There are N mutations in total (N = 2000). Then the average number of mutations is:

<n> = 0* probability of all non-coding mutations + 1 * probability of exactly 1 coding mutation + 2 * probability of exactly 2 coding mutations + ... + N * probability that all mutations are in coding DNA.

Let's look at the typical term, we need to know the probability of n coding mutations. The probability of getting n coding mutations in a row is x^n (x to the power of n). The probability of then getting (N - n) non-coding mutations is (1 - x)^(N - n). We have to take into account that we can get our n coding mutations and (N - n) non-coding mutations in any order. This is given by the binomial coefficient (which I'll write C(N, n)). So the typical term in the above polynomial is:

n * C(N, n) * x^n * (1 - x)^(N - n)

To sum this we need a new variable y = x / (1-x), and we can rewrite the typical term as:

n* C(N, n) * y^n * (1 - x)^N

So the average number of coding mutations is now:

<n> = (1 - x)^N * sum(n = 0 ... N) n * C(N, n) * y^n

We can use that d/dy y^n = n y^(n - 1), to do the sum:

<n> = y*(1 - x)^N * d/dy sum(n = 0 ... N) C(N, n) * y^n

The sum is now straightforward:

<n> = y * (1 - x)^N * d/dy (1 + y)^N = y * (1 - x)^N * [N * (1+y)^(N - 1)]

1 + y = 1/(1 - x) so that:

<n> = [x/(1 - x)] * [(1 - x)^N] * N * [1/(1 - x)]^(N - 1) = Nx = 2000 * 0.02 = 40

The only catch is if we have to take into account the possibility that a coding mutation is in a critical gene which produces a highly conserved protein and the mutation kills the cell. Some of these mutations might kill the organism, for example if it is on the PrP gene causing CJD before age 70. So we need to factor out mutations that kill cells or the entire organism. If there are m coding bases in total of which p are critical coding bases and the genome is length l, then where x was m/l we'd need to replace it with (m - p)/(l - p). If p is small compared with m then don't worry about it.
9. 16 Mar '17 16:50
Originally posted by DeepThought
The only catch is if we have to take into account the possibility that a coding mutation is in a critical gene which produces a highly conserved protein and the mutation kills the cell. Some of these mutations might kill the organism, for example if it is on the PrP gene causing CJD before age 70. So we need to factor out mutations that kill cells or t ...[text shortened]... ed to replace it with (m - p)/(l - p). If p is small compared with m then don't worry about it.
A good point about evolution. Its not quite clear to me though what you are calculating.
The question states that there are 2000 mutations at age 70 - which means the cells involved (and the organism) survived to age 70, so the reality is that there were likely more mutations, some of which occurred in super critical regions but were weeded out by evolution (cell or organism death).
10. DeepThought