'Junk' DNA goes Functional

Diodorus Siculus

Science

06 Nov 08

Diodorus Siculus

Joined: 07 Dec 07
Moves: 2100

06 Nov 08

"Junk DNA" was a term introduced to describe DNA sequences with no known function, steming from the surprising finding that much of higher eukaryotic genomes contain few "exonic" regions, i.e. DNA encoding for protein. Humans contain about ~95% junk DNA much of which is rather worryingly obsolete virus material, thought to have arisen from germline integration of viruses, retrotransposons and other transposable selfish elements (McClintock's "jumping genes"😉.

The C-value paradox is a description of early genome work that somewhat surprisingly, say the corn plant (Zea mays) has more genomic DNA than a human, whereas any reasonable measure of biologic complexity, such as, the number of mRNA transcripts (protein coding RNA from exonic DNA) generated by the compared species shows humans to be vastly more complex genetically.

Exciting and profound new work shows that the non-coding/"junk" DNA has important roles in gene regulation and the transposable elements provide many binding sites for transcription factors.

"Junk" DNA which was always meant to be a red rag to a bull as molecular biologists were not happy with 95% non-functional and is a good example of nomenclature provoking a useful reaction experimentally. It took a lot longer than thought but it now appears "junk" DNA is a major driver of evolution. Exciting stuff. Below is a copy and paste that is going up on sites all over the place. Abstract is here: http://genome.cshlp.org/content/current

In a paper published in Genome Research on Nov. 4, scientists at the Genome Institute of Singapore (GIS) report that what was previously believed to be "junk" DNA is one of the important ingredients distinguishing humans from other species.

More than 50 percent of human DNA has been referred to as "junk" because it consists of copies of nearly identical sequences. A major source of these repeats is internal viruses that have inserted themselves throughout the genome at various times during mammalian evolution.

Using the latest sequencing technologies, GIS researchers showed that many transcription factors, the master proteins that control the expression of other genes, bind specific repeat elements. The researchers showed that from 18 to 33% of the binding sites of five key transcription factors with important roles in cancer and stem cell biology are embedded in distinctive repeat families.

Over evolutionary time, these repeats were dispersed within different species, creating new regulatory sites throughout these genomes. Thus, the set of genes controlled by these transcription factors is likely to significantly differ from species to species and may be a major driver for evolution.

This research also shows that these repeats are anything but "junk DNA," since they provide a great source of evolutionary variability and might hold the key to some of the important physical differences that distinguish humans from all other species.

The GIS study also highlighted the functional importance of portions of the genome that are rich in repetitive sequences.

"Because a lot of the biomedical research use model organisms such as mice and primates, it is important to have a detailed understanding of the differences between these model organisms and humans in order to explain our findings," said Guillaume Bourque, Ph.D., GIS Senior Group Leader and lead author of the Genome Research paper.

"Our research findings imply that these surveys must also include repeats, as they are likely to be the source of important differences between model organisms and humans," added Dr. Bourque. "The better our understanding of the particularities of the human genome, the better our understanding will be of diseases and their treatments."

"The findings by Dr. Bourque and his colleagues at the GIS are very exciting and represent what may be one of the major discoveries in the biology of evolution and gene regulation of the decade," said Raymond White, Ph.D., Rudi Schmid Distinguished Professor at the Department of Neurology at the University of California, San Francisco, and chair of the GIS Scientific Advisory Board.

"We have suspected for some time that one of the major ways species differ from one another – for instance, why rats differ from monkeys – is in the regulation of the expression of their genes: where are the genes expressed in the body, when during development, and how much do they respond to environmental stimuli," he added.

FabianFnas

Joined: 11 Nov 05
Moves: 43938

07 Nov 08

Originally posted by Diodorus Siculus
"Junk DNA" was a term introduced to describe DNA sequences with no known function, steming from the surprising finding that much of higher eukaryotic genomes contain few "exonic" regions, i.e. DNA encoding for protein. Humans contain about ~95% junk DNA much of which is rather worryingly obsolete virus material, thought to have arisen from germline ...[text shortened]... , and how much do they respond to environmental stimuli," he added.

Very interesting article, indeed it is!

I'm very impressed of how the genome is translated, bit by bit, into information of how the life works. It's like reading a message from another civilization.

Junk DNA, does it contain information? And if so, what information? Not how to produce proteins, but perhaps some kind of information, perhaps morphological information? I don't know, in the future we will know, I hope. Or is it only leftovers fom earlier states of evolution? Then it's a book of history.

Oh, I find this very interesting!

Diodorus Siculus

Joined: 07 Dec 07
Moves: 2100

07 Nov 08

Originally posted by FabianFnas
Very interesting article, indeed it is!

I'm very impressed of how the genome is translated, bit by bit, into information of how the life works. It's like reading a message from another civilization.

Junk DNA, does it contain information? And if so, what information? Not how to produce proteins, but perhaps some kind of information, perhaps morpholog ...[text shortened]... rlier states of evolution? Then it's a book of history.

Oh, I find this very interesting!

Oh yes, there is lots of information in the DNA between the known genes. Genes typically have promoters in front of the start site that contain transcription factor binding sites that regulate transcription. It was known already that this occurred by no one thought the “junk” in repeats bound transcription factors. It now appears that what was thought of as "junk", non-coding intergenic DNA from these “jumping genes” or transposons (http://en.wikipedia.org/wiki/Transposon) may act to regulate genes presumably turning them on and off at certain times and places in development. The really deep part of the work is in the sentence: "Using the age of the repeats, we established that these repeat-associated binding sites (RABS) have been associated with significant regulatory expansions throughout the mammalian phylogeny". Differences will presumably be found in close genetic relatives (humans and chimpanzees for example). This review article looks prescient now in retrospect:
http://www.springerlink.com/content/u64416g121043688/fulltext.pdf?page=1

Andrew Hamilton

Joined: 26 May 08
Moves: 2120

07 Nov 08

1 edit

I am just curious to know:

is there any indication of what proportion of "junk" DNA has some kind of biologic function or at least a biological influence?
-I mean, are we talking here about just, say, less than one present of the "junk" DNA having a biological influence or is it, say, about half of it or what? -does such an estimate exist?

Diodorus Siculus

Joined: 07 Dec 07
Moves: 2100

07 Nov 08

1 edit

Originally posted by Andrew Hamilton
I am just curious to know:

is there any indication of what proportion of "junk" DNA has some kind of biologic function or at least a biological influence?
-I mean, are we talking here about just, say, less than one present of the "junk" DNA having a biological influence or is it, say, about half of it or what? -does such an estimate exist?

It is difficult to say, but less than 1.5% of human DNA is known to be protein coding leaving a lot of unknown material that was previously thought to be simply "junk". Two particular repeats LINE1 and Alu make up about 30% of the human genome. My own view is that many of the pseudogenes and other repeat material and large duplications from other chromosomes serve as a genetic repertoire from which new genes can arise and in light of the new data perhaps these jumping genes can insert near a silent gene leading to activation, much like existing transposon-based mutagenesis? Also, presumably having a code without buffer around the protein encoding regions is non-optimal, where say double strand breaks from UV light or other environmental mutagens damaging the double helix would be too detrimental and the "junk" might serve like a genetic "sandbag". Interesting stuff.

Here are some stats:
Total size of the genome: approximately 3,200,000,000 bp*
Percentage of adenine (A) in the genome: 54%
Percentage of cytosine (C) in the genome: 38%
Percentage of bases not yet determined: 9%
Highest gene-dense chromosome: chromosome 19 with 23 genes per 1,000,000 bp*
Least gene-dense chromosomes: chromosome 13 and Y with 5 genes per 1,000,000 bp*
Percentage of DNA spanned by genes: between 25% and 38%
Percentage of exons: 1.1 to 1.4%
Percentage of introns: 24% to 37%
Percentage of intergenic DNA: 74% to 64%
The average size of a gene: 27,000 bp*
The longest gene: dystrophin (a muscle protein) with 2,400,000 bp*
Average length of an intron: 3,300 bp*
Most common length of an intron: 87 bp*
Occurrence rate of SNPs: roughly 1 per 1,500 bp*
Occurrence rate of genes: about 12 per 1,000,000 bp*

Andrew Hamilton

Joined: 26 May 08
Moves: 2120

07 Nov 08

1 edit

Originally posted by Diodorus Siculus
It is difficult to say, but less than 1.5% of human DNA is known to be protein coding leaving a lot of unknown material that was previously thought to be simply "junk". Two particular repeats LINE1 and Alu make up about 30% of the human genome. My own view is that many of the pseudogenes and other repeat material and large duplications from other ch NPs: roughly 1 per 1,500 bp*
Occurrence rate of genes: about 12 per 1,000,000 bp*

thanks for that 🙂

…Also, presumably having a code without buffer around the protein encoding regions is non-optimal, where say double strand breaks from UV light or other environmental mutagens damaging the double helix would be too detrimental and the "junk" might serve like a genetic "sandbag".
. ...…

I have heard of this theory (although explained in very different words) and I think it is false because I have an alternative theory:

I think the reason why we have so much “junk” DNA is simply because, once a new bit of junk DNA is added to our genome, it is very hard for natural selection to get rid of it. -think about this, suppose somebody had a mutation that simply consisted of him having one tiny particular bit of his junk DNA deleted from his genome -what survival advantage would it give him? -the answer is virtually non because the biological cost of having any particular tiny bit of junk DNA must so tiny as to be totally insignificant and undetectable.
But if a much larger mutation occurred that resulted in somebody loosing a huge chunk of junk DNA then, because the junk DNA is largely mixed up with segments of functional DNA, that mutation would almost certainly result in some important functional DNA being lost along with all that junk DNA thus natural selection wouldn’t select for that mutation because it would actually give him a survival disadvantage!
So my theory is that it is easy and therefore common for a new piece of junk DNA to be added to our genome but is difficult and therefore rare for an old piece of junk DNA to be deleted from our genome and therefore junk DNA excessively accumulates over time.

If my theory is correct here, the fact that we have so much junk DNA is, in fact, yet another example of “evolution’s blunders” -such as our blind spot on your retinas etc.
But by an accident of evolution, some of that so called junk DNA may have developed various “uses” but, my theory is we would still be better off without it and those “uses” could have been better allocated to “non-junk” DNA.
I understand that junk DNA helps us to evolve faster but, that wouldn’t change the fact that we could be better off without it once we have finished evolving by becoming perfectly adapted to our niche.

FabianFnas

Joined: 11 Nov 05
Moves: 43938

07 Nov 08

Originally posted by Andrew Hamilton
If my theory is correct here, the fact that we have so much junk DNA is, in fact, yet another example of “evolution’s blunders” -such as our blind spot on your retinas etc.
But by an accident of evolution, some of that so called junk DNA may have developed various “uses” but, my theory is we would still be better off without it and those “uses” coul ...[text shortened]... better off without it once we have finished evolving by becoming perfectly adapted to our niche.

When I have done some creative programming, proud over the result, the last thing I do before I shut the programming environment down is to tidy up the code. Some code are there only for debugging purposes. I delete all extra unneeded variables. I get rid of extra memory consuming areas. Making the code pleasant to read. And so on.

Sometimes when I read others code, I see dirty code, memory inefficient code, unnessecary time consuming code, uneffective garbage collection. This programmer must be sloppy.

If I was some kind of Creator, and I've just created Homo Sapiens, then I would do the same thing. Tidy up the code. Get rid of obsolete junk parts of the genome. Making the DNA pleasant to decode.

All the junk DNA, with no use at all, is a proof that there are no Creator. Or that the Creator that is a sloppy one.

Diodorus Siculus

Joined: 07 Dec 07
Moves: 2100

07 Nov 08

Originally posted by Andrew Hamilton
thanks for that 🙂

[b]…Also, presumably having a code without buffer around the protein encoding regions is non-optimal, where say double strand breaks from UV light or other environmental mutagens damaging the double helix would be too detrimental and the "junk" might serve like a genetic "sandbag".
. ...…

I have heard of this theory ...[text shortened]... er off without it once we have finished evolving by becoming perfectly adapted to our niche.[/b]

It is clear that "junk" DNA is co-opted for new functions ("neo-functionalization" in the jargon) and the generation of antibody repertoire is a good example of this (VDJ joining). I should mention the pufferfish Fugu rubripes which has a very small genome (C-value) so non-coding DNA appears not be the rule.
This is worth reading:
http://www.scientificblogging.com/dna_and_diversity/junk_dna_function_and_non_function