[go: nahoru, domu]

Jump to content

Draft:Original research/Genes

From Wikiversity

A gene is a distinct sequence of nucleotides forming part of a chromosome, the order of which determines the order of monomers in a polypeptide or nucleic acid molecule which a cell (or virus) may synthesize.

Theoretical genes

[edit | edit source]
A chromosome unravelling into a long string of DNA, a section of which is highlighted as the gene
Chromosome
(107 - 1010 bp)
Gene
(103 - 106 bp)
Phenotype (Function)
The image above contains clickable links
The image above contains clickable links
A gene is a region of DNA that encodes function. A chromosome consists of a long strand of DNA containing many genes. A human chromosome can have up to 500 million base pairs of DNA with thousands of genes.

Def. a "theoretical unit of heredity of living organisms; [that] in principle predetermines a precise trait of an organism's form (phenotype), such as hair color"[1] or a "segment of DNA or RNA from a cell's or an organism's genome, that may take several forms and thus parameterizes a phenomenon, in general the structure of a protein; locus"[1] is called a gene.

Def. any discrete locus of heritable, genomic sequence which affect an organism's traits by being expressed as a functional product or by regulation of expression[2][3] is called a gene.

Here's a theoretical definition:

Def. a specific nucleotide sequence within a gene locus with its own transcription start site(s), introns, exons, and UTRs, that transcribes a specific RNA product is called an isoform, or gene isoform.

Def. any "of several different forms of the same protein, arising from either single nucleotide polymorphisms,[4] differential splicing of mRNA, or post-translational modifications (e.g. sulfation, glycosylation, etc.)"[5] is called an isoform.

Def. a "region of a transcribed gene present in the final functional RNA molecule"[6] is called an exon.

Def. a "portion of a split gene that is included in pre-RNA transcripts but is removed during RNA processing and rapidly degraded"[7] is called an intron.

Gene structures

[edit | edit source]
Eukaryote gene structure diagram
Regulatory sequence
Regulatory sequence
Enhancer
/silencer
Promoter
5'UTR
Open reading frame
3'UTR
Enhancer
/silencer
Proximal
Core
Start
Stop
Terminator (genetics) Terminator]]
Transcription
DNA
Exon
Exon
Exon
Intron
Intron
Post-transcriptional
modification
Pre-
mRNA
Protein coding region
5'cap
Poly-A tail Poly-A tail
Translation
Mature mRNA
Protein
The image above contains clickable links
The image above contains clickable links
The structure of a eukaryotic protein-coding gene. Regulatory sequence controls when and where expression occurs for the protein coding region (red). Promoter and enhancer regions (yellow) regulate the transcription of the gene into a pre-mRNA which is modified to remove introns (light grey) and add a 5' cap and poly-A tail (dark grey). The mRNA 5' and 3' untranslated regions (blue) regulate translation into the final protein product.[8]
Prokaryote gene structure diagram
Polycistronic operon
Regulatory sequence
Regulatory sequence
Enhancer
Enhancer
/silencer
/silencer
Operator
Promoter
5'UTR
ORF
ORF
UTR
3'UTR
Start
Start
Stop
Stop
Terminator
Transcription
DNA
RBS
RBS
Protein coding region
Protein coding region
mRNA
Translation
Protein
The image above contains clickable links
The image above contains clickable links
The structure of a prokaryotic operon of protein-coding genes. Regulatory sequence controls when expression occurs for the multiple protein coding regions (red). Promoter, operator and enhancer regions (yellow) regulate the transcription of the gene into an mRNA. The mRNA untranslated regions (blue) regulatetranslation into the final protein products.[8]

The promoter is recognized and bound by transcription factors that recruit and help RNA polymerase bind to the region to initiate transcription.[9] A gene can have more than one promoter, resulting in messenger RNAs (mRNA) that differ in how far they extend in the 5' end.[10] Highly transcribed genes have "strong" promoter sequences that form strong associations with transcription factors, thereby initiating transcription at a high rate, while others genes have "weak" promoters that form weak associations with transcription factors and initiate transcription less frequently.[9] Eukaryotic promoter regions are much more complex and difficult to identify than prokaryotic promoters.[9]

Additionally, genes can have regulatory regions many kilobases upstream or downstream of the open reading frame that alter expression that act by binding to transcription factors which then cause the DNA to loop so that the regulatory sequence (and bound transcription factor) become close to the RNA polymerase binding site.[11] For example, enhancers increase transcription by binding an activator protein which then helps to recruit the RNA polymerase to the promoter; conversely silencers bind repressor proteins and make the DNA less available for RNA polymerase.[12]

The transcribed pre-mRNA contains untranslated regions at both ends which contain a ribosome binding site, terminator and start and stop codons.[13] In addition, most eukaryotic open reading frames contain untranslated introns which are removed before the exons are translated, where the sequences at the ends of the introns dictate the splice sites to generate the final mature mRNA which encodes the protein or RNA product.[14]

Many prokaryotic genes are organized into operons, with multiple protein-coding sequences that are transcribed as a unit.[15][16] The genes in an operon are transcribed as a continuous messenger RNA, referred to as a polycistronic mRNA, where the term cistron in this context is equivalent to gene, with the transcription of an operon's mRNA often controlled by a repressor that can occur in an active or inactive state depending on the presence of specific metabolites.[17] When active, the repressor binds to a DNA sequence at the beginning of the operon, called the operator region, and represses transcription of the operon; when the repressor is inactive transcription of the operon can occur (see e.g. Lac operon), where the products of operon genes typically have related functions and are involved in the same regulatory network.[9]

Gene clusters

[edit | edit source]

GeneID: 348 APOE apolipoprotein E description contains this: "This gene maps to chromosome 19 in a cluster with the related apolipoprotein C1 and C2 genes."

Gene expressions

[edit | edit source]
File:Phylogenetic tree of eutherian A1BG.jpg
Phylogenetic tree of eutherian A1BG, includes opossum DM43 and DM46, and A1BG-like sequences in marsupials. Credit: Katrina M. Morris, Denis O’Meally, Thiri Zaw, Xiaomin Song, Amber Gillett, Mark P. Molloy, Adam Polkinghorne, and Katherine Belova.{{fairuse}}

Gene expressions is a suite of genes, and their isoforms, that appear to be biochemically involved in the appearance of a trait.

Although it is harder to regulate the transcription of genes with multiple transcription start sites, "variations in the expression of a constitutive gene would be minimized by the use of multiple start sites."[18]

Earlier "studies led to the design of a super core promoter (SCP) that contains a TATA, Inr, MTE, and DPE in a single promoter (Juven-Gershon et al., 2006b). The SCP is the strongest core promoter observed in vitro and in cultured cells and yields high levels of transcription in conjunction with transcriptional enhancers. These findings indicate that gene expression levels can be modulated via the core promoter."[18]

On the right is a phylogenetic tree of eutherian A1BG, which includes opossum DM43 and DM46, and A1BG-like sequences in marsupials.

"A peptide identified in the late and early milk proteomes showed homology to eutherian alpha 1B glycoprotein (A1BG), a plasma protein with unknown function46, as well as venom inhibitors characterised in the Southern opossum Didelphis marsupialis (DM43 and DM4647,48,49), all members of the immunoglobulin superfamily. To characterise the relationship between the peptide sequence identified in koala, A1BG, DM43 and DM46, a phylogenetic tree was constructed [diagram on the right] including all marsupial and monotreme homologs (identified by BLAST), three phylogenetically representative eutherian sequences, with human IGSF1 and TARM1, related members of the immunoglobulin super family, used as outgroups. This phylogeny indicates that A1BG-like proteins in marsupials and the Didelphis antitoxic proteins are homologs of eutherian A1BG, with excellent bootstrap support (98%). The marsupial A1BG-like sequences and the Didelphis antitoxic proteins formed a single clade with strong bootstrap support (97%)."[19]

"Human TARM1 and IGSF1, related members of the immunoglobulin superfamily are used as outgroups. The tree was constructed using the maximum likelihood approach and the JTT model with bootstrap support values from 500 bootstrap tests. Bootstrap values less than 50% are not displayed. Accession numbers: Tasmanian devil (Sarcophilus harrisii; XP_012402143), Wallaby (Macropus eugenii; FY619507), Possum (Trichosurus vulpecula; DY596639) Virginia opossum (Didelphis virginiana; AAA30970, AAN06914), Southern opossum (Didelphis marsupialis; AAL82794, P82957, AAN64698), Human (Homo sapiens; P04217, B6A8C7, Q8N6C5), Platypus (Ornithorhychus anatinus; ENSOANP00000000762), Cow (Bos taurus; Q2KJF1), Alpaca (Vicugna pacos; XP_015107031)."[19]

Gene regulations

[edit | edit source]

Each gene, or its isoforms, is likely to have upregulation and downregulation transcription factors. As each gene is investigated, these enhancers and inhibitors are noted as discovered.

For example, submitting "gene regulation" APOE human to the NCBI gene database returns 28 genes and 21 mouse analogs. The first on the list is GeneID: 2099 ESR1 estrogen receptor 1. "This gene encodes an estrogen receptor, a ligand-activated transcription factor composed of several domains important for hormone binding, DNA binding, and activation of transcription. [...] Estrogen and its receptors are essential for sexual development and reproductive function, but also play a role in other tissues such as bone. Estrogen receptors are also involved in pathological processes including breast cancer, endometrial cancer, and osteoporosis." from the page url=http://www.ncbi.nlm.nih.gov/gene/2099. The database also maintains the DNA sequence upstream, downstream, and through the entire gene locus so that analysis of "Alternative promoter usage and alternative splicing result in dozens of transcript variants, but the full-length nature of many of these variants has not been determined. [provided by RefSeq, Mar 2014]" can be attempted. The site lists gene interactions and six variants for three isoforms (1, 2, and 3) and ten experimental transcriptions.

Gene similarities

[edit | edit source]

There are genes on other chromosomes that are similar to each gene being considered. For example, GeneID: 338, Apolipoprotein B, is on chromosome 2.

Eukaryote genes

[edit | edit source]
This diagram of a eukaryote cell shows that the DNA is located in the nucleus. Credit: Sponk.

Def. any "of the single-celled or multicellular organisms, of the taxonomic domain Eukaryota, whose cells contain at least one distinct nucleus"[20] is called a eukaryote.

Those specific genes that cause cells to contain at least one distinct nucleus are eukaryote genes.

Genetics

[edit | edit source]
File:Primate-family-tree-humanorigins-si.gif
The last common ancestor of monkeys and apes lived about 25 million years ago. Credit: Smithsonian Institution.

There are "more than 4 million sites where proteins bind to DNA to regulate genetic function, sort of like a switch."[21]

"Humans belong to the biological group known as Primates, and are classified with the great apes, one of the major groups of the primate evolutionary tree. Besides similarities in anatomy and behavior, our close biological kinship with other primate species is indicated by DNA evidence. It confirms that our closest living biological relatives are chimpanzees and bonobos, with whom we share many traits. But we did not evolve directly from any primates living today."[22]

"DNA also shows that our species and chimpanzees diverged from a common ancestor species that lived between 8 and 6 million years ago. The last common ancestor of monkeys and apes lived about 25 million years ago."[22]

Human DNA

[edit | edit source]
This diagram of the structure of DNA shows the four bases; adenine, cytosine, guanine and thymine, and the location of the major and minor groove. Credit: Zephyris.

"[H]uman DNA has millions of on-off switches and complex networks that control the genes' activities. ... [A]t least 80% of the human genome is active, which opposed the previously held idea that most of the DNA are useless."[23]

"DNA contains genes, which hold the instructions for [life. But, these] take up only about 2 percent of the genome ... The human genome is made up of about 3 billion “letters” along strands that make up the familiar double helix structure of DNA. Particular sequences of these letters form genes, which tell cells how to make proteins. People have about 20,000 genes, but the vast majority of DNA lies outside of genes. ... [A]t least three-quarters of the genome is involved in making RNA [...] it appears to help regulate gene activity."[21]

Human genes

[edit | edit source]

"Nine elements were tested, representing a sampling of elements present in the two gene deserts and DACH introns, spread over a 1530-kb region surrounding the human DACH's TATA box."[24]

Gene ID: 1602 is the human gene DACH1 dachshund homolog 1 also known as DACH.[25] DACH1 has three isoforms: a, b, and c.

"[T]he human ... prostaglandin-endoperoxide-synthase-2 [gene contains] a canonical TATA box (nucleotide residues at positions -31 to -25 for the human gene)."[26] This is Gene ID: 5743.

The Drosophila hsp70 has a TATA box containing promoter.[27] This suggests that GeneID: 3308 HSPA4 heat shock 70kDa protein 4 [Homo sapiens], also known as hsp70,[28] has a TATA box in its core promoter.

Genotypes

[edit | edit source]

The genetic information in a genome is held within genes, and the complete set of this information in an organism is called its genotype. A gene is a unit of heredity and is a region of DNA that influences a particular characteristic in an organism. Genes contain an open reading frame that can be transcribed, as well as regulatory sequences such as promoters and enhancers, which control the transcription of the open reading frame.

Only about 1.5% of the human genome consists of protein-coding exons.

Pseudogenes

[edit | edit source]

"An abundant form of noncoding DNA in humans are pseudogenes, which are copies of genes that have been disabled by mutation.[29] These sequences are usually just molecular fossils, although they can occasionally serve as raw genetic material for the creation of new genes through the process of gene duplication and divergence.[30]

About 2700 formerly active genes are now pseudogenes.

Deaminations

[edit | edit source]

The CpG deficiency is due to an increased vulnerability of methylcytosines to spontaneously deaminate to thymine in genomes with CpG cytosine methylation.[31]

Methylations

[edit | edit source]

Cytosines in CpG dinucleotides can be methylated to form 5-methylcytosine. In mammals, methylating the cytosine within a gene can turn the gene off, a mechanism that is part of a larger field of science studying gene regulation that is called epigenetics. Enzymes that add a methyl group are called DNA methyltransferases.

In mammals, 70% to 80% of CpG cytosines are methylated.[32]

CpG dinucleotides have long been observed to occur with a much lower frequency in the sequence of vertebrate genomes than would be expected due to random chance. For example, in the human genome, which has a 42% GC content, a pair of nucleotides consisting of cytosine followed by guanine would be expected to occur 0.21 * 0.21 = 4.41% of the time. The frequency of CpG dinucleotides in human genomes is 1% — less than one-quarter of the expected frequency.

Unmethylated CpG sites can be detected by Toll-Like Receptor 9[33] (TLR 9) on plasmacytoid dendritic cells and B cells in humans. This is used to detect intracellular viral, fungal, and bacterial pathogen DNA.

Methylation is central to imprinting, along with histone modifications.[34] Most of the methylation occurs a short distance from the CpG islands (at "CpG island shores") rather than in the islands themselves.[35]

Methylation of CpG sites within the promoters of genes can lead to their silencing, a feature found in a number of human cancers (for example the silencing of tumor suppressor genes). In contrast, the hypomethylation of CpG sites has been associated with the over-expression of oncogenes within cancer cells.[36]

Mutations

[edit | edit source]

Alu elements are a common source of mutation in humans, but such mutations are often confined to non-coding regions where they have little discernible impact on the bearer.[37]

The mutagenic effect of Alu[38] and retrotransposons in general[39] has played a major role in the recent evolution of the human genome.

The first report of Alu-mediated recombination causing a prevalent inherited predisposition to cancer was a 1995 report about hereditary nonpolyposis colorectal cancer.[40]

"The human diseases caused by Alu insertions include":[41]

The following diseases have been associated with single-nucleotide DNA variations in Alu elements impacting transcription levels:[42]

"The ACE gene, encoding angiotensin-converting enzyme, has 2 common variants, one with an Alu insertion (ACE-I) and one with the Alu deleted (ACE-D). This variation has been linked to changes in sporting ability: the presence of the Alu element is associated with better performance in endurance-oriented events (e.g. triathlons), whereas its absence is associated with strength- and power-oriented performance[43]

The opsin gene duplication which resulted in the re-gaining of trichromacy in Old World primates (including humans) is flanked by an Alu element,[44] implicating the role of Alu in the evolution of three colour vision.

Hypotheses

[edit | edit source]
  1. Each gene may be expressed by one of more isoforms usually subject to cell type.

See also

[edit | edit source]

References

[edit | edit source]
  1. 1.0 1.1 Denispir (6 August 2015). "gene". San Francisco, California: Wikimedia Foundation, Inc. Retrieved 2 November 2019. {{cite web}}: |author= has generic name (help)
  2. "Genetics: what is a gene?". Nature 441 (7092): 398–401. May 2006. doi:10.1038/441398a. PMID 16724031. 
  3. "Genomics. DNA study forces rethink of what it means to be a gene". Science 316 (5831): 1556–1557. June 2007. doi:10.1126/science.316.5831.1556. PMID 17569836. 
  4. SemperBlotto (6 January 2007). isoform. San Francisco, California: Wikimedia Foundation, Inc. https://en.wiktionary.org/wiki/isoform. Retrieved 2 December 2018. 
  5. 72.178.245.181 (30 November 2008). isoform. San Francisco, California: Wikimedia Foundation, Inc. https://en.wiktionary.org/wiki/isoform. Retrieved 2 December 2018. 
  6. TransControl~enwiktionary (22 February 2008). exon. San Francisco, California: Wikimedia Foundation, Inc. https://en.wiktionary.org/wiki/exon. Retrieved 2 December 2018. 
  7. SemperBlotto (9 March 2006). intron. San Francisco, California: Wikimedia Foundation, Inc. https://en.wiktionary.org/wiki/intron. Retrieved 2 December 2018. 
  8. 8.0 8.1 Shafee, Thomas; Lowe, Rohan (2017). "Eukaryotic and prokaryotic gene structure". WikiJournal of Medicine 4 (1). doi:10.15347/wjm/2017.002. ISSN 20024436. 
  9. 9.0 9.1 9.2 9.3 Alberts, Bruce; Johnson, Alexander; Lewis, Julian; Raff, Martin; Roberts, Keith; Walter, Peter (2002). Molecular Biology of the Cell (Fourth ed.). New York: Garland Science. ISBN 978-0-8153-3218-3. https://www.ncbi.nlm.nih.gov/books/NBK21054/. 
  10. "Mapping and quantifying mammalian transcriptomes by RNA-Seq". Nature Methods 5 (7): 621–628. July 2008. doi:10.1038/nmeth.1226. PMID 18516045. 
  11. Pennacchio, L.A.; Bickmore, W.; Dean, A.; Nobrega, M.A.; Bejerano, G. (2013). "Enhancers: Five essential questions". Nature Reviews Genetics 14 (4): 288–295. doi:10.1038/nrg3458. PMID 23503198. PMC 4445073. //www.ncbi.nlm.nih.gov/pmc/articles/PMC4445073/. 
  12. Maston, G.A.; Evans, S.K.; Green, M.R. (2006). "Transcriptional Regulatory Elements in the Human Genome". Annual Review of Genomics and Human Genetics 7: 29–59. doi:10.1146/annurev.genom.7.080505.115623. PMID 16719718. 
  13. Mignone, Flavio; Gissi, Carmela; Liuni, Sabino; Pesole, Graziano (2002-02-28). "Untranslated regions of mRNAs". Genome Biology 3 (3): reviews0004. doi:10.1186/gb-2002-3-3-reviews0004. ISSN 1465-6906. PMID 11897027. PMC 139023. //www.ncbi.nlm.nih.gov/pmc/articles/PMC139023/. 
  14. "Introns in UTRs: why we should stop ignoring them". BioEssays 34 (12): 1025–1034. December 2012. doi:10.1002/bies.201200073. PMID 23108796. 
  15. Salgado, H.; Moreno-Hagelsieb, G.; Smith, T.; Collado-Vides, J. (2000). "Operons in Escherichia coli: Genomic analyses and predictions". Proceedings of the National Academy of Sciences 97 (12): 6652–6657. doi:10.1073/pnas.110147297. PMID 10823905. PMC 18690. //www.ncbi.nlm.nih.gov/pmc/articles/PMC18690/. 
  16. Blumenthal, Thomas (November 2004). "Operons in eukaryotes". Briefings in Functional Genomics & Proteomics 3 (3): 199–211. doi:10.1093/bfgp/3.3.199. ISSN 2041-2649. PMID 15642184. http://bfg.oxfordjournals.org/content/3/3/199. 
  17. "Genetic regulatory mechanisms in the synthesis of proteins". J. Mol. Biol. 3 (3): 318–356. 1961. doi:10.1016/S0022-2836(61)80072-7. PMID 13718526. 
  18. 18.0 18.1 Tamar Juven-Gershon and James T. Kadonaga (15 March 2010). "Regulation of gene expression via the core promoter and the basal transcriptional machinery". Developmental Biology 339 (2): 225-9. doi:10.1016/j.ydbio.2009.08.009. http://www.sciencedirect.com/science/article/pii/S0012160609011166. Retrieved 2016-01-16. 
  19. 19.0 19.1 Katrina M. Morris, Denis O’Meally, Thiri Zaw, Xiaomin Song, Amber Gillett, Mark P. Molloy, Adam Polkinghorne, and Katherine Belova (7 October 2016). "Characterisation of the immune compounds in koala milk using a combined transcriptomic and proteomic approach". Scientific Reports 6: 35011. doi:10.1038/srep35011. PMID 27713568. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5054531/. Retrieved 14 March 2020. 
  20. eukaryote. San Francisco, California: Wikimedia Foundation, Inc. August 28, 2012. http://en.wiktionary.org/wiki/eukaryote. Retrieved 2012-09-29. 
  21. 21.0 21.1 Malcolm Ritter (September 6, 2012). Far from being mostly junk, human DNA is ‘a jungle’ of complex activity, huge project shows. The Washington Post. http://www.fftimes.com/node/254360. Retrieved 2012-09-06. 
  22. 22.0 22.1 Homo sapiens (3 June 2015). Primate Family Tree. Washington, DC USA: Smithsonian Institution. http://humanorigins.si.edu/evidence/genetics. Retrieved 2015-06-09. 
  23. Bryan McBournie (September 6, 2012). Human genome study could unlock the biology of disease. Sigma Xi. http://alquemie.smartbrief.com/servlet/ArchiveServlet?issueid=176C7415-9260-447D-A04F-77F3B17D39AF&sid=5df40ccd%252dcb14%252d46a9%252d9d44%252d2dd7cc2984b5. Retrieved 2012-09-06. 
  24. Marcelo A. Nobrega, Ivan Ovcharenko, Veena Afzal, and Edward M. Rubin (October 2003). "Scanning human gene deserts for long-range enhancers". Science 302 (5644): 413. doi:10.1126/science.1088328. PMID 14563999. http://www.sciencemag.org/content/302/5644/413.short. Retrieved 2012-12-26. 
  25. HGNC (December 20, 2012). DACH1 dachshund homolog 1 (Drosophila) ( Homo sapiens ). Bethsda, Maryland, USA: ncbi.nlm.nih. http://www.ncbi.nlm.nih.gov/gene/1602. Retrieved 2012-12-26. 
  26. Tetsuya Kosaka, Atsuro Miyata, Hayato Ihara, Shuntaro Hara, Tamiko Sugimoto, Osamu Takeda, Ei-ichi Takahashi, Tadashi Tanabe (May 1994). "Characterization of the human gene (PTGS2) encoding prostaglandin‐endoperoxide synthase 2". European Journal of Biochemistry 221 (3): 889-97. doi:10.1111/j.1432-1033.1994.tb18804.x. http://onlinelibrary.wiley.com/doi/10.1111/j.1432-1033.1994.tb18804.x/full. Retrieved 2012-12-26. 
  27. Thomas W. Burke and James T. Kadonaga (November 15, 1997). "The downstream core promoter element, DPE, is conserved from Drosophila to humans and is recognized by TAFII60 of Drosophila". Genes & Development 11 (22): 3020–31. doi:10.1101/gad.11.22.3020. PMID 9367984. PMC 316699. http://genesdev.cshlp.org/content/11/22/3020.long. 
  28. HGNC (February 3, 2013). HSPA4 heat shock 70kDa protein 4 ( Homo sapiens ). Bethesda, MD, USA: National Center for Biotechnology Information, U.S. National Library of Medicine. http://www.ncbi.nlm.nih.gov/gene/3308. Retrieved 2013-02-07. 
  29. Harrison P, Hegyi H, Balasubramanian S, Luscombe N, Bertone P, Echols N, Johnson T, Gerstein M (2002). "Molecular Fossils in the Human Genome: Identification and Analysis of the Pseudogenes in Chromosomes 21 and 22". Genome Res 12 (2): 272–80. doi:10.1101/gr.207102. PMID 11827946. PMC 155275. //www.ncbi.nlm.nih.gov/pmc/articles/PMC155275/. 
  30. Harrison P, Gerstein M (2002). "Studying genomes through the aeons: protein families, pseudogenes and proteome evolution". J Mol Biol 318 (5): 1155–74. doi:10.1016/S0022-2836(02)00109-2. PMID 12083509. 
  31. Scarano E, Iaccarino M, Grippo P, Parisi E (1967). "The heterogeneity of thymine methyl group origin in DNA pyrimidine isostichs of developing sea urchin embryos". Proc. Natl. Acad. Sci. USA 57 (5): 1394–400. doi:10.1073/pnas.57.5.1394. PMID 5231746. PMC 224485. //www.ncbi.nlm.nih.gov/pmc/articles/PMC224485/. 
  32. Jabbari K, Bernardi G (May 2004). "Cytosine methylation and CpG, TpG (CpA) and TpA frequencies". Gene 333: 143–9. doi:10.1016/j.gene.2004.02.043. PMID 15177689. http://linkinghub.elsevier.com/retrieve/pii/S0378111904000836. 
  33. Ramirez-Ortiz ZG, Specht CA, Wang JP, Lee CK, Bartholomeu DC, Gazzinelli RT, Levitz SM (2008). "Toll-like receptor 9-dependent immune activation by unmethylated CpG motifs in Aspergillus fumigatus DNA". Infect Immun. 76 (5): 2123–9. doi:10.1128/IAI.00047-08. PMID 18332208. PMC 2346696. //www.ncbi.nlm.nih.gov/pmc/articles/PMC2346696/. 
  34. Feil R, Berger F (2007). "Convergent evolution of genomic imprinting in plants and mammals". Trends Genet 23 (4): 192–9. doi:10.1016/j.tig.2007.02.004. PMID 17316885. 
  35. Irizarry RA, Ladd-Acosta C, Wen B, Wu Z, Montano C, Onyango P, Cui H, Gabo K, Rongione M, Webster M, Ji H, Potash JB, Sabunciyan S, Feinberg AP (2009). "The human colon cancer methylome shows similar hypo- and hypermethylation at conserved tissue-specific CpG island shores". Nature Genetics 41 (2): 178-86. PMID 19151715. http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2729128/. 
  36. Jones PA, Laird PW (February 1999). "Cancer epigenetics comes of age". Nat. Genet. 21 (2): 163–7. doi:10.1038/5947. PMID 9988266. 
  37. International Human Genome Sequencing Consortium (2001). "Initial sequencing and analysis of the human genome". Nature 409 (6822): 860–921. doi:10.1038/35057062. PMID 11237011. http://www.nature.com/nature/journal/v409/n6822/abs/409860a0.html. 
  38. Shen S, Lin L, Cai JJ, Jiang P, Kenkel EJ, Stroik MR, Sato S, Davidson BL, Xing Y (2011). "Widespread establishment and regulatory impact of Alu exons in human genes". PNAS 108 (7): 2837–42. doi:10.1073/pnas.1012834108. http://www.pnas.org/content/108/7/2837. 
  39. Cordaux R, Batzer MA (2009). "The impact of retrotransposons on human genome evolution". Nature Reviews Genetics 10: 691–703. doi:10.1038/nrg2640. PMID 19763152. PMC 2884099. http://rcordaux.voila.net/pdfs/42.pdf. 
  40. Nyström-Lahti M, Kristo P, Nicolaides NC, et al. (November 1995). "Founding mutations and Alu-mediated recombination in hereditary colon cancer". Nat. Med. 1 (11): 1203–6. doi:10.1038/nm1195-1203. PMID 7584997. 
  41. Batzer MA, Deininger PL (May 2002). "Alu repeats and human genomic diversity". Nat. Rev. Genet. 3 (5): 370–9. doi:10.1038/nrg798. PMID 11988762. http://batzerlab.lsu.edu/Publications/Batzer%20and%20Deininger%202002%20Nature%20Reviews%20Genetics.pdf. 
  42. SNPedia: SNP in the promoter region of the myeloperoxidase MPO gene. http://www.snpedia.com/index.php/Rs2333227. 
  43. Puthucheary Z, Skipworth J, Rawal J, Loosemore M, Van Someren K, Montgomery H (2011). "The ACE Gene and Human Performance: 12 Years On". Sports Medicine 41: 433–448. doi:10.2165/11588720-000000000-00000. PMID 21615186. 
  44. Dulai KS, Von Dornum M, Mollon JD, Hunt DM (1999). "The Evolution of Trichromatic Color Vision by Opsin Gene Duplication in New World and Old World Primates". Genome Research 9 (7): 629–638. doi:10.1101/gr.9.7.629. PMID 10413401. http://genome.cshlp.org/content/9/7/629.full. 
[edit | edit source]