The Breakdown of Symmetry in Word Pairs in 1,092 Human Genomes

Authors

  • Vera Afreixo Department of Mathematics & CIDMA, University of Aveiro, 3810-193 Aveiro, Portugal
  • Sara P. Garcia Department of Electronics, Telecommunications and Informatics & Signal Processing Lab, IEETA, University of Aveiro, 3810-193 Aveiro, Portugal
  • João M. O. S. Rodrigues Department of Electronics, Telecommunications and Informatics & Signal Processing Lab, IEETA, University of Aveiro, 3810-193 Aveiro, Portugal

DOI:

https://doi.org/10.11113/jt.v63.1946

Keywords:

Human genomes, single strand symmetry, equivalence testing, symmetry score, 1000 genomes project

Abstract

Single strand symmetry has been observed in several genomes, and some authors have associated this phenomenon to genome evolution. However, it is still not clear how strong and exceptional this phenomenon is. We use next-generation sequencing data from a sample of 1,092 human individuals made available by the 1000 Genomes Project. To evaluate the phenomenon of symmetry of single-strand human genomic DNA, we explore and analyze these 1,092 human genomes and 1,092 randomly generated sequences, each forced to mimic the nucleotide frequency distribution of their real counterpart. Our methodology is based on measurements, traditional and equivalence statistical tests using different parameters. By statistical testing we find that the global symmetries phenomenon is significant for word lengths  smaller than 8. When we evaluate the global symmetry scores, we obtain strong values for all word lengths and both types of sequences under study. However, the symmetry scores in human genomes reach higher values and have lower dispersion than those in random sequences. We also find that human and random symmetry scores are significantly different. We conclude that in the human genome, the differences between symmetric words are higher than in random sequences, but the correlation between symmetric words in human genomes is higher.

References

The 1000 genomes project data release: Integrated variant call set for phase 1, version 3. http://www.1000genomes.org/

Grch37 Reference human genome assembly. ftp://ftp.ncbi.nih.gov/genomes/H_sapiens/

Guenter Albrecht-Buehler. 2007. Inversions and Inverted Transpositions as the Basis for an Almost Universal “Format†of Genome Sequences. Genomics. 90: 297–305.

Pierre-François Baisnée, Steve Hampson, and Pierre Baldi. 2002. Why are Complementary DNA Strands Symmetric? Bioinformatics. 18(8): 1021–1033.

J. D. Karkas, R. Rudner, and E. Chargaff. 1968. Separation of B. Subtilis DNA into Complementary Strands. II. Template Functions and Composition as Determined by Transcription with RNA Polymerase. Proceedings of the National Academy of Sciences of the United States of America. 60(3): 915–920.

R.B. Kline. 2004. Beyond Significance Testing: Reforming Data Analysis Methods in Behavioral Research. Washington, DC: American Psychological Association.

Sing-Guan Kong, Wen-Lang Fan, Hong-Da Chen, Zi-Ting Hsu, Nengji Zhou, Bo Zheng, and Hoong-Chien Lee. 2009. Inverse Symmetry in Complete Genomes and Whole-genome Inverse Duplication. PLoS ONE. 4(11): e7553.

Sonia Migliorati and Andrea Ongaro. 2010. Adjusting p-values when n is Large in the Presence of Nuisance Parameters. In Statistics for Industry and Technology. 305–318.

D. S. Moore. 1997. Statistics: Concepts and Controversies. 4th edition. New York: WH Freeman & Co.

Dong Qi and A. Jamie Cuticchia. 2001. Compositional Symmetries in Complete Genomes. Bioinformatics. 17(6): 557–559.

R. Rudner, J. D. Karkas, and E. Chargaff. 1968. Separation of B. Subtilis DNA into Complementary Strands, I. Biological Properties. Proceedings of the National Academy of Sciences of the United States of America. 60(2): 630–635.

R. Rudner, J. D. Karkas, and E. Chargaff. 1968. Separation of B. Subtilis DNA into Complementary Strands. III. Direct Analysis. Proceedings of the National Academy of Sciences of the United States of America. 60(3): 921–922.

George Thanassoulis and Ramachandran S. Vasan. 2010. Genetic Cardiovascular Risk Prediction—Will We Get There? Circulation. 122(22): 23232334.

Shang-Hong Zhang and Ya-Zhi Huang. 2010. Limited Contribution of Stem-loop Potential to Symmetry of Single-stranded Genomic DNA. Bioinformatics. 26(4): 478–485.

Downloads

Published

2013-07-15

Issue

Section

Science and Engineering

How to Cite

The Breakdown of Symmetry in Word Pairs in 1,092 Human Genomes. (2013). Jurnal Teknologi (Sciences & Engineering), 63(3). https://doi.org/10.11113/jt.v63.1946