A new class of functional elements in the human genome? Study provides first genome-wide evidence for functional importance of unusual DNA structures

Natural selection has preserved some regions of the human genome that allow the DNA to fold into three-dimensional structures called Gquadruplexes (G4s). A new study has shown that G4s located in regulatory sequences, which control gene expression, or in other functional but non-protein-coding regions of the genome are more common and more stable. The structures are more common and less stable than these regions. They also evolve in neutral ways, even within the gene-protein-coding regions.These lines of evidence, together, suggest that G4 elements should also be included in the list of functional elements for the genome, along with genes, regulatory sequences and non-protein-coding RNAs. The journal Genome Research will publish a paper that describes the research by a team of Penn State researchers. It is available June 29, 2021.Wilfried Guiblet was the first author of the paper. He was a Penn State graduate student and is now a postdoctoral researcher at the National Cancer Institute. "Our study is the only one to examine G4s throughout the genome and determine if they exhibit the characteristics of functional elements.G4s can be folded into as little as 1% of the genome, instead of the double helix. (In comparison, protein-coding genes make up approximately 1.5%). G4s can be folded into a variety of non-canonical forms, collectively called "non-BDNA." G4 structures form in DNA sequences that are rich in nucleotide Guanine, which is the "G" of the ACGT alphabet. G4s are implicated in many key cellular processes. They have also been shown to be involved in several human diseases including cancer.The research team examined G4s' function at a genome-wide level. They looked at their distribution, thermostability, as well as whether they were under natural selection. The research team confirmed that G4s are more prevalent in areas of the genome that have important cellular functions. They also found that G4s in these areas are more stable than those elsewhere in the genome.Guilbet stated that G4s' three-dimensional structure can be formed transiently. The stability of their structure depends on the underlying DNA sequence and other factors. G4s found within functional regions of the genome are more stable, according to our research. This means that it is more likely that DNA is folded into G4s at any time, and therefore more likely that G4s are there for a functional purpose.Purifying selection is a form of natural selection that maintains functional regions of the genome. Mutations in these areas could cause disruptions to their function, which can be detrimental for the organism. Purifying selection is a method that eliminates mutations in these regions. This preserves the DNA sequence's integrity over time. A mutation that is not functional in a region of the genome may have little impact on the gene and may persist without any consequences. These areas of the genome are considered to be neutral. The location of G4s in this spectrum will determine where they fall.Yi-Fei Huang is an assistant professor of biology at Penn State. He was also a leader in the research team. Our tests showed that G4s found within functional regions of the genome are under purifying selections. This is more evidence that G4s should still be considered functional elements. G4s found in protein-coding areas of genes were the only exception to this pattern. These regions are extremely rare and unstable and don't evolve under purifying selective. G4s found in the protein-coding areas of genes could be nonfunctional or costly to maintain.Recent research has shown that G4s and other non-B DNA have higher mutation rates. G4s outside of the protein-coding regions can be maintained by purifying selection. This adds weight to the evidence supporting G4s being classified as functional elements."We believe that we are witnessing evidence for a paradigm change in how scientists define function within the genome," stated Kateryna Makova (Verne M. Wilaman Chair in Life Sciences at Penn State) and leader of the research team. "Initially, geneticists were focused on proteins-coding genes. Then we discovered many functional non-coding element and we now have G4s and other non-B DNA elements. The three-dimensional structure of a DNA sequence may not be as important in defining function.Kristin Eckert is a professor of pathology at Penn State College of Medicine and co-author of this paper. She was also a member of the research group. The identification of G4s within the human genome as new functional elements is crucial for precision medicine.