Publication

How Structural and Physicochemical Determinants Shape Sequence Constraints in a Functional Enzyme

Related concepts (32)

Protein–protein interactions (PPIs) are physical contacts of high specificity established between two or more protein molecules as a result of biochemical events steered by interactions that include electrostatic forces, hydrogen bonding and the hydrophobic effect. Many are physical contacts with molecular associations between chains that occur in a cell or in a living organism in a specific biomolecular context. Proteins rarely act alone as their functions tend to be regulated.

Molecular evolution

Molecular evolution is the process of change in the sequence composition of cellular molecules such as DNA, RNA, and proteins across generations. The field of molecular evolution uses principles of evolutionary biology and population genetics to explain patterns in these changes. Major topics in molecular evolution concern the rates and impacts of single nucleotide changes, neutral evolution vs. natural selection, origins of new genes, the genetic nature of complex traits, the genetic basis of speciation, the evolution of development, and ways that evolutionary forces influence genomic and phenotypic changes.

Sequence

In mathematics, a sequence is an enumerated collection of objects in which repetitions are allowed and order matters. Like a set, it contains members (also called elements, or terms). The number of elements (possibly infinite) is called the length of the sequence. Unlike a set, the same elements can appear multiple times at different positions in a sequence, and unlike a set, the order does matter. Formally, a sequence can be defined as a function from natural numbers (the positions of elements in the sequence) to the elements at each position.

Enzyme

Enzymes (ˈɛnzaɪmz) are proteins that act as biological catalysts by accelerating chemical reactions. The molecules upon which enzymes may act are called substrates, and the enzyme converts the substrates into different molecules known as products. Almost all metabolic processes in the cell need enzyme catalysis in order to occur at rates fast enough to sustain life. Metabolic pathways depend upon enzymes to catalyze individual steps.

Protein folding

Protein folding is the physical process where a protein chain is translated into its native three-dimensional structure, typically a "folded" conformation, by which the protein becomes biologically functional. Via an expeditious and reproducible process, a polypeptide folds into its characteristic three-dimensional structure from a random coil. Each protein exists first as an unfolded polypeptide or random coil after being translated from a sequence of mRNA into a linear chain of amino acids.

Sanger sequencing

Sanger sequencing is a method of DNA sequencing that involves electrophoresis and is based on the random incorporation of chain-terminating dideoxynucleotides by DNA polymerase during in vitro DNA replication. After first being developed by Frederick Sanger and colleagues in 1977, it became the most widely used sequencing method for approximately 40 years. It was first commercialized by Applied Biosystems in 1986. More recently, higher volume Sanger sequencing has been replaced by next generation sequencing methods, especially for large-scale, automated genome analyses.

Complex traits

Complex traits, also known as quantitative traits, are traits that do not behave according to simple Mendelian inheritance laws. More specifically, their inheritance cannot be explained by the genetic segregation of a single gene. Such traits show a continuous range of variation and are influenced by both environmental and genetic factors. Compared to strictly Mendelian traits, complex traits are far more common, and because they can be hugely polygenic, they are studied using statistical techniques such as quantitative genetics and quantitative trait loci (QTL) mapping rather than classical genetics methods.

Phenotypic trait

A phenotypic trait, simply trait, or character state is a distinct variant of a phenotypic characteristic of an organism; it may be either inherited or determined environmentally, but typically occurs as a combination of the two. For example, having eye color is a character of an organism, while blue, brown and hazel versions of eye colour are traits. The term trait is generally used in genetics, often to describe phenotypic expression of different combinations of alleles in different individual organisms within a single population, such as the famous purple vs.

Whole genome sequencing

Whole genome sequencing (WGS), also known as full genome sequencing, complete genome sequencing, or entire genome sequencing, is the process of determining the entirety, or nearly the entirety, of the DNA sequence of an organism's genome at a single time. This entails sequencing all of an organism's chromosomal DNA as well as DNA contained in the mitochondria and, for plants, in the chloroplast. Whole genome sequencing has largely been used as a research tool, but was being introduced to clinics in 2014.

Enzyme inhibitor

An enzyme inhibitor is a molecule that binds to an enzyme and blocks its activity. Enzymes are proteins that speed up chemical reactions necessary for life, in which substrate molecules are converted into products. An enzyme facilitates a specific chemical reaction by binding the substrate to its active site, a specialized area on the enzyme that accelerates the most difficult step of the reaction.

Neutral theory of molecular evolution

The neutral theory of molecular evolution holds that most evolutionary changes occur at the molecular level, and most of the variation within and between species are due to random genetic drift of mutant alleles that are selectively neutral. The theory applies only for evolution at the molecular level, and is compatible with phenotypic evolution being shaped by natural selection as postulated by Charles Darwin.

Sequence space

In functional analysis and related areas of mathematics, a sequence space is a vector space whose elements are infinite sequences of real or complex numbers. Equivalently, it is a function space whose elements are functions from the natural numbers to the field K of real or complex numbers. The set of all such functions is naturally identified with the set of all possible infinite sequences with elements in K, and can be turned into a vector space under the operations of pointwise addition of functions and pointwise scalar multiplication.

Limit of a sequence

As the positive integer becomes larger and larger, the value becomes arbitrarily close to . We say that "the limit of the sequence equals ." In mathematics, the limit of a sequence is the value that the terms of a sequence "tend to", and is often denoted using the symbol (e.g., ). If such a limit exists, the sequence is called convergent. A sequence that does not converge is said to be divergent. The limit of a sequence is said to be the fundamental notion on which the whole of mathematical analysis ultimately rests.

Enzyme kinetics

Enzyme kinetics is the study of the rates of enzyme-catalysed chemical reactions. In enzyme kinetics, the reaction rate is measured and the effects of varying the conditions of the reaction are investigated. Studying an enzyme's kinetics in this way can reveal the catalytic mechanism of this enzyme, its role in metabolism, how its activity is controlled, and how a drug or a modifier (inhibitor or activator) might affect the rate. An enzyme (E) is typically a protein molecule that promotes a reaction of another molecule, its substrate (S).

Molecular clock

The molecular clock is a figurative term for a technique that uses the mutation rate of biomolecules to deduce the time in prehistory when two or more life forms diverged. The biomolecular data used for such calculations are usually nucleotide sequences for DNA, RNA, or amino acid sequences for proteins. The benchmarks for determining the mutation rate are often fossil or archaeological dates. The molecular clock was first tested in 1962 on the hemoglobin protein variants of various animals, and is commonly used in molecular evolution to estimate times of speciation or radiation.

Protein microarray

A protein microarray (or protein chip) is a high-throughput method used to track the interactions and activities of proteins, and to determine their function, and determining function on a large scale. Its main advantage lies in the fact that large numbers of proteins can be tracked in parallel. The chip consists of a support surface such as a glass slide, nitrocellulose membrane, bead, or microtitre plate, to which an array of capture proteins is bound. Probe molecules, typically labeled with a fluorescent dye, are added to the array.

Coverage (genetics)

In genetics, coverage is one of several measures of the depth or completeness of DNA sequencing, and is more specifically expressed in any of the following terms: Sequence coverage (or depth) is the number of unique reads that include a given nucleotide in the reconstructed sequence. Deep sequencing refers to the general concept of aiming for high number of unique reads of each region of a sequence. Physical coverage, the cumulative length of reads or read pairs expressed as a multiple of genome size.

Functional genomics

Functional genomics is a field of molecular biology that attempts to describe gene (and protein) functions and interactions. Functional genomics make use of the vast data generated by genomic and transcriptomic projects (such as genome sequencing projects and RNA sequencing). Functional genomics focuses on the dynamic aspects such as gene transcription, translation, regulation of gene expression and protein–protein interactions, as opposed to the static aspects of the genomic information such as DNA sequence or structures.

Mutation

In biology, a mutation is an alteration in the nucleic acid sequence of the genome of an organism, virus, or extrachromosomal DNA. Viral genomes contain either DNA or RNA. Mutations result from errors during DNA or viral replication, mitosis, or meiosis or other types of damage to DNA (such as pyrimidine dimers caused by exposure to ultraviolet radiation), which then may undergo error-prone repair (especially microhomology-mediated end joining), cause an error during other forms of repair, or cause an error during replication (translesion synthesis).

Active site

In biology and biochemistry, the active site is the region of an enzyme where substrate molecules bind and undergo a chemical reaction. The active site consists of amino acid residues that form temporary bonds with the substrate, the binding site, and residues that catalyse a reaction of that substrate, the catalytic site. Although the active site occupies only ~10–20% of the volume of an enzyme, it is the most important part as it directly catalyzes the chemical reaction.