Protein–protein interactionProtein–protein interactions (PPIs) are physical contacts of high specificity established between two or more protein molecules as a result of biochemical events steered by interactions that include electrostatic forces, hydrogen bonding and the hydrophobic effect. Many are physical contacts with molecular associations between chains that occur in a cell or in a living organism in a specific biomolecular context. Proteins rarely act alone as their functions tend to be regulated.
Protein structureProtein structure is the three-dimensional arrangement of atoms in an amino acid-chain molecule. Proteins are polymers - specifically polypeptides - formed from sequences of amino acids, which are the monomers of the polymer. A single amino acid monomer may also be called a residue, which indicates a repeating unit of a polymer. Proteins form by amino acids undergoing condensation reactions, in which the amino acids lose one water molecule per reaction in order to attach to one another with a peptide bond.
ProteinProteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residues. Proteins perform a vast array of functions within organisms, including catalysing metabolic reactions, DNA replication, responding to stimuli, providing structure to cells and organisms, and transporting molecules from one location to another. Proteins differ from one another primarily in their sequence of amino acids, which is dictated by the nucleotide sequence of their genes, and which usually results in protein folding into a specific 3D structure that determines its activity.
Essential geneEssential genes are indispensable genes for organisms to grow and reproduce offspring under certain environment. However, being essential is highly dependent on the circumstances in which an organism lives. For instance, a gene required to digest starch is only essential if starch is the only source of energy. Recently, systematic attempts have been made to identify those genes that are absolutely required to maintain life, provided that all nutrients are available.
Protein quaternary structureProtein quaternary structure is the fourth (and highest) classification level of protein structure. Protein quaternary structure refers to the structure of proteins which are themselves composed of two or more smaller protein chains (also referred to as subunits). Protein quaternary structure describes the number and arrangement of multiple folded protein subunits in a multi-subunit complex. It includes organizations from simple dimers to large homooligomers and complexes with defined or variable numbers of subunits.
DNA sequencingDNA sequencing is the process of determining the nucleic acid sequence – the order of nucleotides in DNA. It includes any method or technology that is used to determine the order of the four bases: adenine, guanine, cytosine, and thymine. The advent of rapid DNA sequencing methods has greatly accelerated biological and medical research and discovery. Knowledge of DNA sequences has become indispensable for basic biological research, DNA Genographic Projects and in numerous applied fields such as medical diagnosis, biotechnology, forensic biology, virology and biological systematics.
ProteomeThe proteome is the entire set of proteins that is, or can be, expressed by a genome, cell, tissue, or organism at a certain time. It is the set of expressed proteins in a given type of cell or organism, at a given time, under defined conditions. Proteomics is the study of the proteome. While proteome generally refers to the proteome of an organism, multicellular organisms may have very different proteomes in different cells, hence it is important to distinguish proteomes in cells and organisms.
Protein function predictionProtein function prediction methods are techniques that bioinformatics researchers use to assign biological or biochemical roles to proteins. These proteins are usually ones that are poorly studied or predicted based on genomic sequence data. These predictions are often driven by data-intensive computational procedures. Information may come from nucleic acid sequence homology, gene expression profiles, protein domain structures, text mining of publications, phylogenetic profiles, phenotypic profiles, and protein-protein interaction.
Protein biosynthesisProtein biosynthesis (or protein synthesis) is a core biological process, occurring inside cells, balancing the loss of cellular proteins (via degradation or export) through the production of new proteins. Proteins perform a number of critical functions as enzymes, structural proteins or hormones. Protein synthesis is a very similar process for both prokaryotes and eukaryotes but there are some distinct differences. Protein synthesis can be divided broadly into two phases—transcription and translation.
High-content screeningHigh-content screening (HCS), also known as high-content analysis (HCA) or cellomics, is a method that is used in biological research and drug discovery to identify substances such as small molecules, peptides, or RNAi that alter the phenotype of a cell in a desired manner. Hence high content screening is a type of phenotypic screen conducted in cells involving the analysis of whole cells or components of cells with simultaneous readout of several parameters.
ProteomicsProteomics is the large-scale study of proteins. Proteins are vital parts of living organisms, with many functions such as the formation of structural fibers of muscle tissue, enzymatic digestion of food, or synthesis and replication of DNA. In addition, other kinds of proteins include antibodies that protect an organism from infection, and hormones that send important signals throughout the body. The proteome is the entire set of proteins produced or modified by an organism or system.
Escherichia coliEscherichia coli (ˌɛʃəˈrɪkiə_ˈkoʊlaɪ ) is a Gram-negative, facultative anaerobic, rod-shaped, coliform bacterium of the genus Escherichia that is commonly found in the lower intestine of warm-blooded organisms. Most E. coli strains are harmless, but some serotypes such as EPEC, and ETEC are pathogenic and can cause serious food poisoning in their hosts, and are occasionally responsible for food contamination incidents that prompt product recalls.
Protein superfamilyA protein superfamily is the largest grouping (clade) of proteins for which common ancestry can be inferred (see homology). Usually this common ancestry is inferred from structural alignment and mechanistic similarity, even if no sequence similarity is evident. Sequence homology can then be deduced even if not apparent (due to low sequence similarity). Superfamilies typically contain several protein families which show sequence similarity within each family.
InteractomeIn molecular biology, an interactome is the whole set of molecular interactions in a particular cell. The term specifically refers to physical interactions among molecules (such as those among proteins, also known as protein–protein interactions, PPIs; or between small molecules and proteins) but can also describe sets of indirect interactions among genes (genetic interactions). The word "interactome" was originally coined in 1999 by a group of French scientists headed by Bernard Jacq.
Essential amino acidAn essential amino acid, or indispensable amino acid, is an amino acid that cannot be synthesized from scratch by the organism fast enough to supply its demand, and must therefore come from the diet. Of the 21 amino acids common to all life forms, the nine amino acids humans cannot synthesize are valine, isoleucine, leucine, methionine, phenylalanine, tryptophan, threonine, histidine, and lysine.
Expressed sequence tagIn genetics, an expressed sequence tag (EST) is a short sub-sequence of a cDNA sequence. ESTs may be used to identify gene transcripts, and were instrumental in gene discovery and in gene-sequence determination. The identification of ESTs has proceeded rapidly, with approximately 74.2 million ESTs now available in public databases (e.g. GenBank 1 January 2013, all species). EST approaches have largely been superseded by whole genome and transcriptome sequencing and metagenome sequencing.
Gene mappingGene mapping or genome mapping describes the methods used to identify the location of a gene on a chromosome and the distances between genes. Gene mapping can also describe the distances between different sites within a gene. The essence of all genome mapping is to place a collection of molecular markers onto their respective positions on the genome. Molecular markers come in all forms. Genes can be viewed as one special type of genetic markers in the construction of genome maps, and mapped the same way as any other markers.
Protein–protein interaction predictionProtein–protein interaction prediction is a field combining bioinformatics and structural biology in an attempt to identify and catalog physical interactions between pairs or groups of proteins. Understanding protein–protein interactions is important for the investigation of intracellular signaling pathways, modelling of protein complex structures and for gaining insights into various biochemical processes.
Protein complexA protein complex or multiprotein complex is a group of two or more associated polypeptide chains. Protein complexes are distinct from multidomain enzymes, in which multiple catalytic domains are found in a single polypeptide chain. Protein complexes are a form of quaternary structure. Proteins in a protein complex are linked by non-covalent protein–protein interactions. These complexes are a cornerstone of many (if not most) biological processes.
Functional genomicsFunctional genomics is a field of molecular biology that attempts to describe gene (and protein) functions and interactions. Functional genomics make use of the vast data generated by genomic and transcriptomic projects (such as genome sequencing projects and RNA sequencing). Functional genomics focuses on the dynamic aspects such as gene transcription, translation, regulation of gene expression and protein–protein interactions, as opposed to the static aspects of the genomic information such as DNA sequence or structures.