Codon

Biological terminology
Collection
zero Useful+1
zero
Codon refers to messenger RNA Every adjacent three in the molecule nucleotide Form a group on protein When synthesizing, it represents a certain type amino acid The law of.
Messenger RNA It can determine the type and order of amino acids in protein molecules in cells. Messenger RNA Four nucleotides in the molecule( Base )The sequence of sequence The three bases on messenger RNA can determine an amino acid.
Chinese name
Codon [1]
Foreign name
genetic code
Alias
Triad password [2]
Discipline
biology
Introduction
Triple nucleotide residue sequence on mRNA (or DNA)
Features
tRNA Complementary codon between anti codon and mRNA [3]

brief introduction

Announce
edit
Codon: triplet nucleotide residue sequence on mRNA (or DNA), which encodes a specific amino acid. The anti codon of tRNA is complementary to the codon of mRNA.
Initialization codon: the codon specifying the starting site of protein synthesis. The most common starting codon is the methionine or valine codon.
Termination codon: A codon that can not be recognized by any tRNA molecule normally, but can be bound by special proteins and cause newly synthesized peptide chains to release from translation machines. There are three termination codons: UAG, UAA and UGA.

origin

Announce
edit
Schematic Diagram of ATP Center Hypothesis
Except for a few differences Genetic code Are very close; Therefore, according to Evolutionism The genetic code should appear very early in the life history. The existing evidence shows that the setting of genetic code is not a random result. One explanation is that some amino acids and their corresponding codons have selective chemical binding power, which shows that the complex protein manufacturing process may not have existed for a long time, but the original proteins may have been formed directly on nucleic acids. [4]

Stereochemistry theory

Worth is the representative of the theory of stereochemistry. He believes that codes originate from the stereochemical interaction between amino acids and codons or anti codons (or more generally RNA). This view can be traced back to 1962. Wuss speculated that the coding relationship might be the stereochemical interaction between nucleic acids and amino acids. He regarded the codons involved in "degeneracy" as equal nucleotides. In May 1965, Wuss published a paper entitled "The Rules of the Code" to clarify the arrangement rules of genetic codes, and believed that the use of amino acid chromatography could be used to analyze "degeneracy" It provides useful evidence that the codon relationship is a stereochemical interaction between nucleotides and amino acids. At this time, the universal password has not been completely established, and the coding relationship studied by Wuss is still speculative. In fact, the best stereochemical match of all codons has never been proved. However, it is a confirmed fact that the hydrophobic order of amino acids is the same as that of the anti codon to 3 'dinucleoside, which indicates that the factors of stereochemistry really affect the recognition of amino acids and anti codons. [5]

Freezing contingency theory

Crick is the representative of the theory of frozen contingency. According to this theory, the cryptographic relationship is the result of the contingency in the evolution process being fixed, and once established, this relationship will remain forever. The corresponding relationship between codon and amino acid is fixed in a certain period of life, and it is difficult to change. The coding relationship discussed by Crick in this paper comes from the code table presented at the Cold Spring Harbor Conference in 1966. Except that the starting code is different from UGA, this table is basically consistent with today's recognized universal codes. Although this hypothesis has been challenged by arguments from the adaptability, historicity and chemical characteristics of cryptography, from the perspective of cryptographic structure, it is not difficult to see that Crick's research on cryptography at that time has been highly objective and forward-looking. On the other hand, from the perspective of amino acid biosynthesis, several amino acids on a synthetic path often use only one base difference in coding codon. It seems that there is a relationship between the amino acid formed in the later stage and the coding of the amino acid appeared in the early stage. The coevolution of this amino acid and the password dictionary shows that the coding relationship is not purely accidental.

Coevolutionary hypothesis

The coevolutionary hypothesis proposes that the traditional code is evolved from the original simple code, and the evolution of codon is parallel to the evolution of amino acid biosynthesis. The main evidence is that the original code may be 64 codons encoding only a small amount of amino acids through high degeneracy. In the later evolution, those amino acids with different physical and chemical properties from the relevant synthesis path have similar codons, indicating that the evolution of the codon is closely related to the biosynthesis of amino acids. Wang Zihui believed that the amino acid code introduced later might be obtained by tampering with the code of amino acids with a similar biosynthetic path to it. The coevolutionary theory identified the precursors and products of eight pairs. This hypothesis was later developed by M. Di Giulio.

in vitro selection

Ai Gen and others conducted experiments when studying the origin of the genetic code: without the participation of any enzyme and template in the test tube, they only relied on the catalysis of zinc ions to polymerize the nucleotide monomers into oligonucleotides, and through the replication and amplification of each other as templates, finally under different conditions of subculture, they selected different tRNA clones, and then formed a quasi population of RNA molecules. This experiment is called "tube selectivity theory", which proves that ribonucleic acid, which can initiate life formation, can be formed under natural conditions without vitality. According to the physical and chemical environment after the formation of the earth, it is speculated that the active period of the formation of biomacromolecules is about 3.8 billion to 4 billion years ago. According to the enlightenment from the experiment, researchers believe that the early conditions of the earth will affect the generation of early short sequence RNA and the evolution of the code. However, this theory does not pay enough attention to the structure of the cipher table itself. [5]
original Genetic code It may be much simpler than today, with life Evolution creates new amino acids that can be reused to complicate the genetic code. Although many evidences support this view, the detailed evolution process is still under exploration. after natural selection The current genetic code has been reduced mutation Adverse effects caused. That is, the genetic code is affected at different stages by three factors: selection, history and chemistry (comprehensive evolutionary hypothesis). [6]

Decipher history

Announce
edit
Nierenberg (M.W.Nirenberg, 1927-2010) and Matthew (H. Matthaei) cracked the first genetic code.
Nierenberg and Matthew used the technology of protein synthesis in vitro. They added an amino acid in each test tube, and then added the cell extract with DNA and mRNA removed, as well as the synthetic RNA poly Uracil nucleotide As a result, more than Polyphenyl Alanine Peptide chain The experimental results show that polyuridine nucleotides lead to polyuridine Phenylalanine The base sequence of polyuracil nucleotides is composed of many uracil (UUUUUUUUUUU...). It can be seen that the base sequence of uracil encodes a peptide chain composed of phenylalanine. combination Crick According to the experimental conclusion that three bases determine one amino acid, the codon corresponding to phenylalanine should be UUU. In the next six or seven years, scientists followed the idea of protein synthesis in vitro, constantly improved the experimental methods, cracked all the codons, and edited the codon table. [7]

type

Announce
edit
There are four bases that make up RNA, three of which are Base The first two decisions of amino acid Theoretically, the combination of bases has a power of 4=64, and the combination of 64 bases is 64 codons. How to determine 20 amino acids? A careful analysis of the codon table of 20 amino acids shows that the same amino acid can be determined by several different codons, Initial codon Is AUG( methionine )In addition, the three codons of UAA, UAG and UGA cannot determine any amino acid protein synthesis Of Termination codon 1994 Edition Zeng Bangzhe By《 Structuralism 》Codon and amino acid Combinatorial mathematics The calculation formula is: C1/4+2C2/4+C3/4=20 amino acids, C1/4+6 (C2/4+C3/4)=64 codon. (Another algorithm is 4 * 4 * 4=64, and there are 4 possibilities for each position of three bases in a codon)

characteristic

Announce
edit
①. The genetic codon is Triad password : A codon is composed of messenger Ribonucleic acid (mRNA) Base composition
② Codons are universal: different biological codons are basically the same, that is, they share a set of codons.
③ There is no comma in the genetic codon: there is no punctuation between two codons, and there is no non coding nucleotide between codons. The code must be read according to certain Code reading frame Frame, start from the correct starting point and read all the way to Termination signal
④ The genetic codons do not overlap Polynucleotide No two adjacent codons on the chain share any nucleotide.
⑤ Codon has Degeneracy Except methionine and tryptophan, every amino acid has at least two codons. In this way, to a certain extent, the amino acid sequence will not cause amino acid errors due to the accidental replacement of a base.
⑥ Codon reading and translation have certain directionality: from the 5 'end to the 3' end.
⑦ Yes Initial codon and Termination codon There are two kinds of start codons, one is methionine (AUG), the other is valine (GUG), while the stop codon (there are three, UAA, UAG, UGA) has no corresponding Transport ribonucleic acid (tRNA) exists only for Release factor Recognition to achieve the termination of translation.
stay Messenger RNA In, base code A stands for adenine , G stands for Guanine , C for cytosine , U stands for Uracil (Note: unlike DNA, RNA does not Thymine T, It is replaced by uracil U according to Base complementary pairing principle , U is paired with A).

effect

Announce
edit

Password table

First of all, the password table is not a biological fact. It is based on the existing 20 essential amino acid The acronym is obtained by adding the missing 6 letters. According to the three letter abbreviation of amino acid, the first letter of the Chinese translation of pinyin is used to find the correlation, and then the amino acid with the strongest codon degeneracy (i.e. repeatability) is used as the first choice for substitution. The specific transformation is as follows:
GCA,GCG:A→B
AGA,AGG:R→J
CCA,CCG:P→O
UUA,UUG:L→U
GUA,GUG:V→X
CAC :H→Z
yes Termination codon Adjustments have also been made. It should be emphasized that this coding scheme ignores the use of existing B and Z, as well as the real strength between stop codons.
Forward translation alternatives
A:GCU,GCC.
B:GCA,GCG.
C:UGU,UGC.
D:GAU,GAC.
E:GAA,GAG.
F:UUU,UUC.
G:GGU,GGC,GGA,GGG.
H:CAU.
I:AUU,AUC,AUA.
J:AGA,AGG.
K:AAA,AAG.
L:CUU,CUC,CUA,CUG.
M:AUG.
N:AAU,AAC.
O:CCA,CCG.
P:CCU,CCC.
Codon table
Q:CAA,CAG.
R:CGU,CGC,CGA,CGG.
S:UCU,UCC,UCA,UCG,AGU,AGC.
T:ACU,ACC,ACA,ACG.
U:UUA,UUG.
V:GUU,GUC.
W:UGG.
X:GUA,GUG.
Y:UAU,UAC.
Z:CAC.
Start character: AUG [the same code as M, but with space (UAA)]
Space □: UAA
Break: UAG
Terminator: UGA
Reverse translation: See Figure "Modified Password Table"

Amino acid characteristics

First base U
UUU (Phe/F) phenylalanine
UUC (Phe/F) phenylalanine
UUA (Leu/L) leucine
UUG (Leu/L) leucine
Correction, the abbreviation of phenylalanine F; Valine Abbreviation V
UCU (Ser/S) serine
UCC (Ser/S) serine
UCA (Ser/S) serine
UCG (Ser/S) serine
UAU (Tyr/Y) Tyrosine
UAC (Tyr/Y) Tyrosine
UAA (Termination)
UAG (Termination)
UGU (Cys/C) Cysteine
UGC (Cys/C) cysteine
UGA (Termination)
UGG (Trp/W) Tryptophan
First base C
CUU (Leu/L) Leucine
CUC (Leu/L) Leucine
CUA (Leu/L) Leucine
CUG (Leu/L) leucine
CCU (Pro/P) proline
CCC (Pro/P) Proline
CCA (Pro/P) proline
CCG (Pro/P) Proline
CAU (His/H) histidine
CAC (His/H) histidine
CAA (Gln/Q) glutamine
CAG (Gln/Q) glutamine
CGU (Arg/R) Arginine
CGC (Arg/R) Arginine
CGA (Arg/R) Arginine
CGG (Arg/R) Arginine
First base A
AUU (Ile/I) isoleucine
AUC (Ile/I) isoleucine
AUA (Ile/I) isoleucine
AUG (Met/M) methionine (starting)
ACU (Thr/T) threonine
ACC (Thr/T) threonine
ACA (Thr/T) threonine
ACG (Thr/T) threonine
AAU (Asn/N) asparagine
AAC (Asn/N) asparagine
AAA (Lys/K) lysine
AAG (Lys/K) lysine
AGU (Ser/S) serine
AGC (Ser/S) serine
AGA (Arg/R) arginine
AGG (Arg/R) Arginine
First base G
GUU (Val/V) valine
GUC (Val/V) valine
GUA (Val/V) valine
GCU (Ala/A) Alanine
GCC (Ala/A) Alanine
GAU (Asp/D) aspartic acid
GAC (Asp/D) aspartic acid
GAA (Glu/E) glutamic acid
GGU (Gly/G) glycine
GGC (Gly/G) glycine
GCA (Ala/A) Alanine
GCG (Ala/A) Alanine
GUG (Val/V) valine
GAG (Glu/E) glutamic acid
GGG (Gly/G) glycine
GGA (Gly/G) glycine [10]

Differential connection

Announce
edit
Codon
genetic information , codon Anticodon Difference and connection of
Genetic information refers to DNA molecule in gene On Deoxynucleoside (Base) sequence, codon refers to the sequence of three adjacent bases on messenger RNA that determine an amino acid, and anti codon refers to Transport RNA The sequence of three bases at one end of the top. The connection is that the genetic information of DNA (gene) is transferred to messenger RNA through transcription, and the transport RNA carries amino acids at one end and amino acids at the other end Anticodon Pair with the codon (base) on messenger RNA. [8]

application

Announce
edit

Improve heterologous expression of genes

The best host of the target gene can be predicted by analyzing the codon usage pattern; Or use genetic engineering methods for the purpose gene expression Provide the optimal usage mode of codons. The purpose of the three different methods is to improve the expression of heterologous genes by using codon preference.

Translation initiation effect

MRNA concentration is one of the main factors affecting the rate of translation initiation. Codons directly affect the efficiency of transcription and determine mRNA concentration. For example, the codon bias of monocotyledon in the "translation start region" is greater than that in the "translation end region", suggesting that the use of the codon in the "translation start region" is more important to improve the efficiency and accuracy of protein translation. Therefore, it is possible to improve the protein expression level by modifying the DNA sequence at the 5 'end of the coding region.

Influence the structure and function of protein

Codon bias and coding of genes protein domain The connection area of is related to the connection area of the secondary structure unit, and the translation speed will decrease in the connection area. Ma Jianmin and others found that the codon preference of mammalian MHC gene is closely related to the tertiary structure of the encoded protein through cluster analysis, and can change the spatial conformation of the encoded protein by affecting the translation speed of different regions of mRNA. Selected by its research institute protein structure The unit is a protein fingerprint, which is also a functional unit of protein to a large extent, indicating that codon bias is also closely related to the function of protein. Changing codon usage patterns can purposely change the structure and function of specific proteins.

Gene mapping function

The mode of codon use in the nucleus and Cytoplasmic inheritance There are also differences between substances, for example, the starting codon in the nuclear gene is only ATG, while the starting codon in the mitochondrial gene is ATN; The termination codon TGA in nuclear gene is used to code tryptophan in mitochondrial gene. Therefore, we can locate eukaryotic ribosomes in cells and unknown genes in genomes by comparing the usage patterns of codons.

Predictive evolution law

Similar codon usage patterns indicate that species are close to each other or living environment. At present, some studies have analyzed the phylogeny and evolution of species by comparing the differences of codon bias. Mitochondrial genome It is an important molecular marker with the advantages of maternal inheritance, simple molecular structure and rich polymorphism. Studying its codon usage preference can be used to determine the genetic differentiation and phylogenetic relationship of animal groups. [9]

scientific research

Announce
edit
On the evening of May 10, 2023, the Tianzhou VI cargo spaceship launched 98 scientific experimental products, among which, in the field of space life science and biotechnology, Four scientific experiments will be carried out in the biotechnology laboratory cabinet of the Sky Quest Laboratory Module, including the research on the molecular evolution of the co origin of protein and nucleic acid and the origin of codons [11]