The secondary structure used to be composed of biological macromoleculesAtomic weightLevel structurehydrogen bondTo define.stayproteinThe secondary structure is based on the main chainaminoIs defined by the mode of hydrogen bonding between, that isDSSPThe hydrogen bond defined does not include the hydrogen bond between the main chain and the side chain or between the side chains.The secondary structure of nucleic acid isBaseIs defined by the hydrogen bond between them.In many RNA molecules, the secondary structure is very important for the normal function of RNA, sometimes even more important than the sequence.This can be used for analysisNon coding RNA。RNA secondary structure can be improved by computerForecast accuracy。Other bioinformatics applications will use some concepts of secondary structure to analyzeRNA。
Secondary structurebiochemistryandStructural BiologyMiddle refers to a biological giant molecule, such as protein and nucleic acid(DNAorRNA), three-dimensional general formula of local sections.It does not describe any specific atomic position, which will beTertiary structureProcessing in.
detailed information
Announce
edit
Since hydrogen bond is related to other structural features, the secondary structure defined by it will be slightly informal.Like the protein helix, the main chain is usually used in the specific area of Raman strong label mappingDihedral angle。Thus, the segment with this dihedral angle will be called "spiral", regardless of whether it is really hydrogen bonded or not.Other slightly informal definitions have also been suggested, and most of them are application curvesdifferential geometry Concepts, such ascurvatureandTorsion。The most informal isStructural BiologyThe secondary structure of atomic scale is determined and recorded by the naked eye.
The secondary structure of biological macromolecules can be preliminarily estimated by spectrum.A common method for protein is called longultraviolet rays(wavelength 170-250nm) circular dichroism.It can be displayed at the double minimum 208nm and 222nmAlpha helixThe single smallest 204nm or 207nm can display arbitrary shape orBeta foldingStructure.A less commonly used method isinfraredSpectrum, which can detect the causehydrogen bondThe shock of amino group caused.Finally, the secondary structure can accuratelynuclear magnetic resonanceOfchemical shiftTo estimate.
*E: ParallelBeta folding, or/and antiparallel folded form (extended chain).The shortest length is 2 residues.
*B: Residues in independent β bridges (a pair of β foldshydrogen bond)
*S: Bending (designation of unique non hydrogen bond)
All residues not in the above formDSSPThey are specified by spaces, and sometimes C represents curl or L represents ring.Spiral (i.e. G, H and I) and folded form all need certain length.This means that the twoPrimary structureAdjacent residues must form the same hydrogen bonding pattern.If the spiral or folded hydrogen bond mode is too short, it will be coded with T or B respectively.There are othersProtein secondary structureNumber, but less used.
early stageproteinThe method of secondary structure prediction is based onamino acidThe tendency to form spirals or folds is sometimes used in conjunction with the method of estimating the energy to form a secondary structure.These methods can predict about 60% of residues in three states (spiral, folded or curled)accuracy, if usedMultiple sequence alignmentThe accuracy can be greatly improved to 80%.Multiple sequence alignment can know the complete distribution of amino acids at a certain position (including the position near it, generally 7 residues on each side), and the evolution process provides a clearer picture of the structural trend.For example, the glycine in a certain position of the protein itself has shown that it is an arbitrary shape.However, multiple sequence comparison can find that in 95% of the proteins after nearly a billion years of evolution, that is a favorable helical amino acid.Furthermore, if the average is detected at that positionHydrophobicityIt will also be found that the residue solubility is related toAlpha helixagreement.Taken together, these factors indicate that the original protein endogenous glycine isAlpha helical structure, not arbitrary.Various methods will combine the existing data to form the prediction of three states. These methods include neural networkhidden Markov model andSupport vector machine。modernPrediction methodIt can also provide confidence scores in the prediction results of each location.
Secondary structure predictionThe method has been continuously calibrated, such as EVA experiment.Based on the 270 week test, the most accurate methods are PsiPRED, SAM, PORTER, PROF and SABLE.Interestingly, finding consensus or consistency among these methods does not improve their accuracy.The biggest improvement seems to be in the prediction of beta stocks, because the method used will ignore some beta stocks.Overall, the highestForecast accuracyIt can only reach 90%, because DSSPstandard method The nature of is contrary to the calibrated prediction.
Secondary structure of proteinIncluding local residueshydrogen bondRegulated interactions.The most common secondary structure isAlpha helixandBeta folding, plusβ - angleAnd random crimping.After calculation, it is found that other helices, such as 310 helices and π helices, have favorable hydrogen bonding modes in energy, but these helices are rare in natural proteinsAlpha helixIt can only be found in the end after unfavorable skeleton packaging is carried out in the center.Tight corners, loose and flexible rings will connect more "regular" secondary structures.Arbitrary form is not a true secondary structure, but it is a form of secondary structure lacking rules.
amino acidThey have different abilities in forming different secondary structures.prolineandglycineIt will appear on the corner and can disrupt the regular form of the alpha spiral skeleton, but both have abnormal morphological ability.The amino acids in helical form in protein aremethionine、alanine 、leucine、glutamateandLysine(The single letter number of amino acid is "MALEK");In contrast, largeAromaticityResidue(Tryptophan、TyrosineandPhenylalanine)And C β branched amino acids(isoleucine、valineandthreonine)Then useBeta foldingform.However, these are not enough to form a reliable method to predict the secondary structure in terms of sequences alone.
nucleic acid
Announce
edit
Nucleic acid also has a secondary structure, most of which are single strandRibonucleic acid(RNA) molecule.RNA secondary structure can be divided into spiral (Base pair)And different kinds of rings (unpaired ones surrounded by spiralsnucleotide)。Stem ring structureIs a base pairHelical structure, the end is a short unpaired ring.This kind of stem ring structure is very common, and it is a largeStructural primitive, such asClover structure(i.e. as inTransport RNAFour spiral nodes in)Basic unit。Inner ring structure (short and unpaired in long base pair helixBase)And bulging (extra insertion in the spiral strand, but no paired base in the relative strand) are also very common.Finally, pseudoknots and base triples also appear in RNA.
Since almost all RNA secondary structures are mediated by base pairs, it can be said to determine which base pairs are in a molecule or complex.However, the traditional Watson CrickBase pairIt is not the only pairing method in RNA, but the Hochsler pairing method is also very common.
application
Announce
edit
One application of bioinformatics is to use predicted RNA secondary structures to search for RNA functional forms rather than codesgenome。For example,Small RNAIt has the length interrupted by the small inner ringStem ring structure。The calculation of possible RNA secondary structures can be performed bydynamic programmingMethod, but it cannot detect false knots or otherBase pairIn the case of no comprehensive coverage, the general method is randomcontextUnrelated syntax.Mfold is a website that uses dynamic planning.
comparison
Announce
edit
Both protein and RNA secondary structures can be used to assistMultiple sequence alignment。This comparison can become more accurate after adding relevant secondary structure data.But sometimes it is not very useful for RNA, because RNABaseThe comparison sequence is highly preserved.For some proteins that cannot be compared with the primary structure, the secondary structure can sometimes be found out the relationship between them.