This entry is missingOverview, add relevant content to make the entry more complete, and it can also be upgraded quickly. Hurry upeditCome on!
VirusesIt is the simplest organism. The complete virus particles include the shell protein and the internalGenomic DNAorRNA(Some viruses have a layer of proteinhost cellConstitutiveCapsule(envelope), which containsViral geneCodedglycoprotein。The virus cannot replicate independently and must enter the host cell with the help of some enzymes andOrganelleSo that the virus can replicate.The function of coat protein (or envelope) is to recognize and invade specific host cells and protect the viral genome fromNucleaseDestruction of.
hepatitis BStructural characteristics and functions of virus genome
Structural characteristics
Announce
edit
oneVirusesGenome size differs greatly from bacteria orEukaryotic cellIn contrast, the genome of viruses is very small, but the genomes of different viruses are also very different.asHBV DNAOnly 3kb in size, includinginformation contentIt is also small, and can only code 4 typesprotein, andPoxvirusThe genome ofnucleotideMetabolic enzymes code, so poxvirusdependenceIt is much smaller than hepatitis B virus.
2. The viral genome can beDNAIt can also be composed of RNA. Each virus particle contains only one nucleic acid, either DNA or RNA, which generally do not coexist in the same virus particle.The DNA and RNA that make up the viral genome can be single stranded orDouble chainCan be closed loop molecules orLinear molecule。asPapillomavirusIt is a closed loopDouble stranded DNA virus, andadenovirusThe genome ofDouble stranded DNA,PoliovirusIt is a single chainRNA virus, andReovirusIts genome is a double stranded RNA molecule.Generally speaking, mostDNA virusThe genome of most RNA viruses isSingle stranded RNAMolecules.
fourGene overlapThat is, the same DNA segment can encode two or even three protein moleculesBiological cellOnly seen inmitochondrionAnd plasmid DNA, so it can also be considered as the structural characteristics of the virus genome.This structure enables smaller genomes to carry moregenetic information。Overlapping geneIn 1977, Sanger was studyingΦX174Was discovered at.Φ X174 is a kind ofSingle strand DNA virus,hostIt is Escherichia coli, therefore, it is also a bacteriophage.It's infectedEscherichia coliafterCo synthesis11 protein molecules, with a total molecular weight of about 250000, equivalent to 6078nucleotideThe amount of information contained.However, the virus DNA itself has only 5375 nucleotides, which can encode a protein molecule with a total molecular weight of 200000 at most. Sanger cannot solve this contradiction for a long time before he finds out that some of the 11 genes of Φ X174 overlap.Overlapping geneThere are the following situations:
(1) One gene is completely in another gene.For example, gene A and B are two different genes, and B is contained in gene A.Similarly, gene E is within gene D.
(2) Partial overlap.Such as gene K and part of gene A and CGene overlap。
(3) Only one of the two genesBaseOverlap.Like gene DTermination codonThe last base ofJ geneInitial codonThe first base of (such as TAATG).Although most of their DNA is the same, these overlapping genesmRNAWhen translated into proteinRead boxDifferent, the protein molecules produced are often different.Some overlapping genes have the same reading frame but different starting positions. For example, in the SV40DNA genome, there are 122 base overlaps between VP1, VP2, and VP3 genes encoding three coat proteins, butCodonThe reading frame of is different.But the small t antigen is completely largeT antigenIn genes, they have a common starting codon.
5. Most of the viral genome is used to encode proteins, and only a very small copy is not translated, which is similar toEukaryotic cellThe redundancy of DNA is different. For example, the untranslated part in Φ X174 only accounts for 217/5375, and that in G4DNA accounts for 282/5577, less than 5%.The untranslated DNA sequence is usuallygene expressionOfControl sequence。For example, there are 67 sequences (3906-3973) between H gene and A gene of Φ X174Base, includingRNA polymeraseBinding Site, TranscriptionalTermination signalandRibosome binding siteIsogenicControl area。Papillomavirus is a kind of virus that infects humans and animals. Its genome is about 8.0kb, and the untranslated part is about 1.0kb. This region is also expressed by other genesRegulatory area.
6. Virus genomeDNA sequenceGenes orrRNAThe genes ofTranscription unit。They can be transcribed together into molecules containing multiple mRNA, calledPolycistron mRNA(polycistroniemRNA), and then processed into template mRNA of various proteins.asadenovirusLate gene12 kinds of viral coat proteinsGene transcriptionWhen on aPromoterIt generates polycistronic mRNA, and then processes it into various mRNA, encoding various viral coat proteins, which are functionally related;The D-E-J-F-G-H gene in Φ X174 genome is also transcribed in the same mRNA, and then translated into various proteins. J, F, G, and H all encode coat proteins. D protein is related to the assembly of viruses, and E protein is responsible for bacterial lysis. They are also functionally related.
7. ExceptRetrovirusBesides, all virus genomes arehaploidEach gene only appears once in virus particles.There are two copies of the retroviral genome.
eightphage(bacterial viruses) whose genes are continuous;andEukaryotic cellViral genes are discontinuous and haveIntron, exceptPositive chainIn addition to RNA virus,EukaryoteCytovirus genes are first transcribed into mRNA precursors, and then processed to remove introns to become mature mRNA.More interestingly, the intron or part of it of some eukaryotic viruses is intron for one gene, but it is intron for another geneExon。asSV40andPolyomavirus(polyomavirus)Early genethis is it.The early genes of SV40, namely large T and small t antigen genes, start from 5146 in a counterclockwise direction. The large T antigen gene terminates at 2676, and the small t antigen terminates at 4624. However, a 346bp segment from 4900 to 4555 is the intron of the large T antigen gene, and the DNA sequence from 4900-4624 in this intron is the coding gene of the small t antigen.Similarly, in polyoma viruses, the intron in the large T antigen gene is the coding gene of the middle T and t antigen.
Genome structure and function of bovine papillomavirus
Papillomavirus infects human and animal skin and mucosa and causesPapillomaA DNA virus of pathological changes, belonging to milk polyvacuolating virus(papovavirus)Section.according toviral infectionDifferent hosts can be divided into bovine papillomavirus (BPV),Human papillomavirus(HPV), etc.All the discovered papillomavirus genomes have similar structure.BPV is taken as an example to illustrate the genome structure and function of papillomavirus.BPVDNA has a total length of 7945bp, which is a closed loop superHelical structure, onhost cellCan andHistoneSyngenetic formationNucleosome。Single Hpa I in BPVDNARestriction sitefirstBaseG is position 1, and the base number is positioned in the direction of 5 '→ 3'.DNA sequence analysisIt indicates that all open reading frames(ORF)They all exist on one DNA strand, and genes overlap each other.The whole BPV genome is divided intoCoding areaandNon coding area(NCR), the coding region can be divided into early transcriptional functional region (E region) and late transcriptional functional region (L region) according to the different functions of the protein it encodes.1. Non coding area (NCR) Non coding area, also known as upstream regulation area (URR) or long control area (LCR), is located inLate geneL1Termination codonAndEarly geneE6 FirstInitial codonThe length is different in different papillomavirus, about 1.0 kb in BPV.Transcribed in NCRPromoterSequence, which can start the transcription and expression of early genes. In addition, there areEnhancerSequence, which can beGene productE2 protein is activated to further promote the expression of early gene AAC. The sequence of enhancer in BPVNCR region has been identified, which is TTGGCGGNNG and ATCGGTGCACCGATPalindrome structure。From the structural characteristics of NCR, we can see that its main function is to regulate the expression of BPV gene.
2. The E region of BPV in the early transcription functional region (or early gene region, E region) contains eight open reading frames (ORFs), namely E6E7、E8、E1、E2、E3、E4、E5,E6, E7 and E1 genes are partially overlapped, E8 is completely in E1, E3 and E4 are all contained in E2, and E5 and E2 are partially overlapped.E2ORF encoded protein products can beEnhancerAnd increase or decrease the expression level of early genes.In addition, E2ORF and E1ORF can maintain theDissociative stateandUnconformityTo the chromosomes of the host cell.The proteins encoded by E6 and E7ORFs may be carcinogenic proteins.E6 and E7 proteins can cause the host to transform into malignanttumourCells.About E6, E7 proteinCell transformationAt this stage, the mechanism of is not clear, but there are two explanations.[1] In E6, E7 proteinamino acidCys-x-x-Cys found in the sequenceRepeating sequenceIt is believed that the structure is intracellularnucleic acidBinding proteinAvailableSpecificityTherefore, E6 and E7 proteins areDNA Binding Protein, YesRegulatory geneAnd further affect the proliferation and differentiation of host cells, making the process out of control and forming tumors;[2] Recently, onnormal cells It is found that there are two proteins with molecular weights of 53KD and 106KD, respectively calledp53And p106 protein.These two proteins are missing orInactivationIt often causes cell malignancy.Studies have found that E7 and E6 proteins of papillomavirus can bind to p53 and p106 proteins to inactivate them, which may also be a mechanism of E6 and E7 proteins leading to cell malignancy.
3. Late transcription functional region (late gene region, L region): There are two ORFs in the L region, namely L1 and L2ORF, encoding the capsid protein of papillomavirus, wherein L1 protein is the main capsid protein and L2 protein is the secondary capsid protein.
Genome structure and function of RNA bacteriophages
The most clearly studied E. coli RNA phages are MS2, R17, f2 and Q β.Their genomes are small, only 3600-4200nucleotide, contains four genes.MS2.R17 and f2 have almost the same genome structure.Two of the four genes encode bacteriophagestructural proteinOne is A protein gene, 1178 nucleotides long.The function of A protein (called mature protein) is to enable bacteriophages to recognize the host andRNA genomeIt can enter the host bacteria, and each phage generally only has the molecular A protein.The other structural protein gene is 399 nucleotides long, encoding a coat protein to form viral particles, and each phage has 180 molecules.Other parts of the genome encode RNAReplicaseAnd a lysoprotein. The gene encoding lysoprotein partially overlaps the genes of coat protein and replicase, but the reading frame is different from that of coat protein.There are many in MS2, R17 and f2 genomesSecondary structure, the self pairing of bases in RNA molecules may preventRNaseDegradation has a certain effect.In addition, there is a segment at the 5 'and 3' ends of the coding geneUntranslated sequenceThis sequence also plays a role in stabilizing RNA molecules.
The genome of another RNA bacteriophage Q β is slightly larger, which is different from the genome of the above RNA bacteriophage as follows:;[1] There is no independent lysoprotein gene, but structural protein A2 (or mature protein) has the function of lysoprotein. [2] It also encodes another coat protein A1.
Structure and function of hepatitis B virus genome
Genome of hepatitis B virus (HBV)DNA structureIt's strange. It's aannularWith a partial double helix structure, about 3.2kb long.Two thirds of them areDouble helix structure1/3 is single strand, which means that the two strands in DNA are unequal in length.Long chainNone at 5 'end and 3' end ofCovalentIt is covalently linked to a protein.250-300 pairs at the 5 'end of the long chainBaseComplementary combination.The long chain isNegative chain,Short chainbyPositive chain。The length of the short chain varies according to the virus, generally about 1.6-2.8 kb, about 2/3 of the long chain.The space between short chains can be determined byDNA PolymeraseFilling.Hepatitis B virus is the smallest known double stranded DNA virus infecting humans.In order to replicate independently in cells, viruses try to contain a large number ofgenetic information。Therefore, the genome structure of HBV appears to be particularly precise and concentrated, making full use of itsgenetic material。
There are many overlapping gene sequences. There are four confirmed open reading frames in the HBV genome, which respectively encode theCore-shell(C) And envelope (S) proteins, virusesReplicase(polymerase)And a virusgene expressionRelated protein X.stayS geneThe front two small ORFs and the S gene ORF belong to the same reading frame. You can read the ORFS through and code two kinds of ORFsS proteinRelated antigens, these two antigens also exist on the surface of virus particles, and these two antigens are called pre-S1 (pre-S1) and pre-S2 (pre-S2), respectively.Similarly, there is a short ORF in front of ORFC, called pre-C, which encodes a large C-protein related antigen.All these ORFs are inNegative chainOn DNA (long strand), S gene completely overlaps with polymerase gene, X gene and polymerase geneC geneOverlap, C gene and polymerase also overlap.Recently, Miller et al. found two ORFs in the HBV genome, namely ORF-5 and ORF-6Gene overlapORF6 is not encoded by negative strand DNA, but is encoded byPositive chainDNA coding.The function of these two ORFs is not clear.
The regulatory sequence is located inside the gene, which is also a way for HBV to save genetic material.The sequences related to HBV genome replication are: short chain forward replication sequence (DR1 and DR2) and U5 like sequence (due toRetrovirusThe U5 sequence at the end is named after the similar face).DR1 and U5 are located in the pre CORF, which is the starting site for the synthesis of long DNA strands. DR2 is located at the overlap of polymerase gene and X gene, which is the starting site for the synthesis of short DNA strands.
Related to HBV gene expressionSignal sequenceThere are four types: [1] promoter, [2]Enhancer, [3] polyA additional signal, [4]GlucocorticoidSensitivity factor (GRE).Because genes in HBV genome are transcribed into three kinds of HBV mRNATranscriptTherefore, there should be at least three transcripts at the near 5 'end of each transcript in the viral genomeRNA Polymerase IIPromoters, although the gene sequence of these promoters is unknown, these promoters obviously exist in the coding protein sequence.Enhancer (ENH) is located in polymerase gene;PolyA additional signal is located in CORF;GRE is located in SORF and polymerase genes.GRE is the same ashormone receptorDNA fragment of structure, which can make a knownGene transcriptionLevel increase.
GRE has many characteristics of enhancers: [1]Cis action[2] plays a role in both directions of transcription, [3] plays a role at different distances from the genes it regulates.
It can be seen from the above that the HBV genome has a tight structure and efficient organization, which is rare in known viruses.HBV DNA not only has its unique structure, but alsoDNA replicationThe process is also very special.When HBV DNA entershost cellThen, it first becomes a complete closed-loop double helix DNA toNegative chainbyTemplate synthesisFull length "+" strand RNA (calledPregenomeRNA)。The "+" strand RNA is packaged in immature core like particles, andDNA PolymeraseAnd a protein are also packaged in granules.In this particle, the "+" strand RNA is used as a templateReverse transcriptaseThe specific mechanism of catalytic synthesis of "-" strand DNA is unclear, which may be related toadenovirusdna replication Similarly, because there are covalently bound proteins at the 5 'end of the "-" strand DNA.The synthesis of "+" strand DNA takes the negative strand DNA as the template and a section of RNA as theprimerIn the process of aggregation and extension, core like virus particles also become mature virus particles.At this time,Positive chainThe DNA has not yet been synthesized, so the length of the two DNA strands of the virus genome is different.