HMMER It is a software used to analyze gene sequences in bioinformatics, which can quickly determine the similarity between two groups of sequences. The latest version is 3.1. however HMMER official website At present, only the Linux version is available, and the Windows version is in a shutdown state. Maybe the sequence analysis software can be used smoothly on Linux? However, after glancing at the official document, it said: "We have been developing HMMER4 since 2011, but it has been in a slow state of development", which may be the real reason for stopping HMMER on the Windows side.

Install HMMER

1. Download

Since you can't set up a virtual machine specifically to use Linux HMMER (It seems that it is not impossible) , then look for the historical version. Through the known download link of Linux version, it is found that HMMER places all historical software in This place The last version of Windows HMMER is 3.0. Portal / backups

After downloading and decompressing, there are a lot of things that can't be understood. There is nothing that can be opened by double clicking .exe File. You need to type a command to start the program. Search the installation method one , the following is a detailed demonstration.

2. Installation

Windows HMMER is relatively simple to install.

Windows10 users open control panel , Search environment variable , click Edit system environment variables , Select environment variable

Windows11 users can open the "Start Menu" and directly type "Environment Variables" in the upper search bar.

Click the first match.

stay System variable Found in Path , click edit

Click on the right newly build Button, fill in the HMMER route

If you don't know what the path is, go to the place where you put the HMMER, copy the things in the address bar according to the figure below, paste them into the variables above, and confirm all the way after adding.

3. Test

next test See if it can be used. You need something that can input commands. Commonly used are CMD and Windows PowerShell. Choose one from the other.

Open CMD: WINDOWS key+R, enter cmd , click OK to open command prompt

Open Windows PowerShell: facing the start menu Right click , Select Windows PowerShell

Then enter

 hmmscan -h

It is recommended to copy and paste directly to avoid errors. If something similar to the following appears, it means that it is installed.

 Illustration
Illustration

Use HMMER

Refer to online tutorials two , demonstrate the functions I currently use: hmmbuild and hmmsearch
hmmbuild : Create an hmm model (roughly)
hmmsearch : Analyze similarity (there should be no error)

1. Get the pfam ID

Need to use this gene Hidden Markov model Then sequence alignment can be carried out. To obtain the hidden Markov model, we need to first obtain the conserved protein domain of the gene pfam ID of. You can obtain the pfam id through the work of others in the references or by searching NCBI yourself. The latter is described below.

get into NCBI Protein Database , input keywords, species keywords need their Latin names or official English, take the MADS box gene of Jatropha curcas as an example, such as "MADS box Jatropha curcas"

Select any item in the search results to view the details, and then click "Identify Reserved Domains" on the right to search and analyze the protein conservative domain

It can be seen from the results that the target protein is hit in the K-box family, and the pfam id has been given in the list

2. Download the protein conservative domain comparison sequence

Click the above figure directly pfam id , jump to the details of the conserved domain of the protein family, click Source pfam

Click on the pfam details page Alignments , Select Stockholm Format, click generate Download the multiple comparison sequence, and the file format is .txt

3. Download the species comparison sequence

Download the genome protein data of Jatropha curcas. Entering NCBI FTP Site , found genomes , using Ctrl + F Search for their Latin names on the web. The Latin scientific name of Jatropha curcas is Jatropha curcas, so we can try to search with Jatropha

Select protein in the next directory to download protein.fa.gz , unzip, get protein.fa Files, here too backups

4. Comparative analysis

Copy the two files downloaded above to the folder of HMMER , of course, it can also be placed in another folder, and then open the command prompt.

The command prompt defaults to C: Users Users> Operation in this directory, and we need to switch to the directory where HMMER is located to continue the operation.

If HMMER is not in disk C, you need to switch the drive letter, such as disk D, and enter

 D:

Press Enter to enter Disk D. Other disks are similar. Then switch to the location of HMMER

 Cd HMMER installation position

such as

 cd D:\hmmer

If an error is reported, it may be because the path contains Chinese or other non English characters. You need to use English single quotation marks to enclose the path, such as this

 Cd 'D:  Pretend to have Chinese  hmmer'

Next, use the hmmbuild Command to convert the obtained protein conservative domain comparison sequence into the hmm model. The file I downloaded is PF01486_seed.txt , then enter

 Hmmbuild hmm Files to be converted

such as

 hmmbuild PF01486.hmm PF01486_seed.txt

Then compare the converted hmm with the Jatropha curcas protein sequence, and use hmmsearch Commands, entering

 hmmsearch PF01486.hmm protein.fa > PF01486.out

 Whole process reference
Whole process reference

End of operation, generate PF01486.out File, right click to open in Notepad

You can see the comparison results.

If this article helps you, you may as well click "Enjoy a cup of coffee" below to give me some material rewards, which will give me more motivation to write. But it does not mean that this article will continuously update XD