HMMERINDEX*

[ Program Manual | User's Guide | Data Files | Databases ]

Table of Contents
FUNCTION
DESCRIPTION
EXAMPLE
OUTPUT
INPUT FILES
RELATED PROGRAMS
RESTRICTIONS
CONSIDERATIONS
COMMAND-LINE SUMMARY
ACKNOWLEDGEMENT
LOCAL DATA FILES
PARAMETER REFERENCE

FUNCTION

[ Top | Next ]

HmmerIndex creates an index for a profile hidden Markov model database so that profile HMMs can be retrieved from the database with HmmerFetch.

DESCRIPTION

[ Previous | Top | Next ]

HmmerIndex provides a GCG interface to the hmmindex program of Dr. Sean Eddy's HMMER package. It allows you to access most of hmmindex's parameters from the GCG command line.

In order to retrieve profile HMMs from a database of profile HMMs, the database must first be indexed with HmmerIndex. The index is a binary file referred to as a GSI ("generic sequence index"). The Pfam and PfamFrag databases distributed with the Wisconsin Package are already indexed, but if you create your own profile HMM database, you must index it yourself. Once the index is created, you can use HmmerFetch to retrieve individual profile HMMs from the database, much as you can use Fetch to retrieve sequences from sequence databases.

EXAMPLE

[ Previous | Top | Next ]

Here is a session using HmmerIndex to index a small profile HMM database called HSP.hmmdb. This database was created by using HmmerFetch to retrieve four profile HMMs (HSP20.hmm, HSP33.hmm, HSP70.hmm, HSP90.hmm) from the Pfam database, then using HmmerConvert with the -MENu=E parameter to concatenate the individual profile HMMs into a single file.


% hmmerindex HMMERINDEX which HMM database ? HSP.hmmdb Creating temp file for input to hmmindex . Calling hmmindex to perform analysis ... hmmindex -- create GSI index for an HMM database HMMER 2.1.1 (Dec 1998) Copyright (C) 1992-1998 Washington University School of Medicine HMMER is freely distributed under the GNU General Public License (GPL). - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - HMM file: /usr/users/share/smith/HSP.hmmdb - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Determining offsets for /usr/users/share/smith/HSP.hmmdb, please be patient... Sorting keys... Complete. GSI /usr/users/share/smith/HSP.hmmdb.gsi indexes 8 keys (names+accessions) for 4 H MMs in /usr/users/share/smith/HSP.hmmdb. %

OUTPUT

[ Previous | Top | Next ]

The output is a binary file called HSP.hmmdb.gsi.

INPUT FILES

[ Previous | Top | Next ]

HmmerIndex's only input is a profile HMM database file. You can create your own profile HMM database by concatenating profile HMM files. This can be done in several ways. Within the HMMER package, you can append a new profile HMM to an existing one by using the -APPend parameter with HmmerBuild or the append menu option (-MENu=E) in HmmerConvert.

You can also use UNIX commands to append a profile HMM to an existing one or to concatenate several profiles into a single new file:


     % cat newprofile.hmm_g >> existing.hmm_g
     % cat profile1.hmm_g profile2.hmm_g > two_profiles.hmmdb

RELATED PROGRAMS

[ Previous | Top | Next ]

PileUp creates a multiple sequence alignment from a group of related sequences. LineUp is a multiple sequence editor used to create multiple sequence alignments. Pretty displays multiple sequence alignments.

ProfileMake makes a profile from a multiple sequence alignment. ProfileSearch uses the profile to search a database for sequences with similarity to the group of aligned sequences. ProfileSegments displays optimal alignments between each sequence in the ProfileSearch output list and the group of aligned sequences (represented by the profile consensus). ProfileGap makes optimal alignments between one or more sequences and a group of aligned sequences represented as a profile. ProfileScan finds structural and sequence motifs in protein sequences, using predetermined parameters to determine significance.

HmmerBuild makes a profile hidden Markov model from a multiple sequence alignment. HmmerAlign aligns one or more sequences to a profile HMM. HmmerPfam searches a database of profile HMMs with a sequence query in order to identify known domains within the sequence. HmmerSearch uses a profile HMM as a query to search a sequence database for sequences similar to the original aligned sequences. HmmerCalibrate calibrates a hidden Markov model so that database searches using it as a query will be more sensitive. HmmerIndex creates a binary GSI ("generic sequence index") for a database of profile HMMs. HmmerFetch retrieves a profile hidden Markov model by name from an indexed database of profile HMMs. HmmerEmit randomly generates sequences that match a profile HMM. HmmerConvert converts between different profile HMM file formats and from profile HMM to GCG profile file format.

MEME finds conserved motifs in a group of unaligned sequences and saves these motifs as a set of profiles. You can search a database of sequences with these profiles using the MotifSearch program.

RESTRICTIONS

[ Previous | Top | Next ]

The profile HMM database to be indexed must consist of a single file. All of the profile HMMs in the file must be of the same type (ASCII text or binary).

CONSIDERATIONS

[ Previous | Top | Next ]

If you add a new profile HMM to a database, you must re-index that profile HMM database.

COMMAND-LINE SUMMARY

[ Previous | Top | Next ]

All parameters for this program may be added to the command line. Use -CHEck to view the summary below and to specify parameters before the program executes. In the summary below, the capitalized letters in the parameter names are the letters that you must type in order to use the parameter. Square brackets ([ and ]) enclose parameter values that are optional. For more information, see "Using Program Parameters" in Chapter 3, Using Programs in the User's Guide.


Minimal Syntax: % hmmerindex [-INfile1=]HSP.hmm -Default

Local Data Files: None

Optional Parameters:

-NOMONitor suppresses the screen monitor

ACKNOWLEDGEMENT

[ Previous | Top | Next ]

The programs comprising the HMMER package are designed and implemented by Dr. Sean Eddy of the Washington University School of Medicine, St. Louis, Missouri. The GCG front-end programs were written by Christiane van Schlun in collaboration with Dr. Eddy.

Pfam - A database of protein domain family alignments and HMMs Copyright (C) 1996-2000 The Pfam Consortium.

LOCAL DATA FILES

[ Previous | Top | Next ]

None.

PARAMETER REFERENCE

[ Previous | Top | Next ]

You can set the parameters listed below from the command line. For more information, see "Using Program Parameters" in Chapter 3, Using Programs in the User's Guide.

-NOMONitor

suppresses the display of the program's progress on the screen.

Printed: February 5, 2001 11:37 (1162)

[ Program Manual | User's Guide | Data Files | Databases ]


Documentation Comments: doc-comments@gcg.com
Technical Support: help@gcg.com

Copyright (c) 1982-2001 Genetics Computer Group Inc. A subsidiary of Pharmacopeia, Inc. All rights reserved.

Licenses and Trademarks Wisconsin Package is a trademark of Genetics Computer Group, Inc. GCG and the GCG logo are registered trademarks of Genetics Computer Group, Inc.

All other product names mentioned in this documentation may be trademarks, and if so, are trademarks or registered trademarks of their respective holders and are used in this documentation for identification purposes only.

Genetics Computer Group

www.gcg.com