Sequences can be found by performing text searches on the contents of the library. As the searches use precomputed indexes they are almost instantaneous. In order to perform a text search the user must select: the library, the field on which to search, the text to search for and the mode of searching.
These options are described in greater detail below.
The specific library for searching is selected using the menu button labelled "EMBL" in the above picture but this will depend on the available libraries at each site. The default library name can be specified using the .seqlibrc file. See section The .seqlibrc file. Personal files can also be searched by choosing the "personal file" option. Personal files can only be searched on entrynames (all the other field selections are disabled) and this search is case sensitive. This is in contrast to the other library searches which are case insensitive. Personal file searches display only the sequence in the output window of the parent window. Extraction of other information is not possible.
For some libraries it is possible to limit the search to specific fields. The field is selected using the menu button labelled "All Text" in the above picture. The fields available depend on the library and whether srs5.1 or embl indices are being used. Note that entries included in the All Text field are taken from the entry description, keywords, comments, reference titles and feature table. Although all the fields are available for embl indices, irrespective of the sequence library chosen, only certain fields may be applicable. Selecting a field which is not available for a particular library will cause a warning bell to ring and print an error message in the Error window of the parent program.
The text to be searched for within the sequence library is entered in the search word entry box. The search is case insensitive except when a personal file is searched, in which case it is case sensitive.
There are three or four modes of searching depending on whether srs5.1 or embl indices are used. Both srs5.1 and embl indices can be searched using "word", "word*" or "*word*". Srs5.1 indices can also be searched using "srs_query". This takes the form of a standard srs query string and allows more complex searches to be performed than our simple interface will allow. The first mode (word) will find exact matches. The second mode (word*) will find matches which begin with the search word but may have variable endings. For example to search for "trypsin" but not "trypsinogen", enter "trypsin" into the text box and select the mode "word". To find all instances of the author "Smith", type in "Smith " and select the mode "word*". By using the precomputed indexes these two modes are very fast.
The third mode (*word*) will find occurrences of the search word within a piece of text. Although this mode also uses the precomputed indexes it is slower than other two searches. If embl indices are used, a "Busy" window indicator is displayed which shows how much searching has been performed. To cancel the search, press the "Cancel" button on the "Busy" window.
For example to search for "trypsin" with the mode *word* would find both antichymotrypsinogen and trypsinogen whereas word* would find only trypsinogen.
To begin the search, either press "Enter" in the search word entry box or press the button labelled "Search".