first previous next last contents

Searching

The contig editors searching ability and its links to the consensus calculation algorithm are crucial in determining the efficiency with which contigs can be checked and corrected. The consensus is calculated "on the fly" and changes in response to edits. For editing, the most important search functions are those which reveal problems in the consensus calculation whilst ignoring all bases that are adequately well determined.

Selecting "Search" brings up a window which can remain present during normal editor operation. The window allows the user to select the direction of search, the type of search, and a value to search on. The value is entered into a value text box, then pressing the "search" button performs the search. If successful, the cursor is positioned accordingly. An audible tone indicates failure. Pressing the "Quit" button removes the search window. The search window is automatically removed when the contig editor is exited.

[picture]

Searches are performed by scanning forwards or backwards, as appropriate, examining the visible data. Hence the "Cutoffs" button can be used to select whether or not searching should find matches within the cutoff data.

The Control-s key binding in the editor is equivalent to searching forward for the next match. The Escape Control-s key sequence performs a reverse search. Both key bindings will bring up the search window if it is not currently displayed.

There are thirteen different search modes.

Search by Position

This positions the cursor at the numeric position specified in the value text window. Eg a value of "1234" causes the cursor to be placed at base number 1234 in the contig. Positioning within a reading is achieved by prefixing the number with the "@" character, eg "@123" positions the cursor at base 123 of the sequence in which the cursor lies. Relative positions can be specified by prefixing the number with a plus or minus character. Eg "+1234" will advance the cursor 1234 bases. If possible, the cursor is positioned within the same sequence. The direction buttons have no effect on this operation.

Search by Problem

This positions the cursor at the next place in the consensus sequence which is not an "A", "C", "G" or "T". The search can be performed either forwards or backwards from the current cursor position. Obviously the characters appearing in the consensus depend on the selected consensus calculation algorithm and the thresholds set.

Search by Annotation Comments

This positions the cursor at the start of the next tag which has a comment containing the string specified in the value text window. The search performed is a regular expression search, and certain characters have special meaning. Be careful when your value string contains ".", "*", "[", "]", "\", "^" or "$". The search can be performed either forwards or backwards from the current cursor position. Searching with an empty value will find all tags.

Search by Sequence

This positions the cursor at the start of the next segment of sequence that matches the value specified in the text value window. The search is case insensitive. The search is performed on the readings, not on the consensus sequence. The search can be performed either forwards or backwards from the current cursor position.

Search by Quality

This positions the cursor at the next place in the consensus sequence where the consensus for each of the two strands disagree. Where this is only data for one strand the search will stop at every base. The search can be performed either forwards or backwards from the current cursor position.

Search by Reading Name

This positions the cursor at the left end of the reading specified in the value text window. If the value is prefixed with a hash sign it is assumed to be a gel reading number. Otherwise it is assumed to be a gel reading name. Eg "#123" positions the cursor at the left end of gel reading number 123. "a16a12.s1" positions at the start of reading a16a12.s1. If the value was "a16" the cursor is positioned at the first reading which starts with "a16".

Search by Edit

This positions the cursor at the next place in the contig where an edit has been made. Edits include base insertions, deletions, replacements and confidence value changes.The search can be performed either forwards or backwards from the current cursor position.

Search by Evidence for Edit (1)

The Evidence for Edit (1) option checks edited bases to find bases in the consensus for which there is no evidence in the original readings. The definition of evidence is that at least one reading had this original base call. It will currently not find such cases where edits have no been made. This option finds such bases where there is no evidence for the forward strand AND there is no evidence for the reverse strand. Hence it will possibly find less problems than Evidence for Edit (2). Currently this searches only in the forward direction.

Search by Evidence for Edit (2)

The Evidence for Edit (2) option checks edited bases to find bases in the consensus for which there is no evidence in the original readings. The definition of evidence is that at least one reading had this original base call. It will currently not find such cases where edits have no been made. This option finds such bases where there is no evidence for the forward strand OR there is no evidence for the reverse strand. Hence it will possibly find more problems than Evidence for Edit (1). Currently this searches only in the forward direction.

Search by Tag Type

This positions the cursor at the start of the next tag which has the the same type as specified by the type value menu. To change the type, select from the menu that pops up when the mouse is clicked on the button labeled "Type:". The search can be performed either forwards or backwards of the current cursor position. To find all tags, use "search by annotation", with an empty text value string.

Search by Consensus Quality

This positions the cursor on the consensus at a position where the quality of the consensus is below a given threshold. The quality of the consensus is defined as the value at which setting the consensus cutoff to the same value would produce an unknown base type ('-'). The quality threshold is given in the value string and should be within the range of 0 to 100 inclusive.

Search by Discrepancies

This finds positions where two or more bases are above a particular quality level, but in disagreement. The quality threshold is given in the value string and should be within the range of 0 to 100 inclusive.

Search by file

This reads a series of search hits from a file and in turn displays each position in the editor. The format for the hits in the file is of one hit per line with each line consisting of a reading name, a position within that reading, and an optional comment. If the position of a hit is relative to the start of the contig rather than the start of any particular reading, then simply use the first reading in the contig. Positions that are beyond the ends for the reading are still valid, although the editing cursor is moved onto the consensus sequence.

The hit comment consists of any string. Multiline comments are possible, but they must be written using \n in the comment string rather than an actual newline character (which would signify the start of the next hit). The comment for the current hit is displayed at the bottom of the editor search window in a text panel which is visible only when in the "search by file" mode.

Any hit containing a reading name that is not in this contig is silently ignored. This allows for a search hits file to have hits for all contigs. However at present there is no mechanism for stepping through an entire search file bringing up editors as appropriate. This will be expanded upon in future.

An example hits file follows.

xb63c7.s2 102
xb63c7.s2 30 A multi-\nline comment.
xb32a2.s1 56 Oligo, of length 12
xa17b1.r1 5714 Repeat from 5714 to 5780

first previous next last contents
This page is maintained by James Bonfield. Last generated on 2 Febuary 1999.
URL: http://www.mrc-lmb.cam.ac.uk/pubseq/manual/gap4_55.html