first previous next last contents

Find similar spans

This method, generally the most useful and sensitive, was first described by McLachlan Mclachlan,A.D. Tests for comparing related amino acid sequences J. Mol. Biol. 61, 409-424 (1971). It involves calculating a score for each position in the plot by summing points found when looking forwards and backwards along a diagonal line of a given length (window length). The algorithm does not simply look for identity but uses a score matrix that contains scores for every possible pair of character types. At each point that the score is above a minimum score, a match is saved. The matches are plotted as a single point, corresponding to the centre of the matching span, in the sip plot ( see section Sip plot).

[picture]

The dialogue box (shown above) requests the horizontal and vertical sequences and their ranges ( see section Selecting a sequence), the window span length and the minimum score. Only results above this minimum score are plotted. The default value for the minimum score is one that would produce approximately 500 matches between two random sequences of the same composition as the two under investigation ( see section Probabilities and expected numbers of matches). This value of 500 can be changed using the "Configure default number of matches" option of the "Options" menu on the main menubar ( see section Changing the default number of matches). The upper and lower limits of the minimum score are similarly determined except that the expected number of matches for the upper limit is 0 and for the lower limit is "maximum number of matches". The "maximum number of matches" value can be altered if more matches are required to be plotted by using the "Configure maximum number of matches" option of the "Options" menu ( see section Changing the maximum number of matches).

Further operations available for find similiar spans are:

Information
This command gives a brief description of the sequences used in the comparison, the input parameters used and the number of matches found.

horizontal EMBL: hsproperd 
vertical EMBL: mmproper
window length 11 min match 9 direction f
number of matches 1772

Results
A detailed listing of all the hits found is displayed in the output window.

Positions          2 h        630 v and score          9

 Percentage mismatch  18.2
                2        12
              H agcctatcaac
                ::::::: : :
              V agcctatgagc
              630       640

Positions          7 h        369 v and score          9

 Percentage mismatch  18.2
                7        17
              H atcaacccaga
                :  ::::::::
              V aggaacccaga
              369       379

Scores
This option lists scores, probabilities, and their expected and observed numbers of matches.

score    9 probability 1.73e-04 expected          365 observed 1772
score   10 probability 1.17e-05 expected           25 observed 601
score   11 probability 3.60e-07 expected            1 observed 149

Rescan matches
It is also possible to plot a dot for each residue with a score above a minimum value within each matching span using the "Rescan matches" command. This is only a temporary result and will be destroyed if the sip plot is altered ( see section Permanent and temporary results).
Configure
This option allows the line width and colour of the matches to be altered. See section Colour Selector. A colour browser is displayed from which the desired line width or colour can be configured. Pressing OK will update the sip plot.
Display sequences
Selecting this command invokes the sequence display ( see section Sequence display). Moving the cursor in the sequence display will move the cursors of the same sequence in any sip plot ( see section Cursors). To force the sequence display to show the nearest match, use the "nearest match" button in the sequence display plot. To force the sequences to maintain their current register activate the "Lock" button.
Hide
This option removes the points from the sip plot but retains the information in memory.
Reveal
This option will redisplay previously hidden points in the sip plot.
Remove
This command removes all the information regarding this particular invocation of Find similar spans and access to this data lost.

first previous next last contents
This page is maintained by James Bonfield. Last generated on 2 Febuary 1999.
URL: http://www.mrc-lmb.cam.ac.uk/pubseq/manual/sip_9.html