first previous next last contents

Align sequences

This function will produce an optimal alignment of two segments of the sequence. The dynamic programming alignment algorithm is based on Huang,X On global sequence alignment. CABIOS 10 227-235 (1994). There is no length limit of the sequences but the sequences to be aligned should be of the same type i.e. both be DNA or both protein.


A dialogue box (shown above) requests the horizontal and vertical sequences and their ranges over which they are to be aligned ( see section Selecting a sequence) and the gap start penalty and the gap extension penalty. In addition, if the sequence is DNA, the "score for match" and "score for mis-match" must be provided. These values are used to generate a score matrix. For protein sequences, the score matrix can be changed from the "Options" menu ( see section Changing the score matrix).

The alignment is displayed in the output window along with the percentage mismatch (see below) and on the sip plot as a line. The line represents the positions of the bases in the alignment. Stretches of pads will appear as straight horizontal or vertical regions depending on whether the pads were in the vertical or horizontal sequence respectively.

 Percentage mismatch  29.6
                1        11        21        31        41        51
      hsproperd gagcctatcaacccagataaagcgggacctcctctctggtagaggtgcagggggcagtac
       mmproper ************************************************************
             -157      -147      -137      -127      -117      -107

               61        71        81        91       101       111
      hsproperd tcaacatgatcacagagggagcgcaggcccctcgattgttgctgccgccgctgctcctgc
       mmproper ************************************************************
              -97       -87       -77       -67       -57       -47

              121       131       141       151       161       171
      hsproperd tgctcaccctgccagccacaggctcagaccccgtgctctgcttcacccagtatgaagaat
                                                      :: :::::::::::::: :: :
       mmproper **************************************tgtttcacccagtatgaggagt
              -37       -27       -17        -7         3        13

              181       191       201       211       221       231
      hsproperd cctccggcaagtgcaagggcctcctggggggtggtgtcagcgtggaagactgctgtctca
                :::: :::: :::::: ::::: :: ::: : :   :::: :: ::::::::::::::::
       mmproper cctctggcaggtgcaaaggcctacttgggagagacatcagggtagaagactgctgtctca
               23        33        43        53        63        73

The two aligned sequences are automatically saved in memory and can be accessed through the sequence manager. They are assigned default filenames which are based on the parent with the addition of _a"number" where "number" is a unique identifier (see the twelth and thirteenth entries of the sequence manager picture ( see section Sequence manager).

Further operations available for align sequences are:

This command gives a brief description of the sequences used in the comparison and the input parameters used.

horizontal EMBL: hsproperd
vertical EMBL: mmproper
window length 11 minimum score 9 word length 8 minimum sd 3.000000

This option allows the line width and colour of the matches to be altered.See section Colour Selector. A colour browser is displayed from which the desired line width or colour can be configured. Pressing OK will update the sip plot.
Display sequences
Selecting this command invokes the sequence display ( see section Sequence display). Moving the cursor in the sequence display will move the cursors of the same sequence in any sip plot ( see section Cursors). To force the sequence display to show the nearest match, use the "nearest match" button in the sequence display plot.
This option removes the points from the sip plot but retains the information in memory.
This option will redisplay previously hidden points in the sip plot.
This command removes all the information regarding this particular invocation of Align sequences, and access to this data is lost.

first previous next last contents
This page is maintained by James Bonfield. Last generated on 2 Febuary 1999.