The local alignment routine is based around the program sim.c by Huang and Miller which is an implementation of the Smith-Waterman algorithm Huang,X.Q. & Miller, W. A Time-Efficient, Linear-Space Local Similarity Algorithm. Advances in Applied Mathematics 12 337-357 (1991).
SIM finds k best non-intersecting alignments between two sequences or within a sequence using dynamic programming techniques. The alignments are reported in order of decreasing similarity score and share no aligned pairs. SIM requires space proportional to the sum of the input sequence lengths and the output alignment lengths, so it accommodates 100,000-base sequences on a workstation. Both sequences must be of the same type, ie both be DNA or both be protein.
A dialogue box (shown above) requests the horizontal and vertical sequences and their ranges over which they are to be aligned ( see section Selecting a sequence). Either a specified number of alignments can be specified or alternatively, all alignments above a certain score. If the sequence is DNA, the scores for a matching aligned pair, a transition and a transversion must be provided. These values are used to generate a score matrix. For protein sequences, the score matrix can be changed from the "Options" menu ( see section Changing the score matrix). Both DNA and protein sequences require the penalty for opening a gap and the penalty for gap extension.
The alignments are displayed in the output window along with the percentage mismatch (see below) and on the sip plot as a series of lines, each line corresonding to a single alignment. The line represents the positions of the bases in the alignment. Stretches of pads will appear as straight horizontal or vertical regions depending on whether the pads were in the vertical or horizontal sequence respectively.
Percentage mismatch 35.7 438 448 458 468 478 488 h caggcctgtgaggaccagcagtgctgtcctgagatgggcggctggtctggctgggggccc ::::::::::: :::: :: ::: :: :: : :::: : :::::: ::: m caggcctgtgacacccagaagacctgccccacacatggggcctgggcatcctggggcccc 451 461 471 481 491 501 498 508 518 h tgggagccttgctctgtcacctgc ::: :: :::: : ::::: m tggagcccccgctcaggatcctgc 511 521 531
Further operations available for local alignments are:
horizontal PERSONAL: h from 1 to 1553 vertical PERSONAL: m from 1 to 1358 number of alignments 3 score for match 1 score for transition -1 score for transversion -1 penalty for starting gap 6 penalty for each residue in gap 0.2