first previous next last contents

Trace_diff

NAME

trace_diff -- Detect mutations between two SCF files.

SYNOPSIS

trace_diff [-v scf_version] [-p scf_precision] [-n num_sd] [-b band_width] [-s position] [-e position] [-o file] [-S] [-c] [-a] mutant_file [wild_type_file]

DESCRIPTION

trace_diff aligns two trace files together, subtracts one from the other, and then uses this `difference trace' to differentiate real sequence mutations from incorrect base calls. Bonfield,JK, Rada,C and Staden,R Automated detection of point mutations using fluorescent sequence trace subtraction. Nucl. Acids Res. 26, 3404-3409 (1998).

For an overview and more details about mutation detection seesection Search for Mutations..

The alignment is performed by firstly aligning the two text sequences together, and then using this alignment to scale the traces such that the same number of `trace samples' are used for the aligned base calls in the two traces. The differencing algorithm then simply subtracts one trace from another, without any Y scaling. With the -o option, this difference trace can be written as an SCF file which can then be viewed with trev.

To detect mutations in the difference trace we firstly compute the background mean and the standard deviation over the length of the trace to check for mutations. We then search the trace for locations containing two opposing differences in trace signals. For instance a change from an A base to a T will include a positive A trace difference and a negative T trace difference. If both the positive and negative differences are more than num_sd multiples of the standard deviation from the mean, then this is a potential mutation. If the -a option has been given trace_diff will accept this as a real mutation. Otherwise, it also checks that the base calls differ and will ignore the potential mutation if they do not. If an experiment file was given as input, mutations may be written back to the experiment file as tags.

The mutant_file and wild_type_file specify the file to detect mutations in, and the wild type trace to compare against. If the mutant_file is an experiment file (which must contain LN and LT records pointing to the real trace file), the mutations found will be written back to the experiment file as MUTN tags. The experiment file may also contain a WT record referencing the wild type SCF file, in which case the wild_type_file argument is optional.

OPTIONS

-v scf_version
Specifies the version of the SCF created when using -o. Valid values are 2 and 3. Defaults to 3.
-p scf_precision
Specifies the precision (in bits) of the trace samples stored in the SCF file. Valid values are 8 and 16. Defaults to 16.
-n num_sd
Specifies the threshold at which peaks in the difference trace are to be considered as potential mutations. This is the value most likely to be changed by the user. The default is 4.
-b band_width
Specifies the width of the band along the diagonal the sequence alignment matrix checked when aligning the sequences. Roughly speaking this is equivalent to the expected difference in the number of pads needed to each sequence, including end gaps. To force a full alignment specify band_width as the sequence length or greater.
-s position
-e position
Specifies the start and end positions within the mutant sequence to in which to check for mutations. Note that looking for mutations in really poor quality data may have detrimental effects on the detection in good data. The default range is from 50 to 300.
-o file
Specifies the name of an SCF file in which to save the difference trace. No default exists, but this is optional.
-S
Silent mode: do not output to stdout information on mutations found. If an experiment file has been used, the mutations will still be written to the experiment file as tags.
-c
Specifies that the range checked (-s to -e) should be clipped, if necessary, by the QL and QR line types in the experiment file. Hence the start position is the maximum of the QL and -s values, whilst the end position in the minimum of QR and -e values.
-a
Specifies that all mutations found by analysing the difference trace will be output and/or tagged regardless of whether the base calls in the mutant and wild type sequences differ.

first previous next last contents
This page is maintained by James Bonfield. Last generated on 2 Febuary 1999.
URL: http://www.mrc-lmb.cam.ac.uk/pubseq/manual/manpages_11.html