first previous next last contents

Trace_clip

NAME

trace_clip -- an Experiment File sequence clipper that analyses traces

SYNOPSIS

trace_clip [-w winlen_nonc] [-W winlen_drop] [-s start] [-c cut_nonc_r] [-C cut_drop_r] [-f fract_nonc_r] [-k cut_nonc_l] [-K cut_drop_l] [-F fract_nonc_l] [-m max_right] [-M min_left] [-L] [-R] [-b] [-v] [-t] [-p] file ...

DESCRIPTION

trace_clip is used to "clip" the 3' and 5' ends of machine produced sequences. It adds QR and QL records to the reading's experiment file and bases respectively to the right and left of this point will be ignored for many subsequent processing steps (although note that the clipped data can be used to help find joins between contigs (see section Find Internal Joins), and to confirm single stranded regions (see section Double stranding). The clip position is selected by analysing the reading's traces using two simple measures. The first (nonc, or non-called over called) calculates the ratio of the area under the trace for the called base to the maximum area under each of the non-called bases at the same position. The second (drop) measures, for the called base, the ratio of the height of the trace at its peak to its height at the mid-point between the peak and the next base. In our hands, for ABI-produced traces both of these calculations give values that start off high, drop to a minimum and then increase 5' to 3'.

For the majority of the sequence the measures are averaged over windows winlen_nonc and winlen_drop but near to the 5' end of the sequence the windows are progressively decreased. For example if the window length is 101 then from base position 51 rightwards the calculations are averaged over window lengths of 101 bases; for base position 50 the window length is 99, for 49 it is 97, and so on, until the 5' end of the reading is reached. (Actually bases at positions 1 to min_left are given the value at position min_left.

To select the right end clip point the program starts either at the base having the minimum observed average value or, if defined by the user at start, and searches rightwards until it finds a position that exceeds the cutoff values cut_nonc_r and cut_drop_r. The clip point is the weighted mean (using fract_nonc_r) of the positions at which the two windows stop. The left clip point is calculated in a similar manner.

The file arguments, of which there can be several, are processed one at a time. Each argument is assumed to be a valid Experiment File. The trace file name is read from the Experiment File; clipping is performed; and a QR or QL identifier is appended to the Experiment File.

The default arguments are -w 101 -W 101 -c 0.3 -C 1.1 -f 0.25, -k 0.3 -K 0.3 -F 0.5 -m 550 -M 5. Also by default, only 3' clipping is performed. Left end clipping can be forced using -L or -b options. -L does the left end only unless the -R option is also used, and -b does both ends.

Using the -p option the program will output the averaged values nonc and drop to two files with names derived from the input file name: input file fred.1, output files fred.1.d and fred.1.n will contain respectively the drop and nonc values as x,y coordinates suitable for use by a graph plotting program.

Using the -t option will leave the experiment file unchanged. If used in conjunction with the -v option the program will write the clip points to the terminal screen.

The parameters cut_nonc_r, cut_drop_r fract_nonc_r, cut_nonc_l, cut_drop_l fract_nonc_l can be chosen by use of scale_trace_clip See section scale_trace_clip..

OPTIONS

-w winlen_nonc
Set the length for the non-called over called window to winlen_nonc This should be an odd number.
-W winlen_drop
Set the length for the drop window to winlen_drop This should be an odd number.
-c cut_nonc_r
Stop searching the non-called over called values rightwards when the score is greater than or equal to cut_nonc_r.
-C cut_drop_r
Stop searching the drop values rightwards when the score is greater than or equal to cut_drop.
-f fract_nonc_r
Set the weight (or fraction) of the non-called over called window stop point that is to be used in the weighted mean calculation. If stop_nonc and stop_drop are the stop positions found for the two measures then the clip point is given by: clip = fract_nonc_r * stop_nonc + ( 1.0 - fract_nonc_r ) * stop_drop
-k cut_nonc_l
Stop searching the non-called over called data leftwards when the score is greater than or equal to cut_nonc_l.
-K cut_drop_l
Stop searching the drop values leftwards when the score is greater than or equal to cut_drop_l.
-F fract_nonc_l
Set the weight (or fraction) of the non-called over called window stop point that is to be used in the weighted mean calculation. If stop_nonc and stop_drop are the stop positions found for the two measures then the clip point is given by: clip = fract_nonc_l * stop_nonc + ( 1.0 - fract_nonc_l ) * stop_drop
-v
Enable verbose output. This outputs information on which files are currently being clipped and their clip points.
-t
Enable test only mode in which no changes are made to the experiment file.
-p
Enable plotting output. This outputs the drop and nonc values to two files fred.1.d and fred.1.n if the input file is called fred.1.
-s start
Force the searches to start from position start in the sequence. Alternatively the search starts from the lowest value of drop or nonc found.
-M min_left
Force the left clip point to be to the right of min_left
-m max_right
Force the right clip point to be to the left of max_right
-L
Clip the left end of the sequence.
-R
Clip the right end of the sequence (default)..
-b
Clip both ends of the sequence.

SEE ALSO

See section ExperimentFile(4).See section scale_trace_clip.


first previous next last contents
This page is maintained by James Bonfield. Last generated on 2 Febuary 1999.
URL: http://www.mrc-lmb.cam.ac.uk/pubseq/manual/manpages_10.html