Trace_diff compares individual sequence traces against a known wild
type or reference trace to detect mutations. The reference trace
should therefore have the
correct sequence
and preferably be as `typical' as possible. Note that this does not imply
the wild type trace should be perfect. If the mutant sequences have a
sequence dependent problem (such as a compression), then in order for
trace_diff to work reliably the wild type trace should also have the
same problem.
The reference trace can either be that of a single reading (perhaps run as a standard on the same gel as the potential mutant DNA) or a "consensus" trace produced by combining several traces. To produce a consensus trace you need to select the readings you wish to combine and then assemble them into a gap4 database. Then from within the gap4 contig editor (see section Save Consensus Trace) the traces can be combined and written out to anew SCF file ready for use in trace_diff. For work in our laboratory we have selected the readings to use by assembling a batch into gap4 databases and then examining their traces. We found that it was best not to use the very highest quality readings but to combine several that were of slightly lower quality.
As is explained below, one way to produce a consensus trace is to first employ pregap to order a set of wild type readings into order based on their quality, and then to assemble them into gap4. Do not be put off by the UNIX commands!
Firstly, add
"do_sort=Yes" to your `.pregaprc' file and run pregap once
without mutation detection. Assemble the results into Gap4 using the normal
shotgun assembly algorithm
(see section Normal shotgun assembly).
The sequences have been sorted by pregap
into quality order with the best sequences first. If you wish to produce a
consensus trace from only the better quality sequences use the Unix
head command on the pregap `passed' file first. (For example,
"head -10 fofn.passed > fofn10.passed"). As it is known that the
sequences will differ, but should also align at or near the same start
position, a relatively high mismatch figure can be used.
Once assembled into gap4, within the contig editor use the `Save Consensus Trace' option from the Commands menu to create a `cons.scf' file (see section Save Consensus Trace). The command brings up a dialogue containing controls to specify the filename, the consensus start and end positions, the strand, and whether to use matching reads.
As the trace of a reading is dependent on the direction it was read, the
consensus trace can be computed from all the reads in either the forward or
reverse directions, but not both at once. When the "Use only matching reads"
toggle is set to "Yes" only the readings of the correct strand that have the
same base call as the consensus sequence are used.
The file `cons.scf' can then be used as a reference trace.
This Gap4 database is no longer
needed.