first previous next last contents

Producing a reference trace for trace_diff

Trace_diff compares individual sequence traces against a known wild type or reference trace to detect mutations. The reference trace should therefore have the correct sequence and preferably be as `typical' as possible. Note that this does not imply the wild type trace should be perfect. If the mutant sequences have a sequence dependent problem (such as a compression), then in order for trace_diff to work reliably the wild type trace should also have the same problem.

The reference trace can either be that of a single reading (perhaps run as a standard on the same gel as the potential mutant DNA) or a "consensus" trace produced by combining several traces. To produce a consensus trace you need to select the readings you wish to combine and then assemble them into a gap4 database. Then from within the gap4 contig editor (see section Save Consensus Trace) the traces can be combined and written out to anew SCF file ready for use in trace_diff. For work in our laboratory we have selected the readings to use by assembling a batch into gap4 databases and then examining their traces. We found that it was best not to use the very highest quality readings but to combine several that were of slightly lower quality.

As is explained below, one way to produce a consensus trace is to first employ pregap to order a set of wild type readings into order based on their quality, and then to assemble them into gap4. Do not be put off by the UNIX commands!

Firstly, add "do_sort=Yes" to your `.pregaprc' file and run pregap once without mutation detection. Assemble the results into Gap4 using the normal shotgun assembly algorithm (see section Normal shotgun assembly). The sequences have been sorted by pregap into quality order with the best sequences first. If you wish to produce a consensus trace from only the better quality sequences use the Unix head command on the pregap `passed' file first. (For example, "head -10 fofn.passed > fofn10.passed"). As it is known that the sequences will differ, but should also align at or near the same start position, a relatively high mismatch figure can be used.

Once assembled into gap4, within the contig editor use the `Save Consensus Trace' option from the Commands menu to create a `cons.scf' file (see section Save Consensus Trace). The command brings up a dialogue containing controls to specify the filename, the consensus start and end positions, the strand, and whether to use matching reads.

As the trace of a reading is dependent on the direction it was read, the consensus trace can be computed from all the reads in either the forward or reverse directions, but not both at once. When the "Use only matching reads" toggle is set to "Yes" only the readings of the correct strand that have the same base call as the consensus sequence are used. The file `cons.scf' can then be used as a reference trace. This Gap4 database is no longer needed.


first previous next last contents
This page is maintained by James Bonfield. Last generated on 2 Febuary 1999.
URL: http://www.mrc-lmb.cam.ac.uk/pubseq/manual/mutations_3.html