first previous next last contents

Use of the "hidden" poor quality data

In general sequences obtained from machines contain segments such as vector sequence and poor quality data that need either to be removed or ignored during assembly and editing. In our package we do not remove such segments but instead we mark them so that the programs can deal with them appropriately. In gap such data is referred to as "hidden". The positions to hide are determined initially by preprocessing programs such as vector_clip (see section Screening Against Vector Sequences), trace_clip (see section trace_clip) and clip (see section Clip).

The hidden data can be revealed in the Contig Editor by toggling the "Reveal Cutoffs" button (see section Adjusting the Cutoff data); can be used to search for possible joins between contigs (see section Find Internal Joins), and can be included in the consensus sequence (see section Extended consensus) to be used by external screening programs. For these cases the program can distinguish data that is hidden because it is vector and that hidden because it is of poor quality: only poor quality data is included.

The position of hidden data can be changed interactively in the Contig Editor. In addition the Double Strand function (see section Double stranding) will reduce the amount of hidden data for readings that cover single stranded regions of contigs, if the data aligns well with that on the other strand.


first previous next last contents
This page is maintained by James Bonfield. Last generated on 2 Febuary 1999.
URL: http://www.mrc-lmb.cam.ac.uk/pubseq/manual/gap4_7.html