first previous next last contents

Sequence Information.

Information relating to the base interpretation of the trace is stored at byte offset from the start of the file. Stored for each base are: its character representation and a number (an index into the Samples data structure) indicating its position within the trace. The relative probabilities of each of the 4 bases occurring at the point where the base is called can be stored in prob_A , prob_C , prob_G and prob_T . The Amersham FilmReader uses Bases.spare[0] to store the confidence level of the base. The value ranges from 0 (low confidence) to 8 (high), and 9 indicating the base has been manually edited.

From version 3.00 these items are stored in the following order: all "peak indexes", i.e. the positions in the sample points to which the bases corresponds; all the accuracy estimates for base type A, all for C,G and T; the called bases; this is followed by 3 sets of empty int1 data items. These values are read into the following data structure by the routines in the io library.

/*
 * Type definition for the sequence data
 */
typedef struct {
    uint_4 peak_index;        /* Index into Samples matrix for base posn */
    uint_1 prob_A;            /* Probability of it being an A */
    uint_1 prob_C;            /* Probability of it being an C */
    uint_1 prob_G;            /* Probability of it being an G */
    uint_1 prob_T;            /* Probability of it being an T */
    char   base;              /* Called base character        */
    uint_1 spare[3];          /* Spare */
} Base;

first previous next last contents
This page is maintained by James Bonfield. Last generated on 2 Febuary 1999.
URL: http://www.mrc-lmb.cam.ac.uk/pubseq/manual/formats_5.html