TERMINATOR

[ Program Manual | User's Guide | Data Files | Databases ]

Table of Contents
FUNCTION
DESCRIPTION
ACKNOWLEDGMENT
EXAMPLE
OUTPUT
INPUT FILES
RELATED PROGRAMS
RESTRICTIONS
ALGORITHM
CONSIDERATIONS
COMMAND-LINE SUMMARY
LOCAL DATA FILES
PARAMETER REFERENCE

FUNCTION

[ Top | Next ]

Terminator searches for prokaryotic factor-independent RNA polymerase terminators according to the method of Brendel and Trifonov.

DESCRIPTION

[ Previous | Top | Next ]

Terminator uses a table of the dinucleotide frequencies for each position from a set of known terminators to find places in a new sequence where terminator-like sequences occur. Terminator finds all discrete examples in the searched sequence where a measurement falls above some user-defined threshold value. The measurement for each alignment of the table over the sequence is the sum of the values in the table for each dinucleotide from the sequence. The method can also restrict the set of terminator-like sequences shown to those that fall above some threshold for the presence of a GC-rich dyad symmetry near the poly-U region.

The method used by Terminator is described in detail in two papers: Brendel, V. and Trifonov, E. N., Nucl. Acids Res. 12 4411-4427 (1984) and Brendel, V. and Trifonov, E. N. in CODATA Conference Proceedings, Jerusalem, 1984. Any use of Terminator that results in publication should cite these papers.

ACKNOWLEDGMENT

[ Previous | Top | Next ]

The Terminator program was written by Volker Brendel and was adapted to run with the Wisconsin Package(TM) by Greg Hamm.

EXAMPLE

[ Previous | Top | Next ]

Here is a session using Terminator to search for terminator-like sequences in synpbr322:


% terminator

  TERMINATOR search of what sequence ?  GenBank:SynpBR322

                 Begin (* 1 *) ?
               End (*  4361 *) ?
              Reverse (* No *) ?

  Primary structure threshold value (* 3.50 *) ?

  Secondary structure threshold value (* 0 *) ?

  What should I call the output file (* synpbr322.trm *) ?

  Searching . . .

%

OUTPUT

[ Previous | Top | Next ]

Here is the output file:


 TERMINATOR search on: synpbr322  check: 5483  from: 1  to: 4361

J01749 Cloning vector pBR322, complete genome. 6/96
LOCUS       SYNPBR322    4361 bp    DNA   circular  SYN       07-JUN-1996
DEFINITION  Cloning vector pBR322, complete genome.
ACCESSION   J01749 K00005 L08654 M10282 M10283 M10286 M10356 M10784 M10785
            M10786 M33694 V01119
NID         g208958 . . .

 Primary structure threshold: 3.50  Secondary structure threshold: 0

                                   October 6, 1998 15:02  ..

           -40  -35  -30  -25  -20  -15  -10   -5  -1+  +5         p      s
             .    .    .    .    .    .    .    .   ..   .
     921=> CCATTATCGCCGGCATGGCGGCCGACGCGCTGGGCTACGTCTTGCTGGCGT   3.80      0
    1398=> CATCTCCAGCAGCCGCACGCGGCGCATCTCGGGCAGCGTTGGGTCCTGGCC   3.62      0
    1573=> TCTGCGACCTGAGCAACAACATGAATGGTCTTCGGTTTCCGTGTTTCGTAA   3.62      0
    1583=> GAGCAACAACATGAATGGTCTTCGGTTTCCGTGTTTCGTAAAGTCTGGAAA   4.32      0
    1881=> CATGAACAGAAATCCCCCTTACACGGAGGCATCAGTGACCAAACAGGAAAA   3.57     16
                                --- --    -- ---
                        -- --      -- --
    1914=> AGTGACCAAACAGGAAAAAACCGCCCTTAACATGGCCCGCTTTATCAGAAG   4.47      0
    2320=> GATGCGTAAGGAGAAAATACCGCATCAGGCGCTCTTCCGCTTCCTCGCTCA   3.73     48
                          -- - ---     --- - --
    2492=> GCAAAAGGCCAGGAACCGTAAAAAGGCCGCGTTGCTGGCGTTTTTCCATAG   4.35      0
    2497=> AGGCCAGGAACCGTAAAAAGGCCGCGTTGCTGGCGTTTTTCCATAGGCTCC   3.95      0
    3039=> TGATCCGGCAAACAAACCACCGCTGGTAGCGGTGGTTTTTTTGTTTGCAAG   6.92     95
                        -----------   -----------
                          ----    ----
             .    .    .    .    .    .    .    .   ..   .
    3101=> GCGCAGAAAAAAAGGATCTCAAGAAGATCCTTTGATCTTTTCTACGGGGTC   4.18     68
                      -------  - --   -- -  -------
                     ---------     ---------
    3199=> GATTATCAAAAAGGATCTTCACCTAGATCCTTTTAAATTAAAAATGAAGTT   4.62     19
                    ---------      ---------
    3502=> GTGGTCCTGCAACTTTATCCGCCTCCATCCAGTCTATTAATTGTTGCCGGG   3.59      0
    4226=> TATTTAGAAAAATAAACAAATAGGGGTTCCGCGCACATTTCCCCGAAAAGT   4.49      0
    4311=> ACATTAACCTATAAAAATAGGCGTATCACGAGGCCCTTTCGTCTTCAAGAA   3.69      0

INPUT FILES

[ Previous | Top | Next ]

Terminator takes a single nucleic acid sequence as input. If Terminator rejects your nucleotide sequence, see Appendix VI for information on how to change or set the type of a sequence.

RELATED PROGRAMS

[ Previous | Top | Next ]

None

RESTRICTIONS

[ Previous | Top | Next ]

The pattern recognition method used by Terminator is only applicable to the search for prokaryotic factor-independent terminators. As mentioned above, Terminator is not really a GCG program, but was adapted to run with the Wisconsin Package by Greg Hamm. Its behavior is not completely known, and we do not assert that all GCG conventions have been followed. We are very grateful to Drs. Brendel and Trifonov for generously allowing GCG to distribute their program.

ALGORITHM

[ Previous | Top | Next ]

The algorithm is described clearly in the CODATA paper.

CONSIDERATIONS

[ Previous | Top | Next ]

The default primary structure threshold is such that about 95 percent of known factor-independent, prokaryotic terminators should be found by Terminator in a set of terminator-like sequences, based on primary structure alone.

The program predicts terminators in those parts of the sequence composed entirely of lower- and uppercase G, A, T, and C. Parts of the sequence containing other sequence symbols are given a primary structure value of 0.0 and a secondary structure value of 0.

COMMAND-LINE SUMMARY

[ Previous | Top | Next ]

All parameters for this program may be added to the command line. Use -CHEck to view the summary below and to specify parameters before the program executes. In the summary below, the capitalized letters in the parameter names are the letters that you must type in order to use the parameter. Square brackets ([ and ]) enclose parameter values that are optional. For more information, see "Using Program Parameters" in Chapter 3, Using Programs in the User's Guide.


Minimal Syntax: % terminator [-INfile=]GenBank:SynpBR322 -Default

Prompted Parameters:

-BEGin=1 -END=4363        sets the range of interest
-REVerse                  uses the reverse strand
-PTHRESHold=3.50          sets the primary structure threshold value
-STHRESHold=0             sets the secondary structure threshold value
[-OUTfile=]synpbr322.trm  names the output file

Local Data Files:

-DATa1=pmatrix.dat       contains the normalized dinucleotide fractions
-DATa2=smatrix.dat       contains the significant GC-rich dyad diagonals

Optional Parameters: None

LOCAL DATA FILES

[ Previous | Top | Next ]

The files described below supply auxiliary data to this program. The program automatically reads them from a public data directory unless you either 1) have a data file with exactly the same name in your current working directory; or 2) name a file on the command line with an expression like -DATa1=myfile.dat. For more information see Chapter 4, Using Data Files in the User's Guide.

The file pmatrix.dat is taken from Figure 3 of the CODATA paper. It is similar to Figure 3 of the NAR paper. It contains the normalized fractions of each dinucleotide observed in the set thought to be determining terminator structure. The file smatrix.dat is from Figure 2 of the CODATA paper. It contains the significant diagonals for the GC-rich dyad symmetry. Both pmatrix.dat and smatrix.dat must be provided to Terminator as local data files.

PARAMETER REFERENCE

[ Previous | Top | Next ]

You can set the parameters listed below from the command line. For more information, see "Using Program Parameters" in Chapter 3, Using Programs in the User's Guide.

-PTHRESHold=3.50

sets the primary threshold value for display of sequence ranges. The default value is set to find 95 percent of known, factor-independent, prokaryotic terminators.

-STHRESHold=0

sets the secondary structure threshold value. This secondary structure is GC-rich dyad symmetry near poly-U regions of a sequence.

Printed: December 9, 1998 16:26 (1162)

[ Program Manual | User's Guide | Data Files | Databases ]


Documentation Comments: doc-comments@gcg.com
Technical Support: help@gcg.com

Copyright (c) 1982-2001 Genetics Computer Group, Inc. A subsidiary of Pharmacopeia, Inc. All rights reserved.

Licenses and Trademarks Wisconsin Package is a trademark of Genetics Computer Group, Inc. GCG and the GCG logo are registered trademarks of Genetics Computer Group, Inc.

All other product names mentioned in this documentation may be trademarks, and if so, are trademarks or registered trademarks of their respective holders and are used in this documentation for identification purposes only.

Genetics Computer Group

www.gcg.com