Glossary

[ Program Manual | User's Guide | Data Files | Databases ]

Accession Number
A unique, identifying number consisting of a letter followed by five digits (for example M13786) that is assigned to each entry in a database. Using accession numbers is the best method for specifying database entries in the Wisconsin Package.

When a sequence is first entered into EMBL, GenBank, or SWISS-PROT, it is assigned a unique primary accession number. If that sequence is ever merged with another sequence, the accession number of the original sequence becomes a secondary accession number in the merged sequence.

For more information, see the release notes for the individual databases or see "Specifying Database Sequences by Accession Number" in the "Using Database Sequences" section of Chapter 2, Using Sequences.

Alias
A feature of UNIX shells that enables users to define program names (and parameters) and commands with abbreviations. When an alias is used in a command line, the system substitutes the alias definition for the abbreviation (the alias). For example, the following command creates an alias called ls: % alias ls `ls -l`. Whenever you use the ls command, the operating system interprets the command as ls -l, That is, instead of just listing the files in a directory, it lists the files with additional attributes such as owner, date, and protections. For more information, see "Using Aliases" in the "For Advanced Users" section of Chapter 3, Using Programs.

At and Batch
UNIX operating system commands which enable users to submit scripts to the computer for running Wisconsin Package programs and UNIX commands at a later time. These commands are useful for submitting programs to run at off-peak hours.

The at queue is also known as the batch queue or script-execution queue. See Batch.

-Batch
A command-line parameter available for some programs in the Wisconsin Package that enables you to run a program as a separate process in a batch queue. The program job may execute immediately as a separate process or may wait in a batch queue to execute at a later time. After you submit a job to batch, your terminal is free for other work. Or, you can log off the computer and your job will still execute. For more information, see "Using the Batch Queue" in Chapter 3, Using Programs.

Checksum
An integer value given to each sequence file created with Wisconsin Package editors or programs. Programs check this value each time a sequence file is used as input. It is a check against the corruption of the sequence data. For more information, see Chapter 2, Using Sequences.

Command-Line Control
A feature of the computer operating system that enables you to provide more information on the command line than just the name of the program or command name. For example, in the Wisconsin Package, you can include the input filename, the output filename, as well as program parameters on the command line. For more information, see "Using Command-Line Control" in Chapter 3, Using Programs.

Database
Repositories of sequence information established and maintained by the scientific community. Each database entry consists of a reference section and the sequence itself. Databases must be formatted specially for use with the Wisconsin Package.

A number of databases are available which allow you to search and retrieve sequence information, including GenBank, EMBL, SWISS-PROT, and PIR. For more information, see "Using Database Sequences" in Chapter 2, Using Sequences.

The Wisconsin Package also allows you to create your own personal databases with the DataSet program. For more information, see "Using Personal Databases" in the "For Advanced Users" section of Chapter 2, Using Sequences.

Data Files
Files containing information essential for running Wisconsin Package programs. For example, the Map program requires the data file Enzyme.Dat, which contains information about recognition sites of restriction enzymes.

Data files can be either default or local. Default data files are those files programs use unless you specify otherwise. That is, you are never required to supply a data file because a default data file is always available. Local data files are those data files you specify a program should use instead of the default.

GCG provides the default data files in the public directory with the logical name genrundata. In addition, GCG provides a number of alternative data files you can specify a program use in the directory with the logical name genmoredata. For more information, see Chapter 4, Using Data Files.

Directory
A unit of organization for storing information on a computer. Within a directory, you can store subdirectories and files. Directories and subdirectories are analogous to drawers in a filing cabinet.

The top directory, or home directory, refers to the directory you log into. The current directory, or working directory, refers to the directory you are working in presently. For more information, see "Working with Directories" in Chapter 1, Getting Started.

File
A basic unit of storage on a computer--for example sequence information, the output of a program, or a memo to other individuals in your lab. Most Wisconsin Package programs require one or more files as input and produce an output file of results. For more information, see "Working with Files" in Chapter 1, Getting Started.

File of Sequence Names
Replaced by the term list file. See List File.

FOSN
File of Sequence Names. This term has been replaced by the term list file. See List File.

GenMoreData
The GCG logical name for a public directory of the Wisconsin Package containing alternate data files that can be used by programs. This directory also stores other files which contain useful but auxiliary information. You can copy files from this directory into your own directory with the Fetch command. For more information, see Chapter 4, Using Data Files.

GenRunData
The GCG logical name for a public directory of the Wisconsin Package containing default data files used by programs. You can copy files in this directory into your own directory with the Fetch command. For more information, see Chapter 4, Using Data Files.

Global Parameters
Optional command-line parameters that are available to all Wisconsin Package programs that support command-line control. (To avoid repetition, these parameters are not displayed in the "Command-Line Summary" in online help or in the Program Manual.) For more information, see "Using Global Parameters" in the "Customizing Program Analyses" section of Chapter 3, Using Programs.

Global Switch
Commands you can issue after initializing the Wisconsin Package that affect every program run during that session. For example, the command nodoc suppresses the short banner that introduces each Package program. Global switches also are used to initialize graphics for a GCG session. For more information, see "Using Global Switches" in the "Customizing Program Analyses" section of Chapter 3, Using Programs.

Graphics Configuration
Specifies how graphics output from Wisconsin Package programs will display during a session. The configuration consists of the graphics language you want to use, the type of supported graphics device you want to display on (printer, plotter, or terminal screen), and the port or queue to which the device is attached. For more information, see "Initializing Your Graphics Configuration" in Chapter 5, Using Graphics. For more information on supported graphics languages and devices, see Appendix C, Graphics.

List File
A text file consisting of sequence specifications (one per line) which can include database sequences, single sequences in your own directories, multiple sequence specifications using wildcards, other list files, and/or MSF files. The items in a list must be preceded by a line ending with two periods (..).

Some Wisconsin Package programs generate list files as output (for example StringSearch). Others (for example PileUp) accept list files as input if you precede the filename with an at symbol (@), for example @hsp70.list. For more information, see "Using List Files" in Chapter 2, Using Sequences.

Local Data File
See Data File.

Logical Name
A shorthand name you can use for directories, databases, and filenames which reduces typing and is often easier to remember than full specifications. You can create your own GCG logical names using the program Name. See "Defining and Using Logical Names for Directories" in Chapter 1, Getting Started.

Login
An approved account on a computer that must be created by a system manager before you can log in and use the system. The login consists of a user name and (almost always) a password, the latter known only to the user. For more information, see "Logging On" in Chapter 1, Getting Started or see your system manager.

Metacharacter
A character that is interpreted by UNIX shells and Wisconsin Package programs in a defined manner. The most common examples of metacharacters are the wildcards * and ?. The * wildcard metacharacter is interpreted to mean "any character or no character" and the ? wildcard metacharacter is interpreted to mean "any one character." For example, in the command % ls *.seq, the *.seq is interpreted as any filename ending with the extension ".seq". In the example % ls hsp?.seq, the hsp?.seq is interpreted as the name of any file beginning with "hsp" followed by one alphanumeric character, and ending with ".seq".

You can use wildcard metacharacters to specify databases or divisions of databases within the Wisconsin Package. For example, GenEMBL:* specifies all of the entries in the GenEMBL database; Ba:* specifies all of the bacterial entries in GenEMBL; and Sw:hsp* specifies all of the sequences in SWISS-PROT that begin with "hsp".

For more information, see "UNIX Metacharacter Differences" in Appendix D, Command and Keystroke Differences Between OpenVMS and UNIX.

Metafile
A device-independent graphics file created by including -FIGure=filename on the command line when you run a Wisconsin Package graphics program. You then can use the Figure program to print, plot or display the file. For example, if you have configured your graphics for Tektronix emulation of a tek4014 terminal, the Figure program translates the metafile to Tektronix language and displays the information on the Tektronix terminal screen. If you change your graphics configuration to PostScript for a LaserWriter printer and rerun the Figure program, the metafile is translated to PostScript and prints on the LaserWriter. For more information, see "Saving Graphic Output to a File" in Chapter 5, Using Graphics.

MSF File
Files containing two or more sequences aligned together. For more information, see "Using Multiple Sequence Format (MSF) Files" in Chapter 2, Using Sequences.

Multiple Sequence Format File
See MSF File.

MyData
MyData is a logical name for a directory that you can use to store local data files. Because programs automatically search for the logical name MyData, you need not need worry about what directory you are in when you specify the local data file; the program automatically finds the MyData directory. For more information, see Chapter 4, Using Data Files.

Parameter
Modifies the action of a UNIX or GCG command. Some parameters have values, which modify the parameter (for example -BEGin=100), but not all do (for example -BATch). See also unqualified parameter. For more information, see "Using Program Parameters" in Chapter 3, Using Programs.

Platen
The area used to print, plot, or display GCG graphics. The GCG platen is defined on every supported graphics device as 100 vertical units (Y) by 150 horizontal units (X). For more information, see "Using Graphic Parameters" in Chapter 5, Using Graphics.

Platen Unit
The units used to define the platen. There are 100 platen units vertically and 150 platen units horizontally in the GCG platen. For more information, see "Using Graphic Parameters" in Chapter 5, Using Graphics.

Port
A port is a connection through which a separate device (such as a printer, plotter, or graphics terminal) may communicate with the computer. For more information, see "Connecting a Graphics Device to the Computer" in Appendix C, Graphics.

Primary Accession Number
The first number that appears in the accession number in a database entry. Using accession numbers is the best way to specify a sequence entry. See Accession Number.

Public Data File
A data file that resides in a public directory of the Wisconsin Package. See Data File.

Qualifier
The first unit of a parameter. That is, a parameter can be defined as -qualifier=value. All qualifiers are preceded with a dash (-). Some qualifiers have values (for example -BEGin=100), but not all do (for example -BATch). Modifies the action of a UNIX or GCG command. For more information, see "Using Program Qualifiers" in Chapter 3, Using Programs.

Queued Device
A device, usually a printer, that has a method (the queue) for controlling the number of "jobs" it submits to the computer. Each job that is sent to a queued device is sent to the computer in the order it was received. A system manager must set up a queue device. For more information, see "Defining Your Graphics Configuration" in Chapter 5, Using Graphics.

Rich Sequence Format File
See RSF File.

RSF File
Files containing one or more sequences that may or may not be related. In addition to the sequence data, each sequence can be richly annotated with creator/author, sequence weight, creation date, a one-line description, offset, and sequence features. For more information see "Using Rich Sequence Format (RSF) Files" in Chapter 2, Using Sequences.

Scoring Matrix
A table of pairwise relationships between nucleotide symbols or amino acid symbols.

By default for nucleotides, the pairwise value for identities (for example guanines pairing with guanines) is greater than the value given for for non-identity pairs (for example guanine pairing with adenine).

For amino acids, the greater the value for an amino acid pair, the more related or substitutable those amino acids are thought to be (for example valine pairing with isoleucine is given a higher value than valine pairing with tryptophane).

Scoring matrices are used by database searching and multiple sequence alignment programs. Default scoring matrices for each Wisconsin Package program requiring one can be found in files in the directory with the logical name GenRunData. You can find alternative scoring matrices in the directory with the logical name GenMoreData. You can use the Fetch command to copy these files into your directory to customize them. For more information, see "Using a Special Kind of Data File: A Scoring Matrix" in Chapter 4, Using Data Files.

Script-Execution Queue
The queue used by the at and batch UNIX commands. The script-execution queue is also known as the batch queue or at queue. See At and Batch.

Secondary Accession Number
Any non-primary accession number associated with a database entry. Secondary accession numbers usually indicate that the database entry has been merged or modified in some way. See Accession Number.

Shell Metacharacter
See Metacharacter.

Shell Script
A file that can be used to execute one or more UNIX or GCG commands. Shell scripts are automatically created each time you run a program with the -BATch parameter. The script contains all of the information you would communicate to the computer if you ran the program from the command line.

Other examples of files containing shell scripts are .login (csh) and .profile (ksh), which are executed automatically each time you log onto the computer. You also can create shell scripts to submit programs that do not support the -BATch parameter. For more information, see "Working with Shell Scripts" in Chapter 3, Using Programs.

Standard Error
The device to which a program or operating system normally sends error messages. Standard error is usually directed to either the terminal screen or to a file. For more information, see "Using Command-Line Redirection" in Chapter 1, Getting Started.

Standard Input
The device from which a program or operating system normally receives input. In most cases this is the terminal (that is, whatever you type from the keyboard is displayed on the terminal screen). For more information, see "Using Command-Line Redirection" in Chapter 1, Getting Started.

Standard Output
The device to which a program or operating system normally sends its output. For UNIX commands this is usually the terminal. For Wisconsin Package programs (other than graphics programs) this is usually a file. For more information, see "Using Command-Line Redirection" in Chapter 1, Getting Started.

Symbol Comparison Table
Replaced by the term scoring matrix. See scoring matrix.

Unqualified Parameter
Parameters that are not preceded by a qualifier. For example, it is not necessary to specify an input file from the command line with -INfile=; you can simply type the filename. If you use unqualified parameters, they must appear in the proper order on the command line. See also parameter. For more information, see "Using Program Parameters" in Chapter 3, Using Programs.

Wildcard
Use wildcard characters, such as the asterisk (*) or question mark (?), for file specifications. You most often will use wildcards to specify multiple files. An asterisk (*) wildcard serves as an ambiguous replacement for a character or group of characters; the * means "any character or no character." The question mark (?) wildcard means "any one character." See also metacharacter. For more information, see "Using Wildcards" in the "Working with Files" section of Chapter 1, Getting Started.

[ Program Manual | User's Guide | Data Files | Databases ]


Documentation Comments: doc-comments@gcg.com
Technical Support: help@gcg.com

Copyright (c) 1982-2001 Genetics Computer Group, Inc. A subsidiary of Pharmacopeia, Inc. All rights reserved.

Licenses and Trademarks Wisconsin Package is a trademark of Genetics Computer Group, Inc. GCG and the GCG logo are registered trademarks of Genetics Computer Group, Inc.

All other product names mentioned in this documentation may be trademarks, and if so, are trademarks or registered trademarks of their respective holders and are used in this documentation for identification purposes only.

Genetics Computer Group

www.gcg.com