NetFetch*

[ Program Manual | User's Guide | Data Files | Databases ]

Table of Contents
FUNCTION
DESCRIPTION
EXAMPLE
OUTPUT
INPUT FILES
RELATED PROGRAMS
RESTRICTIONS
CONSIDERATIONS
NETWORK CONSIDERATIONS
COMMAND-LINE SUMMARY
ACKNOWLEDGEMENT
PARAMETER REFERENCE

FUNCTION

[ Top | Next ]

NetFetch retrieves sequences from NCBI listed in a NetBLAST output file. You can also use it to retrieve sequences individually by sequence name or accession number. The output of NetFetch is an RSF file.

DESCRIPTION

[ Previous | Top | Next ]

NetFetch is an interface to the NetEntrez service provided by NCBI's web server at www.ncbi.nlm.nih.gov. It uses this server to perform remote retrievals. NetFetch reads the NetBLAST output file, queries the NCBI web service, and returns the sequences in an RSF output file. You can also retrieve individual sequences with NetFetch.

NetFetch can retrieve sequences only from the databases maintained at NCBI. Sometimes these databases and the databases searched with NetBLAST differ, resulting in the total or partial failure of some requests. Remote searches require almost no resources from your own computer.

EXAMPLE

[ Previous | Top | Next ]

Here is a session using NetFetch to retrieve sequences listed in a NetBLAST output file:


% netfetch

 NETFETCH what NCBI sequence or NetBLAST output file ?  zizm99.blastp

 What should I call the RSF output file (* zizm99.rsf *) ?

 NETFETCH complete with:

      Input: zizm99.blastp
     Output: zizm99.rsf
     Server: www.ncbi.nlm.nih.gov
  Requested: 25
   Returned: 25

%

OUTPUT

[ Previous | Top | Next ]

Below is part of the output from the example session:


!!RICH_SEQUENCE 1.0

NETFETCH of: zizm99.blastp  August 11, 1998 08:09

from server: www.ncbi.nlm.nih.gov

 25	Sequences Requested
 25	Sequences Returned

Sequences Requested
-----
sp|P04704|ZEA2_MAIZE	sp|P24449|ZEAC_MAIZE
gi|168691	pir||S47265
sp|P06674|ZEA3_MAIZE	gi|16073
sp|P24450|ZEAD_MAIZE	pir||S47266
gi|168693	pir||S07172
sp|P04705|ZEAB_MAIZE	gi|22523
gi|168701	sp|P04701|ZEAL_MAIZE
sp|P02859|ZEA1_MAIZE	prf||1107201C
sp|P06675|ZEA4_MAIZE	sp|P08416|ZEA5_MAIZE
prf||1107201B	sp|P04703|ZEA7_MAIZE
sp|P06676|ZEA8_MAIZE	sp|P06677|ZEA9_MAIZE
pir||S21969	prf||1107201G
sp|P04702|ZEA6_MAIZE

 ..
{
name  ZEA2_MAIZE
descrip    ZEIN-ALPHA PRECURSOR (19 KD) (CLONE ZG99).
type    PROTEIN
longname  Zea mays
sequence-ID  141598
checksum    745
creation-date  8/11/1998  8: 9:17
strand  1
comments
  LOCUS       141598        235 aa                              15-JUL-1998
  DEFINITION  ZEIN-ALPHA PRECURSOR (19 KD) (CLONE ZG99).
  ACCESSION   141598
  PID         g141598

///////////////////////////////////////////////////////////////////////////////

Since NetFetch completes successfully if any of the sequences requested are returned, the output file may not contain all of the files that were requested.

INPUT FILES

[ Previous | Top | Next ]

NetFetch accepts a NetBLAST output file or the sequence name or accession number of a sequence. You can specify several sequences by placing a comma between sequence names or accession numbers.

RELATED PROGRAMS

[ Previous | Top | Next ]

NetBLAST searches for sequences similar to a query sequence. The query and the database searched can be either peptide or nucleic acid in any combination. NetBLAST can search only databases maintained at the National Center for Biotechnology Information (NCBI) in Bethesda, Maryland, USA. Fetch copies GCG sequences or data files from the GCG database into your directory or displays them on your terminal screen.

RESTRICTIONS

[ Previous | Top | Next ]

NetFetch was designed specifically to search the NetEntrez server at NCBI. It is unlikely that it will work with other similar servers.

Searching remote databases opens up the possibility of unauthorized access to your query sequence. You should not use confidential query sequences for remote searches.

NetFetch does not accept a conventional GCG sequence specification for the input. The input file is the NetBLAST output file not a GCG list file. Sequence specifications must be consistent with those allowed by the NCBI web server.

The NCBI databases searched by NetFetch may differ from the databases searched by NetBLAST so that not all sequence names listed in the NetBLAST output file can be retrieved by NetFetch. For example, when this document was written you could search the Alu database with NetBLAST but that database was not available to the NetEntrez server at NCBI used by NetFetch.

CONSIDERATIONS

[ Previous | Top | Next ]

Network bandwidth varies greatly from time to time and from site to site. You may want to retrieve sequences when the network is more likely to be quiet. However, be aware that waiting too long to fetch sequences may result in retrieval failures because sequences are sometimes replaced or deleted from the databases.

NetFetch retrieves all of the sequences into a single RSF file. Most Wisconsin Package programs can read individual sequences directly from the RSF file. If you want to export a single sequence into a GCG single sequence file, use the program Reformat.

NETWORK CONSIDERATIONS

[ Previous | Top | Next ]

There are a number of possible problems with client/server applications running over the Internet. You should determine if you are charged for network communications, and note that the security and integrity of your sequences is at risk. Also there is the possibility that a server will become overloaded and that your search will take much longer than normal or that your output will be lost altogether because of a network or server computer glitch.

COMMAND-LINE SUMMARY

[ Previous | Top | Next ]

All parameters for this program may be added to the command line. Use -CHEck to view the summary below and to specify parameters before the program executes. In the summary below, the capitalized letters in the parameter names are the letters that you must type in order to use the parameter. Square brackets ([ and ]) enclose parameter values that are optional. For more information, see "Using Program Parameters" in Chapter 3, Using Programs in the User's Guide.


Minimal Syntax: % netfetch [-INfile1=]zizm99.blastp  -Default

Prompted Parameters:

-OUTfile1=name.rsf           specifies the output file name

Optional Parameters:

-TOP=10                      fetch only the top 10 sequences
-MONitor                     displays screen trace
-NOSUMmary                   suppresses the screen summary
-RAW                         saves the entire server response in a .raw file
-URL="www.blast.ncbi.nlm.nih.gov:80/htbin-post/Entrez/query?db=s&form=6&uid="
                             sends HTTP query to NCBI's netentrez server

ACKNOWLEDGEMENT

[ Previous | Top | Next ]

The NetEntrez service was created and is maintained by the National Center for Biotechnology Information (NCBI). The NetFetch program was written by Joseph King.

PARAMETER REFERENCE

[ Previous | Top | Next ]

You can set the parameters listed below from the command line. For more information, see "Using Program Parameters" in Chapter 3, Using Programs in the User's Guide.

-TOP=10

limit the retrieval to the top sequences. You specify how many sequences you want to retrieve and NetFetch will request no more that that many. It always builds the request list from the sequences at the top of the list. If you specify more sequences than listed in the input file, all of the sequences in the file will be requested. If you specify zero or omit -TOP, all of the sequences in the input file will be requested.

-MONitor

display's a screen trace of the program's progress. Messages will display indicating the connection status to NCBI, the retrieval, and parsing of the result.

-SUMmary

writes a summary of the program's work to the screen when you've used -Default to suppress all program interaction. A summary typically displays at the end of a program run interactively. You can suppress the summary for a program run interactively with -NOSUMmary.

You can also use this parameter to cause a summary of the program's work to be written in the log file of a program run in batch.

-RAW

saves the response as it comes back from NCBI in a .raw file. The file will have the same basename as the RSF file. This file will contain the entire response from NCBI including any error or informational messages.

-URL="www.blast.ncbi.nlm.nih.gov:80/"

specifies the host, port, and command to use when making the request. You can specify the host only, in which case the default port and command are used. You must specify the host if you need to change the port or the command. Specifying the port is never necessary.

The syntax of the command assumes that a comma-separated list of sequence IDs will be concatenated to it before submission to NCBI. For example, if you specify:

% netfetch -URL="www.blast.ncbi.nih.gov/htbin/Entrez/query?db=s&uid="  drome_gpdh

The actual request made to NCBI will be equivalent to making the following request from a web browser:


http://www.blast.ncbi.nih.gov/htbin/Entrez/query?db=s&uid=drome_gpdh

You can read the current version of the NetEntrez documentation on the World Wide Web at http://www.ncbi.nlm.nih.gov/.

Printed: December 9, 1998 16:24 (1162)

[ Program Manual | User's Guide | Data Files | Databases ]


Documentation Comments: doc-comments@gcg.com
Technical Support: help@gcg.com

Copyright (c) 1982-2001 Genetics Computer Group, Inc. A subsidiary of Pharmacopeia, Inc. All rights reserved.

Licenses and Trademarks Wisconsin Package is a trademark of Genetics Computer Group, Inc. GCG and the GCG logo are registered trademarks of Genetics Computer Group, Inc.

All other product names mentioned in this documentation may be trademarks, and if so, are trademarks or registered trademarks of their respective holders and are used in this documentation for identification purposes only.

Genetics Computer Group

www.gcg.com