|
CENTER
for
MOLECULAR
MEDICINE
&
GENETICS
Wayne State University School of Medicine |





In this section, we will use GCG to align several related protein sequences, then we will use the alignment for phylogenetic analysis. START SEQLAB Start the X Windows program on your PC telnet to genetics, set DISPLAY: setenv DISPLAY ip.address:0 cd into your mbg8680/gcg folder: cd mbg8680/gcg type gcg, then type seqlab & Make sure your working dir is set to mbg8680/gcg (see Introduction to GCG for a review if needed) OPEN A DIFFERENT LIST FILE Click File, Open List, hsp70a.list, OK. The hsp70a.list file contains several heat shock-related proteins from various species, which we will use for the following demonstrations. PILEUP, THE MULTIPLE SEQUENCE ALIGNMENT PROGRAM We will use the GCG program PileUp to align the heat shock proteins. PileUp does progressive pairwise comparisons on every possible pair of sequences to find the best alignment, then repeats the process until the alignment is complete. Select all the entries in the hsp70a.list file. Click Functions, Multiple Comparison, Pileup. Note options, then click Run. Click Windows, Job Manager, Open Output Mgr Display the .msf multiple sequence file. Display the .figure dendogram (plot of sequence similarity). Add the .msf file to the Main List. Close the Output and Job Manager Windows. Note the .msf file added to the list, then save the list. EXAMINE THE ALINGMENT IN THE MSF FILE Load the .msf file into the Editor. You can manually edit the alignment to make it better. Return to the Main List. Run Plotsimilarity on the .msf file to show a plot of the overall similarity of the 5 protein sequences. Use Pretty to calculate the consensus sequence, and display the matching amino acids in upercase characters. PHYLOGENETIC (EVOLUTIONARY) ANALYSIS Run Distances on the .msf file. Note 3 output files. Display the .distances file (table of relatedness). Display the .figure file (phylogram of relatedness). Note the negative branch. That is because the sequences are not similar near their ends. To run the analysis properly, the unrelated sequence ends should be left out by first selecting only the interior sequences. In fact, the analysis can be done on several different interior segments in order to compare the results.
Send comments to:
dwomble@genetics.wayne.edu
Copyright © 2003, David D. Womble.