|
Exercise 5: Finding Domains in Protein Sequences
I. INTRODUCTION
Many proteins which have been classified as "globular (i.e. folded into a compact
globular shape) appear to be composed of several distinct folded regions joined
by more extended loops of amino acids. These globular subregions are termed
"domains" and can range in size from 20-300 amino acids. Some domains have been
associated with specific functions (e.g. catalysis of peptide bond cleavage,
ATP binding, etc), but this association must be tentative since ligand binding
or formation of an active site often takes place at the surface where two domains
interact. Identification of domains can help us to assign a newly discovered
open reading frame to a family of proteins. Domains in a newly discovered protein
can be recognized by sequence homology with known domains in well characterized
proteins, but this is still not a precise science. While new techniques of analysis
are being introduced, at the present the most user-friendly and visual domain
identification program is the SMART domain annotation database.
You can access the SMART Protein Domain database via the
server indicated below.
Copy your sequence and past it into SMART sequence window Click the Sequence
SMART button. Depending on how busy the SMART server is, it may take a few minutes
for a result to be returned. BE PATIENT!!
The results will show you a live diagram with the domains within the query sequence. Each domain
has a unique color and shape and annotation.
Scroll down the window to see a table that lists each identified domain together with its putative (probable) start
and end point in your sequence and the probability (E-value) assigned to that identification
(the smaller the e-value the more likely the identification is not simply due
to chance).
Clicking the mouse over the domain on the figure or in the table will bring up the domain name
or abbreviation and the amino acid sequence assigned to this domain at the very
bottom of the Netscape window. With a PC, right click on the image to save it
as a PNG file. with Macintosh, hold down the control key and the mouse button
to save the figure. Rename it with a descriptive title and the .png extension.
It can be opened in Quicktime or Photoshop or most any other reader.
Clicking on the domain name will bring up more detailed information on the domain.
Pick out one domain to examine in detail.
What are the characteristics (amino acid sequences) that define that domain?
What kinds of proteins contain this domain?
What is the function of that domain?
How similar is your sequence to the defined domain?
These tutorials were developed by Dr. Ross S. Feldberg, Dept of Biology, Tufts University, Medford, MA 021554 with the assistance of a Teaching with Technology grant from the Academic Computing Department at Tufts. Thanks to Anoop Kumar, Abhra Verma and Scott Cordeiro for help in developing this resource. Suggestions, corrections and comments should be sent to Ross.Feldberg@Tufts.edu. (last modified Aug 2005) |