Searching the Scientific Literature PubMED vs OVID search engines, books and OMIM
Searching the Scientific Literature PubMED vs OVID search engines, books and OMIM
Searching the Scientific Literature PubMED vs OVID search engines, books and OMIM
Searching the Scientific Literature PubMED vs OVID search engines, books and OMIM
Searching the Scientific Literature PubMED vs OVID search engines, books and OMIM
Searching the Scientific Literature PubMED vs OVID search engines, books and OMIM

Welcome to Bioinformatics Basics at Tufts!

As the sequencing of the genomes of humans and other species proceeds, vast amounts of raw data are accumulating in publically accessible databases. Understanding what is available, how to access it and the tools available for analysis of these data are now critical skills for anyone interested in understanding modern bioscience. The exercises available at this site are designed to give you a very basic introduction to the databases and some of the methodologies used in bioinformatic analysis. As of July 2005, the exercises available at this site are:

Exercise 1: An introduction to searching the scientific literature. Simple web searches generally turn up both interesting information and garbage. As scientists and health professionals you will need to know how to access the refereed scientific literature (publications which have been reviewed for accuracy and completeness by other professionals). This exercise will introduce you to the MEDLINE database accessed via the Entrez Browser and to two other databases available via this browser (Online Mendelian Inheritance in Man and Books-on-line).

Exercise 2: Finding the Nucleotide Sequence for a Gene.  One problem with the vast amount of data now accessible is that it is becoming increasingly difficult to sort though it to find a specific gene of interest. This exercise will introduce you to the Nucleotide database and the various ways to structure a search for a given gene.

Exercise 3: Determining the correct reading frame for a nucleotide sequence. Many experiments result in one obtaining a short DNA sequence of unknown function. Using the available databases it is often possible to assign this short sequence to a specific gene. Typically the first step in obtaining such an assignment involves determining the reading frame that the cell uses to translate this DNA sequence into a protein sequence. This exercise will provide you with an unknown sequence and you will use web-based tools to determine a likely reading frame. Save your sequences from this exercise and in the next exercise you will use them to search the database and determine which gene gave rise to them.

Exercise 4: Using BLAST to identify a gene. (cont from Exercise 3) In this exercise, you will take the best two open reading frames obtained in Exercise 3 and use them to carry out a similarity search against all the protein sequences available in the database. We will use the best two open reading frames to allow us to compare the results obtained in a correct translation with those from an incorrect translation. This exercise should allow us to assign our unknown sequence to a specific gene.

Exercise 5: Searching for Sequence motifs in a given protein.  Say you have found an increase in the level of an mRNA coding for an unknown gene under conditions of low oxygen pressure. You would like to know what protein this mRNA is coding for, but a similarity search of the databases reveals no obvious homologs to this protein. Another way to look for possible function is to determine if short regions in the protein correspond to sequences which have been recognized to carry out specific functions. Special search engines have been designed to look for such "motifs" and you will use one of these to examine an unknown protein.

Exercise 6: (To be developed) Finding homologs of a human gene in other organisms.  In recent years we have been able to determine that certain genes are associated with various human diseases. Often, however, the function of the proteins coded for by these genes is not clear. One way to better understand the normal function of these proteins is to find their homologs in simpler organisms which can be experimentally manipulated. In this exercise you will be given a gene known to be involved in susceptibilty to a human disease and you will search the Drosophila or yeast databases to determine if a homologous gene has been found in these organisms.


These tutorials were developed by Dr. Ross S. Feldberg, Dept of Biology at Tufts University, Medford, MA 02155 with the assistance of a Teaching with Technology grant from the Academic Computing Department at Tufts University. Thanks to Anoop Kumar, Abha Verma and Scott Cordeiro for development of this instructional resource.

You are free to make use of this site for educational purposes, but I would appreciate your informing me of its use and would welcome any suggestions for improvement, comments on its usefulness as a teaching tool or how it might be improved. Please cite Bioinformatic Basics at Tufts as follows: Feldberg, R.S. 2005 Bioinformatic Basics at Tufts. World Wide Web electronic publication http://ase.tufts.edu/biology/bioinformatics

Corrections, comments or suggestions are greatly appreciated and should be sent to ross.feldberg@tufts.edu (last modified Aug 2005)