Chat room

Create a Meebo Chat Room
Showing posts with label opsin. Show all posts
Showing posts with label opsin. Show all posts

Wednesday, August 18, 2010

How are the opsin genes related to each other?

Answering this question requires making a multiple sequence alignment and then using it to make a phylogenetic tree. For these tasks, we move to another database where it's a little easier to gather a bunch of sequences into a single FASTA file.
Point your browser to http://us.expasy.org. PASY is mirrored at several locations including the following:
http://www.expasy.org/  http://ca.expasy.org/
If one does not work or responds slow, try a different one.
You see the home page of ExPASy, the Expert Protein Analysis System. As I said earlier, ExPASy is a complete protein tool box. With ExPASy, you can do almost any imaginable analysis or comparison of protein sequences and structures.
Click Swiss-Prot and TrEMBL under Databases.
Read the introduction to these databases. They are high quality protein sequence databases with abundant annotation, minimal redundancy, and many connections to other databases.
Click Advanced search in Swiss-Prot and TrEMBL.
With advance searching, you can limit your search to specific genes and organisms, and you can search on descriptive information in the entries
Set up a search for human opsins, as follows:
  • Search Swiss-Prot only.
  • Enter Description: opsin
  • Organism: Choose "Human" from the pull-down menu
  • Check "Append and prefix * to query terms. The * is a "wild card". You are searching for all entries that contain "opsin" as a whole or partial word.
Click Submit.
The page Swiss-Prot description is your search result page.
Look over the results. On 9/8/2003, this search gave 14 hits. The rod pigment rhodopsin (OPSD), along with the three cone pigments (OPSB, OPSG, OPSR). There is also a "visual pigment-like receptor peropsin", OPSX. Sound mysterious. Let's find out more about it, and in the process, see a typical Swiss-Prot entry.
Click on the gene name, OPSX.
You see the NiceProt View of Swiss-Prot: O14718. Persue this entry and try to find out just what this rhodopsin-like protein is thought to do. Under Comments, you'll learn that it's found in the retina (the RPE or retinal pigment epithelium), and that it may detect light, or perhaps monitors levels of retinoids, the general class of compounds that are the actual light absorbers in opsins. Also under Comments - Similarity, you see, as mentioned earlier, that this protein is a member of the large family of G protein-coupled receptors. If you click "G protein-coupled receptors" under the Keywords, you find a list of all purported 7-transmembrane receptor proteins in SwissProt. The human genome alone contains 350 of them! See if you can verify this statement, without counting. Now back up to the NiceProt view.
Under References click the journal citation, "Proc. Natl. Acad. Sci. U.S.A. 94:9893-9898(1997). From the resulting page, you can read a full article in the Journal of the National Academy of Sciences (PNAS) about this protein. Like many journals, PNAS puts full articles online just 6 to 12 months after publication.
Looking further down the page, you find cross-references to the protein or its gene in other databases, predicted structural features of the protein, and last, the sequence. Note also, at the bottom of the page, links to a number of ExPASy tools listed for further analysis of this sequence. Try some of them. For example, I just learned in about ten seconds from Compute pI/MW that the isoelectric pH (or pI) of this protein is 8.78. And I learned in no time at all from ScanProSite that the sequence contains signatures indicating that the protein is probably a G protein-coupled receptor (no surprise, but comforting) and that it has a retinal binding site. ProSite is a tool for finding signatures of function in new sequences.When you finish playing with these powerful tools, return to your SwissProt search results by use of the back button of your browser. If you're lost, go back to ExPASy and do the search again.
Now let's compare the sequences with each other. We'll use the program ClustalW to make a multiple sequence alignment.
Scroll down the result page and check the boxes at the left of these entries
  • OPSB (blue-sensitive opsin)
  • OPSD (rhodopsin)
  • OPSG (green-sensitive opsin)
  • OPSR (red-sensitive opsin)
  • OPSX (visual pigment-like receptor opsin)
At the top of the page, at Send selected sequences to, select Clustal W (multiple alignment) from the menu, and click Submit.
ClustalW has been implemented at many web sites. This one, at EMBnet.org, automatically receives the FASTA files from the selected entries, allows you to make some settings of the alignment criteria, and then does the alignment. We will just accept the default alignment settings. First, scroll in the Input Sequences box and verify that it contains five FASTA files, one right after the other. To make them easier to identify in subsequent outputs, edit the name of each FASTA comment line (begins with ">") as follows:
  • Change "sp|P03999|OPSB_HUMAN Blue-sensitive opsin (Blue cone photoreceptor pigment) - Homo sapiens (Human)." to "Blue".
  • Change "sp|P08100|OPSD_HUMAN Rhodopsin (Opsin 2) - Homo sapiens (Human)." to "Rhodopsin".
  • Change "sp|P04001|OPSG_HUMAN Green-sensitive opsin (Green cone photoreceptor pigment) - Homo sapiens (Human)." to "Green".
  • Change "sp|P04000|OPSR_HUMAN Red-sensitive opsin (Red cone photoreceptor pigment) - Homo sapiens (Human)." to "Red".
  • Change "sp|O14718|OPSX_HUMAN Visual pigment-like receptor peropsin - Homo sapiens (Human)." to "Peropsin".
In all cases, be sure to leave the ">" in the first line of each FASTA entry. To save some work in case something goes wrong, select the edited contents of the Input Sequences box, copy it, and paste it onto an empty word-processor page, and save the file in text format. Name it Opsins.txt.
Click Run ClustalW.
The resulting page is called ClustalW query receipt, and it contains links to several output files.
Click clustalw (aln).
You see the typical ClustalW alignment file, showing our five protein sequences aligned to maximize identical and similar residues. Below each line of five sequences are symbols to show the extent of similarity among the sequences. An asterisk (*) means that the same residue is always (that is, for all of these sequences) found at that location; for example, the first asterisk marks a location where only N (asparagine) is found. Colon (:) means that all residues at this location are very similar; for example, the first colon is where only F (phenylaline), I (isoleucine), and L (leucine) -- residues with large, nonpolar sidechains -- occur. Period (.) means somewhat similar residues; for example, at the first period, serine, threonine, and glutamine occur -- all polar, but varied in size. If there is no mark then the residues at that location display no predominant common properties.
Once more, as a safety measure, copy this alignment to your clipboard, and paste it onto an empty word-processor page. Then save the file in text format. Name it OpsMSA.txt. Remember that it is still on your clipboard, for pasting at our next stop. This multiple sequence alignment is one type of input you can use to make a phylogenetic tree.

Tuesday, August 17, 2010

Bacteriorhodopsin

The amino-acid sequence of this OPN1LW

Things look a lot like before, but this is a protein entry, containing the amino-acid sequence in one-letter abbreviations. Just as with the mRNA entry, turn this into a FASTA display, and copy it into a new word-processor document. Save it in text format as protred.txt. Return to LocusLink.
you can translate the FASTA format of the nucleotide sequence of the gene otherwise its amino acid sequence is also present in the genbank page you can access it here
FASTA of the amino acid sequence is at this page http://www.ncbi.nlm.nih.gov/protein/9910526

What is the nucleotide sequence of this gene?

Remember that we are looking at the gene for the red-sensitive opsin in humna vision, and it's located near the bottom tip of the X chromosome. Scroll down to NCBI Reference Sequences (RefSeq). You see that mRNA (messenger RNA) and protein sequences are available, along with a GenBank sequence.

Click the entry number beside mRNA.

This is a typical GenBank nucleotide file, and a lot of it is hard to read, but a few things are clear. First note, under references, a citations to the publication of this sequence in the scientific literature. To see an abstract of the article in which this gene was described, click the PubMed link below the reference. As you see, you've been here before. There are many ways to move from one database to another, which is both a blessing and a curse. You have to keep your eyes open for useful links, and when you find a path that you think you might use again, make a note of it and bookmark the web pages. It is frustrating to know there's an easier way to do something, and not remember how you did it.

NB to GR: point back to this abstract when you get the phylogenetic tree.
to find the sequence go to this page http://www.ncbi.nlm.nih.gov/nuccore/9910525?report=genbank
Scroll to the bottom of this long page. The last thing is the sequence of this messenger RNA. You are seeing the actual list of As, Ts, Gs, and Cs that make up the message for synthesis of this opsin. But wait! You know that RNA contains no T. In most nucleotide databases, U from RNA is represented as T, to make for easy comparison of DNA and RNA sequences. This sequence information is not in the form that is most useful for searching in databases, say, searching for related genes. Let's display this entry in a form more useful for searching.

At the top of the page, beside the Display button, pull down the menu that says default (we are looking at the default entry display), and select FASTA (note that several other display options are available). Then click the Display button. You see one descriptive or "comment" line that begins with ">", followed by the nucleotide sequence. This little file is just what you need to search nucleotide databases for similar sequences. Let's keep it for future use.
This is the FASTA format of the gene OPN1LW http://www.ncbi.nlm.nih.gov/nuccore/164419729?report=fasta
Click and drag on the web page to select everything from the ">" through the last nucleotide. Be careful not to select anything else. From your browser's Edit menu, select Copy to make a copy of this information on your clipboard, for pasting elsewhere. Now start your favorite word processor, make a new document, and paste. The FASTA comment and sequence should appear. Select all of the text and change the font to Courier or Monaco -- these "typewriter" fonts make it easy to align letters into columns, because all letter are the same width. Save this file, choosing text or plain text as the file type. Call it mrnared.txt. Save it to a convenient location for the files you'll be making later. Click your browser's Back button until you return to LocusLink.

Find and Characterize the gene using Bioinformatics and its tools and databases

By using Bioinformatics we have to find the specific gene in genome where it is present in the genome.
Here we go:
Our gene of interest is Opsin:
Where are the opsin genes in the human genome? 
First go to this site http://www.ncbi.nlm.nih.gov/mapview/. 
Read the instructions. Note that you can look at a genome by clicking on the NAME of the species, not the B beside it. The species name takes you to a viewer for the genome of that organism. The B takes you to a BLAST search tool (later).

Click Homo sapiens (human).

You see a diagram of the human chromosomes, and a search box at the top. Enter "opsin" in the box next to Search for.

Click Find.
 You see the diagram again, with red marks at your "hits", the locations of genes whose entries contain "opsin" as a whole or partial word. Below the diagram is a list of the indicated genes. Among them are the rhodopsin gene (RHO), and three cone pigments, short-, medium-, and long-wavelength sensitive opsins (for blue, green, and red light detection). Four hits look like visual pigments, which probably does not surprise you. To the left of each entry is the chromosome number, allowing you to tell which red mark corresponds to each entry. Note that two opsins are on the X chromosome, one of the sex-determining chromosomes. You can pursue multiple hits on the same chromosome with the all matches link for that chromosome.

Click all matches next to X.

You see a very complicated display (don't sweat -- we're going to use only a part of this now). On the left is a diagram of the X chromosome, with red marks at the positions of the gene(s) you've followed to this page -- in our case, the two opsins, medium- and long-wave, which are located near the bottom tip of the X chromosome. To the right are various representations of the X chromosome, with listings of annotated areas. The two opsin genes are highlighted in pink. If you pass your cursor over this page without clicking, you will find that some symbols provide brief information, most about regions that are not yet characterized well enough to have a full entry.

As you can see, there is a tremendous amount of information on this page, with links to much more. If you want full information about the meanings of abbreviations and symbols on this page, as well as the kinds of information linked to the page, you can use Map Viewer Help at the top of the page. You will find abundant information about the Map Viewer, explanations of all symbols and links, and even tutorials about how to ask and answer all kinds of questions about the genome.

For now, note the information provided for the first of the two highlighted opsin genes, OPN1LW (this is called the gene symbol). You see that this is the long-wavelength-sensitive (red) opsin, and that it's a gene involved in color blindness (a sex-linked trait -- no surprise).

Twitter Delicious Facebook Digg Stumbleupon Favorites More