[maker-devel] maker-devel Digest, Vol 18, Issue 4

Shane Brubaker SBrubaker at Aurorabiofuels.com
Wed Nov 18 15:48:52 MST 2009


Hi Barry, I would like to use maker_functional_gff, but I noticed it refers to Wu-Blast.
        I would like to use it with NCBI blast (I couldn't afford Wu-Blast!), do you know if that could work, and what would be the equivalent of -m 2 format?
        I would also kind of like to try this with a blast vs. nr in addition to using Swissprot, do you think that would be possible?

        On a related note, is there a field(s) I can put things into in Chado that will show up in Gbrowse, such that I could just write my own custom scripts which transfer annotations to enhance my Maker-annotated genome?


Thanks much.


-----Original Message-----
From: maker-devel-bounces at yandell-lab.org [mailto:maker-devel-bounces at yandell-lab.org] On Behalf Of maker-devel-request at yandell-lab.org
Sent: Monday, November 16, 2009 3:49 PM
To: maker-devel at yandell-lab.org
Subject: maker-devel Digest, Vol 18, Issue 4

Send maker-devel mailing list submissions to
        maker-devel at yandell-lab.org

To subscribe or unsubscribe via the World Wide Web, visit
        http://yandell-lab.org/mailman/listinfo/maker-devel_yandell-lab.org
or, via email, send a message with subject or body 'help' to
        maker-devel-request at yandell-lab.org

You can reach the person managing the list at
        maker-devel-owner at yandell-lab.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of maker-devel digest..."


Today's Topics:

   1. Question on ESTs and functional annotations (Shane Brubaker)
   2. Re: Question on ESTs and functional annotations (Barry Moore)
   3. Re: Question on ESTs and functional annotations (Shane Brubaker)
   4. Re: Question on ESTs and functional annotations (Carson Holt)
   5. Re: Question on ESTs and functional annotations (Shane Brubaker)


----------------------------------------------------------------------

Message: 1
Date: Mon, 16 Nov 2009 12:45:29 -0800
From: Shane Brubaker <SBrubaker at Aurorabiofuels.com>
Subject: [maker-devel] Question on ESTs and functional annotations
To: "maker-devel at yandell-lab.org" <maker-devel at yandell-lab.org>
Message-ID:
        <4F28B35C91C1B040B7183D28726A8663264CEF1524 at Exchange.aurora.local>
Content-Type: text/plain; charset="us-ascii"

Hi, I had a couple of general questions.

1.  What are the EST blastn results that are found in the MAKER gff3 output?  I thought these might be a track of my ESTs mapped onto the genome, but they seem to contain pieces of the EST but not the entire thing?
2.  What is a good way to add functional annotation to the gene results? I have a track of protein matches, I used Swissprot as my set of proteins.  So I can see proteins and I can look up my Swissprot accession number and find out more about that gene.  But what I really want is to click on my actual gene models (from augustus or snap) and then have some functional annotation on that page which has been transferred onto the gene, as well as some links out to the appropriate external sites.  Is there a good general approach for doing that?


Thanks,
Shane

This email and any attachments thereto may contain private, confidential, and privileged material for the sole use of the intended recipient.  Any review, copying, or distribution of this email (or any attachments thereto) by others is strictly prohibited.  If you are not the intended recipient, please contact the sender immediately and permanently delete the original and any copies of this email and any attachments thereto.



------------------------------

Message: 2
Date: Mon, 16 Nov 2009 15:35:25 -0700
From: Barry Moore <barry.moore at genetics.utah.edu>
Subject: Re: [maker-devel] Question on ESTs and functional annotations
To: Shane Brubaker <SBrubaker at Aurorabiofuels.com>
Cc: "maker-devel at yandell-lab.org" <maker-devel at yandell-lab.org>
Message-ID: <6D22E2BC-73A1-4570-BA2F-307B15EFD434 at genetics.utah.edu>
Content-Type: text/plain; charset="us-ascii"; format=flowed; delsp=yes

Shane,

With regards to your second question...we've got a few scripts in the
bin directory that we have used to add functional annotations to MAKER
output.  They are:

maker_functional_fasta
maker_functional_gff
ipr_update_gff

The first two will take a blastp output from your MAKER proteins
against swissprot and add this sort of thing:

name:"Serine-aspartate repeat-containing protein I (Staphylococcus
saprophyticus)"
Note="Serine-aspartate repeat-containing protein I (Staphylococcus
saprophyticus)"

to your fasta and GFF files respectively.  It's not currently adding
the swissprot ID to that output, but it would be trivial to alter
either or both to do so.

Also a great (but CPU costly) way to get deeper functional annotations
is to run IPRscan on your MAKER proteins and then use the last script
mentioned above to add Dbxref and Ontology_term attributes to your GFF.

Have a look at those scripts and then holler back if you have any
questions or want more details.

Barry

On Nov 16, 2009, at 1:45 PM, Shane Brubaker wrote:

> Hi, I had a couple of general questions.
>
> 1.  What are the EST blastn results that are found in the MAKER gff3
> output?  I thought these might be a track of my ESTs mapped onto the
> genome, but they seem to contain pieces of the EST but not the
> entire thing?
> 2.  What is a good way to add functional annotation to the gene
> results? I have a track of protein matches, I used Swissprot as my
> set of proteins.  So I can see proteins and I can look up my
> Swissprot accession number and find out more about that gene.  But
> what I really want is to click on my actual gene models (from
> augustus or snap) and then have some functional annotation on that
> page which has been transferred onto the gene, as well as some links
> out to the appropriate external sites.  Is there a good general
> approach for doing that?
>
>
> Thanks,
> Shane
>
> This email and any attachments thereto may contain private,
> confidential, and privileged material for the sole use of the
> intended recipient.  Any review, copying, or distribution of this
> email (or any attachments thereto) by others is strictly
> prohibited.  If you are not the intended recipient, please contact
> the sender immediately and permanently delete the original and any
> copies of this email and any attachments thereto.
>
> _______________________________________________
> maker-devel mailing list
> maker-devel at yandell-lab.org
> http://yandell-lab.org/mailman/listinfo/maker-devel_yandell-lab.org




------------------------------

Message: 3
Date: Mon, 16 Nov 2009 15:41:19 -0800
From: Shane Brubaker <SBrubaker at Aurorabiofuels.com>
Subject: Re: [maker-devel] Question on ESTs and functional annotations
To: Barry Moore <barry.moore at genetics.utah.edu>
Cc: "maker-devel at yandell-lab.org" <maker-devel at yandell-lab.org>
Message-ID:
        <4F28B35C91C1B040B7183D28726A8663264CEF1545 at Exchange.aurora.local>
Content-Type: text/plain; charset="us-ascii"

That is very helpful, thanks Barry!

-----Original Message-----
From: Barry Moore [mailto:barry.moore at genetics.utah.edu]
Sent: Monday, November 16, 2009 2:35 PM
To: Shane Brubaker
Cc: maker-devel at yandell-lab.org
Subject: Re: [maker-devel] Question on ESTs and functional annotations

Shane,

With regards to your second question...we've got a few scripts in the
bin directory that we have used to add functional annotations to MAKER
output.  They are:

maker_functional_fasta
maker_functional_gff
ipr_update_gff

The first two will take a blastp output from your MAKER proteins
against swissprot and add this sort of thing:

name:"Serine-aspartate repeat-containing protein I (Staphylococcus
saprophyticus)"
Note="Serine-aspartate repeat-containing protein I (Staphylococcus
saprophyticus)"

to your fasta and GFF files respectively.  It's not currently adding
the swissprot ID to that output, but it would be trivial to alter
either or both to do so.

Also a great (but CPU costly) way to get deeper functional annotations
is to run IPRscan on your MAKER proteins and then use the last script
mentioned above to add Dbxref and Ontology_term attributes to your GFF.

Have a look at those scripts and then holler back if you have any
questions or want more details.

Barry

On Nov 16, 2009, at 1:45 PM, Shane Brubaker wrote:

> Hi, I had a couple of general questions.
>
> 1.  What are the EST blastn results that are found in the MAKER gff3
> output?  I thought these might be a track of my ESTs mapped onto the
> genome, but they seem to contain pieces of the EST but not the
> entire thing?
> 2.  What is a good way to add functional annotation to the gene
> results? I have a track of protein matches, I used Swissprot as my
> set of proteins.  So I can see proteins and I can look up my
> Swissprot accession number and find out more about that gene.  But
> what I really want is to click on my actual gene models (from
> augustus or snap) and then have some functional annotation on that
> page which has been transferred onto the gene, as well as some links
> out to the appropriate external sites.  Is there a good general
> approach for doing that?
>
>
> Thanks,
> Shane
>
> This email and any attachments thereto may contain private,
> confidential, and privileged material for the sole use of the
> intended recipient.  Any review, copying, or distribution of this
> email (or any attachments thereto) by others is strictly
> prohibited.  If you are not the intended recipient, please contact
> the sender immediately and permanently delete the original and any
> copies of this email and any attachments thereto.
>
> _______________________________________________
> maker-devel mailing list
> maker-devel at yandell-lab.org
> http://yandell-lab.org/mailman/listinfo/maker-devel_yandell-lab.org


This email and any attachments thereto may contain private, confidential, and privileged material for the sole use of the intended recipient.  Any review, copying, or distribution of this email (or any attachments thereto) by others is strictly prohibited.  If you are not the intended recipient, please contact the sender immediately and permanently delete the original and any copies of this email and any attachments thereto.



------------------------------

Message: 4
Date: Mon, 16 Nov 2009 16:42:25 -0700
From: Carson Holt <carson.holt at genetics.utah.edu>
Subject: Re: [maker-devel] Question on ESTs and functional annotations
To: Shane Brubaker <SBrubaker at Aurorabiofuels.com>,
        "maker-devel at yandell-lab.org" <maker-devel at yandell-lab.org>
Message-ID: <C7273171.112B%carson.holt at genetics.utah.edu>
Content-Type: text/plain; charset="iso-8859-1"

1. The BLASTN ESTs are the BLASTN results of aligning the EST library you provided during a MAKER run.  These are filtered using the thresholds you set in the maker_bopts.ctl file for coverage, identity, etc.  I think the default coverage is 80% and the default identity is 85%.  So all alignments should contain at least 80% of the EST provided with 85% identity.  You can tighten and loosen these thresholds as needed since you do expect a certain amount of sequencing error in the ESTs.

Positive BLASTN results actually get realigned using Exonerate which gives a higher quality alignment that is splice site aware.  BLASTN alignments are recorded more for informative purposes.  The important alignments are actually the Exonerate est2genome alignments which are derived from the BLASTN alignments.  Note, you will sometimes see a BLASTN alignment on one strand while the Exonerate alignment will be on the other, this is because splice sites allow you to correctly determine the strand which is not actually inherent from the EST sequence (except for Sanger sequence).

2. For functional annotation, I like to use InterProScan from the EBI.  It can be used to identify protein domains and related GO functional categories for a gene.  InterProScan takes a while to run though, as it is an extremely extensive analysis.  The current version of MAKER includes a few accessory scripts for adding InterProScan results into the final annotation set.  The main script is ipr_update_gff.  It will add functional tags to the gene models in accordance to GFF3 format.  You can then dump these GFF3 files into a Chado database and run things like GBrowse off of it.

If you are using GBrowse for viewing your data, you can set it up to link out to the appropriate external sites.  It is not entirely simple to do, but they do provide documentation.  You will also need to be familiar with SQL for some of the more advanced options.

Hope that helps,
Carson



On 11/16/09 1:45 PM, "Shane Brubaker" <SBrubaker at Aurorabiofuels.com> wrote:

Hi, I had a couple of general questions.

1.  What are the EST blastn results that are found in the MAKER gff3 output?  I thought these might be a track of my ESTs mapped onto the genome, but they seem to contain pieces of the EST but not the entire thing?
2.  What is a good way to add functional annotation to the gene results? I have a track of protein matches, I used Swissprot as my set of proteins.  So I can see proteins and I can look up my Swissprot accession number and find out more about that gene.  But what I really want is to click on my actual gene models (from augustus or snap) and then have some functional annotation on that page which has been transferred onto the gene, as well as some links out to the appropriate external sites.  Is there a good general approach for doing that?


Thanks,
Shane

This email and any attachments thereto may contain private, confidential, and privileged material for the sole use of the intended recipient.  Any review, copying, or distribution of this email (or any attachments thereto) by others is strictly prohibited.  If you are not the intended recipient, please contact the sender immediately and permanently delete the original and any copies of this email and any attachments thereto.

_______________________________________________
maker-devel mailing list
maker-devel at yandell-lab.org
http://yandell-lab.org/mailman/listinfo/maker-devel_yandell-lab.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20091116/cf8876c2/attachment-0001.html>

------------------------------

Message: 5
Date: Mon, 16 Nov 2009 15:48:22 -0800
From: Shane Brubaker <SBrubaker at Aurorabiofuels.com>
Subject: Re: [maker-devel] Question on ESTs and functional annotations
To: Carson Holt <carson.holt at genetics.utah.edu>,
        "maker-devel at yandell-lab.org"   <maker-devel at yandell-lab.org>
Message-ID:
        <4F28B35C91C1B040B7183D28726A8663264CEF1547 at Exchange.aurora.local>
Content-Type: text/plain; charset="us-ascii"

Thanks very much Carson, that is quite helpful.  I will try using the est2genome results instead of the blastn results.

I am indeed using Chado and Gbrowse, so that is what I would like to do.  I will try using InterproScan.


From: Carson Holt [mailto:carson.holt at genetics.utah.edu]
Sent: Monday, November 16, 2009 3:42 PM
To: Shane Brubaker; maker-devel at yandell-lab.org
Subject: Re: [maker-devel] Question on ESTs and functional annotations

1. The BLASTN ESTs are the BLASTN results of aligning the EST library you provided during a MAKER run.  These are filtered using the thresholds you set in the maker_bopts.ctl file for coverage, identity, etc.  I think the default coverage is 80% and the default identity is 85%.  So all alignments should contain at least 80% of the EST provided with 85% identity.  You can tighten and loosen these thresholds as needed since you do expect a certain amount of sequencing error in the ESTs.

Positive BLASTN results actually get realigned using Exonerate which gives a higher quality alignment that is splice site aware.  BLASTN alignments are recorded more for informative purposes.  The important alignments are actually the Exonerate est2genome alignments which are derived from the BLASTN alignments.  Note, you will sometimes see a BLASTN alignment on one strand while the Exonerate alignment will be on the other, this is because splice sites allow you to correctly determine the strand which is not actually inherent from the EST sequence (except for Sanger sequence).

2. For functional annotation, I like to use InterProScan from the EBI.  It can be used to identify protein domains and related GO functional categories for a gene.  InterProScan takes a while to run though, as it is an extremely extensive analysis.  The current version of MAKER includes a few accessory scripts for adding InterProScan results into the final annotation set.  The main script is ipr_update_gff.  It will add functional tags to the gene models in accordance to GFF3 format.  You can then dump these GFF3 files into a Chado database and run things like GBrowse off of it.

If you are using GBrowse for viewing your data, you can set it up to link out to the appropriate external sites.  It is not entirely simple to do, but they do provide documentation.  You will also need to be familiar with SQL for some of the more advanced options.

Hope that helps,
Carson



On 11/16/09 1:45 PM, "Shane Brubaker" <SBrubaker at Aurorabiofuels.com> wrote:
Hi, I had a couple of general questions.

1.  What are the EST blastn results that are found in the MAKER gff3 output?  I thought these might be a track of my ESTs mapped onto the genome, but they seem to contain pieces of the EST but not the entire thing?
2.  What is a good way to add functional annotation to the gene results? I have a track of protein matches, I used Swissprot as my set of proteins.  So I can see proteins and I can look up my Swissprot accession number and find out more about that gene.  But what I really want is to click on my actual gene models (from augustus or snap) and then have some functional annotation on that page which has been transferred onto the gene, as well as some links out to the appropriate external sites.  Is there a good general approach for doing that?


Thanks,
Shane

This email and any attachments thereto may contain private, confidential, and privileged material for the sole use of the intended recipient.  Any review, copying, or distribution of this email (or any attachments thereto) by others is strictly prohibited.  If you are not the intended recipient, please contact the sender immediately and permanently delete the original and any copies of this email and any attachments thereto.

_______________________________________________
maker-devel mailing list
maker-devel at yandell-lab.org
http://yandell-lab.org/mailman/listinfo/maker-devel_yandell-lab.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20091116/b14186e7/attachment.html>

------------------------------

_______________________________________________
maker-devel mailing list
maker-devel at yandell-lab.org
http://yandell-lab.org/mailman/listinfo/maker-devel_yandell-lab.org


End of maker-devel Digest, Vol 18, Issue 4
******************************************



More information about the maker-devel mailing list