[maker-devel] Maker question

Carson Holt carson.holt at genetics.utah.edu
Tue Oct 20 22:51:57 MDT 2009


If your using proteins from a closely related organism, just add the data into the mix with all other protein data.  You really don't need to differentiate data from the same or another organism here because they all get scored by the same BLOSSUM matrix anyway.  However, If your using ESTs or transcript data from a closely related organism, you want to give this as the altest option in the maker_opts.ctl file.  EST/cDNA sequence from the same organism goes with the est option in the maker_opts.ctl file.  While differentiating protein sequence really is not important, differentiating nucleotide sequence like ESTs and transcripts is.  This is because you have to use a different alignment strategy when the data comes from another organism since nucleotide sequence tends to diverge rapidly.  Raw genomic sequence from another organism is not used by MAKER though, just for clarification.

I hope that helps,
Carson

Examples of a hypothetical maker setup:

Est file -> ESTs and cDNAs from the same organism
AltEST file -> All ESTs from closely related organisms in dbEST combined with transcripts from select closely related genomes
Protein file -> Proteins from all closely related organisms, all of UniProt/Swiss-Prot, and proteins from the same organism (if available)



On 10/20/09 3:20 PM, "Daniel Standage" <byuhobbes at gmail.com> wrote:

Carson,

I was going to send this to the mailing list, but Chris Conley said it would probably be fine if I sent it directly to you. Thanks for your help so far.

We're getting ready to use maker for annotating a fungal genome. I feel like I have a pretty good understanding of what goes into and comes out of maker (thanks to the summer school session and some time on my own playing with it). However, it's not clear to me how to integrate available data from a model organism. I feel like if we just toss everything into the mix, maker won't differentiate between organism we're studying and the organism we're using as a reference.

Is it possible to use data from a closely related organism to assist maker? If so, do we just toss the data into the mix or do we need to indicate that the data is from a reference organism? There is genomic, transcriptomic, and proteomic data available, and it would probably go a long way to improve any gene models maker predicts.

Thanks.

Daniel Standage
Plant Genetics Lab
Brigham Young University

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20091020/162cfe38/attachment.html>


More information about the maker-devel mailing list