[maker-devel] add_utr_gff.pl and maker2zff.pl issues when trying to convert to GBrowse friendly format

projectcortana at gmail.com projectcortana at gmail.com
Tue Nov 4 09:34:32 MST 2008


Allo,

I am experiencing problems using the two external scripts provided on
my GFF output file built from the default fastas in the data dir of
MAKER.

In order to add UTRs for GBrowse, the README states that users must
run both add_utr_gff and maker2zff scripts on the GFF, yet no changes
are made between my original GFF and the wutr.gff file. After checking
the source, it seems that the reason for this is because my
source_tags in the GFF provided don't pass the first, or other,
conditionals required in the script. For instance, the first if
statement in add_utr_gff.pl is checking for where the source_tag
equals "maker" - yet the code to place the word 'maker' has been
commented out in the gene_data subroutine, located in ../maker/lib/
Dumper/GFF/GFFV3.pm. After taking a quick look at the remaining
conditionals in either script file, it becomes obvious that the GFF
file fails the rest of them.

Do these scripts need updating or is it my GFF file to blame? If it is
my GFF, what's wrong with it compared to what MAKER should normally be
outputting?

Thank you,

Sam

##gff-version 3
##sequence-region contig-dpp-500-500 1 32156
contig-dpp-500-500	.	contig	1	32156	.	.	.	ID=contig-
dpp-500-500;Name=contig-dpp-500-500
contig-dpp-500-500	repeatmasker	match	903	928	.	+	.	ID=contig-
dpp-500-500:hit:0;Name=species:(CGAAT)n-
genus:Simple_repeat;Target=species:(CGAAT)n-genus:Simple_repeat 5 29 +
contig-dpp-500-500	repeatmasker	match_part	903	928	185	+	.	ID=contig-
dpp-500-500:hsp:0;Parent=contig-dpp-500-500:hit:0;Name=species:
(CGAAT)n-genus:Simple_repeat;Target=species:(CGAAT)n-
genus:Simple_repeat 5 29 +
contig-dpp-500-500	repeatmasker	match	5809	5897	.	+	.	ID=contig-
dpp-500-500:hit:1;Name=species:(CAA)n-
genus:Simple_repeat;Target=species:(CAA)n-genus:Simple_repeat 2 88 +
contig-dpp-500-500	repeatmasker	match_part	5809	5897	244	+	.	ID=contig-
dpp-500-500:hsp:1;Parent=contig-dpp-500-500:hit:1;Name=species:(CAA)n-
genus:Simple_repeat;Target=species:(CAA)n-genus:Simple_repeat 2 88 +
contig-dpp-500-500	repeatmasker	match	5170	5198	.	+	.	ID=contig-
dpp-500-500:hit:2;Name=species:AT_rich-
genus:Low_complexity;Target=species:AT_rich-genus:Low_complexity 1 29
+
contig-dpp-500-500	repeatmasker	match_part	5170	5198	29	+	.	ID=contig-
dpp-500-500:hsp:2;Parent=contig-dpp-500-500:hit:2;Name=species:AT_rich-
genus:Low_complexity;Target=species:AT_rich-genus:Low_complexity 1 29
+
contig-dpp-500-500	repeatmasker	match	12416	12440	.	+	.	ID=contig-
dpp-500-500:hit:3;Name=species:AT_rich-
genus:Low_complexity;Target=species:AT_rich-genus:Low_complexity 1 25
+
contig-dpp-500-500	repeatmasker	match_part	12416	12440	25	+	.
ID=contig-dpp-500-500:hsp:3;Parent=contig-dpp-500-500:hit:
3;Name=species:AT_rich-genus:Low_complexity;Target=species:AT_rich-
genus:Low_complexity 1 25 +
contig-dpp-500-500	repeatmasker	match	15478	15502	.	+	.	ID=contig-
dpp-500-500:hit:4;Name=species:AT_rich-
genus:Low_complexity;Target=species:AT_rich-genus:Low_complexity 1 25
+
contig-dpp-500-500	repeatmasker	match_part	15478	15502	25	+	.
ID=contig-dpp-500-500:hsp:4;Parent=contig-dpp-500-500:hit:
4;Name=species:AT_rich-genus:Low_complexity;Target=species:AT_rich-
genus:Low_complexity 1 25 +
contig-dpp-500-500	repeatmasker	match	17472	17494	.	+	.	ID=contig-
dpp-500-500:hit:5;Name=species:AT_rich-
genus:Low_complexity;Target=species:AT_rich-genus:Low_complexity 1 23
+
contig-dpp-500-500	repeatmasker	match_part	17472	17494	23	+	.
ID=contig-dpp-500-500:hsp:5;Parent=contig-dpp-500-500:hit:
5;Name=species:AT_rich-genus:Low_complexity;Target=species:AT_rich-
genus:Low_complexity 1 23 +
contig-dpp-500-500	repeatmasker	match	31755	31785	.	+	.	ID=contig-
dpp-500-500:hit:6;Name=species:AT_rich-
genus:Low_complexity;Target=species:AT_rich-genus:Low_complexity 1 31
+
contig-dpp-500-500	repeatmasker	match_part	31755	31785	24	+	.
ID=contig-dpp-500-500:hsp:6;Parent=contig-dpp-500-500:hit:
6;Name=species:AT_rich-genus:Low_complexity;Target=species:AT_rich-
genus:Low_complexity 1 31 +
contig-dpp-500-500	repeatmasker	match	31845	31888	.	+	.	ID=contig-
dpp-500-500:hit:7;Name=species:AT_rich-
genus:Low_complexity;Target=species:AT_rich-genus:Low_complexity 1 44
+
contig-dpp-500-500	repeatmasker	match_part	31845	31888	30	+	.
ID=contig-dpp-500-500:hsp:7;Parent=contig-dpp-500-500:hit:
7;Name=species:AT_rich-genus:Low_complexity;Target=species:AT_rich-
genus:Low_complexity 1 44 +
contig-dpp-500-500	repeatmasker	match	26624	26647	.	+	.	ID=contig-
dpp-500-500:hit:8;Name=species:(GTCTG)n-
genus:Simple_repeat;Target=species:(GTCTG)n-genus:Simple_repeat 1 24 +
contig-dpp-500-500	repeatmasker	match_part	26624	26647	195	+	.
ID=contig-dpp-500-500:hsp:8;Parent=contig-dpp-500-500:hit:
8;Name=species:(GTCTG)n-genus:Simple_repeat;Target=species:(GTCTG)n-
genus:Simple_repeat 1 24 +
contig-dpp-500-500	repeatmasker	match	105	129	.	+	.	ID=contig-
dpp-500-500:hit:9;Name=species:(CA)n-
genus:Simple_repeat;Target=species:(CA)n-genus:Simple_repeat 1 25 +
contig-dpp-500-500	repeatmasker	match_part	105	129	204	+	.	ID=contig-
dpp-500-500:hsp:9;Parent=contig-dpp-500-500:hit:9;Name=species:(CA)n-
genus:Simple_repeat;Target=species:(CA)n-genus:Simple_repeat 1 25 +
contig-dpp-500-500	repeatmasker	match	26695	26756	.	+	.	ID=contig-
dpp-500-500:hit:10;Name=species:(CCG)n-
genus:Simple_repeat;Target=species:(CCG)n-genus:Simple_repeat 3 65 +
contig-dpp-500-500	repeatmasker	match_part	26695	26756	247	+	.
ID=contig-dpp-500-500:hsp:10;Parent=contig-dpp-500-500:hit:
10;Name=species:(CCG)n-genus:Simple_repeat;Target=species:(CCG)n-
genus:Simple_repeat 3 65 +
contig-dpp-500-500	repeatmasker	match	2163	2192	.	+	.	ID=contig-
dpp-500-500:hit:11;Name=species:(CAG)n-
genus:Simple_repeat;Target=species:(CAG)n-genus:Simple_repeat 2 31 +
contig-dpp-500-500	repeatmasker	match_part	2163	2192	183	+	.	ID=contig-
dpp-500-500:hsp:11;Parent=contig-dpp-500-500:hit:11;Name=species:
(CAG)n-genus:Simple_repeat;Target=species:(CAG)n-genus:Simple_repeat 2
31 +
contig-dpp-500-500	repeatmasker	match	1849	1881	.	+	.	ID=contig-
dpp-500-500:hit:12;Name=species:(CTTTG)n-
genus:Simple_repeat;Target=species:(CTTTG)n-genus:Simple_repeat 1 33 +
contig-dpp-500-500	repeatmasker	match_part	1849	1881	206	+	.	ID=contig-
dpp-500-500:hsp:12;Parent=contig-dpp-500-500:hit:12;Name=species:
(CTTTG)n-genus:Simple_repeat;Target=species:(CTTTG)n-
genus:Simple_repeat 1 33 +
contig-dpp-500-500	blastx:repeatmask	protein_match	28310	28507
0.00421711	-	.	ID=contig-dpp-500-500:hit:13;Name=gi|18254413|gb|
AAL66754.1|AF464738_5;Target=gi|18254413|gb|AAL66754.1|AF464738_5 784
855 +
contig-dpp-500-500	blastx:repeatmask	match_part	28310	28507
0.00421711	-	.	ID=contig-dpp-500-500:hsp:13;Parent=contig-
dpp-500-500:hit:13;Name=gnl|BL_ORD_ID|15987;Target=gnl|BL_ORD_ID|15987
784 855 +
contig-dpp-500-500	blastx:repeatmask	protein_match	30776	30931
0.016025	-	.	ID=contig-dpp-500-500:hit:14;Name=gi|7670973|gb|
AAF66306.1|;Target=gi|7670973|gb|AAF66306.1| 135 185 +
contig-dpp-500-500	blastx:repeatmask	match_part	30776	30931
0.016025	-	.	ID=contig-dpp-500-500:hsp:14;Parent=contig-
dpp-500-500:hit:14;Name=gnl|BL_ORD_ID|4439;Target=gnl|BL_ORD_ID|4439
135 185 +
contig-dpp-500-500	blastx:repeatmask	protein_match	31190	31270	9.72204
+	.	ID=contig-dpp-500-500:hit:15;Name=gi|4521269|dbj|
BAA76304.1|;Target=gi|4521269|dbj|BAA76304.1| 661 687 +
contig-dpp-500-500	blastx:repeatmask	match_part	31190	31270	9.72204
+	.	ID=contig-dpp-500-500:hsp:15;Parent=contig-dpp-500-500:hit:
15;Name=gnl|BL_ORD_ID|22384;Target=gnl|BL_ORD_ID|22384 661 687 +
contig-dpp-500-500	blastx:repeatmask	protein_match	31558	31587
4.36403	-	.	ID=contig-dpp-500-500:hit:16;Name=gi|27670321|ref|
XP_229474.1|;Target=gi|27670321|ref|XP_229474.1| 451 460 +
contig-dpp-500-500	blastx:repeatmask	match_part	31558	31587
4.36403	-	.	ID=contig-dpp-500-500:hsp:16;Parent=contig-dpp-500-500:hit:
16;Name=gnl|BL_ORD_ID|20333;Target=gnl|BL_ORD_ID|20333 451 460 +
contig-dpp-500-500	blastx:repeatmask	protein_match	31717	31818
0.231401	+	.	ID=contig-dpp-500-500:hit:17;Name=gi|327819|gb|
AAB03749.1|;Target=gi|327819|gb|AAB03749.1| 18 51 +
contig-dpp-500-500	blastx:repeatmask	match_part	31717	31818	0.231401
+	.	ID=contig-dpp-500-500:hsp:17;Parent=contig-dpp-500-500:hit:
17;Name=gnl|BL_ORD_ID|29022;Target=gnl|BL_ORD_ID|29022 18 51 +
contig-dpp-500-500	blastx:repeatmask	protein_match	32026	32109
2.55843	-	.	ID=contig-dpp-500-500:hit:18;Name=gi|6015506|emb|
CAB57796.1|;Target=gi|6015506|emb|CAB57796.1| 138 165 +
contig-dpp-500-500	blastx:repeatmask	match_part	32026	32109
2.55843	-	.	ID=contig-dpp-500-500:hsp:18;Parent=contig-dpp-500-500:hit:
18;Name=gnl|BL_ORD_ID|30389;Target=gnl|BL_ORD_ID|30389 138 165 +
contig-dpp-500-500	blastn	expressed_sequence_match	31379	31507
1.07552e-18	+	.	ID=contig-dpp-500-500:hit:19;Name=dpp-
mRNA-3;Target=dpp-mRNA-3 3961 4089 +
contig-dpp-500-500	blastn	match_part	31379	31429	1.07552e-18	+	.
ID=contig-dpp-500-500:hsp:19;Parent=contig-dpp-500-500:hit:19;Name=gnl|
BL_ORD_ID|2;Target=gnl|BL_ORD_ID|2 3961 4011 +
contig-dpp-500-500	blastn	match_part	31449	31507	1.80977e-23	+	.
ID=contig-dpp-500-500:hsp:20;Parent=contig-dpp-500-500:hit:19;Name=gnl|
BL_ORD_ID|2;Target=gnl|BL_ORD_ID|2 4031 4089 +
##FASTA
>contig-dpp-500-500
TGAGAGAGCTGAAATATTGTAATTGTGAGTCTGGCTTGTTTGTTATTGTTGCCTTAGCGG
TTGCTTGTTGTTTTTTTGGCTTGATTAATAATTAATCGCACTCGCACACACACACACACA
...cut for brevity.




More information about the maker-devel mailing list