Researchers at the Georgia Institute of Technology have developed a computer program that trains itself to predict genes in the DNA sequences of fungi. Understanding the recently sequenced fungal genomes can help in developing and producing critical pharmaceuticals. Gene prediction can also help to identify potential targets for therapeutic intervention and vaccination against pathogenic fungi.
Mark Borodovsky, director of Georgia Tech's Center for Bioinformatics and Computational Genomics, and his colleagues expanded the eukaryotic genome self-training software program they developed in 2005 to address the issue that fungal genes are more complex than other eukaryotes. Unlike other programs that require a pre-determined training set along with the genome sequence, GeneMark.hmm-ES (BP) only requires the genome sequence. The program is able to iteratively identify the correct algorithm parameters from the anonymous sequence.
"The enhanced program predicted fungal genes with higher accuracy than either the original self-training algorithm or known algorithms with supervised training," noted Borodovsky. "And because we didn't need any additional training information for our program, the sequencing teams could immediately proceed with gene annotation right after the genomic sequence was in hand, without spending time and effort to extract a set of validated genes necessary for estimating parameters of traditional algorithms."

