STITCH is a computer program that processes raw nucleotide-sequence data to automatically remove unwanted vector information, perform reverse-complement comparison, stitch shorter sequences together to make longer ones to which the shorter ones presumably belong, and search against the user's choice of private and Internet-accessible public 16S rRNA databases. ["16S rRNA" denotes a ribosomal ribonucleic acid (rRNA) sequence that is common to all organisms.] In STITCH, a template 16S rRNA sequence is used to position forward and reverse reads. STITCH then automatically searches known 16S rRNA sequences in the user's chosen database(s) to find the sequence most similar to (the sequence that lies at the smallest edit distance from) each spliced sequence.

The result of processing by STITCH is the identification of the most similar welldescribed bacterium. Whereas previously commercially available software for analyzing genetic sequences operates on one sequence at a time, STITCH can manipulate multiple sequences simultaneously to perform the aforementioned operations. A typical analysis of several dozen sequences (length of the order of 103 base pairs) by use of STITCH is completed in a few minutes, whereas such an analysis performed by use of prior software takes hours or days.

This program was written by Shariff Osman and Kasthuri Venkateswaran of Caltech; George Fox of Dept. of Biology and Biochemistry, University of Texas, Houston; and Dianhui Zhu of Dept. of Computer Sciences, University of Texas, Houston for NASA's Jet Propulsion Laboratory.

In accordance with Public Law 96-517, the contractor has elected to retain title to this invention. Inquiries concerning rights for its commercial use should be addressed to:

Innovative Technology Assets Management
Mail Stop 202-233
4800 Oak Grove Drive
Pasadena, CA 91109-8099
E-mail: This email address is being protected from spambots. You need JavaScript enabled to view it.

Refer to NPO-44785, volume and number of this NASA Tech Briefs issue, and the page number.

This Brief includes a Technical Support Package (TSP).
Document cover
Automated Identification of Nucleotide Sequences

(reference NPO-44785) is currently available for download from the TSP library.

Don't have an account? Sign up here.