Introduction

Installation

Execution

References

FAQs

Home

 

 

    FASTA Format

    A sequence in the FASTA format begins with a single-line description, followed by lines of sequence data. The description line is distinguished from the sequence data by a greater-than (">") symbol in the first column. The letters for DNA sequences can be either in upper case or in lower case. Any space, return, or tab in the file will be ignored. Any character not in the standard coding (A, T, C, and G), e.g., Z or 3, will be treated as an unknown base. Finally, an unknown base may be indicated by using X. An example:

    >ORFA00005
    ATGTATGAAGTTTTAGTGGTTGTTTACTTGTTGGTTGCATTAGGGGTCAATTTACA
    GGTTTAATTGGCCTGATCTTAATCCAGCAGGGTAAAGGAGCTCTAGCCATTGCGTT
    GACATGGGGGCCTCATTTGGCGCCGGTGCATCAGGTACCTTAACATTAGAACATTA
    TTTGGTTCAAGCGGTTCAGGTAACTTCCTGACACGCACAACGCTAGCCATTGCGTT
    GCAATCCTAGCCATTGCGTTTTTTACCTTAAGTTTGCTAATTGGGGTCAATTTACA
    GCAATTTAAGTGCAAACCACGCAAAAAATGAAGATGCATGGACTAGCCATTGCGTT
    AAATTTAGGTTCAGACACTGAACAGGTTACCCAACCTGTTGAACATTAGAACATTA
    GGAACCGAAAAGTCAGAAACAAAAATTCCTGAC

    >ORFA00006
    TTGTTTTTTCGGGGGTCAATTTTGGCAACATTAGAATCCAGACTGAAGCATTAGGC
    GGCAGACATGCTCAAAGTGCCTGTGGAAGCATTAGGCTTTCAACCAGCATTTGGCA
    TTTGGGGTATTGAATATGTACAAGCCGGTAAACATTCCATACTGCAGCGACTAGCA
    CGCGTGTTCATTGATGGTGAGAATGGCATCAATATCGAAGATTGGAAGCATTAGGC
    GCCAACGTAAGTCGCCAAGTCAGTGCTGTGCTAGATGTTGAAGAGAAGCATTAGGC
    CCCTATTTCTACTGAATATACCTTAGAGGTTTCTTCGCCTGGTGCAGCATTTGGCA
    AGATAGACCGCTGTTCACTGCTGAACAATACGCGGCCTATGTCGCAGCGACTAGCA
    CGAGGATGTCAAACTTCAACTGACTATGCCTGTCGCGGGCAGTCGAAGCATTAGGC
    >ORFA00007
    GTGGAACGGCCTTTTCATTTGAAACCGCTTCAGCGACTAGCAGAGCGTGTTCATTG
    TCTCTTTATTCATTTTGTCTTGCCTCGTTCACCTTGAAACCATCGTTTCTTCGCCT
    AACTTTGCGATGATGTTGCCTTTACGGATATTATCCAAGGCGACGCGTGTTCATTG
    CCAGCTCTTTACCATTCACATTCACCGACAGCATTTGGCCGTCAGTTTCTTCGCCT
    AACCTGAGTAATGGCGCCTTT
    >ORF00007 acyl carrier protein (acpP)
    ttgacggaagcgggcggctcgttgcaccccgttcagccttgcgcccccgcttct
    cgtgtacactgcgggcacttcagtttcaggaggaatttggtaatggcgactttt
    gtgaaagatgtgattgtggacaagctcggtgtggacgaaggcaaggtgaccccc
    cgcttcgtggaagacctcggcgccgacagcctggaaaccgtggaactgatcatg
    gaagacaaattcggcgtgaccattcccgacgaagccgccgaaaccatccgcacc
    gccgcggtcgactacatcgacaacaaccag
    >ORF00058 conserved hypothetical protein
    atgtcagatatgaatgacgttgcccccccgaccttctgtcccgtgtaccgcgcc
    gtgttgcaggaaaaatgggtgctgcacatcgtccgcgccctgctggggagcgaa
    ttcaacgagctggcccgcgccgtgggcggctgcaacagcgccaccctgacgcag
    gagagcctggaagacctgggcatcatcgtcaagcgcaccgaagacggcggcggc
    gcccgcagcgtgtactcgctgacccctgccggacaggaactccagaccgtgatt
    atcgacgcctgggcgcgcgcgcacctcagcgaatccgagccgacgcgctgcgtg

     

     

 

Designed by Gyan Prakash Srivastava, Digital Biology Laboratory, Computer Science Department, University of Missouri - Columbia USA