Gala: Genome Alignment and Annotation database

Example query 2: A query to find possible single exon genes or genes with large exons which are currently unknown. Additionally, conserved noncoding sequences (CNS) of extensive length are pulled out by this search.

Query:

Step 1
Select all alignments that are greater than 500 bp and 80% identity with no more than 2 bp of gap and 5% step. Start with the query form. Enter 500 for minimum length, 80 for minimum percent identity, 2 for maximum gap size, and 5 for maximum identity step. Under Range select chr22. Type "alignments" in the query description box. Then click on Submit query. (*Found 99 ranges)
Step 2
Select all genes. First click on Query form button at the top of the page to get an new query form. Then click the checkbox where it says Query all genes. Under Range select chr22. Type "genes" in the query description box and click Submit query. (*Found 776 ranges).
Step 3
Remove alignments that overlap known genes by using subtraction on the history page. Click checkbox next to the query "alignments" and the query "genes". Select the subtract radio button that will subtract the genes from the alignments. If the steps are done in the order given here that is "Subtraction earlier - later query". Leave "whole regions" selected in the box under subtraction to remove any alignment that overlaps a gene. For the format of output select Genome Browser user track. Type "sample query 2" in the query description box and click Submit compound query. (*Found 10 ranges)

Results:

  1. chr22 13381719 to 13382314
    The human genome browser currently shows no predicted exons in this range. The alignment falls upstream of AK001324, known as -Homo sapiens weakly similar to MALE STERILITY PROTEIN 2. Part of this conserved noncoding sequence (CNS) contains a repetitive element. A Blast search shows that it falls within the Cat Eye Syndrome Region.
  2. chr22 14460724 to 14461366
    The human genome browser currently shows no predicted exons in this range. The alignment falls at the 5' end of Homo sapiens Cat Eye Syndrome critical region 7 mRNA sequence.
  3. chr22 15896297 to 15896943
    The human genome browser currently shows no predicted exons in this range. The aligning segment is a CNS at the 5' end of DiGeorge Critical region gene 5.
  4. chr22 17967333 to 17967886
    The human genome browser currently shows no predicted exons in this range. The aligning segment is a CNS at the 5' end of CRKL gene.
  5. chr22 23232064 to 23232705
    The human genome browser currently shows no predicted exons in this range. The aligning segment is a CNS 20 kb upstream of KIAA0927.
  6. chr22 23864165 to 23865294
    The human genome browser currently shows no predicted exons overlapping this region. The aligning segment is a CNS that falls within the intron of an Acembly prediction.
  7. chr22 24050310 to 24050935
    The human genome browser currently shows no predicted exons overlapping this region. The aligning segmentis a CNS within an intron of a Genscan prediction.
  8. chr22 36254154 to 36255226
    Overlaps the 5' end of PDGFB, using ref seq gene coordinates. The database is using Sanger's gene coordinates which don't overlap this region.
  9. chr22 36696247 to 36697728
    Overlaps the 5' end of an Acembly gene prediction.
  10. chr22 41527677 to 41528672
    Overlaps an Acembly single-exon gene prediction, near 5' end of DKFZp761O17121 gene.

Summary:
This query was intended to highlight potentially unknown genes. However, since the annotations on chromosome 22 are very complete, no new genes were identified. The query returns several regions of high sequence similarity with no predicted coding potential. These conserved noncoding regions are located at the 5' end of genes as well as in introns of some predicted, but uncharacterized genes.

*NOTE: the numbers in each category and the resulting ranges may change as we continue to update our information tables.