Gala: Genome Alignment and Annotation Database

Example query: A chromosome 22 query to investigate the presence of snps in conserved noncoding regions near genes.

Rationale: If snps can contribute to changes in gene expression, then snps in conserved regions should be stronger candidates than those in nonconserved regions for having an effect on gene expression.

Methods: This query was formulated as a series of requests for all entries within an individual field. After that information was collected the history tool was used to find the intersection of some fields and to subtract the overlap between other fields.

Query:

Step 1
Start with the query form. Choose all genes by selecting all the choices under "Type".
Note- you could also select "Query all genes" at the top of the section for Genes.
Under Range select chr22. Type "genes" in the Query description box above the Submit Query button. Click Submit Query. (*Found 780 ranges)
Step 2
Click on the Query form button at the top of the page to begin a new query. Limit the alignments to gap free segments 100 bp in length and 70% or more in identity, by entering 100 for Minimum length and 70 for Minimum identity. Under Range select chr22. Type "alignments" in query description box. Click Submit Query. (*Found 4763 ranges)
Step 3
Click on the Query form button at the top of the page to begin a new query. Under the SNPs section choose "Query all SNPs". Type "SNPs" in the query description box. Under Range select chr22. Click Submit Query. (*Found 27035 ranges)
Step 4
Use history page to limit alignments by proximity to genes request alignments in proximity of 2000 bp to genes. Click on the "History page" button at the top of the page. In the first text box under proximity enter the query number from the top section that is next to the alignments query. Click the lie within button and enter 2000 in the text box following the button. Enter the query number for the genes query in the text box following "an region in subquery number". Type "alignments near genes" in the description of compound query box. Click "Submit compound query". (*Found 3854 ranges)
Step 5
Use history page to remove alignments that fall within genes by subtraction. Click on the history page button. Next choose the alignments near genes and the genes queries, by clicking on their check boxes. Under Compound queries select the radio button "SUBTRACTION later - earlier query". Type "alignments near but not in genes" in the description of compound query box. Click "Submit compound query". (*Found 141 ranges)
Step 6
Use history page to find SNPs that overlap the alignments that are left. Click on history page button. Next choose the queries SNPs and alignments near but not in genes. Click on the INTERSECTION radio button. Select "Genome Browser user track" under Format of output. Type "example query" in the description of compound query box. Click "Submit compound query". (*Found 11 ranges)

RESULTS: To interpret the results, click the UCSC Golden Path button and zoom out to identify the closest gene identified by Refseq or the Sanger Centre. The blastz alignment track and snps line up with the region covered by the query which is highlighted in a custom track.

  1. falls near a Sanger prediction (orientation ambiguous)
  2. falls within an intron of a homologue to glucuronidase, beta
  3. downstream of a Sanger gene
  4. upstream of a refseq hypothetical protein
  5. upstream of CST a sulfyltransferase; overexpressed in some renal carcinoma cell lines
  6. upstream of a gene predicted by Sanger
  7. a Sanger gene prediction
  8. upstream of TEF a bzip transcription factor that TSHB promoter whose expression corresponds temporally and spatially to the onset of TSHB gene expression
  9. upstream of a Sanger prediction
  10. downstream of SEP3, a GTPase upregulated in neuronal differentiation
  11. downstream of Sanger prediction
*NOTE the numbers in each category may change as we continue to update our information tables.