ssSNPTarget: Genome-Wide Splice-Site Single Nucleotide Polymorphism Database


This figure is a flowchart for database construction of the ssSNPTarget. First, we obtained gene and transcript data from Ensembl and UCSC (Human:hg18, Mouse:mm9). Using dbSNP (Human:129, Mouse:128) with splice site information to be taken from the data, we got ssSNP candidates. To get clearer ssSNPs, the candidates were filtered by six conditions such as only GT-AG splice site, >= 100 bp of intron site, >= 4 ESTs alignment coverage, only trueSNPs, SNPs with one base, and removing pseudogenes. The ssSNPs were made as the ssSNPdb with exon junction strength, Pfam domain, mRNA, EST, HGNC, MGI, and disease information. The disease sources were obtained from three popular resources: GAD and OMIM. To show ssSNP conservation, we extracted flanked sequences from splice sites from various species. The ssSNPs can lead to great changes in the composition of amino acids and domains in proteins causing mRNA truncation. Hence, this database provides ssSNP effects with exon extension, exon skipping, alternation of splice site strength, and breaking domain.

1) Search interfaces

This web site provides a search interface with two categories: keyword and chromosome. The user can directly get results through keyword searches with SNP ID (rs number), ensembl transcript ID, UCSC transcript ID, gene symbol, and chromosomal position. The “Chromosomal search” category displays all ssSNP entries to be included to queried chromosome number. Moreover, these search categories are given support for the Mouse as well as the Human.

2) Results by "Keyword search"

The figure below is an example of Ensembl ID, “ENST00000369522”. All entries to be located within genomic position of the query are displayed. The entry matching your query has highlighted specially.

3) Results by "Chromosomal search"

The figure below is an example by chromosome 1 in Human. All entries within the chromosome have been displayed. In the displayed entries, summarized results with SNP ID, transcript ID, exon number, splice site, genotype, splice site position, splice site variation, domain and disease have been showed.

4) Search results

The information below is output through “Keyword search” and “Chromosomal search”.
1. “SNP information” - Annotation information to be obtained from dbSNP.
2. “Gene information” - Disease mapping information including GAD and OMIM as well as gene annotation information.
3. “ssSNP sequence” – Nucleotide sequence to be flanked from ssSNP and sequence conservation. The sequences for showing conservation status are from the exon side only because of intron instability.
4. “ssSNP effect”
1) Alternation of splice site strength by ssSNP.
2) Breaking of Pfam domain by ssSNP.
3) Evidences for exon extension and skipping events by ssSNP.

5) Domain information

The domains below have break possibility by ssSNP. These domains were predicted using HMMER.

6) Disease information

To provide disease information, we integrated popular databases with GAD and OMIM.

7) Detailed information on exon extension

To provide clear evidence of exon extension, the ssSNPTarget shows aligned information on nucleotide sequence level using Clustalw.

8) Browsing using Gbrowse

The ssSNPTarget browser includes various annotation tracks with assembly contigs, Ensemble gene, UCSC gene, splice site, exon junction strength value, pfam domain, disease, and conservation score.

Copyright © 2009 by Korean Bioinformation Center. All rights reserved.