Sequences flanking Ds insertions are retrieved from the database and compared with sequence databases by BLASTN controlled by automated processes written in Java. The BLAST search is conducted with an Expect threshold of 0.0001.


Two databases are used as references:

Choice of Matches to Put in the Database

The top-scoring match from the TIGR chromosome assemblies is selected and the annotation from the closest annotated gene is retrieved from an SQL table of TIGR's annotations of their sequences.

The top-scoring experimentally verified match is taken from the NCBI results. This allows the data to stay current with empirical knowledge. In the absence of a top experimentally verified match, the top match exclusive of TIGR's depositions is selected.

In both cases the closest feature is defined as the one whose 5' or 3' end is the least distance from the start of the match with the flanking sequence.

Fields from the Annotations that are Put in the Database

The feat_name, locus, pub_locus, com_name, pub_comment, and coords are taken from the TIGR annotation of a feature. The GI, Accession, gene, product, note, and evidence are taken from the GenBank annotation.

Determination of Location and Orientation of an Insertion

If the start of the sequence match falls within the annotated coordinates of a feature then it is considered to be within that feature. Otherwise a determination is made, based on the strand of the annotated feature, as to whether the insertion is upstream or downstream of the feature

By noting whether the PCR primers from the 5' end or 3' end of Ds were used to obtain the flanking sequence it is possible to determine whether the GUS gene within the Ds element is oriented in the same direction as the annotated feature or in the opposite direction. This is particularly important for GT (gene-trap) type elements since their expression is dependent on the transcription of the gene in which they reside.If oriented opposite the direction of trancription they are less likely to be expressed and to truly reflect the expression pattern of the gene.

Note that sequences generated from the Ds5 primer are oriented in the same direction as transcription of the GUS gene and sequences from the Ds3 primer run opposite.