Date of Thesis

Spring 2018

Thesis Type

Honors Thesis (Bucknell Access Only)

Degree Type

Bachelor of Science

Major

Mathematics

Second Major

Computer Science

Minor, Emphasis, or Concentration

Women's and Gender Studies

First Advisor

Abby Flynt

Second Advisor

Brian King

Keywords

Dynamic Programming, Dynamic Time Warping, Sequence Comparison, Algorithms, Clustering, Classification

Abstract

Sequence comparison, or the process of determining the similarity between two sequences, is an important topic in mathematics, statistics, and computer science, and has applications to an even wider number of fields, including in biology and medicine. Traditionally, sequence comparison is applied to the discovery of similar genomes, however other sequence data involves events in time. The inclusion of the measures of time between subsequent events is absent from most sequence comparison calculations. Extending the Smith-Waterman and Needleman-Wunsch algorithms, which are two quintessential algorithms used for sequence comparison, a method is proposed to take advantage of these measures of time between elements organized in a finite sequence. This will allow for a better measure of similarity between two sequences in time. Using a simulation study, these algorithms are shown to be able to distinguish between sequences with different variations in their time gaps effectively. A new time parameter is introduced to allow for tuning the importance of time in the calculation as compared to the importance of the sequence variation. These algorithms are then applied to eye-tracking data of autism patients, and the inclusion of time is shown to improve the algorithms' abilities to distinguish between autistic and non-autistic patients. Ultimately, the approach is concluded to effectively include variations in time in the calculation of the Smith-Waterman and Needleman-Wunsch algorithms.

Share

COinS