Biotechnology & Biotechnological Equipment 26 (1) Special edition, 209 - 217 (2012)
http://dx.doi.org/10.5504/50YRTIMB.2011.0038
Correspondence to: R.A. Dimitrov E-mail: roumen.dimitrov@gmail.com
Recurring difficulties associated with diverged sequence data include alternative alignment possibilities of insertions and deletions, region of length variations in which homology assessment is questionable or impossible, occurrence of localized excessive mutations to the point of saturation and lost of phylogenetic signals. Therefore, for diverged sequences optimizing similarity will not necessarily improve structure, function and evolutionary history assessments. Here our aim is to present an overview of the methods involved in sequence analysis which are critical for current theoretical and application development. However, we do not follow historical events. For sequence comparison we focus on those methods that are based on exhaustive schemes, which are classically formulated as dynamic programming algorithms. They consist either of optimization schemes which find the best alignment for a given model, or of probabilistic schemes based on partition functions - in which all alignments, with their respective weights, are evaluated.