Study Guide for Bioinformatics I
Define
computational problem,
instance of a computational problem, and
algorithm for a computational problem.
Informally, what is a greedy method, and what is a
branch-and-bound method?
Consider the following informal problem: find a motif that appears in each
member of a given set of DNA sequences.
Give two distinct formalizations of the problem in terms of optimizing a
specified score over a set of specified objects.
Compare the computational efficiencies of a brute-force optimization for the
two formalizations.
Sketch a greedy method for the same problem.
Informally, what is meant by an NP-complete problem?
What is the problem of sorting by reversals?
Give a simple greedy algorithm for sorting by reversals and show that
it has no (instance-independent) performance guarantee.
(Advanced.) Sketch a method for sorting by reversals that has a (constant)
performance guarantee, and verify that guarantee.
What is an alignment between two sequences?
What is the difference between a global and a local alignment?
What is the general principle behind dynamic programming algorithms?
Explain a dynamic progamming algororithm for aligning two DNA sequences.
Be able to apply the dynamic programming algorithm to two sequences (say, of
lengths around 4), given particular scores for all possible alignment columns.
Be able to sketch how blast works. Under what conditions will
psi-blast outperform blastp? Explain
Explain how NP-completeness results and graph algorithms lead to a
formalization of the Sequencing by Hybridization Problem (i.e., reconstructing
a sequence from the set of all its k-mers for some k)
that can be solved efficiently.
What is a Hidden Markov Model, and what is the
decoding problem for HMMs. Be able to give precise definitions.
Discuss two examples of bioinformatics problems where HMMs provide a good
approach.