Part of Exam for Bioinformatics I

Answer precisely 4 of the following (10 pts. each)
  • Define computational problem, instance of a computational problem, and algorithm for a computational problem. Informally, what is meant by an NP-complete problem? Informally, what is a greedy method, what is a branch-and-bound method, and what is a dynamic programming method?
  • Consider the following informal problem: find a motif that appears in each member of a given set of DNA sequences. Give two distinct formalizations of the problem in terms of optimizing a specified score over a set of specified objects. Compare the computational efficiencies of a brute-force optimization for the two formalizations. Sketch a greedy method for the same problem.
  • What is an alignment between two sequences? Apply (by hand) the dyamic programming algorithm for determing an optimal sequence alignment to the sequences ACCT and AGCTT with the scores: match = 1, mismatch = -1, gap = -1.5.
  • What is the problem of sorting by reversals? Give a simple greedy algorithm for sorting by reversals and show that it has no instance-independent performance guarantee. Outline a more complex algorithm that has an instance-independent performance guarantee.
  • Sketch how blast works. Under what conditions will psi-blast outperform blastp? Explain.
  • Explain how NP-completeness results and graph algorithms lead to a formalization of the Sequencing by Hybridization Problem (i.e., reconstructing a sequence from the set of all its k-mers for some k) that can be solved efficiently.
  • What is a Hidden Markov Model, and what is the decoding problem for HMMs? Discuss two examples of bioinformatics problems where HMMs provide a good approach.