Our material on phylogenetics in bioinformatics was roughly divided into five "units", with the greatest time spent in parts 2,

Our material on phylogenetics in bioinformatics was roughly divided into five "units". Some of the topics that you should especially focus on, with one or a few goals or questions follow each section.

1. concepts of trees and inferences based on trees

- trees as hypotheses of evolutionary history and shared ancestory

- HOMOPLASY: convergence, parallelism, reversal

- gene trees I: orthology, paralogy

- monophyly

- inference of ancestral states using ACCTRAN

* be able to "read" a phylogenetic tree, and draw correct inferences about the monophyly of groups of organisms or sequences

* given a tree and a set of data for a given character, be able to infer the ancestral states of the character using the method of ACCTRAN

2. methods of building phylogenetic trees

- parsimony, distance, likelihood compared and contrasted

- the basic approaches, similarities and differences

- standard (nonparametric) bootstrap in phylogenies: use and interpretation

- strengths and weaknesses of each of the major methods

- PHYLIP as an intro to computer programs for phylogeny

* be able to perform and interpret a small parsimony analysis by hand, as we did in class, or using any of the main approaches including boostrap, with PHYLIP

3. distance models of sequence evolution

* contrast the different distance models for sequence (or protein) evolution. What are some advantages and disadvantages?

4. maximum likelihood as a general tool for hypothesis testing

* what is the likelihood ratio test and how is it used to test a wide variety of possible hypotheses about sequence evolution, such as: rates of evolution, monophyly of group or sequences, similarity of branching history of two trees, etc.

* be able to outline or diagram the goals and basic steps in a parametric bootstrap analysis, and it's use in hypothesis testing in sequence studies.

5. further concepts and their application

- gene families II, reconciled gene trees

- long branch attraction conditions, causes

* The example given in class of phylogenetic analyses of invertebrate animals was a good example of a dataset where different methods gave different results, but exploring the different results led to a better understanding of the history of the sequences. What were some "take home lessons" to be gained from this example?