Preparing the dataset for the next tools

Now we will change the format of the dataset and filter the SNPs in preparation for running the PCA and Ancestry tools. The Prepare Input tool in the Genome Diversity section reformats gd_snp datasets for the tools in the Population Structure subsection. The gd_snp dataset should already be selected. This time we will filter the SNPs only requiring five reads covering the SNP. The PCA and Ancestry tools need the same SNP to be in all the individuals instead of pairwise like the tree uses. The more lenient filter will allow us to start with more SNPs, while still filtering to use the more reliable ones. Leave the rest of the options as the default and click the Execute button.

[screen shot]