Part 3 Questions: Single cell genome assembly
Questions:
Q3.1: Did you notice how many reads were discarded in the pre-processing? Do the numbers differ between the Miseq and Hiseq datasets? You can use the group google spreadsheet to see the results for all the datasets.
Q3.2: Did you notice any differences between HiSeq and MiSeq data and the different assemblers?
Q3.3: Did you notice the impact of trimming, is it the same for all assemblers?
Q3.4: What do you think is the best way to assess the ‘quality’ of an assembly? (e.g. total size, N50, number of predicted ORFs, completeness)
Q3.5: What do you think is the best way to assemble this particular SC dataset? Why?
Q3.6: What is the identity of the organism based on the analyses you have performed? What phylum does it belong to and is there any closely related organisms in the databases?
Q3.7: Try to find out in what type of environment you might find similar organisms in.