Part 3: Single cell genome assembly


× Info! If you get disconnected from Uppmax click here to know how to get back to it.

3.1 Organizing working folder

The following set of commands are to be typed in your compute node (for example mXX - look up using jobinfo -u username command). Make sure you are typing them in the compute node and not log in node. Go back to Part 1 to check how to log in to your compute node.
Before starting the exercises, you should make a folder named single_cell_exercises in your home directory where the exercises will be run. Then create 2 folders dataset1 and dataset2 where the raw data will be linked

mkdir ~/single_cell_exercises
cd ~/single_cell_exercises/
mkdir dataset1 dataset2

Next, make symbolic links of sequences in those folders:

ln -s /proj/g2015028/nobackup/single_cell_exercises/sequences/dataset1/* dataset1/
ln -s /proj/g2015028/nobackup/single_cell_exercises/sequences/dataset2/* dataset2/

Please, do not edit those files

Check that the data is present in those 2 datasets folders

ls dataset1
ls dataset2

You should now see 2 files per dataset, a forward fastq file _R1_001.fastq and its reverse _R2_001.fastq

Later in some commands we use the variables sample and trim, the following commands will set those variables. In case you loose your connection, you will need to redo this step again.

If you assemble Hiseq data without trimming:

sample=Hiseq
trim=''
cd ~/single_cell_exercises/dataset1

If you assemble Hiseq data with trimming:

sample=Hiseq
trim=_Trimmomatic
cd ~/single_cell_exercises/dataset1

If you assemble Miseq data without trimming:

sample=Miseq
trim=''
cd ~/single_cell_exercises/dataset2

If you assemble Miseq data with merging:

sample=Miseq
trim=_Trimmomatic
cd ~/single_cell_exercises/dataset2
Previous page Next page