Loading...
 

Getting Started

Introduction

The two key programs that come with Novoalign package are:

novoindexA utility to construct an index for the reference sequences. Typically creates a k-mer index that can be loaded into shared memory for access by multiple search processes.
novoalignAn alignment tool for aligning short sequences against an indexed set of reference sequences. Typically used for aligning Illumina single end and paired end reads.
novoalignCSAn alignment tool for aligning color space sequences against an indexed set of reference sequences. Typically used for aligning SOLiD(TM) single end and mate-pair reads.

Requirements


A computer with an Intel/AMD X86-64 CPU running a 64-bit Linux with 2.6 Kernel or MAC OS-X.

RAM requirements depend on the size of the genome you are aligning against. General rule is about 3 times the size of the reference genome. A minimum of 8Gbyte RAM is recommended for alignments against Human genome.

Getting Started

  1. Download Novoalign tar file from www.novocraft.com(external link), just click the I agree button and then look for the latest release named like "Novo Package V2.xx.xx for X86-64 Linux (Static LInk)"
  2. Untar with command:
    tar -xzf novo*tar.gz
    This will create a folder ./novocraft with Novoalign programs and some documentation files and perl scripts.
  3. Download the Sample Data tar file that is attached to this page.
  4. Untar with command:
    tar -xzf sampledata.tar.gz
    This will create a folder ./sampledata with some files that we'll use to run Novoalign

Running Novoalign

Build the indexed genome

./novocraft/novoindex ssuis.nix ./sampledata/S_suis.dna

Run Novoalign for Single End Reads

./novocraft/novoalign -d ssuis.nix -f ./sampledata/s_1_sequence.txt

Run Novoalign for Paired End Reads

./novocraft/novoalign -d ssuis.nix -f ./sampledata/s_1_0000.1.fastq ./sampledata/s_1_0000.2.fastq

Using Novoalign on Bisulphite treated DNA

Build the indexed genome

./novocraft/novoindex -b ssuis.nbx ./sampledata/S_suis.dna

Run Novoalign for Single End Bi-Seq Reads

./novocraft/novoalign -d ssuis.nbx -f ./sampledata/sim_biseq.1.fastq

Run Novoalign for Paired End Bi-Seq Reads

./novocraft/novoalign -d ssuis.nbx -f ./sampledata/sim_biseq.1.fastq ./sampledata/sim_biseq.2.fastq

Using Novoalign on SOLiD(TM) Colorspace reads

Build the indexed genome for SOLiD colorspace alignment

./novocraft/novoindex -c  ssuis.ncx ./sampledata/S_suis.dna

Run Novoalign for Single End SOLiD Reads

./novocraft/novoalignCS -d ssuis.ncx -f reads.csfastq

Run Novoalign for mate-pair SOLiD Reads

./novocraft/novoalignCS -d ssuis.ncx -f file_F3.csfastq file_R3.csfastq

Specify the library insert size and standard deviation of working with mate-pair libraries

./novocraft/novoalignCS -d ssuis.ncx -f file_F3.csfastq file_R3.csfastq -i 3000 200


Note that novoalignCS accepts reads in .csfasta and .csfastq formats.


That's it!



Created by system. Last Modification: Friday 18 of June, 2010 12:32:59 MYT by colin.
List of attached files
ID Name desc uploaded Size Downloads Actions
28 gz sampledata.tar.gz Sample Data Thu 16 of May, 2013 08:46 MYT by colin 1.75 Mb 47 View Download  
Show HelpHelp

Show php error messages