Alignment of small RNAs or microRNAs (miRNA) from the Illumina platform is easily accomplished with Novoalign. Use the adaptor stripping option to remove terminal adaptor sequences left by the
Illumina GAII/Hiseq protocol
Align short reads with microRNA adaptors present on the 5′ end. Use the default Illumina adaptor sequence:
novoalign -d arabidopsis.nix -f miRNA_reads.fastq -a -l 15 -t 30 > alignment output.txt
In some cases we may want to limit the aligner to using the first 22 bases of the read:
novoalign -d arabidopsis.nix -f miRNA_reads.fastq -n 22 =l 15 -t 30 > alignment_output.l22.txt<br />
The example below turns on microRNA mode to report the hairpin score of the reverse complement alignment. Note that lower microRNA scores in the output report are more significant:
#Stripping off adaptor and setting microRNA mode novoalign -d arabidopsis.nix -f miRNA_reads.fastq -a -m -l 15 -t 30 > mirna_output.l22.txt
In the case of SOLiD colorspace small RNA/miRNA reads the adaptor needs to be removed prior to alignment against the reference genome. The cutadapt program can be used to trim off adaptor sequence introduced by the small RNA isolation laboratory preparation. Below we use cutadapt to remove the adaptor “330201030313112312” SOLiD™ Small RNA Expression Kit Protocol (SREK).
Note. cutadapt expects BFAST style .csfastq files without a quality for the first nucleotide. The current version will abort with an error message if given a BWA format csfastq file.
# Matches the colorspace adaptor and removes it # Adds a "TRIM:" prefix to each read name cutadapt -c -e 0.12 -a 330201030313112312 -x TRIM: solid.csfasta solid.qual > trimmed.cfastq
We are now able to align these trimmed reads to the reference genome with NovoalignCS:
# Align the trimmed miRNA reads # Remember to build the arabidopsis.ncx index using "novoindex -c" novoalignCS -d aradopsis.ncx -f trimmed.cfastq -o SAM -l 15 -t 40 -r All > output.sam
Alternatively align the first 22 base pairs of the colorspace read without trimming the adaptor:
# Align first 22bp miRNA reads novoalignCS -d aradopsis.ncx -f trimmed.cfastq -n 22 -o SAM -t 40 -r All > output.sam