MicroRNA Mapping

Alignment of small RNAs or microRNAs (miRNA) from the Illumina platform is easily accomplished with Novoalign. Use the adaptor stripping option to remove terminal adaptor sequences left by the
microRNA protocol.

Illumina GAII/Hiseq protocol

Align short reads with microRNA adaptors present on the 5′ end. Use the default Illumina adaptor sequence:

novoalign -d arabidopsis.nix   -f miRNA_reads.fastq -a   -l 15 -t 30 >  alignment output.txt

In some cases we may want to limit the aligner to using the first 22 bases of the read:

novoalign -d arabidopsis.nix   -f miRNA_reads.fastq -n 22  =l 15 -t 30 &gt;  alignment_output.l22.txt<br />

The example below turns on microRNA mode to report the hairpin score of the reverse complement alignment. Note that lower microRNA scores in the output report are more significant:

#Stripping off adaptor and setting microRNA mode
novoalign -d arabidopsis.nix   -f miRNA_reads.fastq -a -m  -l 15 -t 30 >  mirna_output.l22.txt

SOLiD protocol

In the case of SOLiD colorspace small RNA/miRNA reads the adaptor needs to be removed prior to alignment against the reference genome. The cutadapt(external link) program can be used to trim off adaptor sequence introduced by the small RNA isolation laboratory preparation. Below we use cutadapt(external link)  to remove the adaptor “330201030313112312”   SOLiD™ Small RNA Expression Kit Protocol(external link)  (SREK).

Note. cutadapt expects BFAST style .csfastq files without a quality for the first nucleotide. The current version will abort with an error message if given a BWA format csfastq file.

# Matches the colorspace adaptor and removes it
# Adds a "TRIM:" prefix to each read name
cutadapt -c -e 0.12 -a 330201030313112312 -x TRIM: solid.csfasta solid.qual > trimmed.cfastq

We are now able to align these trimmed reads to the reference genome with NovoalignCS(external link):

# Align the trimmed miRNA reads
# Remember to build the arabidopsis.ncx index using "novoindex -c"
novoalignCS -d aradopsis.ncx  -f  trimmed.cfastq -o SAM -l 15 -t 40 -r All > output.sam

Alternatively align the first 22 base pairs of the colorspace read without trimming the adaptor:

# Align first 22bp miRNA reads
novoalignCS -d aradopsis.ncx  -f  trimmed.cfastq -n 22 -o SAM -t 40 -r All > output.sam

