Loading...
 

Support Help

Forums > Support> MPI results comparison

MPI results comparison

Hi all, could someone help explain why I'm seeing significantly different output with novoalignMPI vs standard novoalign? See below - thanks!

  1. MPI command and results

novoalignMPI --hdrhd off -c 4 -v 120 -i PE 425,80 -x 5 -r Random -F ILMFQ
-d /indexes/human/ncbi/37.1/indexed/allchr.nix
-f /reads/s_1_1_sequence.txt reads/s_1_2_sequence.txt
-o SAM

Starting at Tue May 29 09:29:34 2012
Interpreting input files as Illumina FASTQ, Casava Pipeline 1.3 to 1.7.
Index Build Version: 2.7
Hash length: 14
Step size: 2
samopen SAM header is present: 84 sequences.
Read Sequences: 82391350
Aligned: 80634157
Unique Alignment: 79832999
Gapped Alignment: 895978
Quality Filter: 27908
Homopolymer Filter: 9386
Elapsed Time: 3798.351 (sec.)
CPU Time: 827.4 (min.)
Fragment Length Distribution
From To Count
No pairs found
Done at Tue May 29 10:32:53 2012

  1. Std Novoalign and results

novoalign --hdrhd off -c 4 -v 120 -i PE 425,80 -x 5 -r Random -F ILMFQ
-d /indexes/allchr.nix
-f /reads/s_1_1_sequence.txt /reads/s_1_2_sequence.txt -o SAM

Starting at Tue May 29 13:49:47 2012
Interpreting input files as Illumina FASTQ, Casava Pipeline 1.3 to 1.7.
Index Build Version: 2.7
Hash length: 14
Step size: 2
samopen SAM header is present: 84 sequences.
Paired Reads: 82391350
Pairs Aligned: 80393967
Read Sequences: 164782700
Aligned: 161837302
Unique Alignment: 160753885
Gapped Alignment: 2152908
Quality Filter: 565523
Homopolymer Filter: 5872
Elapsed Time: 43403.363 (sec.)
CPU Time: 2818.1 (min.)
Fragment Length Distribution
From To Count
56 63 7
64 71 238
72 79 2274
80 87 17655
88 95 38315
96 103 87367
104 111 202694
112 119 447758
......


Hi Lebowski,

The MPI run ran in single end mode not paired mode. I can't be sure why this is unless you post the log file from the run.

One case where this can happen is if you specify -f twice like

novoalignMPI --hdrhd off -c 4 -v 120 -i PE 425,80 -x 5 -r Random -F ILMFQ
-d /indexes/human/ncbi/37.1/indexed/allchr.nix
-f /reads/s_1_1_sequence.txt -f reads/s_1_2_sequence.txt
-o SAM

then as each -f option only specifies one file it is treated as single end and only the last -f option will be processed.

Kind Regards, Colin


Hi Lebowski,

It looks like it is the hdrhd off check that has some how upset the MPI version. Try with hdrhd 99 and see if it helps.

Colin


Hi Colin,

It ran in paired-end mode after changing the hdrhd argument. From the little bit I can see right off the bat this is a threshold setting for paired-end reads. What I'll need to understand better is what the impact of 99 is vs. off. Thank you again Colin.


Hi,

hdrhd 99 allows up to 99 letters difference in the two headers. hdrhd off should turn the test off but there was an issue in MPI version of file reader where off wasn't working and all headers were consider different and then MPI treated read as single end rather than reporting an error.

Colin


Show posts:
 
Show HelpHelp