CloudBurst Wiki
Brought to you by:
mcschatz
Usage: CloudBurst refpath qrypath outpath readlen k allowdifferences filteralignments #mappers #reduces #fmappers #freducers blocksize redundancy refpath: path in hdfs to the reference file qrypath: path in hdfs to the query file outpath: path to a directory to store the results (old results are automatically deleted) readlen: minimum length of the reads k: number of mismatches / differences to allow (higher number requires more time) allowdifferences: 0: mismatches only, 1: indels as well filteralignments: 0: all alignments, 1: only report unambiguous best alignment (results identical to RMAP) #mappers: number of mappers to use. suggested: #processor-cores * 10 #reduces: number of reducers to use. suggested: #processor-cores * 2 #fmappers: number of mappers for filtration alg. suggested: #processor-cores #freducers: number of reducers for filtration alg. suggested: #processor-cores blocksize: number of qry and ref tuples to consider at a time in the reduce phase. suggested: 128 redundancy: number of copies of low complexity seeds to use. suggested: # processor cores