bamm.bamExtractor.BamExtractor

init(self, contigs, bamFiles, prefix=`''`, groupNames=`[]`, outFolder=`'.'`, mixBams=False, mixGroups=False, mixReads=False, interleaved=False, bigFile=False, headersOnly=False, minMapQual=0, maxMisMatches=1000, useSuppAlignments=False, useSecondaryAlignments=False)
(Constructor)


Default constructor.

Set all the instance variables, make ReadSets, organise output files

Inputs:
 contigs - [[string]], list of list contig IDs (used as a filter)
 bamFiles - [string], list of bamfiles to extract reads from
 prefix - string, append this string to the start of all output files
 groupNames - [string], list of names of the groups in the contigs list
 outFolder - path, write output to this folder
 mixBams - == True -> use one file for all bams
 mixGroups - == True -> use one file for all groups
 mixReads - == True -> use one file for paired / unpaired reads
 interleaved - == True -> use interleaved format for paired reads
 bigFile - == True -> do NOT gzip outputs
 headersOnly - == True -> write read headers only
 minMapQual - int, skip all reads with a lower mapping quality score
 maxMisMatches - int, skip all reads with more mismatches (NM aux files)
 useSuppAlignments - == True -> DON'T skip supplementary alignments
 useSecondaryAlignments - == True -> DON'T skip secondary alignments

Outputs:
 None

extract(self, threads=1, verbose=False)

source code

Start extracting reads from the BAM files

This function is responsible for starting and stopping all threads and
processes used in bamm extract. Due to python multiprocessing's need to
pickle everything the actual work of extraction is carried out in the
first level function called externalExtractWrapper. See there for actual
extraction details. This function is primarily concerned with thread
and process management.

Inputs:
 threads - int, the number of threads / processes to use
 verbose - bool, True if lot's of stuff should be printed to screen

Outputs:
 None

managePrintQueue(self)

source code

Write all the print requests to stdout / stderr

This function is run as a process and so can be terminated.
Place a None on the printQueue to terminate the process.

Change self.outputStream to determine where text will be written to.

Inputs:
 None

Outputs:
 None

makeSurePathExists(self, path)

source code

Make sure that a path exists, make it if necessary

Inputs:
 path - string, full or relative path to create

Outputs:
 None

outFilePrefixes


        for bid in range(len(self.bamFiles)):
            for gid in range(len(self.groupNames)):
                for rpi in [RPI.FIR, RPI.SEC, RPI.SNGL, RPI.SNGL_FIR, RPI.SNGL_SEC]:
                    sys.stderr.write("%s %s %s %s
" % (self.prettyBamFileNames[bid], self.groupNames[gid], RPI2Str(rpi), str(self.outFilePrefixes[bid][gid][rpi])))

Class BamExtractor

__init__(self, contigs, bamFiles, prefix='', groupNames=[], outFolder='.', mixBams=False, mixGroups=False, mixReads=False, interleaved=False, bigFile=False, headersOnly=False, minMapQual=0, maxMisMatches=1000, useSuppAlignments=False, useSecondaryAlignments=False) (Constructor)

extract(self, threads=1, verbose=False)

managePrintQueue(self)

makeSurePathExists(self, path)

outFilePrefixes

init(self, contigs, bamFiles, prefix=`''`, groupNames=`[]`, outFolder=`'.'`, mixBams=False, mixGroups=False, mixReads=False, interleaved=False, bigFile=False, headersOnly=False, minMapQual=0, maxMisMatches=1000, useSuppAlignments=False, useSecondaryAlignments=False)
(Constructor)