Package bamm :: Module bamExtractor :: Class BamExtractor
[hide private]
[frames] | no frames]

Class BamExtractor

source code

Class used to manage extracting reads from multiple BAM files

Instance Methods [hide private]
 
__init__(self, contigs, bamFiles, prefix='', groupNames=[], outFolder='.', mixBams=False, mixGroups=False, mixReads=False, interleaved=False, bigFile=False, headersOnly=False, minMapQual=0, maxMisMatches=1000, useSuppAlignments=False, useSecondaryAlignments=False)
Default constructor.
source code
 
extract(self, threads=1, verbose=False)
Start extracting reads from the BAM files
source code
 
managePrintQueue(self)
Write all the print requests to stdout / stderr
source code
 
makeSurePathExists(self, path)
Make sure that a path exists, make it if necessary
source code
Instance Variables [hide private]
  outFilePrefixes
for bid in range(len(self.bamFiles)):...
Method Details [hide private]

__init__(self, contigs, bamFiles, prefix='', groupNames=[], outFolder='.', mixBams=False, mixGroups=False, mixReads=False, interleaved=False, bigFile=False, headersOnly=False, minMapQual=0, maxMisMatches=1000, useSuppAlignments=False, useSecondaryAlignments=False)
(Constructor)

source code 

Default constructor.

Set all the instance variables, make ReadSets, organise output files

Inputs:
 contigs - [[string]], list of list contig IDs (used as a filter)
 bamFiles - [string], list of bamfiles to extract reads from
 prefix - string, append this string to the start of all output files
 groupNames - [string], list of names of the groups in the contigs list
 outFolder - path, write output to this folder
 mixBams - == True -> use one file for all bams
 mixGroups - == True -> use one file for all groups
 mixReads - == True -> use one file for paired / unpaired reads
 interleaved - == True -> use interleaved format for paired reads
 bigFile - == True -> do NOT gzip outputs
 headersOnly - == True -> write read headers only
 minMapQual - int, skip all reads with a lower mapping quality score
 maxMisMatches - int, skip all reads with more mismatches (NM aux files)
 useSuppAlignments - == True -> DON'T skip supplementary alignments
 useSecondaryAlignments - == True -> DON'T skip secondary alignments

Outputs:
 None

extract(self, threads=1, verbose=False)

source code 
Start extracting reads from the BAM files

This function is responsible for starting and stopping all threads and
processes used in bamm extract. Due to python multiprocessing's need to
pickle everything the actual work of extraction is carried out in the
first level function called externalExtractWrapper. See there for actual
extraction details. This function is primarily concerned with thread
and process management.

Inputs:
 threads - int, the number of threads / processes to use
 verbose - bool, True if lot's of stuff should be printed to screen

Outputs:
 None

managePrintQueue(self)

source code 
Write all the print requests to stdout / stderr

This function is run as a process and so can be terminated.
Place a None on the printQueue to terminate the process.

Change self.outputStream to determine where text will be written to.

Inputs:
 None

Outputs:
 None

makeSurePathExists(self, path)

source code 
Make sure that a path exists, make it if necessary

Inputs:
 path - string, full or relative path to create

Outputs:
 None


Instance Variable Details [hide private]

outFilePrefixes


        for bid in range(len(self.bamFiles)):
            for gid in range(len(self.groupNames)):
                for rpi in [RPI.FIR, RPI.SEC, RPI.SNGL, RPI.SNGL_FIR, RPI.SNGL_SEC]:
                    sys.stderr.write("%s %s %s %s
" % (self.prettyBamFileNames[bid], self.groupNames[gid], RPI2Str(rpi), str(self.outFilePrefixes[bid][gid][rpi])))