Since publication we've been recieving may feature requests and bug reports.So far we've been concentrating on the latter and it's now time to start concentrating on the former.
The first major round of fixes are now in the main distribution version 0.3.0! One major change is a switch away from using pytables to parse BAM files and instead using a new tool called BamM. This makes the parse and extraction steps much much much faster.
We've had a few very dedicated beta testers try the new version out but as always we can't test all systems. If you find a bug, please let me know.
Of course, to take advantage of this, you'll need to reinstall GroopM :(
I use and love Linux and GroopM has been developed to work on a Linux system. I'm not saying it won't work elsewhere, but I haven't tried. YMMV. People have successfully used GroopM on many different flavours of Linux as well as on Mavericks 10.9. If you try it somewhere else then let me know. I'd like to keep this list up to date.
using PIP is the recommended method as it will automaticaly install many but not all of GroopM's dependencies.
This guide assumes you're starting from a completely blank system, so it seems like there are a lot of packages to install. If you're installing this on a running bioinformatics system then many of these will already be installed. The following s what I'd type on a fresh ubuntu install
$ sudo apt-get -y install git build-essential zlib1g-dev python-numpy python-pip python-dev cython libhdf5-dev libfreetype6-dev libpng-dev python-pillow python-matplotlib libblas-dev liblapack-dev gfortran
Next you need to install numexpr. This is straightforward if the above dependencies have been met.
$ sudo pip install git+https://github.com/PyTables/PyTables.email@example.com#egg=tables
Finally, install GroopM
$ sudo pip install GroopM
If you prefer this type of thing you can always try install from source directly. You will need the following dependencies:
Clone the repo from github:
$ git clone https://github.com/minillinim/GroopM.git
Then change into the GroopM directory and type:
$ sudo python setup.py install
GroopM was developed to be used in conjunction with a specific experimental design pattern. Before you try GroopM please ensure:
Still with me? Great!
Before you can use GroopM you'll need to assemble and map your reads. The general recipe is to make a co-assembly of ALL of your data using Velvet or similar. Take these contigs and map each of your read sets to them using BWA or similar. If you have N sampling points then your aim is to produce N sorted-indexed BAM files. samtools can help with this.
The typical workflow for GroopM is as follows:
GroopM was designed to be as parameter-free as possible. For more information on these steps type:
$ groopm OPTION -h
After you've finished binning your contigs you will need to assess their quality (completeness + contamination). We suggest using our other tool: CheckM to do this.
If you use this software then we'd love you to cite us. Our paper is now published at PeerJ. You can get it here. Please cite as: "Imelfort M, Parks D, Woodcroft BJ, Dennis P, Hugenholtz P, Tyson GW. (2014) GroopM: an automated tool for the recovery of population genomes from related metagenomes. PeerJ 2:e603 http://dx.doi.org/10.7717/peerj.603".
All GroopM related suggestions, criticisms or abuse should be directed to Mike Imelfort. m_dot_imelfort_at_uq_dot_edu_dot_au
GroopM is licensed using the GNU General Public License version 3 as published by the Free Software Foundation.
This site and the GroopM logo are copyright Mike Imelfort
This site was created using a template created by the wonderful people at bootswatch.