Getting Started
This is a package for the julia programming language, designed for working with the bioBakery family of tools for metagenomic analysis of microbial communities. Currently, we support MetaPhlAn
and HUMAnN
.
Read on to learn how to install the package and use it to begin using it to uncover insights about your microbial community data! If you run into problems, you can open an issue on this repository, or start a discussion over on Microbiome.jl
.
Installing julia
If this is your first time using julia, you'll need to install it by going to the julia downloads page and following the instructions for your platform. BiobakeryUtils.jl
should work on any julia version >= 1.6.0.
Alternatively, you can you jill.py
, which is an easy-to-use python utility for installing julia.
Launching julia from the terminal
If you download the "app" versions of julia from the downloads page above, you may also want to add julia
to your shell's $PATH
so that you can launch it from your terminal. For windows users, you can look look here for instructions. Mac users, see here for instructions.
Making a project
In julia, it's typically a good idea to use "projects" to organize your package dependencies (this is similar to "environments" that conda
uses).
To do this, make a directory and "activate" it in the julia Pkg REPL.
$ mkdir my_project
$ cd my_project
$ julia
_
_ _ _(_)_ | Documentation: https://docs.julialang.org
(_) | (_) (_) |
_ _ _| |_ __ _ | Type "?" for help, "]?" for Pkg help.
| | | | | | |/ _` | |
| | |_| | | | (_| | | Version 1.6.1 (2021-04-23)
_/ |\__'_|_|_|\__'_| | Official https://julialang.org/ release
|__/ |
julia> # press ] to enter the Pkg REPL
(@v1.6) pkg> activate .
Activating new environment at `~/my_project/Project.toml`
(my_project) pkg> # press backspace to get back to julia REPL
julia>
So far, this is still just an empty directory, but you can also use the Pkg REPL to install packages, like BiobakeryUtils.jl
.
(my_project) pkg> add BiobakeryUtils
Once this process completes, the directory will now contain a Project.toml
file that contains BiobakeryUtils.jl
as a dependency, and a Manifest.toml
file that contains all of the exact info about dependencies installed for this environment.
In the future, you can launch julia with the environment already activated using julia --project
if your working directory is my_project/
, or julia --project=<path to project>
if you're in a different working directory (eg. julia --project=~/my_project
if my_project/
is in the home directory).
Using bioBakery
command line tools
Some functions provided by this package (eg humann_regroup
and humann_rename
), require the appropriate bioBakery
tools to be installed and accessible from the julia shell
environment. The easiest way to do this is to use Conda.jl
, though other installation methods are possible as well.
Using a previous installation
If you have a previous installation of metaphlan
and/or humann
, you can tell julia to use them by modifying the $PATH
environment variable.
Environment variables in julia are stored in a Dict
called ENV
. For example, the $PATH
variable in Unix tells the shell where to look for executable programs, and is available in julia using ENV["PATH"]
julia> ENV["PATH"]
"/home/runner/.local/bin:/opt/pipx_bin:/home/runner/.cargo/bin:/home/runner/.config/composer/vendor/bin:/usr/local/.ghcup/bin:/home/runner/.dotnet/tools:/snap/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin"
If you launch julia from the shell, this variable is automatically populated with the same $PATH
, so if you can access humann
or metaphlan
from your shell, then launch julia, you should be all set (eg, if you've installed them with miniconda, and you do conda activate envname
, then launch julia from the same shell, they should already be available).
If not, you need to identify where humann
or metaphlan
executables are located, then add that location to ENV["PATH"]
(delimeted by :
). For example, if the humann
executable is found at /home/kevin/.local/bin
, you would run:
julia> ENV["PATH"] = ENV["PATH"] * ":" * "/home/kevin/.local/bin"
"/home/runner/.local/bin:/opt/pipx_bin:/home/runner/.cargo/bin:/home/runner/.config/composer/vendor/bin:/usr/local/.ghcup/bin:/home/runner/.dotnet/tools:/snap/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/home/kevin/.local/bin"
If you don't know where your installation is located, from the terminal, you can use the which
command:
$ which humann
/home/kevin/.local/bin/humann
Using Conda.jl
If you don't have a previous installation, you can use Conda.jl
to install the necessary tools.
This can be done automatically for you using BiobakeryUtils.install_deps()
.
julia> BiobakeryUtils.install_deps()
[ Info: Running conda create -y -p /home/kevin/.julia/conda/3/envs/BiobakeryUtils in root environment
Collecting package metadata (current_repodata.json): done
Solving environment: done
## Package Plan ##
environment location: /home/kevin/.julia/conda/3/envs/BiobakeryUtils
Preparing transaction: done
Verifying transaction: done
Executing transaction: **done**
# ... etc
BiobakeryUtils.install_deps
— Methodinstall_deps([env]; [force=false])
Uses Conda.jl to install HUMAnN and MetaPhlAn. In order to use the commandline tools, you must have the conda environment bin directory in ENV["PATH"]
. See "Using Conda" for more information.
Or you can do it manually. First install Conda.jl
in your environment using the Pkg REPL (accessible by typing ]
in the julia REPL - press <backspace>
to get back to the regular REPL).
$ cd my_project/
$ julia --project
_
_ _ _(_)_ | Documentation: https://docs.julialang.org
(_) | (_) (_) |
_ _ _| |_ __ _ | Type "?" for help, "]?" for Pkg help.
| | | | | | |/ _` | |
| | |_| | | | (_| | | Version 1.6.1 (2021-04-23)
_/ |\__'_|_|_|\__'_| | Official https://julialang.org/ release
|__/ |
julia> # press ']'
(my_project) pkg> add Conda
Updating registry at `~/.julia/registries/General`
Resolving package versions...
Updating `~/.julia/dev/BiobakeryUtils/my_project/Project.toml`
[8f4d0f93] + Conda v1.5.2
Updating `~/.julia/dev/BiobakeryUtils/my_project/Manifest.toml`
[8f4d0f93] + Conda v1.5.2
[682c06a0] + JSON v0.21.2
[69de0a69] + Parsers v2.0.4
[81def892] + VersionParsing v1.2.0
[ade2ca70] + Dates
[a63ad114] + Mmap
[de0858da] + Printf
[4ec0a83e] + Unicode
julia> using Conda
First, you'll need to add "channels" to a new Conda environment. The order here is important. Assuming you want your environment to be called biobakery
:
julia> Conda.add_channel("bioconda", :biobakery)
┌ Info: Running conda config --add channels bioconda --file /home/kevin/.julia/conda/3/envs/biobakery/condarc-julia.yml --force
└ in biobakery environment
# ...
julia> Conda.add_channel("conda-forge", :biobakery)
┌ Info: Running conda config --add channels conda-forge --file /home/kevin/.julia/conda/3/envs/biobakery/condarc-julia.yml
└ --force in biobakery environment
# ...
julia> Conda.add("humann", :biobakery)
[ Info: Running conda install -y -c bioconda humann in biobakery environment
Collecting package metadata (current_repodata.json): done
Solving environment: done
# ...
julia> Conda.add("metaphlan", :biobakery; channel="bioconda")
[ Info: Running conda install -y -c bioconda metaphlan in biobakery environment
Collecting package metadata (current_repodata.json): done
Solving environment: done
# ...
By default, Conda.jl
puts environments into ~/.julia/conda/envs/<env name>/bin
, which you can get with Conda.bin_dir()
, so in this case, you'd next want to run
julia> ENV["PATH"] = ENV["PATH"] * ":" * Conda.bin_dir(:BiobakeryUtils)
ERROR: UndefVarError: Conda not defined
Note 1: if you need to manually edit ENV["PATH"]
like this, you'll need to do this each time you load julia. To get around this, you can modify you shell's $PATH
variable, or use direnv
to set it on a per-directory basis.
Note 2: If following this docs you get ERROR: UndefVarError: Conda not definedtry installing and loading
Conda.jl
Using MetaPhlAn and HUMAnN
You should now be ready to start using MetaPhlAn and HUMAnN from julia! Take a look at the MetaPhlAn tutorial or HUMAnN tutorial for next steps.
Troubleshooting
So, you followed all the steps above, and you're still having problems? There are a couple of common things that can go wrong.
Cannot find {program}
If you get an error that looks like this:
┌ Error: Can not find metaphlan! If you think it should be
│ installed, try running:
│
│ ENV["PATH"] = ENV["PATH"] * ":" * Conda.bin_dir(env)
│
│
│ Where env is something like :BiobakeryUtils.
└ @ BiobakeryUtils /home/kevin/.julia/dev/BiobakeryUtils/src/uti
ls.jl:55
ERROR: failed process: Process(`which metaphlan`, ProcessExited(
1)) [1]
Then the relevant program is not being found in your ENV["PATH"]
.
1. Check that path to biobakery executables is in ENV["PATH"]
At the julia REPL, just enter ENV["PATH"]
and press <kbd>Enter</kbd>.
julia> ENV["PATH"]
"/home/kevin/.local/bin:/home/kevin/.cargo/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:
/bin"
Somewhere in the string should be your conda installation.
If not, follow the advice in the error message; that is, run ENV["PATH"] = ENV["PATH"] * ":" * Conda.bin_dir(env)
, where env
in the call to Conda.bin_dir()
is the name of your environment (if you installed things with BiobakeryUtils.install_deps()
, the default is :BiobakeryUtils
).
Alternatively, if you didn't use Conda.jl
or BiobakeryUtils.install_deps()
, and have a different conda installation you're using, replace Conda.bin_dir(env)
with the path, eg ENV["PATH"] = ENV["PATH"] * ":" * "/Users/yourname/miniconda3/envs/biobakery/bin"
.
2. Check if a different installation is interfering
To find a program to run, the shell looks through your PATH
variable at each directory for a matching program. For example, if your ENV["PATH"]
looks like:
"/Users/yourname/miniconda3/envs/other_project/bin:/usr/bin/:/Users/yourname/.julia/conda/3/envs/BiobakeryUtils/bin"
The shell will look in this order:
/Users/yourname/miniconda3/envs/other_project/bin
/usr/bin/
/Users/yourname/.julia/conda/3/envs/BiobakeryUtils/bin
If, for example, you have metaphlan
installed in (1) other_project/bin
, but intend to use (3) BiobakeryUtils/bin
, you might run into issues. In this case, you can add (3) to the front of your path so that it's reached first, eg:
ENV["PATH"] = "/Users/yourname/.julia/conda/3/envs/BiobakeryUtils/bin" * ":" * ENV["PATH"]
Error involving bowtie2
If you run metaphlan()
or humann()
, things start out looking ok, then you get a long error message that includes something like
subprocess.CalledProcessError: Command '['bowtie2-build', '--usage']' returned non-zero exit status 250.
ERROR: failed process: Process(`metaphlan samplename.fasta samplename_profile.tsv --input_type fasta`, ProcessExited(1)) [1]
buried in the stack trace, you probably have a problem with tbb
.
The default bowtie2 installation installs a version of tbb
that doesn't work properly, so you need to pin it to an earlier version to make it work (see here).
If you installed using BiobakeryUtils.install_deps()
, this should have been done already, and you might be having this problem.
If you used a different installation method, you can try
$ conda install tbb=2020.2
or
julia> Conda.add("tbb=2020.2", env; channel="conda-forge")
(replace env
with the name of your conda environment)
Still having issues?
If your issue isn't addressed here, or you're still having problems, please open an issue or start a discussion over on Microbiome.jl
.