Getting Started

This is a package for the julia programming language, designed for working with the bioBakery family of tools for metagenomic analysis of microbial communities. Currently, we support MetaPhlAn and HUMAnN.

Read on to learn how to install the package and use it to begin using it to uncover insights about your microbial community data! If you run into problems, you can open an issue on this repository, or start a discussion over on Microbiome.jl.

Installing julia

If this is your first time using julia, you'll need to install it by going to the julia downloads page and following the instructions for your platform. BiobakeryUtils.jl should work on any julia version >= 1.6.0.

Alternatively, you can you jill.py, which is an easy-to-use python utility for installing julia.

Launching julia from the terminal

If you download the "app" versions of julia from the downloads page above, you may also want to add julia to your shell's $PATH so that you can launch it from your terminal. For windows users, you can look look here for instructions. Mac users, see here for instructions.

Making a project

In julia, it's typically a good idea to use "projects" to organize your package dependencies (this is similar to "environments" that conda uses).

To do this, make a directory and "activate" it in the julia Pkg REPL.

$ mkdir my_project

$ cd my_project

$ julia

               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.6.1 (2021-04-23)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

julia> # press ] to enter the Pkg REPL

(@v1.6) pkg> activate .
  Activating new environment at `~/my_project/Project.toml`

(my_project) pkg> # press backspace to get back to julia REPL

julia>

asciicast

So far, this is still just an empty directory, but you can also use the Pkg REPL to install packages, like BiobakeryUtils.jl.

(my_project) pkg> add BiobakeryUtils

asciicast

Once this process completes, the directory will now contain a Project.toml file that contains BiobakeryUtils.jl as a dependency, and a Manifest.toml file that contains all of the exact info about dependencies installed for this environment.

In the future, you can launch julia with the environment already activated using julia --project if your working directory is my_project/, or julia --project=<path to project> if you're in a different working directory (eg. julia --project=~/my_project if my_project/ is in the home directory).

Using bioBakery command line tools

Some functions provided by this package (eg humann_regroup and humann_rename), require the appropriate bioBakery tools to be installed and accessible from the julia shell environment. The easiest way to do this is to use Conda.jl, though other installation methods are possible as well.

Using a previous installation

If you have a previous installation of metaphlan and/or humann, you can tell julia to use them by modifying the $PATH environment variable.

Environment variables in julia are stored in a Dict called ENV. For example, the $PATH variable in Unix tells the shell where to look for executable programs, and is available in julia using ENV["PATH"]

julia> ENV["PATH"]"/home/runner/.local/bin:/opt/pipx_bin:/home/runner/.cargo/bin:/home/runner/.config/composer/vendor/bin:/usr/local/.ghcup/bin:/home/runner/.dotnet/tools:/snap/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin"

If you launch julia from the shell, this variable is automatically populated with the same $PATH, so if you can access humann or metaphlan from your shell, then launch julia, you should be all set (eg, if you've installed them with miniconda, and you do conda activate envname, then launch julia from the same shell, they should already be available).

If not, you need to identify where humann or metaphlan executables are located, then add that location to ENV["PATH"] (delimeted by :). For example, if the humann executable is found at /home/kevin/.local/bin, you would run:

julia> ENV["PATH"] = ENV["PATH"] * ":" * "/home/kevin/.local/bin""/home/runner/.local/bin:/opt/pipx_bin:/home/runner/.cargo/bin:/home/runner/.config/composer/vendor/bin:/usr/local/.ghcup/bin:/home/runner/.dotnet/tools:/snap/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/home/kevin/.local/bin"

If you don't know where your installation is located, from the terminal, you can use the which command:

$ which humann
/home/kevin/.local/bin/humann

Using Conda.jl

If you don't have a previous installation, you can use Conda.jl to install the necessary tools.

This can be done automatically for you using BiobakeryUtils.install_deps().

julia> BiobakeryUtils.install_deps()
[ Info: Running conda create -y -p /home/kevin/.julia/conda/3/envs/BiobakeryUtils in root environment
Collecting package metadata (current_repodata.json): done
Solving environment: done

## Package Plan ##

  environment location: /home/kevin/.julia/conda/3/envs/BiobakeryUtils



Preparing transaction: done
Verifying transaction: done
Executing transaction: **done**
# ... etc
BiobakeryUtils.install_depsMethod
install_deps([env]; [force=false])

Uses Conda.jl to install HUMAnN and MetaPhlAn. In order to use the commandline tools, you must have the conda environment bin directory in ENV["PATH"]. See "Using Conda" for more information.

source

Or you can do it manually. First install Conda.jl in your environment using the Pkg REPL (accessible by typing ] in the julia REPL - press <backspace> to get back to the regular REPL).

$ cd my_project/

$ julia --project
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.6.1 (2021-04-23)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

julia> # press ']'

(my_project) pkg> add Conda
    Updating registry at `~/.julia/registries/General`
   Resolving package versions...
    Updating `~/.julia/dev/BiobakeryUtils/my_project/Project.toml`
  [8f4d0f93] + Conda v1.5.2
    Updating `~/.julia/dev/BiobakeryUtils/my_project/Manifest.toml`
  [8f4d0f93] + Conda v1.5.2
  [682c06a0] + JSON v0.21.2
  [69de0a69] + Parsers v2.0.4
  [81def892] + VersionParsing v1.2.0
  [ade2ca70] + Dates
  [a63ad114] + Mmap
  [de0858da] + Printf
  [4ec0a83e] + Unicode

julia> using Conda

First, you'll need to add "channels" to a new Conda environment. The order here is important. Assuming you want your environment to be called biobakery:

julia> Conda.add_channel("bioconda", :biobakery)
┌ Info: Running conda config --add channels bioconda --file /home/kevin/.julia/conda/3/envs/biobakery/condarc-julia.yml --force
└ in biobakery environment
# ...

julia> Conda.add_channel("conda-forge", :biobakery)
┌ Info: Running conda config --add channels conda-forge --file /home/kevin/.julia/conda/3/envs/biobakery/condarc-julia.yml
└ --force in biobakery environment
# ...

julia> Conda.add("humann", :biobakery)
[ Info: Running conda install -y -c bioconda humann in biobakery environment
Collecting package metadata (current_repodata.json): done
Solving environment: done
# ...

julia> Conda.add("metaphlan", :biobakery; channel="bioconda")
[ Info: Running conda install -y -c bioconda metaphlan in biobakery environment
Collecting package metadata (current_repodata.json): done
Solving environment: done
# ...

asciicast

By default, Conda.jl puts environments into ~/.julia/conda/envs/<env name>/bin, which you can get with Conda.bin_dir(), so in this case, you'd next want to run

julia> ENV["PATH"] = ENV["PATH"] * ":" * Conda.bin_dir(:BiobakeryUtils)ERROR: UndefVarError: Conda not defined

Note 1: if you need to manually edit ENV["PATH"] like this, you'll need to do this each time you load julia. To get around this, you can modify you shell's $PATH variable, or use direnv to set it on a per-directory basis.

Note 2: If following this docs you get ERROR: UndefVarError: Conda not definedtry installing and loadingConda.jl

Using MetaPhlAn and HUMAnN

You should now be ready to start using MetaPhlAn and HUMAnN from julia! Take a look at the MetaPhlAn tutorial or HUMAnN tutorial for next steps.

Troubleshooting

So, you followed all the steps above, and you're still having problems? There are a couple of common things that can go wrong.

Cannot find {program}

If you get an error that looks like this:

┌ Error: Can not find metaphlan! If you think it should be
│ installed, try running:
│
│ ENV["PATH"] = ENV["PATH"] * ":" * Conda.bin_dir(env)
│
│
│ Where env is something like :BiobakeryUtils.
└ @ BiobakeryUtils /home/kevin/.julia/dev/BiobakeryUtils/src/uti
ls.jl:55
ERROR: failed process: Process(`which metaphlan`, ProcessExited(
1)) [1]

Then the relevant program is not being found in your ENV["PATH"].

1. Check that path to biobakery executables is in ENV["PATH"]

At the julia REPL, just enter ENV["PATH"] and press <kbd>Enter</kbd>.

julia> ENV["PATH"]
"/home/kevin/.local/bin:/home/kevin/.cargo/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:
/bin"

Somewhere in the string should be your conda installation.

If not, follow the advice in the error message; that is, run ENV["PATH"] = ENV["PATH"] * ":" * Conda.bin_dir(env), where env in the call to Conda.bin_dir() is the name of your environment (if you installed things with BiobakeryUtils.install_deps(), the default is :BiobakeryUtils).

Alternatively, if you didn't use Conda.jl or BiobakeryUtils.install_deps(), and have a different conda installation you're using, replace Conda.bin_dir(env) with the path, eg ENV["PATH"] = ENV["PATH"] * ":" * "/Users/yourname/miniconda3/envs/biobakery/bin".

2. Check if a different installation is interfering

To find a program to run, the shell looks through your PATH variable at each directory for a matching program. For example, if your ENV["PATH"] looks like:

"/Users/yourname/miniconda3/envs/other_project/bin:/usr/bin/:/Users/yourname/.julia/conda/3/envs/BiobakeryUtils/bin"

The shell will look in this order:

  1. /Users/yourname/miniconda3/envs/other_project/bin
  2. /usr/bin/
  3. /Users/yourname/.julia/conda/3/envs/BiobakeryUtils/bin

If, for example, you have metaphlan installed in (1) other_project/bin, but intend to use (3) BiobakeryUtils/bin, you might run into issues. In this case, you can add (3) to the front of your path so that it's reached first, eg:

ENV["PATH"] = "/Users/yourname/.julia/conda/3/envs/BiobakeryUtils/bin" * ":" * ENV["PATH"]

Error involving bowtie2

If you run metaphlan() or humann(), things start out looking ok, then you get a long error message that includes something like

subprocess.CalledProcessError: Command '['bowtie2-build', '--usage']' returned non-zero exit status 250.
ERROR: failed process: Process(`metaphlan samplename.fasta samplename_profile.tsv --input_type fasta`, ProcessExited(1)) [1]

buried in the stack trace, you probably have a problem with tbb.

The default bowtie2 installation installs a version of tbb that doesn't work properly, so you need to pin it to an earlier version to make it work (see here).

If you installed using BiobakeryUtils.install_deps(), this should have been done already, and you might be having this problem.

If you used a different installation method, you can try

$ conda install tbb=2020.2

or

julia> Conda.add("tbb=2020.2", env; channel="conda-forge")

(replace env with the name of your conda environment)

Still having issues?

If your issue isn't addressed here, or you're still having problems, please open an issue or start a discussion over on Microbiome.jl.