Unmessing Python packages with virtual environments

When I teach programming to beginners, I often recommend Anaconda or Canopy as the default environment for coding in Python. Such programs offer a convenient solution for someone who is just starting to learn how to code, simplifying the process of installing and managing Python packages. Also, such programs alleviate cognitive overload of trying to understand what’s happening “under the hood”, which might be desirable for someone who just learned what a variable or a for loop is. However, once a user starts to manage their own packages and once the understanding of things “under the hood” becomes relevant, things can easily get tricky. This post explains a simple and consistent way to manage Python packages.

50 shades of headaches

Python package management can be a mess if some advanced functionality is needed. For example, if you are a grad student and you just joined a lab that speaks Python instead of Matlab/R/something else you’ve been used to. Here are a few examples of situations which you might encounter:

Package is not available in a Anaconda/Canopy/pip/OS-specific package repository
A version of a package is needed that is different from the version currently installed on a computer
Two different versions of the same package are needed (e.g., version 1.1 and version 1.2) for two different projects
Two versions of the same package are needed, but one for Python 2.X and one for Python 3.X
You do not have sudo privileges on the machine you are using
You spent a whole night (or two) trying to understand why some import does not work while there is a package installed and it really should be working. You are reconsidering your choice to go to a grad school and wonder what is the meaning of life.

One way to deal with these kinds of issues is by using Python virtual environments. A virtual environment (VE) is a program that isolates a group of Python packages and keeps them separate from all other Python packages. Then, when a group of packages is needed, the virtual environment containing that packages can be activated and packages can be imported as usual with import package. The activation of a VE happens in a terminal, so there are no changes to the Python code. The nice thing about this system is that there can be many VEs, and one VE can be activated in one shell prompt while another one can be activated in a different shell prompt. On my computer, I have one general VE with Python 2.X installation that I use frequently, then I keep a separate one for Python 3.X installation, and sometimes I have other VEs for individual research projects. Also, packages installed within a VE are kept separate from system-wide, global Python packages that needed to run the operating system. And messing with global Python packages that are used to run the OS is exactly what it sounds like – not a great idea. This kind of mess with global packages can happen when sudo pip install is used to install Python packages (which, unfortunately, often shows up as a “solution” when googling for problems related to installing Python packages).

Setting up Python Virtual Environments

In the next few paragraphs I will explain how to get VEs up and running. I will assume a *nix system and some basic familiarity with shell commands (e.g., navigation, creating and removing directories). A tool used to create Python VEs is called virtualenv, and another tool called virtenvwrapper is a set of extensions to virtualenv that provide shortcuts that allow setting up a VE more quickly. Together, they make the management of Python packages convenient and fairly simple.

Both of these packages can be installed with pip:

$ pip install --user virtualenv virtualenvwrapper

The --user flag ensures installation to a local user folder, so no sudo privileges are required for that. virtualenv and virtualenvwrapper should be now installed somewhere in a local folder, in my case they can be found in:

/home/ivana/.local/lib/python2.7/site-packages

but different versions of pip might put them in a different place. I use locate virtualenv.py or locate virtualenvwrapper to double-check whey there are. Before we can create a new VE, we need to do two things: (1) specify a directory where different virtual environments will be stored and (2) tell bash about new commands we want to use (commands for maintaining VEs). We can do this by adding a few lines in .bashrc. At this point, it is good to check where virtualenvwrapper.sh is located, as we will need to add that path in our .bashrc. For me, it is in /home/ivana/.local/bin/virtualenvwrapper.sh. Then, we can append following lines to the .bashrc:

# Python VE stuff
export WORKON_HOME=~/.virtualenvs
source /home/ivana/.local/bin/virtualenvwrapper.sh

virtualenv will create the directory specified in WORKON_HOME the first time we create a new VE (explained next) and it will also take care of managing the content in the directory.

In theory, this should be the end of the setup. You can test if it works by opening a new terminal and creating a new environment:

$ mkvirtualenv first_env

The output should look similar to this one:

ivana@cassiopeia:~$ mkvirtualenv first_env
New python executable in /home/ivana/.virtualenvs/first_env/bin/python
Installing setuptools, pip, wheel...done.
virtualenvwrapper.user_scripts creating /home/ivana/.virtualenvs/first_env/bin/predeactivate
virtualenvwrapper.user_scripts creating /home/ivana/.virtualenvs/first_env/bin/postdeactivate
virtualenvwrapper.user_scripts creating /home/ivana/.virtualenvs/first_env/bin/preactivate
virtualenvwrapper.user_scripts creating /home/ivana/.virtualenvs/first_env/bin/postactivate
virtualenvwrapper.user_scripts creating /home/ivana/.virtualenvs/first_env/bin/get_env_details
(first_env) ivana@cassiopeia:~$ 

The name of the virtual environment has been prepended in front of a user name and everything has been set up successfully. In practice, something might go wrong here. A little troubleshooting section below gives some guidelines (feel free to e-mail me if you get different errors).

Troubleshooting

When I first tried creating an environment, I had an issue with paths:

ivana@cassiopeia:~$ mkvirtualenv first_env
ERROR: virtualenvwrapper could not find virtualenv in your path

the sudo-less way to resolve this is to add /home/ivana/.local/bin to the PATH in .bashrc:

export PATH=$PATH:/home/ivana/.local/bin

Using Virtual Environments

Within a virtual environment packages can be installed as usual:

(first_env) ivana@cassiopeia:~$ pip install xkcd
Collecting xkcd
Installing collected packages: xkcd
Successfully installed xkcd-2.4.2

To get a list of installed packages use pip freeze:

(first_env) ivana@cassiopeia:~$ pip freeze
appdirs==1.4.3
packaging==16.8
pyparsing==2.2.0
six==1.10.0
xkcd==2.4.2

deactivate deactivates the current environment, and an environment can be activated again with workon first_env. Use rmvirtualenv to remove a virtual environment (this can be done only if that environment is not currently active). If there is an environment that is used frequently, it can be useful to add it to .bashrc so that every time a new terminal is opened, this environment is activated by default. To do so, just add workon first_env (or whatever name of that environment is) in .bashrc after source /home/ivana/.local/bin/virtualenvwrapper.sh.

This was a light-weight introduction into the world of virtual environments, and there are plenty of other features which I have not mentioned here. This nice tutorial provides more information on commands that exists and related useful tools.