The Ultimate Python Machine Learning Environment

If you are like me, beginning your next data analytics project, you may be wondering what tools to use, where can I cut corners and how up to your productivity and save time. I recently learnt python properly and fell in love with the language, but encountered many issues in getting python running smoothly on my mac. Thankfully at the time, I came across JetBrains PyCharm IDE which simplified the installation and allowed me to run with Python.

I soon encountered PyCharms limitations, however, realising its Jupyter Notebook editor was riddled with bugs, making it prohibitive for any extensive data analysis, not to mention the learning curve required to get around it’s UI to set up it all up fully. In the end, I returned to a more fundamental Python setup using a Github project Jupyter Labs as my primary text editor.

Avoiding a python IDE such as PyCharm, means going back to the command line to install and manage Python projects and packages, but don’t worry once this is set up, it provides a blazing fast, fully flexible Python infrastructure ready for any new future project. Let’s talk you through how to get there. (Please note this tutorial is for macOS; however, most of this is directly transferable to Linux)

Key Components

  • Homebrew – for Mac package management
  • Python 3
  • Pip python package manager
  • Virtualenv
  • Virtualenvwrapper
  • iTerm
  • Jupyter Lab

Just skip the sections if already installed.

Python Installation

First, let’s get python installed. After experimenting with Anaconda in the past, I ran into many issues with python version management, having to remove the application from my system entirely. Instead, like all other applications on my Mac (watch this space for a future post) I opted to use Homebrew to install and control my python installation cleanly.

For those who don’t have Homebrew installed, fire up the command line and type:

 ruby -e "$(curl -fsSL https://raw.github.com/mxcl/homebrew/go)”

Next, we can install python (3.6 at this current point in time) with the command:

 brew install python

This installs python to /usr/local/bin/python

Python Package installation

Install the python package manager pip; this manages the installation of a projects python dependencies:

curl -O http://python-distribute.org/distribute\_setup.py
python distribute\_setup.py
curl -O https://raw.github.com/pypa/pip/master/contrib/get-pip.py
python get-pip.py

Install Virtualevn

Next, let’s add Virtualenv. Virtualevn allows independent python virtual environments to be built and run with their own python version and packages:

pip install virtualenv

To bootstrap Virtualenv with some useful additional functionality virtualenvwrapper is used; installed via:

pip install virtualenvwrapper

To get this package fully up and running requires some slightly more technical tweaking of the .bashrc file and if you are using zsh, also tweak .zshrc

Setup bashrc for Virtualenv:

If like me, you may not have a .bashrc file on your system, go to your root directory (type: cd) and type:

echo "export PATH=/usr/local/lib:$PATH" >> ~/.bashrc

You can check the files’ creation with:

 tail ~/.bashrc

Then using nano (cd && sudo nano .bashrc) on the command line or a text editor like Atom, edit the .bashrc file by pasting

export WORKON\_HOME=$HOME/.virtualenvs
export PROJECT\_HOME=$HOME/
export VIRTUALENVWRAPPER\_PYTHON=/usr/local/bin/python
export VIRTUALENVWRAPPER\_VIRTUALENV=/usr/local/bin/virtualenv
export VIRTUALENVWRAPPER\_VIRTUALENV\_ARGS='--no-site-packages'

Double check the python path is correct (line 3), this is the code for a python homebrew setup. After saving this file, the system requires a reset:

source ./bashrc

Finally, virtualenvwrapper requires resetting via the command:

source /usr/local/bin/virtualenvwrapper.sh

You should be able to test if this is working by going to the root folder and typing workon, this list all python virtual environments installed (as there are none, this should return nothing). If you have no errors, virtualenvwrapper has been set up correctly, else refer to the troubleshooting tips. Feeling lazy? Here is a shell script you could run instead:

read -r -p "Install Python Setup? [y/N] " response
if [[ "$response" =~ ^([yY][eE][sS]|[yY])$ ]]; then
  cd ~
  pip install virtualenv
  pip install virtualenvwrapper
  echo "export PATH=/usr/local/lib:$PATH
export WORKON\_HOME=$HOME/.virtualenvs
export PROJECT\_HOME=$HOME/
export VIRTUALENVWRAPPER\_PYTHON=/usr/local/bin/python
export VIRTUALENVWRAPPER\_VIRTUALENV=/usr/local/bin/virtualenv
export VIRTUALENVWRAPPER\_VIRTUALENV\_ARGS='--no-site-packages'" >> ~/.bashrc
  source ~/.bashrc
  source /usr/local/bin/virtualenvwrapper.sh
  echo 'source /usr/local/bin/virtualenvwrapper.sh' >>~/.zshrc
fi
echo "Complete"

Just save this as a .sh file and make sure you have access rights. (chmod 755 pythonInstall.sh )
 

Keep Sessions Consistent

An additional step for zsh is required to allow iTerm sessions to be kept consistent. Simply run the command:

echo 'source /usr/local/bin/virtualenvwrapper.sh' >>~/.zshrc

When a command window is closed and reopened the virtualenvwrapper installation should remain consistent.

Creating Virtual Python Environments:

Now after we have got that all out of the way, it is time for the more interesting bit. Let’s create a new python virtual environment, swap env_name with your desired virtual environment.

# make py3
mkvirtualenv env_name --python=python3

# make py2
mkvirtualenv env_name --python=python2

This creates a brand new environment with a few basic packages and activates it (My virtual environment is called ML-Lab ). To check this works correctly type workon and you should your new environment listed.

 

iTerm workon console view for python set up

 

Some other useful Virtualenvwrapper Commands for your notebook:

workon env_name # loads environment
python -V # checks version of python. If the new project is set up with python 3, this is an easy way to see if the virtual environment is running
deactivate env_name # closes environment
rmvirtualenv env_name #removes environment
virtualenvwrapper --list # lists all commands 
setvirtualenvproject my_venv path_to_project # creates project Link

Tips:

Keep all your python venvs in one place (I keep them in my home directory for day finding), then use virtualenvwrapper to link to new project directories; making it simple to share a single python virtual environment and keep your work clean and organised. 

Simply type:

setvirtualenvproject my_venv path_to_project

Replacing my_venv with the name of your virtual environment and path_to_project with the location of your project from your home directory (My virtual environment is ML-Lab ).

Package Setups Jupyter Kernel

Now lets set up the virtual environment for Machine learning. You are going to require a few necessary packs to get this all working correctly.

Pip Packages

pip install jupyter #batch install all Jupyter packages 
pip install jupyterlab #jupyter lab environment (IDE to be used)
pip install seaborn # install pandas, matlibplot 
pip install sklearn # Machine learning toolkit

Kernels

A kernel is required to run a Jupyter Notebook. As standard, a Python 2 kernel will be installed. As we are using Python 3 for development it is important to add a Python 3 kernel and switch to this in development. To see a list of the installed kernels use: jupyter kernelspec list. You can install a Python 3 kernel with:

ipython kernel install --user --name myenv --display-name "Python (myenv)"  #Installs current virtualenv python version kernel
sudo jupyter kernelspec uninstall mykernel #unistalls kernel

Check that a python 3 kernel has been installed correctly with: jupyter kernelspec list.

Lets Code

We have installed the Jupyter Lab IDE, which extends jupyter notebook functionality providing a more functional IDE, to code within a browser. By running jupyter lab from the virtual environment; the IDE is booted in a browser and connected to a jupyter kernel.

You are now ready to start developing. In this current state with each new sitting you run:

workon env_name
jupyter lab

This code boots the virtual environment and runs a great Jupyter notebook IDE.

 

Jupyter Lab IDE for python snapshot

 

Check that you are working in a python 3 kernel (top right corner). You are now good to go!

Automate Automate Automate

It great to make those repetitive tasks simpler. Create a shell script, with an embedded apple shell script in your project:

#!/bin/bash
osascript <<EOD
  if application "iTerm" is running then
      tell application "iTerm"
      activate
        tell current session of current window to write text "cd && workon venv_name"
        tell current window to set tb to create tab with default profile
        tell current session of current window to write text " workon venv_name && cd project_location && jupyter lab"
        tell first tab of current window
          select
        end tell
      end tell
  else
      activate application "iTerm"
  end if
EOD

Save as run.sh (or anything you like), and make sure this new file is executable through setting permissions on the script (replace with script name and path)

sudo chmod -R a+rwx path/run.sh

One step further

Why not go one step further and boot up your project via spotlight. Create an Automator app:

  • Go to Automator
  • Create a new application
  • Add Run shell script to the application (select from the list)
  • Paste shell script
  • Test the script and save as my_project

My final revised script with some tweaks is:

#!/bin/bash
osascript <<EOD
   if application "iTerm" is running then
      tell application "iTerm"
      activate
        tell current session of current window to write text "cd && workon ML-Lab"
        tell current window to set tb to create tab with default profile
        tell current session of current window to write text "workon ML-Lab && jupyter lab"
        tell first tab of current window
          select
        end tell
      end tell
  else
      activate application "iTerm"
      tell application "iTerm"
      activate
        tell current session of current window to write text "cd && workon ML-Lab"
        tell current window to set tb to create tab with default profile
        tell current session of current window to write text "workon ML-Lab && jupyter lab"
        tell first tab of current window
          select
        end tell
      end tell
  end if
EOD

Then you can run this script via spotlight:

cmd+space my\_project

Summary

Ta-dah, your machine learning lab is up and running. To summarise what we have achieved:

    • Created a clean installation of Python 3 on our Mac which won’t interfere with the system required Python version 2
    • Made it simple to create and manage a range all our different python virtual environments using virtualenv and virtualenvwrapper
    • Learnt the command line tips for using virtualenvwrapper
  • Installed the basic required packages on our machine learning virtual environment
  • Installed a great Jupyter Notebooks IDE, Jupyter-Lab
  • Automated the Python Virtual Environment by using a shell script with Automator

I hope you have enjoyed this post, please leave in the comments if you have any question, issues or if any parts of this tutorial have been unclear. If you want to read more please check out some of my other blog posts.

Troubleshooting:

Check virtualenv path

sudo find / -name "virtualenv"

Reinstall virtualenv: ERROR: virtualenvwrapper could not find virtualenv in your path

pip uninstall virtualenv
pip install virtualenv

For an unknown reason, my pip packages were not installing onto my virtualenv. This simplest way to fix this is to remove the environment and create a new one. This should fix the issue

Join the Mailing List

Get the latest blog posts delivered straight to your inbox