If you are like me, beginning your next data analytics project, you may be wondering what tools to use, where can I cut corners and how up to your productivity and save time. I recently learnt python properly and fell in love with the language, but encountered many issues in getting python running smoothly on my mac. Thankfully at the time, I came across JetBrains PyCharm IDE which simplified the installation and allowed me to run with Python.
I soon encountered PyCharms limitations, however, realising its Jupyter Notebook editor was riddled with bugs, making it prohibitive for any extensive data analysis, not to mention the learning curve required to get around it’s UI to set up it all up fully. In the end, I returned to a more fundamental Python setup using a Github project Jupyter Labs as my primary text editor.
Avoiding a python IDE such as PyCharm, means going back to the command line to install and manage Python projects and packages, but don’t worry once this is set up, it provides a blazing fast, fully flexible Python infrastructure ready for any new future project. Let’s talk you through how to get there. (Please note this tutorial is for macOS; however, most of this is directly transferable to Linux)
- Homebrew – for Mac package management
- Python 3
- Pip python package manager
- Jupyter Lab
Just skip the sections if already installed.
First, let’s get python installed. After experimenting with Anaconda in the past, I ran into many issues with python version management, having to remove the application from my system entirely. Instead, like all other applications on my Mac (watch this space for a future post) I opted to use Homebrew to install and control my python installation cleanly.
For those who don’t have Homebrew installed, fire up the command line and type:
ruby -e "$(curl -fsSL https://raw.github.com/mxcl/homebrew/go)”
Next, we can install python (3.6 at this current point in time) with the command:
brew install python
This installs python to
Python Package installation
Install the python package manager pip; this manages the installation of a projects python dependencies:
curl -O http://python-distribute.org/distribute\_setup.py python distribute\_setup.py curl -O https://raw.github.com/pypa/pip/master/contrib/get-pip.py python get-pip.py
Next, let’s add Virtualenv. Virtualevn allows independent python virtual environments to be built and run with their own python version and packages:
pip install virtualenv
To bootstrap Virtualenv with some useful additional functionality virtualenvwrapper is used; installed via:
pip install virtualenvwrapper
To get this package fully up and running requires some slightly more technical tweaking of the
.bashrc file and if you are using zsh, also tweak
Setup bashrc for Virtualenv:
If like me, you may not have a .bashrc file on your system, go to your root directory (type:
cd) and type:
echo "export PATH=/usr/local/lib:$PATH" >> ~/.bashrc
You can check the files’ creation with:
Then using nano (
cd && sudo nano .bashrc) on the command line or a text editor like Atom, edit the .bashrc file by pasting
export WORKON\_HOME=$HOME/.virtualenvs export PROJECT\_HOME=$HOME/ export VIRTUALENVWRAPPER\_PYTHON=/usr/local/bin/python export VIRTUALENVWRAPPER\_VIRTUALENV=/usr/local/bin/virtualenv export VIRTUALENVWRAPPER\_VIRTUALENV\_ARGS='--no-site-packages'
Double check the python path is correct (line 3), this is the code for a python homebrew setup. After saving this file, the system requires a reset:
Finally, virtualenvwrapper requires resetting via the command:
You should be able to test if this is working by going to the root folder and typing
workon, this list all python virtual environments installed (as there are none, this should return nothing). If you have no errors, virtualenvwrapper has been set up correctly, else refer to the troubleshooting tips. Feeling lazy? Here is a shell script you could run instead:
read -r -p "Install Python Setup? [y/N] " response if [[ "$response" =~ ^([yY][eE][sS]|[yY])$ ]]; then cd ~ pip install virtualenv pip install virtualenvwrapper echo "export PATH=/usr/local/lib:$PATH export WORKON\_HOME=$HOME/.virtualenvs export PROJECT\_HOME=$HOME/ export VIRTUALENVWRAPPER\_PYTHON=/usr/local/bin/python export VIRTUALENVWRAPPER\_VIRTUALENV=/usr/local/bin/virtualenv export VIRTUALENVWRAPPER\_VIRTUALENV\_ARGS='--no-site-packages'" >> ~/.bashrc source ~/.bashrc source /usr/local/bin/virtualenvwrapper.sh echo 'source /usr/local/bin/virtualenvwrapper.sh' >>~/.zshrc fi echo "Complete"
Just save this as a
.sh file and make sure you have access rights. (
chmod 755 pythonInstall.sh )
Keep Sessions Consistent
An additional step for zsh is required to allow iTerm sessions to be kept consistent. Simply run the command:
echo 'source /usr/local/bin/virtualenvwrapper.sh' >>~/.zshrc
When a command window is closed and reopened the virtualenvwrapper installation should remain consistent.
Creating Virtual Python Environments:
Now after we have got that all out of the way, it is time for the more interesting bit. Let’s create a new python virtual environment, swap
env_name with your desired virtual environment.
# make py3 mkvirtualenv env_name --python=python3 # make py2 mkvirtualenv env_name --python=python2
This creates a brand new environment with a few basic packages and activates it (My virtual environment is called ML-Lab ). To check this works correctly type
workon and you should your new environment listed.
Some other useful Virtualenvwrapper Commands for your notebook:
workon env_name # loads environment python -V # checks version of python. If the new project is set up with python 3, this is an easy way to see if the virtual environment is running deactivate env_name # closes environment rmvirtualenv env_name #removes environment virtualenvwrapper --list # lists all commands setvirtualenvproject my_venv path_to_project # creates project Link
Keep all your python venvs in one place (I keep them in my home directory for day finding), then use virtualenvwrapper to link to new project directories; making it simple to share a single python virtual environment and keep your work clean and organised.
setvirtualenvproject my_venv path_to_project
my_venv with the name of your virtual environment and
path_to_project with the location of your project from your home directory (My virtual environment is ML-Lab ).
Package Setups Jupyter Kernel
Now lets set up the virtual environment for Machine learning. You are going to require a few necessary packs to get this all working correctly.
pip install jupyter #batch install all Jupyter packages pip install jupyterlab #jupyter lab environment (IDE to be used) pip install seaborn # install pandas, matlibplot pip install sklearn # Machine learning toolkit
A kernel is required to run a Jupyter Notebook. As standard, a Python 2 kernel will be installed. As we are using Python 3 for development it is important to add a Python 3 kernel and switch to this in development. To see a list of the installed kernels use:
jupyter kernelspec list. You can install a Python 3 kernel with:
ipython kernel install --user --name myenv --display-name "Python (myenv)" #Installs current virtualenv python version kernel sudo jupyter kernelspec uninstall mykernel #unistalls kernel
Check that a python 3 kernel has been installed correctly with:
jupyter kernelspec list.
We have installed the Jupyter Lab IDE, which extends jupyter notebook functionality providing a more functional IDE, to code within a browser. By running
jupyter lab from the virtual environment; the IDE is booted in a browser and connected to a jupyter kernel.
You are now ready to start developing. In this current state with each new sitting you run:
workon env_name jupyter lab
This code boots the virtual environment and runs a great Jupyter notebook IDE.
Check that you are working in a python 3 kernel (top right corner). You are now good to go!
Automate Automate Automate
It great to make those repetitive tasks simpler. Create a shell script, with an embedded apple shell script in your project:
#!/bin/bash osascript <<EOD if application "iTerm" is running then tell application "iTerm" activate tell current session of current window to write text "cd && workon venv_name" tell current window to set tb to create tab with default profile tell current session of current window to write text " workon venv_name && cd project_location && jupyter lab" tell first tab of current window select end tell end tell else activate application "iTerm" end if EOD
Save as run.sh (or anything you like), and make sure this new file is executable through setting permissions on the script (replace with script name and path)
sudo chmod -R a+rwx path/run.sh
One step further
Why not go one step further and boot up your project via spotlight. Create an Automator app:
- Go to Automator
- Create a new application
- Add Run shell script to the application (select from the list)
- Paste shell script
- Test the script and save as
My final revised script with some tweaks is:
#!/bin/bash osascript <<EOD if application "iTerm" is running then tell application "iTerm" activate tell current session of current window to write text "cd && workon ML-Lab" tell current window to set tb to create tab with default profile tell current session of current window to write text "workon ML-Lab && jupyter lab" tell first tab of current window select end tell end tell else activate application "iTerm" tell application "iTerm" activate tell current session of current window to write text "cd && workon ML-Lab" tell current window to set tb to create tab with default profile tell current session of current window to write text "workon ML-Lab && jupyter lab" tell first tab of current window select end tell end tell end if EOD
Then you can run this script via spotlight:
Ta-dah, your machine learning lab is up and running. To summarise what we have achieved:
- Created a clean installation of Python 3 on our Mac which won’t interfere with the system required Python version 2
- Made it simple to create and manage a range all our different python virtual environments using virtualenv and virtualenvwrapper
- Learnt the command line tips for using virtualenvwrapper
- Installed the basic required packages on our machine learning virtual environment
- Installed a great Jupyter Notebooks IDE, Jupyter-Lab
- Automated the Python Virtual Environment by using a shell script with Automator
I hope you have enjoyed this post, please leave in the comments if you have any question, issues or if any parts of this tutorial have been unclear. If you want to read more please check out some of my other blog posts.
Check virtualenv path
sudo find / -name "virtualenv"
Reinstall virtualenv: ERROR: virtualenvwrapper could not find virtualenv in your path
pip uninstall virtualenv pip install virtualenv
For an unknown reason, my pip packages were not installing onto my virtualenv. This simplest way to fix this is to remove the environment and create a new one. This should fix the issue