Setting up an Amazon Ubuntu EC2 instance and configuring with Python2.7.9 and SciPy stack

by Eric Bunch


Posted on 15 Nov 2015


Here, I will go through the steps to set up an Amazon EC2 instance that runs Ubuntu 14.04 LTS, install Python2.7.9, and install important modules like numpy and SciPy. So let’s get started. First, sign in to Amazon Web Services, and go to the EC2 dashboard. Click on the ‘Running instances’ option, and the click ‘Launch instance’. We will be using the Ubuntu AMI $\text{(currently they are offering 14.04 LTS)}$ flavor for this tutorial, as well as using the t2 micro instance because it is free with a one year free trial. Select this option and click ‘Review and launch’. On the next screen, click ‘Launch’. There are security options that we can change, but for simplicity we will leave them as default and change them later if we desire. A popup screen will come up asking about a key pair. If you do not have a key pair, follow the instructions to obtain one. Once you have, select ‘Launch instance’.

Next select ‘View instances’ to view the image you just created. It will probably need a minute or so to instantiate. Wait until the entry in the ‘Instance state’ column says ‘running’. Once the instance is running, select the box on the left side of the row where the instance information is displayed. Then click the button that says ‘Connect’ above the list of instances. A popup box will appear with instructions on how to connect to your instances. We will be connecting through our command line. The command needed to connect to your instance will look like


ssh -i "whatever_your_key_pair_is.pem" ubuntu@xx.xx.xxx.x

This command needs to be executed while in the same directory as your .pem file holding the key pair, or you will need to pre-append the full directory path to "whatever_your_key_pair_is.pem". The .pem file is the file with your key pair we just obtained, and the x’s will be numbers in the IP address for your EC2 instance.

This next part is optional. We can crate an alias command for the above command to connect to the EC2 instance, so that we don't have to remember or look up the command each time we wish to connect. To do so, in the command line type nano ~/.bash_aliases. This will open up a file containing the alias commands you already have. If you haven't created aliases before, this wil be blank. To this file, add the line


alias ec2-connect="ssh -i "whatever_your_key_pair_is.pem" ubuntu@xx.xx.xxx.x"
Of course, you can name your alias something other than ec2-connect if you like. Type Ctrl+x to exit, and type 'y' and then Enter to save these changes. Then to activate this alias, in the command line type source ~/.bashrc, and hit Enter. Now connect to the EC2 instance by typing either the ssh -i... command or the alias you created, if you did create one. You will be asked if it is okay to connet to the IP address if your EC2 instance; agree to proceed. You will then be connected with the EC2 instance. To disconnect, hit Ctrl+d. We can see that if we type python and hit enter, we open up a Python terminal. On the images currently hosted, the default Python version is 2.7.6. For the project that I am using my EC2 instance for, I attempted to run a script that employed some things from the urllib3 module. When I did this, I received an Insecure Platform Warning. In the documentation about this warning, it is strongly recommended to upgrade to Python 2.7.9 or greater. In the next steps, we will detail how to get Python 2.7.9 set up in a virtual environment, and install all the relevant modules.

First, we will install Python 2.7.9

  1. Update and upgrade:
    
    sudo apt-get update && sudo apt-get upgrade
    
  2. Install some dependencies:
    
    sudo apt-get install build-essential
    
    
    sudo apt-get install libreadline-gplv2-dev libncursesw5-dev libssl-dev libsqlite3-dev tk-dev libgdbm-dev libc6-dev libbz2-dev
    
  3. Create a Downloads directory, then change to that directory and download the files for Python 2.7.9
    
    mkdir Downloads
    cd Downloads
    wget https://www.python.org/ftp/python/2.7.9/Python-2.7.9.tgz
    
  4. Extract the files, and go into the directory Python-2.7.9 that was created
    
    tar -xvf Python-2.7.9.tgz
    cd Python-2.7.9
    
  5. Install Python
    
    ./configure make sudo make install
    
  6. Return to the main directory and update and upgrade again
    
    cd sudo apt-get update && sudo apt-get upgrade
    
    At some point during the upgrade process, a menu will appear, asking about which version of a file to use. Select the option that says something to the effect of 'use package maintainer's version'. Now that Python 2.7.9 is installed, it can be called by typing python2.7 into the command line
  7. Before we install virtualenv, we will install some prerequisites.
  8. 
    sudo apt-get install build-essential python2.7-dev python-dev python-pip liblapack-dev libblas-dev libatlas-base-dev gfortran libpng-dev libjpeg8-dev libfreetype6-dev libqt4-core libqt4-gui libqt4-dev libzmq-dev
    
    For this part of the tutorial, I am taking the above command from this article.
  9. Install virtualenv
    sudo pip install virtualenv
  10. Now that we have virtualenv installed, we will create a virtual environment to keep our Python 2.7.9 stuff in--and to keep that stuff separate from the Python 2.7.6 stuff.
    1. First, make a folder to hold all of our virtual environments that we wish to create in the future. This step is not essential, but is common practice. We will call the folder virtualenvs.
      
      mkdir virtualenvs
      
    2. Make a virtual environment where the default is Python 2.7.9. The information for a virtual environment will be stored in a directory that we will store in the virtualenvs directory. We will call the directory for the virtual environment we are about to create py-2-7-9_scipy_stack. In order to identify what kind of versions and packages we will have in this virtualenv.
      virtualenv -p python2.7 virtualenvs/py-2-7-9_scipy_stack
    3. Move into the directory where the virtualenv information we just created is stored, and then activate the virtualenv
      
      cd virtualenvs/py-2-7-9_scipy_stack source bin/activate
      
      If we wish to deactivate the virtualenv, use the command deactivate.
    4. Now that we are our virtual environment, we can begin to install the Python packages we want. The main one we want is SciPy, and since that one is trickier to install than the others, we will detail how to install it. First, SciPy requires numpy, so install numpy; this will take some time.
      
      pip install numpy
      
      Next, we wish to install the SciPy module. But because this module is so large, and we are using a micro EC2 instance, we run into problems with memory if we simply try pip install scipy. To overcome this, we will add some swap space to our memory. We will add a 2048kb swap file. The following instructions come from here. First, create the swap file
      
      sudo dd if=/dev/zero of=/swapfile bs=1024 count=2048k
      
      Next prepare the swap file by creating a linux swap area
      
       sudo mkswap /swapfile
       
      Then activate the swap file
      
      sudo swapon /swapfile
      
      This file will be available until you log out. In order to make the swap permanent, we need to add the following to the fstab file. Open the fstab file
      
      sudo nano /etc/fstab
      
      Paste in the following line
      
      /swapfile       none    swap    sw      0       0
      
      Next we will set the swappiness to 10
      
      echo 10 | sudo tee /proc/sys/vm/swappiness echo vm.swappiness = 10 | sudo tee -a /etc/sysctl.conf
      
      Finally, we correct the permissions on the swap file by using the following commands
      
      sudo chown root:root /swapfile sudo chmod 0600 /swapfile
      
    5. Now we can finally proceed with installing SciPy and other modules. Below, I have code to install SciPy and a number of other modules. It should be noted that SciPy takes quite some time to install, as do the pandas and scikit-learn modules.
      
      pip install scipy
      pip install matplotlib
      pip install pandas
      pip install scikit-learn
      pip install ipython
      pip install pyzmq
      pip install pygments
      pip install patsy
      pip install statsmodels
      
  11. That's it! Whenver you want to use Python 2.7.9 on your EC2 instance, activate the virtual environment as we did above, and you will have available Python 2.7.9 as well as any modules you installed while active in that virtual environment. If you would like to use a different version of Python, say Python 3.x, on your EC2 instance, you can follow largely the same steps. Instead of installing Python 2.7.9, you will have to install Python 3.x. In step 7, you must change the package python2.7-dev to the package python3.4-dev; or 3.3 if you desire. Finally, when you create your virtualev, instead of passing the argument -p python2.7, you will pass -p python3.4.