1. Linux basics and software installation

Although departmental computers may be available with the software you need to work through this course you will probably also want to run calculations on your own laptop or workstation. This page will help you with:

  • Basic introduction to using the Linux command line.
  • Installing the programs needed for the work in this course.

These instructions are also largely compatible with macOS and should be similar to what is required for use in the Windows Subsystem for Linux (but this has not been thoroughly tested, so it is possible that some things might not work correctly and WSL users will have to check this for themselves). Another option if you do not have Linux installed on your own machine and do not want to permanently change operating system is to use a virtual machine with Linux installed in it. A popular option for this is VirtualBox.

Warning

All the software that we will use here is freely available for educational and academic research purposes.

If you want to use this software for commercial purposes (e.g. a research collaboration with an industry partner) it is up to you to determine whether the license(s) for the relevant software permit you to do this or not.

1.1. Linux

Simple tutorials to learn basic Linux shell commands can be found online, for example:

Open a new terminal – you can either do this using the GUI or on the keyboard with ctrl-alt-t. When the terminal opens it will be running the Linux ‘shell’ which is an environment that allows you to perform numerous tasks from the keyboard without the need to open menus over and over again. Although a number of shells exist, the default in Linux is bash.

When the shell opens you will be in your home directory and here you can create the files and directories that will be needed in these practicals. Let’s practice some useful commands:

  • pwd – the output will be your ‘present working directory’. This is very useful as it is easy to lose your place once you start creating more directories and moving around them.
  • ls – list the files and directories that exist in the directory that you are in. bash helpfully highlights things in different colours e.g. directories are blue and executable files are green. If you want to see what is in a directory type:
    ls directory
    (replacing directory with the name of the directory you want to look in).
  • mkdir – create a new directory e.g.
    mkdir practical_1
    will create a directory named ‘practical_1’.

Note

It is good to get into the habit of using underscores ( _ ) instead of spaces in file and directory names as empty spaces can lead to problems like the name being treated as two separate strings instead of one.

  • cd – change directory. So if you created the directory ‘practical_1’ then
    cd practical_1
    will move you to that directory. (Do pwd after cd and you will see that you are in a different directory).
    Some simple shortcuts exist that help when moving around the Linux file system:
    • ~ is a shorthand for your home directory e.g. cd ~ will take you back to your home regardless of where you started from.
    • .. is a shorthand for the directory above the one you are in, so cd .. will move you one directory up.
    • You can also use these with other commands e.g. ls ~ will show you what is in your home directory even if you are in a different one.
  • mv – this moves a file or directory. Typing
    mv practical_1 first_practical
    will move/rename the directory ‘practical_1’ to ‘first_practical’. ls will now show the new name that you gave it.
  • cp – copies a file e.g.
    cp file_1 file_2
    will make a copy of ‘file_1’ and call it ‘file_2’.

Use the following commands carefully. Linux will not ask you if you are sure you want to delete things and there is no ‘undo’.

  • rm – allows you to remove/delete files.
  • rmdir – removes directories (only if they are empty).
  • rm -r – can be used to remove a non-empty directory. In addition to removing any files in that directory it will also recursively (the -r modifier) remove any subdirectories in that directory along with their contents. Be very careful with this command as you could potentially remove all the files in your home directory with it – or worse…

As with rm, all of these commands can do more complicated things if you supply modifiers/flags to them but these are beyond the scope of this basic introduction. If you want to know more about these commands you can type:

man commandname

to display the manual page for the command. The up and down arrows will let you scroll through the page. When done viewing a man page, typing q allows you to exit back to the command line.

1.2. Python installation

Although you may already have Python on your computer we will use a separate Python installation to avoid any potential conflicts with the system version. The full installation will exist in a folder in your home directory meaning that if you encounter problems later it will be possible to simply delete the entire folder and start again without harming your system.

We will use the Miniconda Python distribution. This is a minimal installation that, in addition to a separate version of Python, will give you access to the conda package manager which makes installation of Python programs simpler by taking care of any dependencies that they might have.

Note

The Miniconda installer can be found at this page.

Follow the link and download the version of the installer you need from the section Latest Miniconda Installer Links. Download the relevant file and it will be placed in your Downloads folder.

For Linux choose the Miniconda3 Linux 64-bit installer.

Type

ls ~/Downloads/

and you should see the miniconda installer in the Downloads directory.

You can run the installer in the shell by typing:

bash ~/Downloads/Miniconda3-latest-Linux-x86_64.sh

Select the default answers to all of the questions that it asks and it will install miniconda in a directory called miniconda3 in your home directory.

Now you are ready to install the Python software needed for these classes.

1.3. AutoDock Vina

Now that you have your miniconda Python installation you can install the protein-ligand docking software AutoDock Vina [1] (we will simply refer to it as vina from now on).

For this we will use the conda package manager.

First, we will create a separate virtual Python environment for vina. This is good practice since when installing packages like this that have a number of dependencies in a single environment there is always the risk of clashes with versions of libraries and other dependencies between one package and another. Separate environments avoid this problem.

To create a new environment for vina do the following:

conda create --name vina python=3

This will create the environment and install the most recent version of Python 3 in it.

At the start of the command prompt you will see ‘(base)’ which tells you that you are in the miniconda base environment. Now type:

conda activate vina

You will see that ‘(base)’ has changed to ‘(vina)’ and this means that you are now in the new virtual environment that you created. From here you can now install the vina software package. The command to do this is simply:

conda install -c conda-forge vina

The -c conda-forge part tells conda to look for the vina package in a software channel called ‘conda-forge’.

Once you have installed vina you can check if it is there by typing:

vina

which will run the program and produce a usage menu and an error that ends with the line:

ERROR: The receptor or affinity maps must be specified.

Once vina is successfully installed you can exit the virtual environment by typing:

conda deactivate

which will drop you back into the base environment (check the beginning of the command prompt to make sure).

1.4. OpenBabel

openbabel [2] is a powerful command line code for the manipulation of chemical structure files. In addition to being able to convert between a huge number of file formatsopenbabel can perform a wide range of cheminformatics and molecular modelling tasks.

Native versions for Windows and Mac exist and can be downloaded from the openbabel site. On most Linux distributions it is possible to install via the system package managers. The method we will choose here though is to install using conda as we did for the codes above.

Warning

It is important to install openbabel version 3.x. In earlier versions there was an error in the creation of pdbqt files needed for docking with vina.

As openbabel is a very useful piece of software we will install it in the base Python environment so that it is available as soon as you start a terminal, and you do not need to activate and deactivate environments every time you want to use it. Note, however, that this means that you may not have access to it if you are in an environment other than base – if you find you need to use openbabel a lot in a different environment for your work, you can simply install it there as well.

To install with conda do the following:

conda install -c conda-forge openbabel

Once it is installed run

obabel -L formats

for a full list of the different chemical file formats that openbabel can convert. Simply running

obabel -L

results in a list of help topics that are available.

1.5. Avogadro

Although many structures for drug-like molecules can be downloaded from online databases it is always useful to have the ability to graphically edit them or even to create useable structures from scratch for your own research. A very useful piece of software for this is avogadro [3] which can be found  here.

In addition to building/editing molecules, avogadro allows you to optimise their geometry and even perform conformational searches via an interface to openbabel (a version of openbabel is included with the avogadro package so it will work independently of your command line openbabel installation).

The avogadro site has download instructions for installers for Windows and Mac but those fortunate enough to be using Linux can most probably install it directly from their package manager.

Note

In the most recent Ubuntu Linux distributions the default avogadro is a new version with a slightly different interface layout: avogadro2. With some small differences (mainly in the menu layout) this newer version is compatible with the material in this course.

As different users of this course may not all be running the latest Ubuntu (or even Linux) we will stick with the instructions for what is still the most common version of avogadro.

For example, in Ubuntu you can install avogadro with apt as follows:

sudo apt install avogadro

sudo lets you run things with super-user (administrator) privileges and is required for installation using apt.

When you use the sudo command you will have to have administrator permissions and it will ask for your password to enable this (if you installed Linux on your own computer you will be the super-user and therefore your login passsword will be the one required – however, on e.g. a departmental Linux machine you probably won’t have these permissions and you will have to ask for avogadro to be installed for you if it is not already there).

Some instructions for other Linux distributions are available on the avogadro site. Note, however, that these do not appear to have been updated in some time.

1.6. UCSF Chimera

chimera [4] is a graphical program for the display and analysis of (bio)molecules and has many capabilities that go far beyond what we need for this course. Fortunately, it also makes graphical setup, running and subsequent analysis of vina protein-ligand docking calculations much easier. It also has the ability to produce publication-quality graphics easily which is always handy.

Installers and installation instructions for your operating system can be found here. Choose the Current Production Releases as these are the most stable.


References

[1] Trott, O. & Olson, A. J.; J. Comput. Chem.31, 455 (2009).

[2] O’Boyle, N. M. et al.; J. Cheminform.3, 33 (2011).

[3] Hanwell, M. D. et al.; J. Cheminform.4, 17 (2012).

[4] Pettersen, E. F. et al.; J. Comput. Chem.25, 1605 (2004).