Software Environments: Difference between revisions

From hpcwiki
Jump to navigation Jump to search
No edit summary
(updated conda output)
 
(2 intermediate revisions by the same user not shown)
Line 80: Line 80:
==== conda and miniconda ====
==== conda and miniconda ====


Another virtual environment package is conda. Short for anaconda, it is a full Python environment. A minimal/bare environment is miniconda, where packages and their dependencies can be installed with the conda package manager.
Another virtual environment package is conda, which is an abbreviation for 'anaconda'. It is a package manager for installing complete Python environments. A minimal/bare version is miniconda, which installs only the required python modules and dependencies.
 
For installing and using specific python packages, miniconda is preferred because it uses less disk space than conda.
For installing specific python packages, miniconda is preferred because it uses less disk space as conda.


To install miniconda, download the installer from the anaconda website:
To install miniconda, download the installer from the anaconda website:
Line 90: Line 89:
Then execute it:
Then execute it:


   bash Miniconda3-latest-Linux-x86_64.sh
   sh Miniconda3-latest-Linux-x86_64.sh
 
Accept the license agreement, then select a directory-name to install:
 
  Miniconda3 will now be installed into this location:
  /home/feverdij/miniconda3
  - Press ENTER to confirm the location
  - Press CTRL-C to abort the installation
  - Or specify a different location below


Select a directory-name to install. The default should be fine.
The default should be fine. When the installer is finished, it asks:


   Do you wish the installer to initialize Miniconda3
   Do you wish the installer to initialize Miniconda3
   by running conda init? [yes|no] yes
   by running conda init? [yes|no] yes


If 'yes', the script will modify .bashrc to start the conda environment every time you log into the cluster. You can defer this choice by selecting 'no' and do
If 'yes', the script will modify shell startup script to start the conda environment every time you log into the cluster. It is recommended to say 'yes'.
 
After a logout/login the conda command should be available. If this is not the case, there may be a problem sourcing ~/.bashrc , if you are using bash as your shell. As a workaround, you can copy '~/.bashrc' to '~/.profile' and logout/login again.
  conda init
 
later...
 
After a logout/login the conda command should be available. If this is not the case, there may be a problem sourcing ~/.bashrc. As a workaround, you can copy ~/.bashrc to ~/.profile and logout/login again.


If you see '(base)' in your prompt, then conda is installed and ready to use.
If you see '(base)' in your prompt, then conda is installed and ready to use.
Line 133: Line 135:
A specific python version can also be installed using conda:
A specific python version can also be installed using conda:


   conda create -n mypython python-3.6.5
   conda create -n mypython python=3.9
   conda activate mypython
   conda activate mypython
   (mypython) [feverdij@hpc06:~]$ python3
   (mypython) [feverdij@hpc25:~]$ python3
   Python 3.6.5 |Anaconda, Inc.| (default, Apr 29 2018, 16:14:56)  
   Python 3.9.16 (main, May 15 2023, 23:46:34)  
   [GCC 7.2.0] on linux
   [GCC 11.2.0] :: Anaconda, Inc. on linux
   Type "help", "copyright", "credits" or "license" for more information.
   Type "help", "copyright", "credits" or "license" for more information.
   >>>
   >>>  
 
==== Using conda in a batch script ====
 
When submitting batch jobs which require the use of conda environments, you need to tell the node(s) where the conda program can be found before you can activate the desired environment.
 
The easiest way to do this is to not rely on 'conda init'. Instead, put the following line in your batch script:
 
  source $HOME/miniconda3/etc/profile.d/conda.sh
 
...if you installed conda in the default location.
After that, you can activate your environment with 'conda activate'

Latest revision as of 12:58, 23 June 2023

Python Environments


Python

It is possible to install modules or create your own Python environment in your home directory if the python environment on the HPC machine is not suitable to run certain programs: Either because there are modules missing or their version are too old or too new.

This allows you to install extra modules or different versions thereof, or even an entirely different Python version.

Please note that Python 2 is now obsolete and end-of-life. Everybody should consider using Python 3 or migrating to it.

There are several ways to create an environment:


pip/pip3

Pip and its Python 3 equivalent pip3 are installation tools for the Python packages index, abbreviated to PyPI. This allows you to install new modules or programs which are not installed (yet). You can search through the package index on https://pypi.org/

For instance, if you want to install tensorflow, do:

 module load devtoolset/8
 pip3 install --user tensorflow

Pip3 will then download tensorflow and compile and install its dependent modules. When finished, you can check tensorflow's version by:

 [feverdij@hpc12:~]$ python3
 Python 3.6.8 (default, Apr  2 2020, 13:34:55) 
 [GCC 4.8.5 20150623 (Red Hat 4.8.5-39)] on linux
 Type "help", "copyright", "credits" or "license" for more information.
 >>> import tensorflow as tf
 >>> tf.__version__
 '1.14.0'

pip3 list gives a list of locally installed modules.

uninstalling pip modules can be done with :

 pip3 uninstall <name of module>

Note that sometimes pip will install different versions of system modules like numpy/scipy. Since the locally pip-installed modules takes precedence over the system ones, one may get into problems with code developed with the native system modules.

Also, if you need to install multiple programs and modules, pip can cause conflicts between programs if there are dependency conflicts. sometimes these are not easily resolvable, which means you need to up- or downgrade your modules.


virtualenv/venv

Virtualenv and venv (for python3) are a solution to pip dependency problems by creating a separate environment for a python program. It creates a directory where the virtual environment is installed. If you want to use it, you can activate that environment.

Lets try to install pytorch. First install a virtual environment:

 virtualenv pytorch

Then activte it:

 source pytorch/bin/activate

When activated, you see the environment in brackets:

 (pytorch) [feverdij@hpc12:~]$ 

Inside the environment, you can use pip to install pytorch

 pip install future torch torchvision

and check it with

 (pytorch) [feverdij@hpc12:pytorch]$ python
 Python 2.7.5 (default, Aug  7 2019, 00:51:29) 
[GCC 4.8.5 20150623 (Red Hat 4.8.5-39)] on linux2
 Type "help", "copyright", "credits" or "license" for more information.
 >>> import torch
 >>> print torch.__version__
 1.4.0

If you need to return to your normal python environment, do:

 deactivate

conda and miniconda

Another virtual environment package is conda, which is an abbreviation for 'anaconda'. It is a package manager for installing complete Python environments. A minimal/bare version is miniconda, which installs only the required python modules and dependencies. For installing and using specific python packages, miniconda is preferred because it uses less disk space than conda.

To install miniconda, download the installer from the anaconda website:

 wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh

Then execute it:

 sh Miniconda3-latest-Linux-x86_64.sh

Accept the license agreement, then select a directory-name to install:

 Miniconda3 will now be installed into this location:
 /home/feverdij/miniconda3
 - Press ENTER to confirm the location
 - Press CTRL-C to abort the installation
 - Or specify a different location below

The default should be fine. When the installer is finished, it asks:

 Do you wish the installer to initialize Miniconda3
 by running conda init? [yes|no] yes

If 'yes', the script will modify shell startup script to start the conda environment every time you log into the cluster. It is recommended to say 'yes'. After a logout/login the conda command should be available. If this is not the case, there may be a problem sourcing ~/.bashrc , if you are using bash as your shell. As a workaround, you can copy '~/.bashrc' to '~/.profile' and logout/login again.

If you see '(base)' in your prompt, then conda is installed and ready to use.

To create a new conda environment with for instance the pylops package, do:

 conda create -n pylops -c conda-forge pylops

and activate the environment with:

 conda activate pylops

After a while verify that your package is installed and your environment is ready to use:

 (base) [feverdij@hpc06:~]$ conda activate pylops
 (pylops) [feverdij@hpc06:~]$ python3
 Python 3.8.10 | packaged by conda-forge | (default, May 11 2021, 07:01:05) 
 [GCC 9.3.0] on linux
 Type "help", "copyright", "credits" or "license" for more information.
 >>> import pylops
 >>> pylops.__version__
 '1.13.0'
 >>> 

and leave the environment with:

 conda deactivate

A specific python version can also be installed using conda:

 conda create -n mypython python=3.9
 conda activate mypython
 (mypython) [feverdij@hpc25:~]$ python3
 Python 3.9.16 (main, May 15 2023, 23:46:34) 
 [GCC 11.2.0] :: Anaconda, Inc. on linux
 Type "help", "copyright", "credits" or "license" for more information.
 >>> 

Using conda in a batch script

When submitting batch jobs which require the use of conda environments, you need to tell the node(s) where the conda program can be found before you can activate the desired environment.

The easiest way to do this is to not rely on 'conda init'. Instead, put the following line in your batch script:

 source $HOME/miniconda3/etc/profile.d/conda.sh

...if you installed conda in the default location. After that, you can activate your environment with 'conda activate'