Saturday, July 2, 2022

Apache + mod_wsgi + python virtual environments within PLESK obsidian on ubuntu 18LTS

This post describes a way to setup Plesk for serving python web applications

Plesk is an excellent server for managing webservers. It works perfectly for PHP, and it integrates with many tools and plugins. It is a good option to delegate administration and management roles to web sites and web application managers, building all the needed insulation between the different contexts (domains). Plesk is a commercial tool, that comes for windows or for linux.

I am using it on Linux Ubuntu server 18 LTS. 
With Plesk, a single machine can run hundreds of webservers, each with its own PHP version, with per-site, independent package and dependency managment.

Most of Plesk power comes from a super-tight and careful integration between plesk itself, its management interfaces, and the underlying installation of Apache and Nginx, two of the main components of its tool stack. On the back-end, plesk has a tight integration with mysql/mariaDB. 

As a professional tool, Plesk does well its job, and excels in managing LAMP stack apps, like  WordPress, for which it can detect plugins, manage related patches and updates. 

Typical layers
In its typical layering, here are the main components sitting on the information flow between the client and the data.

Client, internet-firewall, nginx/apache, PHP interpreter & user-code, database


Our plesk installation
In our use-case, plesk serves customers websites (mostly built with Wordpress), some custom applications (built with proprietary PHP frameworks), and some old traditional static HTML websites.
We also have some context where we have old Perl cgi-bin applications.
We run two plesk servers (actually they are virtual machines on a VmWare infrastructure), that we regularly update and periodically reinstall on most update system platforms. One of our servers is currently running on an aging and unsupported CENTOS v7. The newer one runs on Ubuntu Linux 18LTS. In both servers the plesk version is the same.

Apache and python using CGI interface 
In order to allow running python code to serve web application, the easiest way is to use CGI, and this requires activating the mod_python apache module. This is old and traditional. Apache receives the request, and spawns a process running the interpreted code to generate the reply. Parameters to the interpreted code are passed thru the environment. This process is simple, but not effective to support high density conversations between client and server, because in order to build each reply a new process has to be created and then destroyed.

Apache and python using WSGI interface
To avoid the burden of process re-spawning, WSGI standard was evolved. In this case, the interpreter is launched once, and sits in memory. Apache invokes a special function every time a requests come to the client, and that function gives back the dynamic HTML to be sent back to the client. This requires apache mod_wsgi but this module has to be compiled specifically for the python interpreter that it has to interact with (interaction takes place thru memory hooks). mod_wsgi also monitors the code files, and automatically reloads the interpreter and the code every time the code changes. 

Python virtual environments
Python projects tend to use libraries, and projects who rely on many libraries become complex to manage because of intricate dependency problems. For this reason, python virtual environments were created. Python3 has a specific component, part of its core libraries, called venv. This is very well integrated. For python2 and python3 another components, called virtualenv is more common. venv and virtualenv have to be installed on the main python at the os system level, from root user. 
Once this is done, each non-privileged user can create her own python virtual environments, for each project, keeping multiple library versions as needed, without dependency conflicts, and without impacting the root python installation. 
Python virtual environments are not to be confused with Apache virtual servers.

How Plesk insulates web production environments
Plesk relies on linux user and file permissions and on apache virtual servers to create compartments to insulate webservers.
Specifically, each user that owns or manages a website has a home folder, and there are Apache Virtual Servers settings specifying the different web folders for code and media files.
Each user can manage apache settings within his virtual-host, and can control the PHP/HTML/CSS/Javascript (... etc) code of his web applications, having full access just to his home folder and subfolders.

Apache+wsgi+user-level-python virtual environments for python web code, within plesk
Apache modules mod_python and mod_wsgi are incompatible. So in order to enable mod_wsgi you need to disable and maybe uninstall mod_python. The modules are to be installed from root user, via apt, from the official repos of your distribution. This will take care of satisfying the correct dependency between the python version and the compile options used to build mod_wsgi.

# apt install libapache2-mod-wsgi-py3

Check that the mod_wsgi is among the enabled modules. No configs are required at this stage.
From root,  I installed the python main environment, together with basic pip tools, and virtual environment venv components for python3

# apt install python3
# apt install python3-pip
# apt install python3-venv
# python3 -m pip install pip --upgrade

After these actions i enabled interactive shell access for my user in plesk administration, allow the user to access (users normally do not need shell access).
From plesk administration, check that mod_wsgi is selected (tools&settings/apache settings).

I switch to unprivileged user, and created a new python virtual environment, from the home folder.

$ python3 -m venv macs
$ source ./macs/bin/activate

(macs) $python -m pip install pip --upgrade
(macs) $python -m pip install qrcode
(macs) $python -m pip install pillow

In the new environment python -m pip is used to perform library and package installations, from python repositories. Here I just installed qrcode and pillow libraries. These installed components end up in ~/macs/lib/python3.6/site-packages

In compliancy with plesk standards, the full path of the virtual environment macs is:

/var/www/vhosts/<mydomain>/macs

files in this user folder have <user> as owner, and psacln as group. The python virtual environment folder is not related to the folder where python web code is.


Creating Python webcode folder:
This is the place where the python application code will go 
(macs) mkdir /var/www/vhosts/<mydomain>/httpdocs/python
here is a basic wsgi compliant python application, generating a qrcode, that I saved in qr.py.




Adjusting virtual-host settings for apache, within plesk administration
These settings allow apache to appropriately serve the python generated data, and to connect to running instances of python. 
These settings are to be placed within the additional apache directives in Domains/<mydomain>/Hosting&DNS/Apache&Nginx settings
I am listing just the https section, because this is the protocol that I am using.
The first Location is used to restrict web access only to internal addresses.
The ScriptAlias related to cgi-bin is allowing some perl code components to work alongside python code, in CGI mode (as explained python can not work simultaneously in CGI and WSGI because of mod incompatibilities)

<Location "/">
Order Deny,Allow
Deny from all
Allow from 172.16.0.0/12
</Location>
ScriptAlias "/cgi-bin/" "/var/www/vhosts/mydomain/httpdocs/cgi-bin/"

<IfModule mod_wsgi.c>
WSGIScriptAlias /python/ /var/www/vhosts/mydomain/httpdocs/python/
WSGIDaemonProcess macs user=u_macs group=psacln threads=5 python-home=/var/www/vhosts/mydomain/macs
WSGIProcessGroup macs
WSGIApplicationGroup %{GLOBAL}
<Directory /var/www/vhosts/mydomain/httpdocs/python>
Require all granted
</Directory>
</IfModule>

these settings allow URLs like https://myserver/python/appcode.py to be processed via wsgi, loading and executing /var/www/vhosts/mydomain/httpdocs/python/appcode.py

Once done, I can access my new wsgi application, served by plesk, accessing its URL:
https://mydomain/python/qr.py






Conclusions
I am happy to have been able to use plesk for python web code serving, also allowing use of python virtual environments.

I hope that plesk will certify and better support this possibility in the future. I consider it potentially very interesting for data science.

In the next months, I will build a new plesk server, based on ubuntu server 20.04LTS, and I will update this document if I will find relevant considerations.