on the creation of compute-heavy scientific microservices
My research over the past few years has been targeted towards the realm of scientific microservices. To be concrete, in this post I am using the following definitions:
Many of the most useful scientific tools are only usable from heavyweight and monolithic native applications. Some examples include ParaView, VisIt, and Tableau. Although these tools have improved and now offer a degree of “scriptability” and control, they are still designed and used by single users on their own computers. As heavy as they are, this also means that everyone that wants to use the tools will need a strong computer of their own. In an organization (whether a business or a school), it would be better to buy one really strong computer and allow users to borrow the compute resources of that server.
Web services support this role of resource sharing exceptionally well. In fact, ParaView has adopted this functionality in their tool ParaViewWeb, and although very exciting for embedding visualization in many applications, it still falls short in an important aspect: they still intend for only 1 user per machine. One reason for this is that, although ParaView now communicates over HTTP, it is still monolithic underneath the hood and must be treated as such. Hence, it is not sufficient to have a “service” because it may still be too large.
Microservices have taken off across many companies and organizations. They separate themselves from traditional services in that each microservice is responsible for a very small domain. For example, a service may be responsible for users, payment processing, and the domain logic of an application, but a microservice solution would have at least 3 separate services, one for users, one for payment, and one for domain logic.
Exposing these scientific tools with a web server is nontrivial. They are often written in C/C++ with high performance libraries that require specific environments to function. For example, a tool might use Open MPI and its executables need to be run with
mpirun(1) instead of just being exposed as a shared library.
This post is primarily to showcase different methods of operating scientific tools in Python using a web server. For simplicity, the code samples target the Flask web framework and a quadratic integration method. Where possible, we try to support different functions, and in some cases, we can even pass in a Python function that the tool can call directly instead of pre-compiling a set of functions.
The methods showcased range from least to most effort and likewise from least to most performance: