$ ls
links  published  categories  alts  content

$ cat links
- Home
- Author

$ cat published


$ cat categories
- Systemd
- Python
- Reference

$ cat alts
- Gopher Mirror
- text/plain; width=40
- text/plain; width=72
- text/plain; width=120
- application/x-troff

$ cat content

         Distributed logging with systemd, journald, and eliot
________________________________________________________________________

It's hard to debug distributed systems.  One possible solution is to use
the Python library [eliot] to trace control flow across the system.  The
problem is that eliot is set up primarily to write to local files, which
is fine for local systems, but not for systems spread across multiple
machines.

This guide covers how I consolidated logs to process on a local machine.


                            Terms and scope
________________________________________________________________________

For the purpose of this document, I need to use a few terms.

The [sink] is the machine that logs will be forwarded to.  It collects
all of the logs that are pushed to it.  There should be only one sink.

A [source] is one machine that is forwarding logs.  There may be multi-
ple sinks.

There is also a matter of scope of this solution.

The sink must be publicly accessible (i.e. has an IP address).

The sink and sources will need to be configured to communicate via HTTP.
HTTPS is possible but is not something I bothered with.

--------8<--------------------------------------------------------------
sink$ uname -a
Linux sink 4.4.0-62-generic #83-Ubuntu SMP Wed Jan 18 14:10:15 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
sink$ systemd --version
systemd 229
+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ -LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD -IDN
sink$ python3.7 --version
Python 3.7.4
sink$ python3.7 -m pip freeze | grep eliot
eliot==1.11.0
eliot-tree==19.0.0

source$ uname -a
Linux source 4.14.154-128.181.amzn2.x86_64 #1 SMP Sat Nov 16 21:49:00 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
source$ systemctl --version
systemd 219
+PAM +AUDIT +SELINUX +IMA -APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 -SECCOMP +BLKID +ELFUTILS +KMOD +IDN
-------->8--------------------------------------------------------------


                              How it works
________________________________________________________________________

On the sink, [systemd-journal-remote] starts an HTTP server on port
19532.  This server listens for POST requests to the [/upload] endpoint
with a Content-Type of [application/vnd.fdo.journal] .  When it gets a
request, it adds it to a log file located in [/var/log/journal/remote/]
with a filename corresponding to the hostname of the remote host like
[remote-1.2.3.4.journal] .

On the source, [systemd-journal-upload] listens for changes to the jour-
nal and for every new log entry, a POST request is made to the sink.
Separately, a Python script is run that produces eliot logs which are
first written to the journal, and then mirrored to the sink.

Back on the sink, these log files are read and filtered by [journalctl]
.  These are then piped to [eliot-prettyprint] or [eliot-tree] .


                              Setup: Sink
________________________________________________________________________

On the sink, install [systemd-journal-remote] or [systemd-journal-gate-
way] for [apt] -based or [yum] -based systems, respectively.  We need to
configure the file [/lib/systemd/system/systemd-journal-remote.service]
and modify the [--list-https=-3] flag to [--list-http=-3] .

--------8<--------------------------------------------------------------
#  This file is part of systemd.
#
#  systemd is free software; you can redistribute it and/or modify it
#  under the terms of the GNU Lesser General Public License as published by
#  the Free Software Foundation; either version 2.1 of the License, or
#  (at your option) any later version.

[Unit]
Description=Journal Remote Sink Service
Documentation=man:systemd-journal-remote(8) man:journal-remote.conf(5)
Requires=systemd-journal-remote.socket

[Service]
ExecStart=/lib/systemd/systemd-journal-remote \
        --listen-http=-3 \
        --output=/var/log/journal/remote/
User=systemd-journal-remote
Group=systemd-journal-remote
PrivateTmp=yes
PrivateDevices=yes
PrivateNetwork=yes
WatchdogSec=3min

[Install]
Also=systemd-journal-remote.socket
-------->8--------------------------------------------------------------

We then need to enable and start the services.

--------8<--------------------------------------------------------------
$ sudo systemctl enable systemd-journal-remote
$ sudo systemctl start systemd-journal-remote
-------->8--------------------------------------------------------------

If you get an exit code of 1 and no log message, there could be two
problems.  One is that the [--listen-https=-3] flag wasn't changed to
http, and it's failing when trying to access the HTTPS certificate.  An-
other is that the [/var/log/journal/remote] folder didn't exist, or
didn't have the right permissions.  The error message you get might look
like this.

--------8<--------------------------------------------------------------
sudo systemctl status systemd-journal-remote.service
* systemd-journal-remote.service - Journal Remote Sink Service
   Loaded: loaded (/lib/systemd/system/systemd-journal-remote.service; indirect; vendor preset: enabled)
   Active: failed (Result: exit-code) since Tue 2020-01-14 12:33:02 CST; 2s ago
     Docs: man:systemd-journal-remote(8)
           man:journal-remote.conf(5)
  Process: 120042 ExecStart=/lib/systemd/systemd-journal-remote --listen-https=-3 --output=/var/log/journal/remote/ (code=exited, status=1/FAILURE)
 Main PID: 120042 (code=exited, status=1/FAILURE)

Jan 14 12:33:02 sink systemd[1]: Started Journal Remote Sink Service.
Jan 14 12:33:02 sink systemd[1]: systemd-journal-remote.service: Main process exited, code=exited, status=1/FAILURE
Jan 14 12:33:02 sink systemd[1]: systemd-journal-remote.service: Unit entered failed state.
Jan 14 12:33:02 sink systemd[1]: systemd-journal-remote.service: Failed with result 'exit-code'.
-------->8--------------------------------------------------------------

If the folder didn't exist, it can be created with these commands.

--------8<--------------------------------------------------------------
sink$ sudo mkdir /var/log/journal/remote
sink$ sudo chown systemd-journal-remote:systemd-journal-remote /var/log/journal/remote
-------->8--------------------------------------------------------------

This might not perist across reboots.  You could add to the tmpfiles.d
configuration file.  I haven't tested this yet so I don't know if it's
needed.


                             Setup: Source
________________________________________________________________________

On the source, [install systemd-journal-remote] or [systemd-journal-
gateway] for [apt] -based or [yum] -based systems, respectively.

We need to configure the [/etc/systemd/journal-upload.conf] file to
point towards our sink.

--------8<--------------------------------------------------------------
[Upload]
URL=http://sink.example.com
# URL=
# ServerKeyFile=/etc/ssl/private/journal-upload.pem
# ServerCertificateFile=/etc/ssl/certs/journal-upload.pem
# TrustedCertificateFile=/etc/ssl/ca/trusted.pem
-------->8--------------------------------------------------------------

After this, you need to enable and start the service.

--------8<--------------------------------------------------------------
source$ sudo systemctl enable systemd-journal-upload
source$ sudo systemctl restart systemd-journal-upload
-------->8--------------------------------------------------------------

You can then test that the forwarding is working by using [systemd-cat]
like this.

--------8<--------------------------------------------------------------
source$ echo hello | systemd-cat
-------->8--------------------------------------------------------------

Then on the sink, check that a file was created in [/var/log/journal/re-
mote/] .


                          Producing eliot logs
________________________________________________________________________

First we need to install Python and then install eliot with the extra
[journald] dependencies.

--------8<--------------------------------------------------------------
source$ sudo yum install python37
source$ python3.7 -m pip install --user 'eliot[journald]'
-------->8--------------------------------------------------------------

Then we can create a [test.py] file that showcases some eliot logs.

--------8<--------------------------------------------------------------
"""
Write some logs to journald.
"""

from __future__ import print_function

from eliot import log_message, start_action, add_destinations
from eliot.journald import JournaldDestination

add_destinations(JournaldDestination())


def divide(a, b):
        with start_action(action_type="divide", a=a, b=b):
                    return a / b

print(divide(10, 2))
log_message(message_type="inbetween")
print(divide(10, 0))
-------->8--------------------------------------------------------------

Then we run that file on the source like normal.

--------8<--------------------------------------------------------------
source$ python3.7 test.py
-------->8--------------------------------------------------------------


                           Showing eliot logs
________________________________________________________________________

On the sink, we can see all journal entries from the the remote hosts
using [journalctl] with a flag to specify the root of our journal en-
tries [-D /var/log/journal/remote/] .

--------8<--------------------------------------------------------------
sink$ sudo journalctl -D /var/log/journal/remote/
-------->8--------------------------------------------------------------

We can just look at the eliot logs from our script using the filter
[SYSLOG_IDENTIFIER=test.py] .  We can also render using eliot-tree by
specifying an output format [--output cat] .  Then we can just pipe into
eliot-tree like normal.

--------8<--------------------------------------------------------------
sink$ sudo journalctl -D /var/log/journal/remote/ --output cat SYSLOG_IDENTIFIER=test.py | eliot-tree --ascii --color never
f6cdd52b-babb-4a66-bcab-52a67bedee2b
+-- divide/1 > started 2020-01-14 19:45:49Z x 0.001s
    |-- a: 10
    |-- b: 2
    +-- divide/2 > succeeded 2020-01-14 19:45:49Z

eef20b8b-b1e9-4987-a840-4d53e9f0612f
+-- inbetween/1 2020-01-14 19:45:49Z

9092d9a9-11e8-4e63-bd9b-297915d5314a
+-- divide/1 > started 2020-01-14 19:45:49Z x 0.000s
    |-- a: 10
    |-- b: 0
    +-- divide/2 > failed 2020-01-14 19:45:49Z
        |-- exception: builtins.ZeroDivisionError
        +-- reason: division by zero
-------->8--------------------------------------------------------------


                               References
________________________________________________________________________

[0]: https://eliot.readthedocs.io/en/stable/outputting/journald.html
The main source of documentation for using eliot with journald.  Shows
how to use [JournalDestination] .

[1]: https://serverfault.com/a/758559
Walks through how to configure and set up [systemd-journal-upload] and
[systemd-journal-remote] .

[2]: https://manpages.debian.org/testing/systemd-journal-remote/systemd-journal-remote.service.8.en.html
Man page for [systemd-journal-remote(8)] .

[3]: https://manpages.debian.org/testing/systemd-journal-remote/systemd-journal-upload.service.8.en.html
Man page for [systemd-journal-upload(8)] .

[4]: https://serverfault.com/a/573951
Shows how to use [systemd-cat(1)] .

[5]: https://unix.stackexchange.com/a/200107
Shows how to use the [-D /path/to/dir] flag.

[6]: https://github.com/jonathanj/eliottree
The main source of documentation and code for [eliot-tree] .
________________________________________________________________________

I'm a big proponent of using all the built-in tools in Python.  This
manifests in weird ways: I like the http.server module; I abuse multi-
processing.Manager; and I like using pure distutils.

The latter gives me problems sometimes because no one seems to use it
like I do.  I often find myself getting lost in Python's lackluster doc-
umentation for this module trying to just find the name of the argument
I need.  It's starting to get a little silly how often and how long I
spend on this problem.

Of course, one solution to this problem is "just use an IDE with auto-
completion."  Another is "just use setuptools which has better documen-
tation."  For various reasons, I prefer to use simpler editors and also
to use standard library functions.

To that end, this document is a consolidation of the things I'm usually
looking for.  It mostly serves as a reference for myself, but maybe it
will be helpful to someone else.

Before the document gets into the details, here are some links that are
helpful.

API documentation for distutils.core.setup
[0]: https://docs.python.org/3/distutils/apiref.html

Official examples for distutils.core.setup
[1]: https://docs.python.org/3/distutils/setupscript.html

List of valid classifier values
[2]: https://pypi.org/classifiers/

API documentation for pkgutil.get_data
[3]: https://docs.python.org/3/library/pkgutil.html#pkgutil.get_data


                          The Simplest Script
________________________________________________________________________

If you are creating a package called MYPACKAGE, your script would look
like this.

--------8<--------------------------------------------------------------
from distutils.core import setup

setup(
        name='MYPACKAGE',
        version='0.1.0',
        packages=[
                'MYPACKAGE',
        ],
)
-------->8--------------------------------------------------------------

You should have a directory called MYPACKAGE with at least an
__init__.py file inside.

--------8<--------------------------------------------------------------
/
  /setup.py
  /MYPACKAGE/
    /MYPACKAGE/__init__.py
-------->8--------------------------------------------------------------


                          Adding Dependencies
________________________________________________________________________

If you need certain dependencies to be installed, you can specify them
with the requires keyword.  In this example, we require at least version
2.20.0 of requests.

--------8<--------------------------------------------------------------
from distutils.core import setup

setup(
        name='MYPACKAGE',
        version='0.1.0',
        packages=[
                'MYPACKAGE',
        ],
        requires=[
                'requests>=2.20.0',
        ],
)
-------->8--------------------------------------------------------------


                          Adding an Executable
________________________________________________________________________

Edit: Oops.  This one does actually depend on setuptools.  If you in-
stall the package with "pip install ."  then it will automatically use
setuptools for you, hence my confusion.

Sometimes you aren't trying to just create a library but also need an
actual executable.  Somewhat confusingly, in distutils land, this is
called an "entry point."  If you want an executable called MYEXEC for
the package MYPACKAGE, then you'll want this.

--------8<--------------------------------------------------------------
# File: setup.py

from distutils.core import setup

setup(
        name='MYPACKAGE',
        version='0.1.0',
        packages=[
                'MYPACKAGE',
        ],
        entry_points={
                'console_scripts': [
                        'MYEXEC=MYPACKAGE.__main__:cli',
                ],
        },
)
-------->8--------------------------------------------------------------

You should have a __main__.py script with a cli function inside.  A sim-
ple __main__.py script looks like:

--------8<--------------------------------------------------------------
# File: __main__.py

from . import hello

def main(name):
        print(f'The output of hello({name!r}) is {hello(name)!r}')

def cli():
        import argparse

        parser = argparse.ArgumentParser()
        parser.add_argument('name')
        args = vars(parser.parse_args())

        main(**args)

if __name__ == '__main__':
        cli()
-------->8--------------------------------------------------------------

The corresponding __init__.py might look like:

--------8<--------------------------------------------------------------
# File: __init__.py

def hello(name):
        return name.upper()
-------->8--------------------------------------------------------------

Now after you install the package, you can use MYEXEC as a normal script
and pass it arguments like: MYEXEC George.

The directory structure here is:

--------8<--------------------------------------------------------------
/
  /setup.py
  /MYPACKAGE/
    /MYPACKAGE/__init__.py
    /MYPACKAGE/__main__.py
-------->8--------------------------------------------------------------


                     Changing the Package Directory
________________________________________________________________________

Sometimes you want to put your code in a different directory than Python
expects.  By default, Python wants MYPACKAGE to be located at ./MYPACK-
AGE, but you can change this to look at ./src/MYPACKAGE or any other di-
rectory instead.

--------8<--------------------------------------------------------------
# File: setup.py

from distutils.core import setup

setup(
        name='MYPACKAGE',
        version='0.1.0',
        packages=[
                'MYPACKAGE',
        ],
        package_dir={
                'MYPACKAGE': 'src/MYPACKAGE',
        },
)
-------->8--------------------------------------------------------------

The directory structure now looks like:

--------8<--------------------------------------------------------------
/
  /setup.py
  /src/
    /src/MYPACKAGE/
      /src/MYPACKAGE/__init__.py
-------->8--------------------------------------------------------------


                       Including Extra Data Files
________________________________________________________________________

You may want to include some extra files with your package.  For in-
stance, I like to include an index.html page with my web server pack-
ages.

--------8<--------------------------------------------------------------
# File: setup.py

from distutils.core import setup

setup(
        name='MYPACKAGE',
        version='0.1.0',
        packages=[
                'MYPACKAGE',
        ],
        package_data={
                'MYPACKAGE': [
                        'static/*',
                ],
        },
)
-------->8--------------------------------------------------------------

Now any file in the static directory in your package will be included.

--------8<--------------------------------------------------------------
/
  /setup.py
  /MYPACKAGE/
    /MYPACKAGE/__init__.py
    /MYPACKAGE/static/
      /MYPACKAGE/static/index.html
-------->8--------------------------------------------------------------

To retrieve this file at runtime, you can use pkgutil.get_data

--------8<--------------------------------------------------------------
# File: __init__.py

import pkgutil

index_html = pkgutil.get_data('MYPACKAGE', 'static/index.html')

print(type(index_html))
#  => bytes
-------->8--------------------------------------------------------------


                         A More Complete Script
________________________________________________________________________

There's lots of parameters, but I suspect that the ones that are most
useful for actually distributing a package are as follows.

--------8<--------------------------------------------------------------
# File: setup.py

from distutils.core import setup

long_description = """\
This is the long description for my package.

It can be pretty long.

You can either use reStructuredText or GitHub Flavored Markdown.
"""

setup(
        name='MYPACKAGE',
        version='0.1.0',
        description="One line description",
        long_description=long_description,
        author='John Smith',
        author_email='johnsmith@example.com',
        url='https://github.com/example/MYPACKAGE',
        license='MIT',
        keywords=[
                'cool',
                'useful',
                'whatever',
        ],
        classifiers=[
                'Development Status :: 1 - Planning',
                'Programming Language :: Python :: 3.8',
        ],
        packages=[
                'MYPACKAGE',
        ],
        requires=[
                'requests>=2.20.0',
        ],
)
-------->8--------------------------------------------------------------