Erlang style processes in Common Lisp

Discussion:

s***@sonic.net

2015-08-03 15:12:36 UTC

Creators of Erlang have a Lisp background, and one feature of the Erlang
VM (BEAM) that I'd like back-ported into Common Lisp is their process.

An Erlang "process" is cheap to create, cheap to destroy, cheap when
blocked, and upon exit performs bulk gc of its allocated memory; e.g.,
munmap().

Handling tens of thousands of requests per second per node isn't
uncommon, and these often have *several* workers per request or
connection: hundreds of thousands of processes. Under such scenarios,
anything less than this approach to lightweight processes might suffer
from stalls during long gc runs that would be avoided or significantly
reduced under Erlang's model.

How might we get equivalent cheap ephemeral processes into a
contemporary Common Lisp implementation?

For those unfamiliar with it, what Erlang means by "process" stems from
an implementation of Actor Model with caveats. Each process is more
co-routine than thread, and many may run within the same OS thread
managed by the Erlang scheduler. The BEAM VM pre-empts processes via
"reduction" counting, which may be understood as unit of VM time. Their
famed tag line, "Let it crash," may be loosely understood in CL terms as
an implicit HANDLER-CASE.

The open question here is to address a stated non-goal of CL-MUPROC, "we
rely on the CL implementation's MP system" and "considerably heavier
than Erlang processes". [See presentation link from
https://common-lisp.net/project/cl-muproc/ ]

Some Erlang-on-CL packages use OS threads or in the case of
Erlang-in-Lisp, fork(). While easier for a library author to implement,
these contradict the definition of cheap here. Other such packages punt
the issue altogether and instead focus only on pattern-matching aspects
of Erlang the language.

Then there's Lisp-Flavoured-Erlang, and Elixir offers hygienic macros
that manipulate the full AST properly-- all play nice on same Erlang VM
at runtime-- for people simply looking to avoid Erlang syntax or
semantics. But let's focus on cheap Erlang style processes in Common
Lisp here, please.

1. Semantics of library elements are straight-forward enough (e.g.,
SPAWN, EXIT) and may be borrowed wholesale and safely named within their
own package.

2. Memory allocation & garbage collection:

Erlang BEAM VM doesn't begin to reclaim memory until very late, so an
initial implementation here might assume ephemeral worker processes and
omit gc until exit of a worker. However, some processes are long-lived
in practice.

One compromise might be acceptable: one nursery per process, but
anything promoted to higher generations gets handled however your gc
does it now.

This states nothing about use of shared versus multiple heaps across OS
threads, so such matters may continue to be implementation-dependent.

3. Co-routines:

For something similar to Erlang's reductions, there would need to be a
measure of runtime complexity per process.

However, I've used single thread co-routines for near realtime systems
in C and CL with a loop of function pointers and ensuring that each
referenced function executes under some threshold determined through
experimentation and confirmed via test cases. No pre-empting needed.
While fragile and not easily portable across different hardware
(including same architecture), this may be acceptable for a preliminary
draft.

Using CL-MUPROC routines as an example and extending MUPROCN: perhaps
its body becomes this top-level list of functions from which
interleaving across processes may occur. Then add variations for playing
well with DO or LOOP semantics.

4. Message-passing:

SBCL's sb-concurrency extension with Queue and Mailbox (implementation
of "Optimistic FIFO Queue") can be the solution here too. More aligned
with CL principles, allowing for multiple mailboxes-- and therefore
priority messaging-- would be a welcome advancement beyond Erlang.
(Erlang allows only one mailbox per process: sequential but with pattern
matching, nothing out-of-band...)

An important implementation detail that simplifies gc and increases
cache locality across NUMA nodes: messages get duplicated when
delivered-- each process only references its own copy!

5. (Almost) nothing shared:

Erlang enforces strict prohibition against shared memory, and common
practice is to use an in-memory database (ETS) as key/value store in
lieu of globals. Scala allows but discourages shared memory.

A CL-inspired take on this might be to use SBCL's approach with thread
creation: upon creating a new process, you get: A) global special
values, but modify at own risk... B) LET bindings are local to each
process; C) threads don't inherit dynamic bindings from parent. i.e.,
http://sbcl.org/manual/index.html#Special-Variables

6. Scheduling processes on OS threads:

This is delicate in Erlang, and I've experienced specialized use cases
interfering with their scheduler when under heavy load. Instead for our
purposes, let the programmer handle the mapping of processes to OS
threads. Less is more here.

7. Finally, gen_server & OTP fiends may be implemented as their own
packages... or skipped entirely!

Thoughts on feasibility?

Thanks,
-Daniel

Scott McKay

2015-08-03 15:18:40 UTC