Discussion:
Erlang style processes in Common Lisp
s***@sonic.net
2015-08-03 15:12:36 UTC
Permalink
Creators of Erlang have a Lisp background, and one feature of the Erlang
VM (BEAM) that I'd like back-ported into Common Lisp is their process.

An Erlang "process" is cheap to create, cheap to destroy, cheap when
blocked, and upon exit performs bulk gc of its allocated memory; e.g.,
munmap().

Handling tens of thousands of requests per second per node isn't
uncommon, and these often have *several* workers per request or
connection: hundreds of thousands of processes. Under such scenarios,
anything less than this approach to lightweight processes might suffer
from stalls during long gc runs that would be avoided or significantly
reduced under Erlang's model.


How might we get equivalent cheap ephemeral processes into a
contemporary Common Lisp implementation?


For those unfamiliar with it, what Erlang means by "process" stems from
an implementation of Actor Model with caveats. Each process is more
co-routine than thread, and many may run within the same OS thread
managed by the Erlang scheduler. The BEAM VM pre-empts processes via
"reduction" counting, which may be understood as unit of VM time. Their
famed tag line, "Let it crash," may be loosely understood in CL terms as
an implicit HANDLER-CASE.

The open question here is to address a stated non-goal of CL-MUPROC, "we
rely on the CL implementation's MP system" and "considerably heavier
than Erlang processes". [See presentation link from
https://common-lisp.net/project/cl-muproc/ ]

Some Erlang-on-CL packages use OS threads or in the case of
Erlang-in-Lisp, fork(). While easier for a library author to implement,
these contradict the definition of cheap here. Other such packages punt
the issue altogether and instead focus only on pattern-matching aspects
of Erlang the language.

Then there's Lisp-Flavoured-Erlang, and Elixir offers hygienic macros
that manipulate the full AST properly-- all play nice on same Erlang VM
at runtime-- for people simply looking to avoid Erlang syntax or
semantics. But let's focus on cheap Erlang style processes in Common
Lisp here, please.


1. Semantics of library elements are straight-forward enough (e.g.,
SPAWN, EXIT) and may be borrowed wholesale and safely named within their
own package.

2. Memory allocation & garbage collection:

Erlang BEAM VM doesn't begin to reclaim memory until very late, so an
initial implementation here might assume ephemeral worker processes and
omit gc until exit of a worker. However, some processes are long-lived
in practice.

One compromise might be acceptable: one nursery per process, but
anything promoted to higher generations gets handled however your gc
does it now.

This states nothing about use of shared versus multiple heaps across OS
threads, so such matters may continue to be implementation-dependent.

3. Co-routines:

For something similar to Erlang's reductions, there would need to be a
measure of runtime complexity per process.

However, I've used single thread co-routines for near realtime systems
in C and CL with a loop of function pointers and ensuring that each
referenced function executes under some threshold determined through
experimentation and confirmed via test cases. No pre-empting needed.
While fragile and not easily portable across different hardware
(including same architecture), this may be acceptable for a preliminary
draft.

Using CL-MUPROC routines as an example and extending MUPROCN: perhaps
its body becomes this top-level list of functions from which
interleaving across processes may occur. Then add variations for playing
well with DO or LOOP semantics.

4. Message-passing:

SBCL's sb-concurrency extension with Queue and Mailbox (implementation
of "Optimistic FIFO Queue") can be the solution here too. More aligned
with CL principles, allowing for multiple mailboxes-- and therefore
priority messaging-- would be a welcome advancement beyond Erlang.
(Erlang allows only one mailbox per process: sequential but with pattern
matching, nothing out-of-band...)

An important implementation detail that simplifies gc and increases
cache locality across NUMA nodes: messages get duplicated when
delivered-- each process only references its own copy!

5. (Almost) nothing shared:

Erlang enforces strict prohibition against shared memory, and common
practice is to use an in-memory database (ETS) as key/value store in
lieu of globals. Scala allows but discourages shared memory.

A CL-inspired take on this might be to use SBCL's approach with thread
creation: upon creating a new process, you get: A) global special
values, but modify at own risk... B) LET bindings are local to each
process; C) threads don't inherit dynamic bindings from parent. i.e.,
http://sbcl.org/manual/index.html#Special-Variables

6. Scheduling processes on OS threads:

This is delicate in Erlang, and I've experienced specialized use cases
interfering with their scheduler when under heavy load. Instead for our
purposes, let the programmer handle the mapping of processes to OS
threads. Less is more here.

7. Finally, gen_server & OTP fiends may be implemented as their own
packages... or skipped entirely!


Thoughts on feasibility?

Thanks,
-Daniel
Scott McKay
2015-08-03 15:18:40 UTC
Permalink
Have you looks at Lisp-Flavored Erlang ("LFE")?
- http://lfe.io

It's really quite interesting, IMO.

--Scott
Post by s***@sonic.net
Creators of Erlang have a Lisp background, and one feature of the Erlang
VM (BEAM) that I'd like back-ported into Common Lisp is their process.
An Erlang "process" is cheap to create, cheap to destroy, cheap when
blocked, and upon exit performs bulk gc of its allocated memory; e.g.,
munmap().
Handling tens of thousands of requests per second per node isn't uncommon,
and these often have *several* workers per request or connection: hundreds
of thousands of processes. Under such scenarios, anything less than this
approach to lightweight processes might suffer from stalls during long gc
runs that would be avoided or significantly reduced under Erlang's model.
How might we get equivalent cheap ephemeral processes into a contemporary
Common Lisp implementation?
For those unfamiliar with it, what Erlang means by "process" stems from an
implementation of Actor Model with caveats. Each process is more
co-routine than thread, and many may run within the same OS thread managed
by the Erlang scheduler. The BEAM VM pre-empts processes via "reduction"
counting, which may be understood as unit of VM time. Their famed tag
line, "Let it crash," may be loosely understood in CL terms as an implicit
HANDLER-CASE.
The open question here is to address a stated non-goal of CL-MUPROC, "we
rely on the CL implementation's MP system" and "considerably heavier than
Erlang processes". [See presentation link from
https://common-lisp.net/project/cl-muproc/ ]
Some Erlang-on-CL packages use OS threads or in the case of
Erlang-in-Lisp, fork(). While easier for a library author to implement,
these contradict the definition of cheap here. Other such packages punt
the issue altogether and instead focus only on pattern-matching aspects of
Erlang the language.
Then there's Lisp-Flavoured-Erlang, and Elixir offers hygienic macros that
manipulate the full AST properly-- all play nice on same Erlang VM at
runtime-- for people simply looking to avoid Erlang syntax or semantics.
But let's focus on cheap Erlang style processes in Common Lisp here, please.
1. Semantics of library elements are straight-forward enough (e.g., SPAWN,
EXIT) and may be borrowed wholesale and safely named within their own
package.
Erlang BEAM VM doesn't begin to reclaim memory until very late, so an
initial implementation here might assume ephemeral worker processes and
omit gc until exit of a worker. However, some processes are long-lived in
practice.
One compromise might be acceptable: one nursery per process, but anything
promoted to higher generations gets handled however your gc does it now.
This states nothing about use of shared versus multiple heaps across OS
threads, so such matters may continue to be implementation-dependent.
For something similar to Erlang's reductions, there would need to be a
measure of runtime complexity per process.
However, I've used single thread co-routines for near realtime systems in
C and CL with a loop of function pointers and ensuring that each referenced
function executes under some threshold determined through experimentation
and confirmed via test cases. No pre-empting needed. While fragile and
not easily portable across different hardware (including same
architecture), this may be acceptable for a preliminary draft.
Using CL-MUPROC routines as an example and extending MUPROCN: perhaps its
body becomes this top-level list of functions from which interleaving
across processes may occur. Then add variations for playing well with DO or
LOOP semantics.
SBCL's sb-concurrency extension with Queue and Mailbox (implementation of
"Optimistic FIFO Queue") can be the solution here too. More aligned with
CL principles, allowing for multiple mailboxes-- and therefore priority
messaging-- would be a welcome advancement beyond Erlang. (Erlang allows
only one mailbox per process: sequential but with pattern matching, nothing
out-of-band...)
An important implementation detail that simplifies gc and increases cache
locality across NUMA nodes: messages get duplicated when delivered-- each
process only references its own copy!
Erlang enforces strict prohibition against shared memory, and common
practice is to use an in-memory database (ETS) as key/value store in lieu
of globals. Scala allows but discourages shared memory.
A CL-inspired take on this might be to use SBCL's approach with thread
creation: upon creating a new process, you get: A) global special values,
but modify at own risk... B) LET bindings are local to each process; C)
threads don't inherit dynamic bindings from parent. i.e.,
http://sbcl.org/manual/index.html#Special-Variables
This is delicate in Erlang, and I've experienced specialized use cases
interfering with their scheduler when under heavy load. Instead for our
purposes, let the programmer handle the mapping of processes to OS
threads. Less is more here.
7. Finally, gen_server & OTP fiends may be implemented as their own
packages... or skipped entirely!
Thoughts on feasibility?
Thanks,
-Daniel
Stelian Ionescu
2015-08-03 15:34:42 UTC
Permalink
Post by s***@sonic.net
Creators of Erlang have a Lisp background, and one feature of the Erlang
VM (BEAM) that I'd like back-ported into Common Lisp is their process.
An Erlang "process" is cheap to create, cheap to destroy, cheap when
blocked, and upon exit performs bulk gc of its allocated memory; e.g.,
munmap().
Handling tens of thousands of requests per second per node isn't
uncommon, and these often have *several* workers per request or
connection: hundreds of thousands of processes. Under such scenarios,
anything less than this approach to lightweight processes might suffer
from stalls during long gc runs that would be avoided or significantly
reduced under Erlang's model.
How might we get equivalent cheap ephemeral processes into a
contemporary Common Lisp implementation?
In short, you need to write from scratch a new CL implementation. Current ones are not designed with the Erlang constraints in mind.
--
Stelian Ionescu a.k.a. fe[nl]ix
Quidquid latine dictum sit, altum videtur.
http://common-lisp.net/project/iolib
David McClain
2015-08-03 16:01:22 UTC
Permalink
It would be great to have such rapid fire threads. In addition to Erlang style, it could also support Concurrent ML style and Reppy Channels. Threads would be able to form spaghetti stacks and be quickly pared when dead. Could do a lot more speculative threading.

But that really does sound like a redesign of CL.

(I’m one of the authors of an Erlang-like system in Lisp, called Butterfly. The Scheme guys have one called Termite.)

- DM
Post by Stelian Ionescu
Post by s***@sonic.net
Creators of Erlang have a Lisp background, and one feature of the Erlang
VM (BEAM) that I'd like back-ported into Common Lisp is their process.
An Erlang "process" is cheap to create, cheap to destroy, cheap when
blocked, and upon exit performs bulk gc of its allocated memory; e.g.,
munmap().
Handling tens of thousands of requests per second per node isn't
uncommon, and these often have *several* workers per request or
connection: hundreds of thousands of processes. Under such scenarios,
anything less than this approach to lightweight processes might suffer
from stalls during long gc runs that would be avoided or significantly
reduced under Erlang's model.
How might we get equivalent cheap ephemeral processes into a
contemporary Common Lisp implementation?
In short, you need to write from scratch a new CL implementation. Current ones are not designed with the Erlang constraints in mind.
--
Stelian Ionescu a.k.a. fe[nl]ix
Quidquid latine dictum sit, altum videtur.
http://common-lisp.net/project/iolib <http://common-lisp.net/project/iolib>
Attila Lendvai
2015-08-03 19:36:05 UTC
Permalink
Post by Stelian Ionescu
Post by s***@sonic.net
How might we get equivalent cheap ephemeral processes into a
contemporary Common Lisp implementation?
In short, you need to write from scratch a new CL implementation. Current ones are not designed with the Erlang constraints in mind.
well, Nikodemus had some plans for green threads for SBCL and it
didn't sound like a rewrite.

and adding first class heaps would be a very useful addition with or
without any of the other erlang stuff. i don't know how hard that is,
but i assume it should be rather simple if the responsibility is
pushed over to the user to make sure that there are no dangling
pointers after destroying a heap.

or am i missing something?
--
• attila lendvai
• PGP: 963F 5D5F 45C7 DFCD 0A39
--
“The problem with socialism is that eventually you run out of other
people's money.”
— Margaret Thatcher
Faré
2015-08-03 19:46:50 UTC
Permalink
Post by Attila Lendvai
Post by Stelian Ionescu
Post by s***@sonic.net
How might we get equivalent cheap ephemeral processes into a
contemporary Common Lisp implementation?
In short, you need to write from scratch a new CL implementation. Current ones are not designed with the Erlang constraints in mind.
well, Nikodemus had some plans for green threads for SBCL and it
didn't sound like a rewrite.
and adding first class heaps would be a very useful addition with or
without any of the other erlang stuff. i don't know how hard that is,
but i assume it should be rather simple if the responsibility is
pushed over to the user to make sure that there are no dangling
pointers after destroying a heap.
or am i missing something?
Not Common Lisp (yet), but Racket has custodians for first-class
resource allocation pool, and all kinds of concurrency primitives.
Since it's a programming language designed to implement other
programming languages on top of it, it would make a great basis for a
"new" common lisp implementation.

—♯ƒ • François-René ÐVB Rideau •Reflection&Cybernethics• http://fare.tunes.org
There are a thousand hacking at the branches of evil
to one who is striking at the root. — Thoreau
David McClain
2015-08-03 20:04:18 UTC
Permalink
I did something similar several years ago for Butterfly, where I pre-instantiate a pool of threads into the “stable”, and utilize a Grand Central Dispatch (real and / or simulated) to run various jobs. No sense killing the horses after they run the racetrack.

But this is still a far cry (a few dozen threads) from what Erlang and CML provide (numbering in the thousands).

- DM
Post by Faré
Post by Attila Lendvai
Post by Stelian Ionescu
Post by s***@sonic.net
How might we get equivalent cheap ephemeral processes into a
contemporary Common Lisp implementation?
In short, you need to write from scratch a new CL implementation. Current ones are not designed with the Erlang constraints in mind.
well, Nikodemus had some plans for green threads for SBCL and it
didn't sound like a rewrite.
and adding first class heaps would be a very useful addition with or
without any of the other erlang stuff. i don't know how hard that is,
but i assume it should be rather simple if the responsibility is
pushed over to the user to make sure that there are no dangling
pointers after destroying a heap.
or am i missing something?
Not Common Lisp (yet), but Racket has custodians for first-class
resource allocation pool, and all kinds of concurrency primitives.
Since it's a programming language designed to implement other
programming languages on top of it, it would make a great basis for a
"new" common lisp implementation.
—♯ƒ • François-René ÐVB Rideau •Reflection&Cybernethics• http://fare.tunes.org <http://fare.tunes.org/>
There are a thousand hacking at the branches of evil
to one who is striking at the root. — Thoreau
Stelian Ionescu
2015-08-03 20:19:03 UTC
Permalink
Post by Attila Lendvai
Post by Stelian Ionescu
Post by s***@sonic.net
How might we get equivalent cheap ephemeral processes into a
contemporary Common Lisp implementation?
In short, you need to write from scratch a new CL implementation. Current ones are not designed with the Erlang constraints in mind.
well, Nikodemus had some plans for green threads for SBCL and it
didn't sound like a rewrite.
There are two main reasons why Erlang(the VM, not so much the language IMO) is so effective in its niche:

1) Per-process(green thread) GC heap and no sharing between processes except through very narrow and explicit primitives. This makes Erlang code pretty safe for concurrency as long as you're not seeking to satisfy hard real-time constraints.

2) It has a bytecode-interpreting VM in which each instruction is a safepoint and all I/O is non-blocking, basically implementing scheduler preemption in user space. This allow Erlang watcher processes to randomly kill worker processes and restart them. Being an interpreter is crucial here, because as soon as they tried compiling to native code(HiPE) they had problems with tight loops in pure Erlang code that would not yield to the scheduler and thus starved other processes. Almost nobody uses HiPE.
Erlang code can still block if it calls foreign code or it performs some syscalls that can't be avoided to block, but that's why they try to implement everything in pure Erlang - they switched the default SSL implementation to one written by them, a few years ago, because they had issues with OpenSSL.

So again, unless you're willing to implement something along those lines, you could have Erlang-style concurrency - which is still useful I guess - but you wouldn't achieve the rock-solid robustness for which the Erlang VM is so famous(except for inter-process message queues that are unbounded by default so you can easily use all memory if you're not careful).
--
Stelian Ionescu a.k.a. fe[nl]ix
Quidquid latine dictum sit, altum videtur.
http://common-lisp.net/project/iolib
Scott McKay
2015-08-03 20:26:04 UTC
Permalink
Yeah, this is why I suggested LFE in my first reply.

--Scott
Post by Stelian Ionescu
Post by Attila Lendvai
Post by Stelian Ionescu
Post by s***@sonic.net
How might we get equivalent cheap ephemeral processes into a
contemporary Common Lisp implementation?
In short, you need to write from scratch a new CL implementation. Current ones are not designed with the Erlang constraints in mind.
well, Nikodemus had some plans for green threads for SBCL and it
didn't sound like a rewrite.
1) Per-process(green thread) GC heap and no sharing between processes except through very narrow and explicit primitives. This makes Erlang code pretty safe for concurrency as long as you're not seeking to satisfy hard real-time constraints.
2) It has a bytecode-interpreting VM in which each instruction is a safepoint and all I/O is non-blocking, basically implementing scheduler preemption in user space. This allow Erlang watcher processes to randomly kill worker processes and restart them. Being an interpreter is crucial here, because as soon as they tried compiling to native code(HiPE) they had problems with tight loops in pure Erlang code that would not yield to the scheduler and thus starved other processes. Almost nobody uses HiPE.
Erlang code can still block if it calls foreign code or it performs some syscalls that can't be avoided to block, but that's why they try to implement everything in pure Erlang - they switched the default SSL implementation to one written by them, a few years ago, because they had issues with OpenSSL.
So again, unless you're willing to implement something along those lines, you could have Erlang-style concurrency - which is still useful I guess - but you wouldn't achieve the rock-solid robustness for which the Erlang VM is so famous(except for inter-process message queues that are unbounded by default so you can easily use all memory if you're not careful).
--
Stelian Ionescu a.k.a. fe[nl]ix
Quidquid latine dictum sit, altum videtur.
http://common-lisp.net/project/iolib
Stelian Ionescu
2015-08-03 20:38:09 UTC
Permalink
Post by Scott McKay
Yeah, this is why I suggested LFE in my first reply.
Yep, forgot to reply to your email.

+1 for LFE
--
Stelian Ionescu a.k.a. fe[nl]ix
Quidquid latine dictum sit, altum videtur.
http://common-lisp.net/project/iolib
s***@sonic.net
2015-08-04 06:02:39 UTC
Permalink
Thanks to all who responded!

Here's one message replying to highlights, yet as always very informed
answers all around!
Post by Attila Lendvai
Post by Stelian Ionescu
Post by s***@sonic.net
How might we get equivalent cheap ephemeral processes into a
contemporary Common Lisp implementation?
In short, you need to write from scratch a new CL implementation. Current ones are not designed with the Erlang constraints in mind.
well, Nikodemus had some plans for green threads for SBCL and it
didn't sound like a rewrite.
and adding first class heaps would be a very useful addition with or
without any of the other erlang stuff. i don't know how hard that is,
but i assume it should be rather simple if the responsibility is
pushed over to the user to make sure that there are no dangling
pointers after destroying a heap.
or am i missing something?
This would give a lot of traction for my projects, especially for
*tasks* that generate a lot of garbage to be otherwise collected. (e.g.,
one use case: lots of temporary hash tables that grow to many GB within
a minute, queried a few dozen times, then discarded. I've previously
used 100 line Perl scripts for this, just to avoid the gc hit. Called
it via file based queue and/or pipe. Everything else in SBCL or CCL.)
Post by Attila Lendvai
Have you looks at Lisp-Flavored Erlang ("LFE")?
- http://lfe.io
It's really quite interesting, IMO.
--Scott
Yes, I have worked with LFE, Elixir and Erlang. I was hoping to get
away from their single mailbox per process limitation, among other issues.

Robert & Duncan are doing an excellent job on LFE. It's come a long way
in the past year or so. If anyone hasn't tried it lately, please do!
For instance, they added a macro for those who were put off by the colon
operator, so more conventional module:fn has been usable for over a
year. And Jose has a very extensive macro system for Elixir that's
possibly closest to Common Lisp's for a non-Lisp.
Post by Attila Lendvai
2) It has a bytecode-interpreting VM in which each instruction is a safepoint and all I/O is non-blocking, basically implementing scheduler preemption in user space. This allow Erlang watcher processes to randomly kill worker processes and restart them.
Ah, that's the insight to their details I was unaware of. Thanks!

In earlier CL approaches, I just wrapped the main loop in a handler-case
and found that sufficient for my needs, but others may find my approach
lacking... And you don't want to know about the external watchdog for
something similar but in C.

Erlang's VM has a very large set of use cases that they are supporting,
far larger than the proposed set of features from the original posting.

As you also stated (but trimmed here), yes I can make do without
Erlang's robustness guarantees. A tens of thousands of workers without a
heavy thread pool and fast/cheap workers are my main use cases.
Post by Attila Lendvai
Not Common Lisp (yet), but Racket has custodians for first-class
resource allocation pool, and all kinds of concurrency primitives.
Since it's a programming language designed to implement other
programming languages on top of it, it would make a great basis for a
"new" common lisp implementation.
Thanks for that. I've been looking for an excuse to delve deeper into
Racket, but since I started with scheme R3 in college, that pesky
explicit value for false still feels wrong to me. ;^)


Thanks again everyone,
-Daniel
Antoniotti Marco
2015-08-04 08:49:12 UTC
Permalink
Creators of Erlang have a Lisp background, and one feature of the Erlang VM (BEAM) that I'd like back-ported into Common Lisp is their process.


Wasn’t it Prolog? I believe the first version was written in Prolog.

In any case, apart from the niceties of the Erlang VM, how much of CL will be there in a ground up new implementation? And how much can be leveraged out of different implementations (e.g., LW has wonderful concurrency primitive available)?

Let me say that the traffic on CLL re LFE and Flavors seems a bit misdirected to me. I
Paul Tarvydas
2015-08-04 12:40:39 UTC
Permalink
I've been trying to convince the people over the in FBP group that this
is much easier than it sounds (esp. if you're rolled your own RTOS, as I
have done many times).

(I have to leave now, but will answer in more detail later tomorrow).

Here are some slides for a not-yet-presented talk:

https://github.com/guitarvydas/FBP-from-scratch

And here is a raw implementation of such extremely-green threads done in
C (sorry :-):

https://github.com/guitarvydas/collate-fbp-classic

Given closures, this should be even easier.

'til later

pt
Stelian Ionescu
2015-08-04 17:52:11 UTC
Permalink
Post by Paul Tarvydas
I've been trying to convince the people over the in FBP group that this
is much easier than it sounds (esp. if you're rolled your own RTOS, as I
have done many times).
(I have to leave now, but will answer in more detail later tomorrow).
https://github.com/guitarvydas/FBP-from-scratch
This is correct but useless to most people, who aren't doing signal processing and need to run on an off-the-shelf OS kernel.
--
Stelian Ionescu a.k.a. fe[nl]ix
Quidquid latine dictum sit, altum videtur.
http://common-lisp.net/project/iolib
Paul Tarvydas
2015-09-27 19:16:17 UTC
Permalink
I apologize for my absence.

@David - FBP == flow-based programming.

http://www.jpaulmorrison.com/fbp/

http://www.amazon.ca/Dataflow-Reactive-Programming-Systems-Practical/dp/1497422442/ref=pd_sim_14_1?ie=UTF8&refRID=0KS3EMGV9M1T5D66DCNG

Frame-based is interesting in its own right, though
http://www.amazon.ca/Framing-Software-Reuse-Lessons-World/dp/013327859X

Just about everyone has used a restricted form of FBP. It’s called “BASH pipe syntax”.

@Stelian, this technique works with or without an O/S.

“… Being an interpreter is crucial here, because as soon as they tried compiling to native code(HiPE) they had problems with tight loops in pure Erlang code that would not yield to the scheduler and thus starved other processes. Almost nobody uses HiPE. …”

This is the part that I have difficulties communicating to people :-).

Here’s another attempt, where I discuss long-running loops and rolling your own scheduler...

https://bittarvydas.wordpress.com/2015/09/27/long-running-loops/

pt

David McClain
2015-08-04 20:36:42 UTC
Permalink
What is FBP? (surely not Federal Bureau of Prisons). Frame Based Programming? (having looked at the slides, maybe so), Flow Based Programming?

- DM
Max Rottenkolber
2015-08-17 17:15:44 UTC
Permalink
On Mon, 03 Aug 2015 08:12:36 -0700, shenanigans-65eDfwRo+1xeoWH0uzbU5w
Post by s***@sonic.net
Creators of Erlang have a Lisp background, and one feature of the Erlang
VM (BEAM) that I'd like back-ported into Common Lisp is their process.
An Erlang "process" is cheap to create, cheap to destroy, cheap when
blocked, and upon exit performs bulk gc of its allocated memory; e.g.,
munmap().
I have to voice interest. I would love for a CL implementation to support
lightweight threads.
Post by s***@sonic.net
The open question here is to address a stated non-goal of CL-MUPROC, "we
rely on the CL implementation's MP system" and "considerably heavier
than Erlang processes". [See presentation link from
https://common-lisp.net/project/cl-muproc/ ]
How did I miss cl-muproc? My spare time hacking in the lat months has
been implementing the *exact same thing*. Oh well...
s***@sonic.net
2015-08-18 03:40:40 UTC
Permalink
Post by Max Rottenkolber
On Mon, 03 Aug 2015 08:12:36 -0700, shenanigans-65eDfwRo+1xeoWH0uzbU5w
Post by s***@sonic.net
Creators of Erlang have a Lisp background, and one feature of the Erlang
VM (BEAM) that I'd like back-ported into Common Lisp is their process.
An Erlang "process" is cheap to create, cheap to destroy, cheap when
blocked, and upon exit performs bulk gc of its allocated memory; e.g.,
munmap().
I have to voice interest. I would love for a CL implementation to support
lightweight threads.
There was a successful IndieGoGo campaign for SBCL features in the
recent past, so might there be someone interested, willing and able to
pursue implementing something like this?

This can be its own thing rather than any attempt at paying homage to
Erlang or BEAM vm: for me at least, it's about runtime performance and
avoiding the big gc hits by discarding an entire heap upon worker exit
but in a way that accommodates hundreds of thousands of concurrent
workers. Start small, and build from there. Erlang & BEAM address all
sorts of use cases for which a Lisp system wouldn't necessarily need to
be constrained-- and keeping it more Lispy anyway.


I personally would contribute a few thousand US dollars for such an
effort, as it's beyond my knowledge of SBCL internals at this time.

Extending ECL or as a implementation-specific package for it might be
another option, but again, this is beyond my knowledge of its internals.

While I respect the various languages orbiting Erlang, I'd rather use
Common Lisp.

Thanks,
-Daniel
Daniel Kochmański
2015-08-18 06:09:58 UTC
Permalink
Hello,
Post by s***@sonic.net
Post by Max Rottenkolber
On Mon, 03 Aug 2015 08:12:36 -0700, shenanigans-65eDfwRo+1xeoWH0uzbU5w
Post by s***@sonic.net
Creators of Erlang have a Lisp background, and one feature of the Erlang
VM (BEAM) that I'd like back-ported into Common Lisp is their process.
An Erlang "process" is cheap to create, cheap to destroy, cheap when
blocked, and upon exit performs bulk gc of its allocated memory; e.g.,
munmap().
I have to voice interest. I would love for a CL implementation to support
lightweight threads.
There was a successful IndieGoGo campaign for SBCL features in the
recent past, so might there be someone interested, willing and able to
pursue implementing something like this?
This can be its own thing rather than any attempt at paying homage to
Erlang or BEAM vm: for me at least, it's about runtime performance and
avoiding the big gc hits by discarding an entire heap upon worker exit
but in a way that accommodates hundreds of thousands of concurrent
workers. Start small, and build from there. Erlang & BEAM address all
sorts of use cases for which a Lisp system wouldn't necessarily need to
be constrained-- and keeping it more Lispy anyway.
I personally would contribute a few thousand US dollars for such an
effort, as it's beyond my knowledge of SBCL internals at this time.
Extending ECL or as a implementation-specific package for it might be
another option, but again, this is beyond my knowledge of its
internals.
Could you elaborate a little on this?
Post by s***@sonic.net
While I respect the various languages orbiting Erlang, I'd rather use
Common Lisp.
Thanks,
-Daniel
Regards,
Daniel
--
Daniel Kochmański | Poznań, Poland
;; aka jackdaniel

"Be the change that you wish to see in the world." - Mahatma Gandhi
s***@sonic.net
2015-08-19 03:24:01 UTC
Permalink
Post by Daniel Kochmański
Post by s***@sonic.net
Extending ECL or as a implementation-specific package for it might be
Post by s***@sonic.net
another option, but again, this is beyond my knowledge of its internals.
Could you elaborate a little on this?
As a conceptual design without having examined ECL internal docs or
source-- a very big disclaimer, indeed-- this is one approach
considering primarily the C & Unix perspective:

1. Since my criteria revolve around cheap & efficient workers and
cheating garbage collection upon worker exit, I'd build the memory
management system upon mmap() and munmap() for each worker heap. All
memory consed within context of the worker must be constrained to exist
only within the large blocks of memory from its private mmap()'d areas.

2. Regardless of ECL or other Lisp system, introduce a form such as
WITH-WORKER that could combine Erlang's spawn() parameters and BODY that
is maintained as a list for which the form at runtime would time-slice
between its elements. WITH-WORKER could vaguely resemble
WITH-OPEN-FILE, so let's leave it at that for now. More below.

3. Where Erlang/BEAM has the concept of "reductions" as the means by
which to introduce and enforce some notion of fairness across workers,
perhaps the Lispy version *doesn't* try to protect the programmer from
hurting him/herself. Iterating over the BODY as list would translate
into round-robin interleaving across all WITH-WORKER instances on the
same CPU core-- same scheduler.

4. As with Erlang/BEAM: introduce one scheduler per CPU core. Worker
migration to re-balance load per core is a pain point, but avoid
anything too fancy here. Start with random-but-even distribution and
random-but-even re-balancing.

(But beware: attempting to localize workers based upon message-passing
communication patterns adds too much complexity to get correct for the
"average" use case. Instead, similarly to thread "colours", just
specify affinity hints as optional parameters to WITH-WORKER, which
should be *ignored* on first cut.)

5. As with Erlang/BEAM: when sending messages across cores, *copy*
rather than share data for better cache locality. Of course, the Lispy
approach would probably be to allow copying *or* sharing, and a thin
package on top of something like SBCL's sb-concurrently (with an
optional parameter to override) could enforce the recommended approach.
Caveat below.

(See http://sbcl.org/manual/index.html#sb_002dconcurrency and
particularly take note of the Edya Ladan-Mozes and Nir Shavit paper
reference for minimizing locks for queues.)

6. There are obvious points that I've ignored here, such as in order to
have cheap worker cleanup upon exit, the system would need guarantees
that no other worker is sharing its data. There are sound reasons why
Erlang semantics are pure functional programming! But for the Lispy
approach, perhaps keep the global heap for such things, and anything to
be shared goes there-- along with the performance penalty for reaching
across NUMA nodes. That is, "here's enough rope..." But seriously,
just as Scala used OO early in its lifetime yet largely considers it
best to avoid that now, so too I suspect we'd feel the same about shared
read/write memory in the long-run once we have enough lines of code
written with this proposed system.


Of course, the above may have no contact with reality of ECL.

For instance, if there is an equivalent to a global lock then the above
design fails quickly.

Perhaps this gives ideas to someone who has working knowledge of such
internals.

Thanks,
-Daniel
David Holz
2015-08-19 05:07:51 UTC
Permalink
Post by s***@sonic.net
5. As with Erlang/BEAM: when sending messages across cores, *copy*
rather than share data for better cache locality. Of course, the
Lispy approach would probably be to allow copying *or* sharing, and a
thin package on top of something like SBCL's sb-concurrently (with an
optional parameter to override) could enforce the recommended
approach. Caveat below.
One of the larger issues I had with Erlang was that you couldn't sic a
bunch of threads on analyzing or searching immutable multi-GB raw data
structures (not ETS entries, nor hacking around with byte blobs).

My ideal layout would involve per-process heaps, with the option to
either cons into or copy into the immutable shared heap. I believe
collecting a shared add-only immutable-data heap in this form might not
require stopping the world. As was mentioned elsewhere, however,
designing around immutability advantages does venture further away from
Lisp's style. Splitting memory into per-process heaps, shared immutable
heap, and a shared mutable stop-the-world-to-GC heap might be going a
little overboard, especially since the allocation/copy style generally
has to be declared by the human.
--
David Holz
Director, Grindwork Corporation
http://www.grindwork.com/
Loading...