Discussion:
Heartbleed?
David McClain
2014-04-12 21:52:50 UTC
Permalink
Just curious for other opinions... but wouldn't this (Heartbleed) sort of buffer excess read-back failure have been prevented by utilizing a "safe" language like Lisp or SML?

I used to be an "unsafe" language bigot -- having mastered C/C++ for many years, and actually producing C compilers for a living at one time. I felt there should be no barriers to me as master of my machine, and not the other way around.

But today's software systems are so complex that it boggles the mind to keep track of everything needed. I found during my transition years that I could maintain code bases no larger than an absolute max of 500 KLOC, and that I actually started losing track of details around 100 KLOC. Making the transition to a higher level language like SML or Lisp enabled greater productivity within those limits for me.

Dr. David McClain
dbm-h9Lvq2qNYxl8mnqqp10EyDJtdDapTP/***@public.gmane.org
William Lederer
2014-04-12 22:03:15 UTC
Permalink
I do software security professionally these days.

While it is easier (e.g., almost possible) to do memory corruption/buffer
overrun/stack smashing in any language, it is certainly far easier to do so
in C and C++. Many languages these days link to C libraries, thus
increasing the possibility.

However, much of my work these days is done against .net applications,
which is a managed, garbage collected language. The number and frequency of
errors in the code is not smaller than with C. It is still possible to get
remote code execution through the IIS/.net web stack.

Application security is very difficult, and not very many of us write
error-free code.

To me the issue with OpenSSL (and there are still some that remain,
although the ones that I know about are not as severe) is that the code is
very unclear and hard to reason about. In fact, the best static code
analyzers had to be tweaked to see the issue.

Having many years experience in both C and C++, I find that working in Lisp
is much easier to make assertions about its fine-grained behavior, pretty
much agreeing with your experience.

I would like to rephrase the question: which language makes it easier to
reason about a large code base? My vote is for the Lisp family. However,
keep in mind one of the best-written programs out there, Qmail, is written
in C. There is a lot to be said for who the author/authors are as well as
the language.

wglb
Post by David McClain
Just curious for other opinions... but wouldn't this (Heartbleed) sort of
buffer excess read-back failure have been prevented by utilizing a "safe"
language like Lisp or SML?
I used to be an "unsafe" language bigot -- having mastered C/C++ for many
years, and actually producing C compilers for a living at one time. I felt
there should be no barriers to me as master of my machine, and not the
other way around.
But today's software systems are so complex that it boggles the mind to
keep track of everything needed. I found during my transition years that I
could maintain code bases no larger than an absolute max of 500 KLOC, and
that I actually started losing track of details around 100 KLOC. Making the
transition to a higher level language like SML or Lisp enabled greater
productivity within those limits for me.
Dr. David McClain
_______________________________________________
pro mailing list
http://common-lisp.net/cgi-bin/mailman/listinfo/pro
Bob Cassels
2014-04-12 22:34:59 UTC
Permalink
That was always part of the Lisp dogma. It's probably even true.
Post by David McClain
Just curious for other opinions... but wouldn't this (Heartbleed) sort of buffer excess read-back failure have been prevented by utilizing a "safe" language like Lisp or SML?
I used to be an "unsafe" language bigot -- having mastered C/C++ for many years, and actually producing C compilers for a living at one time. I felt there should be no barriers to me as master of my machine, and not the other way around.
But today's software systems are so complex that it boggles the mind to keep track of everything needed. I found during my transition years that I could maintain code bases no larger than an absolute max of 500 KLOC, and that I actually started losing track of details around 100 KLOC. Making the transition to a higher level language like SML or Lisp enabled greater productivity within those limits for me.
Dr. David McClain
Pascal J. Bourguignon
2014-04-13 10:01:01 UTC
Permalink
Post by David McClain
Just curious for other opinions... but wouldn't this (Heartbleed)
sort of buffer excess read-back failure have been prevented by
utilizing a "safe" language like Lisp or SML?
I used to be an "unsafe" language bigot -- having mastered C/C++ for
many years, and actually producing C compilers for a living at one
time. I felt there should be no barriers to me as master of my
machine, and not the other way around.
Oh, so you are directly (if partially) responsible for the C mess!


The C standards say that:

{ char a[10]; return a[12]; }

is _undefined_.

Why, as a compiler writer, didn't you define it to raise an exception?
Yes, the C standard doesn't define exceptions, why, as a compiler
writer, didn't you add this obvious extension?




Notice how CLHS aref specifies:

subscripts---a list of valid array indices for the array.

Exceptional Situations: None.

and how:

1.4.4.3 The ``Arguments and Values'' Section of a Dictionary Entry

An English language description of what arguments the operator
accepts and what values it returns, including information about
defaults for parameters corresponding to omittable arguments (such
as optional parameters and keyword parameters). For special
operators and macros, their arguments are not evaluated unless it is
explicitly stated in their descriptions that they are evaluated.

Except as explicitly specified otherwise, the consequences are
undefined if these type restrictions are violated.




Which means that (let ((a (make-array 10))) (aref a 12)) is as undefined
in CL as in C!


However, you don't see Lisp implementers allow it, and instead they all
signal an error:

[***@kuiper :0.0 tmp]$ clall -r '(let ((a (make-array 10))) (aref a 12))'

Armed Bear Common Lisp Invalid array index 12 for #(NIL NIL NIL NIL NIL NIL NIL NIL NIL NIL) (should be >= 0 and < 10).
Clozure Common Lisp Array index 12 out of bounds for #(0 0 0 0 0 0 0 0 0 0) .
CLISP AREF: index 12 for #(NIL NIL NIL NIL NIL NIL NIL NIL NIL NIL) is out of range
CMU Common Lisp Error in function LISP::%ARRAY-ROW-MAJOR-INDEX: Invalid index 12 in #(0 0 0 0 0 0 0 0 0 0)
ECL In function AREF, the index into the object #(NIL NIL NIL NIL NIL NIL NIL NIL NIL NIL). takes a value 12 out of the range (INTEGER 0 9).
SBCL Value of SB-INT:INDEX in (THE (INTEGER 0 (10)) SB-INT:INDEX) is 12, not a (MOD 10).




And even with safety 0, which should never be used, (but perhaps on a
specific function that you've proven needs to 2 cycles faster, for which
you can't find a better algorithm, and when you have proven and tested
that bounds and other adverse conditions couldn't occur, that is, so
many conditions that they never occur in real life), non-toy
implementations still check bounds:

[***@kuiper :0.0 tmp]$ clall -r '(declaim (optimize (safety 0) (speed 3) (debug 0) (space 3)))' '(let ((a (make-array 10))) (aref a 12))'

Armed Bear Common Lisp --> NIL
Armed Bear Common Lisp Invalid array index 12 for #(NIL NIL NIL NIL NIL NIL NIL NIL NIL NIL) (should be >= 0 and < 10).
CCL
/home/pjb/bin/clall: line 284: 16162 Segmentation fault "$implementation" "$@" "${user_args[@]}" > "$error" 2>&1
CLISP --> NIL
CLISP AREF: index 12 for #(NIL NIL NIL NIL NIL NIL NIL NIL NIL NIL) is out of range
CMU Common Lisp --> NIL
CMU Common Lisp Error in function LISP::%ARRAY-ROW-MAJOR-INDEX: Invalid index 12 in #(0 0 0 0 0 0 0 0 0 0)
ECL --> NIL
ECL In function AREF, the index into the object #(NIL NIL NIL NIL NIL NIL NIL NIL NIL NIL). takes a value 12 out of the range (INTEGER 0 9).
SBCL --> No value.
SBCL --> 0






Here is how a lisp programmer implements a C compiler:

cl-user> (ql:quickload :vacietis)
;; …
cl-user> (defun read-c-expression-from-string (source)
(let ((*readtable* vacietis:c-readtable)
(vacietis:*compiler-state* (vacietis:make-compiler-state)))
(read-from-string source)))
read-c-expression-from-string
cl-user> (read-c-expression-from-string "char f(){ char a[10]; return a[12]; }")
(vacietis::defun/1 f nil (prog* ((a (vacietis:allocate-memory 10))) (return (vacietis.c:[] a 12))))
cl-user> (eval (read-c-expression-from-string "char f(){ char a[10]; return a[12]; }"))
f
cl-user> (f)
Post by David McClain
Debug: Array index 12 out of bounds for #<vector 10, adjustable> .
While executing: (:internal swank::invoke-default-debugger), in process repl-thread(871).
Type :POP to abort, :R for a list of available restarts.
Type :? for other options.
1 > :q
; Evaluation aborted on #<simple-error #x302003010B7D>.
cl-user>




A final word: Don't use FFI! Implement the libraries you need in Lisp!
--
__Pascal Bourguignon__
http://www.informatimago.com/
"Le mercure monte ? C'est le moment d'acheter !"
David McClain
2014-04-13 22:58:59 UTC
Permalink
Hi Pascal

That was very funny! Heh!

But in case you weren't trying to be funny, then I'd have a guess that you were born sometime later than 1970 or so.

Everything happens in an historical context. And if the C language had raised exceptions on "invalid" memory accesses, then I can assure you that neither I nor anyone else at the time would have used such a language. It would have been too constraining. If you wanted such confining behavior then you might have considered the new language Ada.

I'd love to discuss at greater length but right now I'm attending a Chamber Concert and not near my computers.
From what I understand about the bug (I have not seen the code) it sounds like data length information arrived both directly and indirectly in the client message and that a conflict between them was not scrutinized.
More later...

Dr. David McClain

Sent from a mobile device
Post by David McClain
Just curious for other opinions... but wouldn't this (Heartbleed)
sort of buffer excess read-back failure have been prevented by
utilizing a "safe" language like Lisp or SML?
I used to be an "unsafe" language bigot -- having mastered C/C++ for
many years, and actually producing C compilers for a living at one
time. I felt there should be no barriers to me as master of my
machine, and not the other way around.
Oh, so you are directly (if partially) responsible for the C mess!
{ char a[10]; return a[12]; }
is _undefined_.
Why, as a compiler writer, didn't you define it to raise an exception?
Yes, the C standard doesn't define exceptions, why, as a compiler
writer, didn't you add this obvious extension?
subscripts---a list of valid array indices for the array.
Exceptional Situations: None.
1.4.4.3 The ``Arguments and Values'' Section of a Dictionary Entry
An English language description of what arguments the operator
accepts and what values it returns, including information about
defaults for parameters corresponding to omittable arguments (such
as optional parameters and keyword parameters). For special
operators and macros, their arguments are not evaluated unless it is
explicitly stated in their descriptions that they are evaluated.
Except as explicitly specified otherwise, the consequences are
undefined if these type restrictions are violated.
Which means that (let ((a (make-array 10))) (aref a 12)) is as undefined
in CL as in C!
However, you don't see Lisp implementers allow it, and instead they all
Armed Bear Common Lisp Invalid array index 12 for #(NIL NIL NIL NIL NIL NIL NIL NIL NIL NIL) (should be >= 0 and < 10).
Clozure Common Lisp Array index 12 out of bounds for #(0 0 0 0 0 0 0 0 0 0) .
CLISP AREF: index 12 for #(NIL NIL NIL NIL NIL NIL NIL NIL NIL NIL) is out of range
CMU Common Lisp Error in function LISP::%ARRAY-ROW-MAJOR-INDEX: Invalid index 12 in #(0 0 0 0 0 0 0 0 0 0)
Max Rottenkolber
2014-04-23 08:06:44 UTC
Permalink
From what I understand about the bug (I have not seen the code) it sounds
like data length information
arrived both directly and indirectly in the client message and that a
conflict between them was not
scrutinized.
No. The bug was that the keep alive protocol in SSL mandates the server to
echo arbitrary data back to the client. The bounds checks were wrong too,
but at that stage it really doesn't matter. The design is just plain wrong.
David McClain
2014-04-23 13:13:03 UTC
Permalink
Post by Max Rottenkolber
. The design is just plain wrong.
Is that statement the benefit of hindsight knowledge, or do you have a more intelligent thought process behind it? (I can imagine the all-knowing smirk in the background, but I'd really like to know :-)

- DM
Post by Max Rottenkolber
From what I understand about the bug (I have not seen the code) it sounds
like data length information
arrived both directly and indirectly in the client message and that a
conflict between them was not
scrutinized.
No. The bug was that the keep alive protocol in SSL mandates the server to
echo arbitrary data back to the client. The bounds checks were wrong too,
but at that stage it really doesn't matter. The design is just plain wrong.
_______________________________________________
pro mailing list
http://common-lisp.net/cgi-bin/mailman/listinfo/pro
Dr. David McClain
dbm-h9Lvq2qNYxl8mnqqp10EyDJtdDapTP/***@public.gmane.org
Max Rottenkolber
2014-04-26 12:18:08 UTC
Permalink
Post by David McClain
Post by Max Rottenkolber
. The design is just plain wrong.
Is that statement the benefit of hindsight knowledge, or do you have a
more intelligent thought process behind it? (I can imagine the
all-knowing smirk in the background, but I'd really like to know :-)
The exact opposite of all-knowing ;). In my opinion the TLS standard is
too complex. Parts of it like the keep-alive, which is also a path MTU
checking *framework*, as criticized by me (and further down discussed
with Pascal).

Many security professionals have criticized the TLS committee for their
standards. As a side note: OpenSSL has roughly 500k lines of code, I
don't think its feasible to assure security on a code base of this
magnitude.

If I imagine to implement a security protocol, e.g. "this code should be
kept short and really really safe", and be confronted with e.g. the
Heartbeat extension, I imagine despair.

So my conclusion is, a widely used security standard should be engineered
well enough to be possible to implement correctly, even in a 4 digit ANSI
C code base.
Scott L. Burson
2014-04-23 17:46:20 UTC
Permalink
From what I understand about the bug (I have not seen the code) it sounds
like data length information
arrived both directly and indirectly in the client message and that a
conflict between them was not
scrutinized.
No. The bug was that the keep alive protocol in SSL mandates the server to
echo arbitrary data back to the client. The bounds checks were wrong too,
but at that stage it really doesn't matter. The design is just plain wrong.
It is a bit curious that the protocol mandates this echoing, and one
could certainly debate whether this is good protocol design, but as
far as the actual vulnerability goes, David's characterization is
accurate. The heartbeat request arrives with some number of bytes of
data attached to it, and also with a length field that tells the
server how many bytes the client would like echoed back. There was no
check that the client didn't request more bytes be echoed than it had
actually sent.

-- Scott
Pascal J. Bourguignon
2014-04-23 19:33:08 UTC
Permalink
Post by Scott L. Burson
From what I understand about the bug (I have not seen the code) it sounds
like data length information
arrived both directly and indirectly in the client message and that a
conflict between them was not
scrutinized.
No. The bug was that the keep alive protocol in SSL mandates the server to
echo arbitrary data back to the client. The bounds checks were wrong too,
but at that stage it really doesn't matter. The design is just plain wrong.
It is a bit curious that the protocol mandates this echoing, and one
could certainly debate whether this is good protocol design, but as
far as the actual vulnerability goes, David's characterization is
accurate. The heartbeat request arrives with some number of bytes of
data attached to it, and also with a length field that tells the
server how many bytes the client would like echoed back. There was no
check that the client didn't request more bytes be echoed than it had
actually sent.
This is not what the protocol specifies.

The protocol specifies that if the length mentionned is wrong ("too
big"), then nothing should be answered, and otherwise that the exact
payload data received be sent back.

Nowhere does it say that the length field in the packet has any validity
that the server must use it blindly.

Nowhere is it specified that the server should return data beyond the
payload data.


It is obvious that the message length should be taken ito account to
determine the padding_length, since it's not in the message. This is
explicitely described in the protocol specifications:


padding: The padding is random content that MUST be ignored by the
receiver. The length of a HeartbeatMessage is TLSPlaintext.length
for TLS and DTLSPlaintext.length for DTLS. Furthermore, the
length of the type field is 1 byte, and the length of the
payload_length is 2. Therefore, the padding_length is
TLSPlaintext.length - payload_length - 3 for TLS and
DTLSPlaintext.length - payload_length - 3 for DTLS. The
padding_length MUST be at least 16.

The sender of a HeartbeatMessage MUST use a random padding of at
least 16 bytes. The padding of a received HeartbeatMessage message
MUST be ignored.
--
__Pascal Bourguignon__
http://www.informatimago.com/
"Le mercure monte ? C'est le moment d'acheter !"
Pascal J. Bourguignon
2014-04-23 18:39:48 UTC
Permalink
From what I understand about the bug (I have not seen the code) it sounds
like data length information
arrived both directly and indirectly in the client message and that a
conflict between them was not
scrutinized.
No. The bug was that the keep alive protocol in SSL mandates the server to
echo arbitrary data back to the client. The bounds checks were wrong too,
but at that stage it really doesn't matter. The design is just plain wrong.
I don't think you can say that it's _just_ the design is just plain wrong.

If I give you as specification:

- client sends a string S of length s.
- client sends an offset o and a length l.
- server sends back data l bytes taken from the address of the string
plus o.

and ask you to implement it only using the CL package, you won't be able
to implement it in any CL implementation using non-zero safety, and you
won't be able to implement it in most CL implementations using (safety 0).



But those weren't the specifications, they are obviously bogus.

But assuming they were the following (still bogus, but rather reasonable
specifications)::

- client sends a string S of length s.
- client sends an offset o and a length l.
- server sends back the substring of S starting at offset o,
containing l characters.

This you could easily implement in CL, (as easily as in C), but again,
while in C this is a heartbleed bug, in CL, it poses absolutely no
security problem (unless you're using some certain implementations with
(safety 0), which you should not have done anyways, you're really asking
for problems, aren't you).


(defun heartbeat-data (S o l)
(subseq S o (+ o l)))

(heartbeat-data "Hello" 0 64000)
Debug: Bad interval for sequence operation on "Hello" : start = 0,
end = 64000
So while the protocol didn't specify apparently what to do
when (> (+ o l) (length S)), this would have been handled as any other
generic protocol or server error, and no private data would be bled away.


http://cacm.acm.org/blogs/blog-cacm/173827-those-who-say-code-does-not-matter/fulltext
http://jameso.be/2012/02/11/language-matters.html



So it's not just the specifications, it's the language implementations
that are at fault here (not the ANSI C language, which clearly says that
it's undefined to read an uninitialized array or outside of allocated
memory, and therefore you could expect as with any CL implementation to
have exceptions signaled in such occurences (since it's undefined,
implementation could define implementation specific exception
mechanisms)).


But the actual protocol specifications didn't even say that! They are
actually quite reasonable, and this is clearly an implementation bug:

https://tools.ietf.org/html/rfc6520

The specifications of the protocol explicitely say:

If the payload_length of a received HeartbeatMessage is too large,
the received HeartbeatMessage MUST be discarded silently.

and:

When a HeartbeatRequest message is received and sending a
HeartbeatResponse is not prohibited as described elsewhere in this
document, the receiver MUST send a corresponding HeartbeatResponse
message carrying AN EXACT COPY OF THE PAYLOAD of the received
HeartbeatRequest.
--
__Pascal Bourguignon__
http://www.informatimago.com/
"Le mercure monte ? C'est le moment d'acheter !"
Max Rottenkolber
2014-04-24 13:59:33 UTC
Permalink
Post by Pascal J. Bourguignon
When a HeartbeatRequest message is received and sending a
HeartbeatResponse is not prohibited as described elsewhere in this
document, the receiver MUST send a corresponding HeartbeatResponse
message carrying AN EXACT COPY OF THE PAYLOAD of the received
HeartbeatRequest.
I didn't mean to dispute that CL is a safer language. My point is that, as
an implementer, the above paragraph in an SSL protocol extension should
raise red lights.

What is the function of the described behavior? Why would I want to echo
back data in context of a keep alive?
A: None. You don't want to do that.

My position on this is to refuse to implement it. If that means my
implementation is useless in the context of other implementations, I need
to implement a better standard. I'd go as far as saying this is a moral
issue. When implementing a standard means building a weapon pointed at
half the Internet, the implementer is responsible for the resulting
threat.

I have made mixed experiences with this. So far I have implemented a few
standards where this approach worked just fine (email client, web server).
I could just omit the behavior I deemed unacceptable and refuse to handle
those messages or send a "501 Not Implemented" respectively. And while
both email and the HTTP standards bear tons of legacy baggage and can be
tedious to implement, I refer to them as _good_ standards bodies, because
they let you safely omit their questionable components.

A security guy reading the TLS standard on the other hand, WILL think that
it was written by a malicious party, optimized for being impossible to
implement in a safe way. And while it is easier to implement the TLS
standard correctly in Lisp, I believe it should be simple and well-
written enough to be able to implement it safely even in C.
Pascal J. Bourguignon
2014-04-24 16:13:35 UTC
Permalink
Post by Max Rottenkolber
Post by Pascal J. Bourguignon
When a HeartbeatRequest message is received and sending a
HeartbeatResponse is not prohibited as described elsewhere in this
document, the receiver MUST send a corresponding HeartbeatResponse
message carrying AN EXACT COPY OF THE PAYLOAD of the received
HeartbeatRequest.
I didn't mean to dispute that CL is a safer language. My point is that, as
an implementer, the above paragraph in an SSL protocol extension should
raise red lights.
What is the function of the described behavior? Why would I want to echo
back data in context of a keep alive?
A: None. You don't want to do that.
You want to make sure that the answer you get corresponds to the request
you sent.

You could use a counter, but it would be too easy to simulate it on the
other end.

If you send random data, and compare the returned data, you make sure
that there's something alive on the other end that can receive your
message and respond to them, not a dead process sending fixed or
previsible packets.
--
__Pascal Bourguignon__
http://www.informatimago.com/
"Le mercure monte ? C'est le moment d'acheter !"
Max Rottenkolber
2014-04-24 17:04:23 UTC
Permalink
a dead process sending fixed or previsible packets
I didn't think of that. So basically you ensure the responding connection
isn't compromised by exercising the encryption, which is the hardest to
fake for a malicious attacker. Makes sense... Shame on me! :)

What about a fixed length input though (and maybe answering with a
digest)? It still seems to me that the specified behavior is overly
arbitrary/error prone.
Pascal J. Bourguignon
2014-04-24 17:11:34 UTC
Permalink
Post by Max Rottenkolber
a dead process sending fixed or previsible packets
I didn't think of that. So basically you ensure the responding connection
isn't compromised by exercising the encryption, which is the hardest to
fake for a malicious attacker. Makes sense... Shame on me! :)
What about a fixed length input though (and maybe answering with a
digest)? It still seems to me that the specified behavior is overly
arbitrary/error prone.
The introduction of the protocol says:

The Heartbeat Extension provides a new protocol for TLS/DTLS allowing
the usage of keep-alive functionality without performing a
renegotiation and a basis for path MTU (PMTU) discovery for DTLS.

So the variable size of the packet is used for this later feature,
discovery of path MTU or PMTU.
--
__Pascal Bourguignon__
http://www.informatimago.com/
"Le mercure monte ? C'est le moment d'acheter !"
Steve Haflich
2014-04-25 02:29:24 UTC
Permalink
There have been a lot of incorrect information and assumption on this
thread. I'm not picking on Pascal here (because I know he knows better).

On Wed, Apr 23, 2014 at 11:39 AM, Pascal J. Bourguignon <
Post by Pascal J. Bourguignon
and ask you to implement it only using the CL package, you won't be able
to implement it in any CL implementation using non-zero safety, and you
won't be able to implement it in most CL implementations using (safety 0).
In any case, you won't be able to implement an HTTP server in ANSI CL
because we of X3J13 didn't get around to defining any socket interface. It
was a known need, but too difficult to achieve. That's probably a good
thing, because any standard socket binding circa 1990 would likely have
been seriously incorrect and/or inadequate. (Compare the lack of Unicode
binding.)
Post by Pascal J. Bourguignon
This you could easily implement in CL, (as easily as in C), but again,
while in C this is a heartbleed bug, in CL, it poses absolutely no
security problem (unless you're using some certain implementations with
(safety 0), which you should not have done anyways, you're really asking
for problems, aren't you).
You're making unsupported assumptions about safety 0. The ANS only makes
distinction between safety _3_ and safety anything else. safety 3 is safe
code and in safe code certain user-code violations are required to be
signalled (usually where the ANS uses the word "should"). And there are
damn few of thos places. Take for example aref, which might be used to
extract octets of characters or whatever from a buffer. aref makes no
guarantees even in safe code that it will signal bad array bounds.

Of course, it is unlikely that a HTTP server would use aref in this
context, but more likely it would engage implementation-dependent socket
and/or stream extensions. Do those extensions guarantee the kind of
paranoid safe checking? Probably not, but even if they claim to do so, how
does one verify? Real socket protocols use large buffers and simply pass
memory pointers and lengths AT the OS. You might think that is bad
practice, but you might dislike the performance in a _real_ performance web
server that made too many guarantees.

(But I certainly agree that the Heatbleed bug results from a poor
implementation of an obscure specification. But it isn't the language.)
Post by Pascal J. Bourguignon
So it's not just the specifications, it's the language implementations
that are at fault here (not the ANSI C language, which clearly says that
it's undefined to read an uninitialized array or outside of allocated
memory, and therefore you could expect as with any CL implementation to
have exceptions signaled in such occurences (since it's undefined,
implementation could define implementation specific exception
mechanisms)).
"Consequences are undefined" ïs a term of art in the ANS. Behavior might
range from DWIM to destruction of the Universe. You cannot expect a CL
implementation to check situations that are not specified by the ANS to be
checked. I just checked the following form in SBCL and ACL -- both did
undefined things and did not signal errors.

(funcall (compile nil
'(lambda (x)
(declare (optimize (speed 3) (safety 0)))
(svref x 10)))
(make-array 3))

Just like C, but at least the Universe didn't disappear. This time.

CL is not intrinsically more safe than C How any favorite implementation
behaves is irrelevant to this argument. It is the programmer that must
code safely.
.
Scott L. Burson
2014-04-25 04:31:30 UTC
Permalink
Post by Steve Haflich
Take for example aref, which might be used to
extract octets of characters or whatever from a buffer. aref makes no
guarantees even in safe code that it will signal bad array bounds.
I've long thought that was an oversight, though now that you point it
out, I realize I must have been mistaken.

Still, it surprises me. I don't know of any implementation that
doesn't bounds-check aref under normal speed/safety settings, and
clearly, users expect them to do so. It seems a little pedantic to
insist that the _language_ isn't safe in this respect even when all
known implementations are. (Am I wrong about that?)

And for the record I disagree with the committee's decision. Bounds
checking aref etc. _should_ be required at safety 3 (and along with
that, there should be a standardized bounds-error condition type).
The reasoning behind the committee's choice here eludes me.

-- Scott
Pascal J. Bourguignon
2014-04-25 05:30:12 UTC
Permalink
Post by Scott L. Burson
Post by Steve Haflich
Take for example aref, which might be used to
extract octets of characters or whatever from a buffer. aref makes no
guarantees even in safe code that it will signal bad array bounds.
I've long thought that was an oversight, though now that you point it
out, I realize I must have been mistaken.
Still, it surprises me. I don't know of any implementation that
doesn't bounds-check aref under normal speed/safety settings, and
clearly, users expect them to do so. It seems a little pedantic to
insist that the _language_ isn't safe in this respect even when all
known implementations are. (Am I wrong about that?)
The point is that ANSI Common Lisp compiler writers will have their
compilers generate run-time bound checking code, while ANSI C compiler
writters won't.

The point is that ANSI Common Lisp compiler writers will add extensions
to the language or "standard library" to deal with sockets and network
communications, while ANSI C compiler writers won't (relaying on library
and OS API writers to do so).

The point is that ANSI Common Lisp compiler writers don't need to add
exception handling as an extension because it's already specified in the
language, while ANSI C compiler writers would have to do so, to deal
non-trivially with run-time errors.
Post by Scott L. Burson
And for the record I disagree with the committee's decision. Bounds
checking aref etc. _should_ be required at safety 3 (and along with
that, there should be a standardized bounds-error condition type).
The reasoning behind the committee's choice here eludes me.
Agreed, a programming language standard should not rely on the good
sense of implementers to ensure the semantics of its programs, the more
so in a dynamic language where code can be executed without being
previously globally validated.


But again, AIUI, Common Lisp was specified as much as it was documenting
the commonality in existing implementations, so that may explain why
there are so many parts that are unspecified or implementation dependant.
--
__Pascal Bourguignon__
http://www.informatimago.com/
"Le mercure monte ? C'est le moment d'acheter !"
Jean-Claude Beaudoin
2014-04-25 05:35:57 UTC
Permalink
Post by Scott L. Burson
Post by Steve Haflich
Take for example aref, which might be used to
extract octets of characters or whatever from a buffer. aref makes no
guarantees even in safe code that it will signal bad array bounds.
I've long thought that was an oversight, though now that you point it
out, I realize I must have been mistaken.
Still, it surprises me. I don't know of any implementation that
doesn't bounds-check aref under normal speed/safety settings, and
clearly, users expect them to do so.
I am surprised too. I always understood it like you Scott but now that
re-read
the page on aref I see that it is exactly like Steve says, no mention of any
exception and a statement that "subscripts" must be a list of valid array
indices
right from the start of the call to aref. Yet that leaves me even more
curious
to know which implementation has read the spec as strictly as Steve says
it can be even under (safety 3)? Does anyone know any?
Steve Haflich
2014-04-25 06:44:21 UTC
Permalink
On Thu, Apr 24, 2014 at 10:35 PM, Jean-Claude Beaudoin <
Post by Scott L. Burson
Post by Scott L. Burson
I've long thought that was an oversight, though now that you point it
out, I realize I must have been mistaken.
"Oversight" might be the wrong way of thinking about this. X3J13 started
with the language defined by CLtL1 (the work of the infamous Gang of Five)
with the purpose of turning it into a powerful, useful, real-world
industrial-strength programming language. Except for new subsystems
grabbed more-or-less intact from other sources (Waters' pretty printer, the
condition system (Pitman and others), and CLOS (a different gang inside
X3J13)) the specification started with the CLtL1 definitions to which
cleanups and accretions were made. There were a lot of inconsistencies to
remove, and a lot of language cleanups, and a lot of incompatibilities as
modern features were added. But a lot of culture from early era Lisps
(primarily MACSYMA) remain. We changed what _needed_ to be changed,
cleaned up a lot of other inelegances, but there was not time or energy to
attempt a thorough job. The subgroups for things like graphics and I18N
and networking realized the time was not yet ripe -- the world was changing
out from under them -- and the committee The process took almost 6 years,
and near the end time and funding was running out. Eventually the
committee standardized _only_ the programming language, more or less, and
we were lucky to get it done.
Post by Scott L. Burson
Still, it surprises me. I don't know of any implementation that
Post by Scott L. Burson
doesn't bounds-check aref under normal speed/safety settings, and
clearly, users expect them to do so.
I am surprised too. I always understood it like you Scott but now that
re-read
the page on aref I see that it is exactly like Steve says, no mention of any
exception and a statement that "subscripts" must be a list of valid array
indices
right from the start of the call to aref. Yet that leaves me even more
curious
to know which implementation has read the spec as strictly as Steve says
it can be even under (safety 3)? Does anyone know any?
I don't know of any and there might not be any, at least among main-line
implementations. I don't remember X3J13 considering aref (except for the
non-interaction with a fill pointer) but I also can't remember what I had
for breakfast this morning, so investigation of X3J13 records might reveal
differently. The lack of exhaustive subtyping of cl:error was recognized
as something missing, but the condition system itself wasn't in the
original language, and no one had the time or energy to go through the
entire specification. The sense was that a quality implementation could do
so itself, and maybe agre on details in the future.

But in your paragraph above I'm bothered by its hidden assumption: It
suggests that after the ANS was available sneaky implementors studied it
kabalistically to find places where annoying error checks could be removed.
It was exactly the opposite! Tired implementors slogged through the ANS
to find places where error checking was _required_ and found missing. (Or
customers did it for them.)

I agree it would be a good thing if the ANS required aref bounds checking
in safe code.

To return to my important point, the language of the ANS wont let you read
or write from a socket. At some point user application code will have to
call some non-ANS functions, and in the real world those functions (just
like C) will take a pointer into some overlarge buffer array along with a
length, and that memory location will be passed further down to some system
code (likely written in C) that has access to the entire user-space memory.
Now, without the missing check on the length that allowed the Heartbleed
bug, such an error won't allow buffer overruns in either input or output,
but my point is that user C code and user CL code are little different in
this regard.
Scott L. Burson
2014-04-25 19:56:47 UTC
Permalink
Post by Steve Haflich
On Thu, Apr 24, 2014 at 10:35 PM, Jean-Claude Beaudoin
Post by Jean-Claude Beaudoin
Post by Scott L. Burson
I've long thought that was an oversight, though now that you point it
out, I realize I must have been mistaken.
"Oversight" might be the wrong way of thinking about this. [...]
We changed what _needed_ to be changed, cleaned
up a lot of other inelegances, but there was not time or energy to attempt a
thorough job. [...]
All I mean by "oversight" is that it was not the product of a
deliberate decision. From the tone of your previous message I thought
that it must have been deliberate, but now it sounds like I was
probably right the first time, though we don't know for sure.
Post by Steve Haflich
Post by Jean-Claude Beaudoin
Post by Scott L. Burson
Still, it surprises me. I don't know of any implementation that
doesn't bounds-check aref under normal speed/safety settings, and
clearly, users expect them to do so.
I am surprised too. I always understood it like you Scott but now that
re-read
the page on aref I see that it is exactly like Steve says, no mention of any
exception and a statement that "subscripts" must be a list of valid array
indices
right from the start of the call to aref. Yet that leaves me even more
curious
to know which implementation has read the spec as strictly as Steve says
it can be even under (safety 3)? Does anyone know any?
I don't know of any and there might not be any, at least among main-line
implementations. [...]
But in your paragraph above I'm bothered by its hidden assumption: It
suggests that after the ANS was available sneaky implementors studied it
kabalistically to find places where annoying error checks could be removed.
I don't read Jean-Claude this way. I think he was expressing surprise
at the thought that an implementor might have done that.
Post by Steve Haflich
To return to my important point, the language of the ANS wont let you read
or write from a socket. At some point user application code will have to
call some non-ANS functions, and in the real world those functions (just
like C) will take a pointer into some overlarge buffer array along with a
length, and that memory location will be passed further down to some system
code (likely written in C) that has access to the entire user-space memory.
Now, without the missing check on the length that allowed the Heartbleed
bug, such an error won't allow buffer overruns in either input or output,
but my point is that user C code and user CL code are little different in
this regard.
It certainly is _possible_ to write an unsafe socket-write function
(*) in a CL library. But I still think the _probability_ of someone
doing so is substantially smaller in CL than in C. Writing in C is
like putting

(declaim (optimize (speed 3) (safety 0))) ; damn the torpedoes!!

at the top of every source file.

When writing a safety-0 function in CL, the unsafe region is much more
restricted, and one is more likely to be careful to add explicit
bounds checks where appropriate. (I recall only one occasion in my
career where I forgot to do this. Koff koff... but the point is, it's
not an error one has the opportunity to make very often.)

-- Scott

(* Actually the missing bounds check was on a 'memcpy' call that was
being used to prepare the heartbeat reply message, but the effect is
the same as if it had been on the socket write.)
William Lederer
2014-04-25 20:20:10 UTC
Permalink
Team:

I would like to weigh in here as a security professional who uses Lisp in
daily practice. I do Application Security assessments and advise companies
on secure coding practices. I do penetration tests and have discovered a
zero day in OpenSSL (not anywhere near the severity of Heartbleed.)

I agree with the general sentiment that Lisp is a much safer language to
build anything in. While several in this thread are pointing to bounds
checking as one of the advantages that Lisp has over C and other languages,
there is something else I find that is also very strong: It is easier to
write programs about which a reader can reason about correctness. In Lisp,
the programs tend to be closer to provable and errors are more evident. As
in "obviously no deficiencies" vs "no obvious deficiencies".

But in my experience, vulnerabilities result from

- Buffer Overflows/lack of bounds checking (Heartbleed and friends)
- Configuration errors
- Logic Flaws
- Dangerous use of user input (leading to SQLi, XSS, XSRF)
- Improper use of cryptography
- Unclear protocol specification (leading to OpenSSL)

So while I would recommend to anyone who will listen to use Lisp (and
likely to many who won't) as the base of their application, I would also
caution them to not take their eye of the other likely sources of
catastrophic application failure.

Finally, one of the most famous positive security stories is Qmail, which
handles a significant fraction of all internet mail. It is written in C
and has been in use for a very long time.

Thus, I feel Lisp is better but not a total panacea. For example, has the
Ironclad library been examined by a cryptographer? Does it, for example, do
constant-time comparisons to avoid timing leaks?

wglb
Post by Steve Haflich
Post by Steve Haflich
On Thu, Apr 24, 2014 at 10:35 PM, Jean-Claude Beaudoin
Post by Jean-Claude Beaudoin
Post by Scott L. Burson
I've long thought that was an oversight, though now that you point it
out, I realize I must have been mistaken.
"Oversight" might be the wrong way of thinking about this. [...]
We changed what _needed_ to be changed, cleaned
up a lot of other inelegances, but there was not time or energy to
attempt a
Post by Steve Haflich
thorough job. [...]
All I mean by "oversight" is that it was not the product of a
deliberate decision. From the tone of your previous message I thought
that it must have been deliberate, but now it sounds like I was
probably right the first time, though we don't know for sure.
Post by Steve Haflich
Post by Jean-Claude Beaudoin
Post by Scott L. Burson
Still, it surprises me. I don't know of any implementation that
doesn't bounds-check aref under normal speed/safety settings, and
clearly, users expect them to do so.
I am surprised too. I always understood it like you Scott but now that
re-read
the page on aref I see that it is exactly like Steve says, no mention of any
exception and a statement that "subscripts" must be a list of valid
array
Post by Steve Haflich
Post by Jean-Claude Beaudoin
indices
right from the start of the call to aref. Yet that leaves me even more
curious
to know which implementation has read the spec as strictly as Steve says
it can be even under (safety 3)? Does anyone know any?
I don't know of any and there might not be any, at least among main-line
implementations. [...]
But in your paragraph above I'm bothered by its hidden assumption: It
suggests that after the ANS was available sneaky implementors studied it
kabalistically to find places where annoying error checks could be
removed.
I don't read Jean-Claude this way. I think he was expressing surprise
at the thought that an implementor might have done that.
Post by Steve Haflich
To return to my important point, the language of the ANS wont let you
read
Post by Steve Haflich
or write from a socket. At some point user application code will have to
call some non-ANS functions, and in the real world those functions (just
like C) will take a pointer into some overlarge buffer array along with a
length, and that memory location will be passed further down to some
system
Post by Steve Haflich
code (likely written in C) that has access to the entire user-space
memory.
Post by Steve Haflich
Now, without the missing check on the length that allowed the Heartbleed
bug, such an error won't allow buffer overruns in either input or output,
but my point is that user C code and user CL code are little different in
this regard.
It certainly is _possible_ to write an unsafe socket-write function
(*) in a CL library. But I still think the _probability_ of someone
doing so is substantially smaller in CL than in C. Writing in C is
like putting
(declaim (optimize (speed 3) (safety 0))) ; damn the torpedoes!!
at the top of every source file.
When writing a safety-0 function in CL, the unsafe region is much more
restricted, and one is more likely to be careful to add explicit
bounds checks where appropriate. (I recall only one occasion in my
career where I forgot to do this. Koff koff... but the point is, it's
not an error one has the opportunity to make very often.)
-- Scott
(* Actually the missing bounds check was on a 'memcpy' call that was
being used to prepare the heartbeat reply message, but the effect is
the same as if it had been on the socket write.)
_______________________________________________
pro mailing list
http://common-lisp.net/cgi-bin/mailman/listinfo/pro
Nathan Froyd
2014-04-25 20:42:13 UTC
Permalink
On Fri, Apr 25, 2014 at 4:20 PM, William Lederer
Post by William Lederer
Thus, I feel Lisp is better but not a total panacea. For example, has the
Ironclad library been examined by a cryptographer? Does it, for example, do
constant-time comparisons to avoid timing leaks?
The answer to these (and many other questions of cryptographic
sophistication) is no. Ironclad has many deficiencies that make it
unsuitable for serious cryptographic software.

I'm not sure that several constant-time checks can even be implemented
in Common Lisp without some serious assistance from and/or knowledge
of the implementation.

-Nathan
Antoni Grzymała
2014-04-25 23:24:35 UTC
Permalink
Ironclad has many deficiencies that make it unsuitable for serious
cryptographic software.
I'm curious what they would be – would you be able to outline that in
more detail?
--
[アントシカ]
Nathan Froyd
2014-04-27 16:02:44 UTC
Permalink
Post by Antoni Grzymała
Ironclad has many deficiencies that make it unsuitable for serious
cryptographic software.
I'm curious what they would be – would you be able to outline that in
more detail?
Sure. In no particular order, and with no claim of exhaustiveness:

- Many ciphers are not safe against timing attacks due to the use of
lookup tables.
- There's nothing like Go's crypto.subtle
(http://golang.org/pkg/crypto/subtle/) package for ensuring that
various checks are safe against timing attacks.
- The public key algorithms are definitely not production ready: they
will give you the correct answers, but the implementations are not
cryptographically robust. Part of this is potentially intractable,
given that they rely on bignums, and the bignum implementations in
Common Lisp implementations are probably not implemented with the
needs of public key algorithms in mind.
- The DSA signature algorithm doesn't use high-quality random numbers,
which makes it unsafe.
- I know there are a whole host of issues with implementing RSA
safely; Ironclad has not paid attention to any of these.
- There's no implementation of padding and all the subtleties that
come with it for block cipher algorithms or public key algorithms.

The hash algorithm implementations are pretty solid (assuming that you
choose cryptographically secure ones, of course); everything else
isn't suitable for security-conscious software.

I would like to fix some of these deficiencies, of course, but I
haven't sat down and taken the time to do so. Patches welcome.

-Nathan
Steve Haflich
2014-04-27 08:27:43 UTC
Permalink
I agree with essentially everything in wglb's message, but (once again)
I'll grumpily jump in to emphasize a point which I think many on this list
have missed.


On Fri, Apr 25, 2014 at 1:20 PM, William Lederer
Post by William Lederer
I agree with the general sentiment that Lisp is a much safer language to
build anything in. While several in this thread are pointing to bounds
checking as one of the advantages that Lisp has over C and other languages,
there is something else I find that is also very strong: It is easier to
write programs about which a reader can reason about correctness. In Lisp,
the programs tend to be closer to provable and errors are more evident. As
in "obviously no deficiencies" vs "no obvious deficiencies".
But in my experience, vulnerabilities result from
- Buffer Overflows/lack of bounds checking (Heartbleed and friends)
- Configuration errors
- Logic Flaws
- Dangerous use of user input (leading to SQLi, XSS, XSRF)
- Improper use of cryptography
- Unclear protocol specification (leading to OpenSSL)
This (IMO entirely worthy and correct) summary can easily be
misunderstood! Lisp may be superior because it has bounds checking. (We've
previously agreed that isn't guaranteed since it isn't in the ANS, and in
any platform likely depends on optimization qualities, including the
optimization qualities under which internal called routines were compiled.)
But bugs based on buffer overflow don't on normal operating systems in
general involve bounds checking. At some point on any modern OS, reading
or writing to a socket stream will involve passing to the OS (generally via
a thin user-mode C API layer like *nix read() and write(), or some socket
analogue). Neither Lisp nor C will provide any automatic bounds checking
on such a call. The OS treats the application's address space as a
mostly-contiguous undifferentiated sea of bytes(*). It doesn't matter that
at the app level C also has this model of a sea of bytes, while in Lisp the
ocean is run-time tagged into small plots. That distinction disappears
once one calls write(fd,buf,len).

The Lisp Machine in its several manifestations might be the only
counterexample, since there was no C boundary over which to cross, and
because type and bounds checking was performed for free in the microcode.
But Lisp machines aren't around any more largely because of the economy of
scale. The number of x86 and x64 processors on the planet must be nearly
on the order of 10^9, while the number of Lisp machine processors never got
out of the 10^5 range, so Intel and AMD etc. could justify huge investments
making those processors 3 orders of magnitude faster in raw speed. Lisp
processors could not have kept up at bearable per-item cost. Alas!

It is certainly true that the Heartbleed bug resulted from an
insufficiently-cautious implementation of an (overly?)complex
specification. The author of the bug has essentially agreed with this
analysis. But the "bounds checking" of most Lisp implementations would
provide no protection against this failure (about which the original
posting agrees) unless the succinctness and superior clarity of CL vs C
code might help it be seen. That's a thin thread on which to hang an
entire language argument.

(*) I originally saw this beautiful metaphor, that C treats memory as an
undifferentiated sea of bytes, on some discussion list but can't remember
the originator. Google shows current use scattered over many programming
subjects, but doesnt identify the original. Anyway, it is the reason that
a small hyper-efficient C-struct-in-Lisp defining macro I wrote for a
certain huge CL software product is named "define-sea-struct" and (I used
to be a sailor) the operator for computing offsets possibly through
multiple levels of nested structs is called "following-sea".
Paradoxically, http://www.ibiblio.org/hyperwar/NHC/fairwinds.htm says
"following seas" means "SAFE journey, good fortune" [emphasis added].
Dan Cross
2014-04-27 15:20:35 UTC
Permalink
Post by Steve Haflich
I agree with essentially everything in wglb's message, but (once again)
I'll grumpily jump in to emphasize a point which I think many on this list
have missed.
On Fri, Apr 25, 2014 at 1:20 PM, William Lederer <
Post by William Lederer
I agree with the general sentiment that Lisp is a much safer language to
build anything in. While several in this thread are pointing to bounds
checking as one of the advantages that Lisp has over C and other languages,
there is something else I find that is also very strong: It is easier to
write programs about which a reader can reason about correctness. In Lisp,
the programs tend to be closer to provable and errors are more evident. As
in "obviously no deficiencies" vs "no obvious deficiencies".
But in my experience, vulnerabilities result from
- Buffer Overflows/lack of bounds checking (Heartbleed and friends)
- Configuration errors
- Logic Flaws
- Dangerous use of user input (leading to SQLi, XSS, XSRF)
- Improper use of cryptography
- Unclear protocol specification (leading to OpenSSL)
This (IMO entirely worthy and correct) summary can easily be
misunderstood! Lisp may be superior because it has bounds checking. (We've
previously agreed that isn't guaranteed since it isn't in the ANS, and in
any platform likely depends on optimization qualities, including the
optimization qualities under which internal called routines were compiled.)
But bugs based on buffer overflow don't on normal operating systems in
general involve bounds checking. At some point on any modern OS, reading
or writing to a socket stream will involve passing to the OS (generally via
a thin user-mode C API layer like *nix read() and write(), or some socket
analogue). Neither Lisp nor C will provide any automatic bounds checking
on such a call. The OS treats the application's address space as a
mostly-contiguous undifferentiated sea of bytes(*). It doesn't matter that
at the app level C also has this model of a sea of bytes, while in Lisp the
ocean is run-time tagged into small plots. That distinction disappears
once one calls write(fd,buf,len).
This is essentially the point I made in my email on April 13; an
application program these days (even one written in Lisp) necessarily
depends on a large set of libraries and support software that the
application programmer has little to no control over. Naive pronouncements
that we should simply write all our code in Lisp (or another "safer"
language) are almost guaranteed to have limited effect because many
security problems are manifest in code we depend on that is simply out of
our control. Rebuilding the entire ecosystem that our applications sit on
is economically infeasible and still leaves us open to the possibility of
security problems in the underlying hardware (which have been shown to be
real and to have been recently exploited). This in no way implies that we
should not STRIVE to do better, but illustrates that the issue is more
complicated than language A vs language B.

Further, assertions that compiler writers of language A tend to write
compilers (or, in this case, standard libraries) that aren't safe in some
way while writers of compilers for language B write systems that are is,
frankly, self-congratulatory naval gazing.

The Lisp Machine in its several manifestations might be the only
Post by Steve Haflich
counterexample,
This, however, I disagree with. There are operating systems that deal
solely with managed-code objects. If one considers, e.g., IL to be the
"hardware" that sits on top of the underlying native instruction set acting
as microcode, then Microsoft's Singularity system could be described as
approximately equivalent to a Lisp machine in this regard.

since there was no C boundary over which to cross, and because type and
Post by Steve Haflich
bounds checking was performed for free in the microcode. But Lisp machines
aren't around any more largely because of the economy of scale. The number
of x86 and x64 processors on the planet must be nearly on the order of
10^9, while the number of Lisp machine processors never got out of the 10^5
range, so Intel and AMD etc. could justify huge investments making those
processors 3 orders of magnitude faster in raw speed. Lisp processors
could not have kept up at bearable per-item cost. Alas!
It is certainly true that the Heartbleed bug resulted from an
insufficiently-cautious implementation of an (overly?)complex
specification. The author of the bug has essentially agreed with this
analysis. But the "bounds checking" of most Lisp implementations would
provide no protection against this failure (about which the original
posting agrees) unless the succinctness and superior clarity of CL vs C
code might help it be seen. That's a thin thread on which to hang an
entire language argument.
Actually, I'm not sure about that; in this case, the boundary violation was
real and due to not taking into account the length of the input (e.g., one
memcpy'd more than had been provided, reading off the end of the source
buffer). But it was a rookie C programmer mistake, and I agree that this
is indeed scant ammunition in a language beef.

(*) I originally saw this beautiful metaphor, that C treats memory as an
Post by Steve Haflich
undifferentiated sea of bytes, on some discussion list but can't remember
the originator. Google shows current use scattered over many programming
subjects, but doesnt identify the original. Anyway, it is the reason that
a small hyper-efficient C-struct-in-Lisp defining macro I wrote for a
certain huge CL software product is named "define-sea-struct" and (I used
to be a sailor) the operator for computing offsets possibly through
multiple levels of nested structs is called "following-sea".
Paradoxically, http://www.ibiblio.org/hyperwar/NHC/fairwinds.htm says
"following seas" means "SAFE journey, good fortune" [emphasis added].
Semper Fi.

- Dan C.
William Lederer
2014-04-27 15:48:41 UTC
Permalink
Dan:

I mostly agree with what you are saying. However there is one point in
much of this discussion that may not be covered.

Further, assertions that compiler writers of language A tend to write
compilers (or, in this case, standard libraries) that aren't safe in some
way while writers of compilers for language B write systems that are is,
frankly, self-congratulatory naval gazing.

There is a fundamental practical difference between C and Lisp that is
relevant in the security world. That is of the vast number of explicitly
undefined behaviors that are in the specification of C. This is pretty much
unmatched in Lisp or C# or Java. John Reghur at http://blog.regehr.org/ has
done some fascinating work not only about undefined behaviors of C, but
also of the substantial number of bugs in compilers.

And while what you say is true about dependencies on other libraries (this
is always a major item we check for when doing assessments) is a risk for
all systems (except for qmail), a significant fraction of all breaches are
a result of logic errors or configuration errors. These errors compromise
all systems equally, regardless of the language underneath.

wglb
Post by Dan Cross
Post by Steve Haflich
I agree with essentially everything in wglb's message, but (once again)
I'll grumpily jump in to emphasize a point which I think many on this list
have missed.
On Fri, Apr 25, 2014 at 1:20 PM, William Lederer <
Post by William Lederer
I agree with the general sentiment that Lisp is a much safer language to
build anything in. While several in this thread are pointing to bounds
checking as one of the advantages that Lisp has over C and other languages,
there is something else I find that is also very strong: It is easier to
write programs about which a reader can reason about correctness. In Lisp,
the programs tend to be closer to provable and errors are more evident. As
in "obviously no deficiencies" vs "no obvious deficiencies".
But in my experience, vulnerabilities result from
- Buffer Overflows/lack of bounds checking (Heartbleed and friends)
- Configuration errors
- Logic Flaws
- Dangerous use of user input (leading to SQLi, XSS, XSRF)
- Improper use of cryptography
- Unclear protocol specification (leading to OpenSSL)
This (IMO entirely worthy and correct) summary can easily be
misunderstood! Lisp may be superior because it has bounds checking. (We've
previously agreed that isn't guaranteed since it isn't in the ANS, and in
any platform likely depends on optimization qualities, including the
optimization qualities under which internal called routines were compiled.)
But bugs based on buffer overflow don't on normal operating systems in
general involve bounds checking. At some point on any modern OS, reading
or writing to a socket stream will involve passing to the OS (generally via
a thin user-mode C API layer like *nix read() and write(), or some socket
analogue). Neither Lisp nor C will provide any automatic bounds checking
on such a call. The OS treats the application's address space as a
mostly-contiguous undifferentiated sea of bytes(*). It doesn't matter that
at the app level C also has this model of a sea of bytes, while in Lisp the
ocean is run-time tagged into small plots. That distinction disappears
once one calls write(fd,buf,len).
This is essentially the point I made in my email on April 13; an
application program these days (even one written in Lisp) necessarily
depends on a large set of libraries and support software that the
application programmer has little to no control over. Naive pronouncements
that we should simply write all our code in Lisp (or another "safer"
language) are almost guaranteed to have limited effect because many
security problems are manifest in code we depend on that is simply out of
our control. Rebuilding the entire ecosystem that our applications sit on
is economically infeasible and still leaves us open to the possibility of
security problems in the underlying hardware (which have been shown to be
real and to have been recently exploited). This in no way implies that we
should not STRIVE to do better, but illustrates that the issue is more
complicated than language A vs language B.
Further, assertions that compiler writers of language A tend to write
compilers (or, in this case, standard libraries) that aren't safe in some
way while writers of compilers for language B write systems that are is,
frankly, self-congratulatory naval gazing.
The Lisp Machine in its several manifestations might be the only
Post by Steve Haflich
counterexample,
This, however, I disagree with. There are operating systems that deal
solely with managed-code objects. If one considers, e.g., IL to be the
"hardware" that sits on top of the underlying native instruction set acting
as microcode, then Microsoft's Singularity system could be described as
approximately equivalent to a Lisp machine in this regard.
since there was no C boundary over which to cross, and because type and
Post by Steve Haflich
bounds checking was performed for free in the microcode. But Lisp machines
aren't around any more largely because of the economy of scale. The number
of x86 and x64 processors on the planet must be nearly on the order of
10^9, while the number of Lisp machine processors never got out of the 10^5
range, so Intel and AMD etc. could justify huge investments making those
processors 3 orders of magnitude faster in raw speed. Lisp processors
could not have kept up at bearable per-item cost. Alas!
It is certainly true that the Heartbleed bug resulted from an
insufficiently-cautious implementation of an (overly?)complex
specification. The author of the bug has essentially agreed with this
analysis. But the "bounds checking" of most Lisp implementations would
provide no protection against this failure (about which the original
posting agrees) unless the succinctness and superior clarity of CL vs C
code might help it be seen. That's a thin thread on which to hang an
entire language argument.
Actually, I'm not sure about that; in this case, the boundary violation
was real and due to not taking into account the length of the input (e.g.,
one memcpy'd more than had been provided, reading off the end of the source
buffer). But it was a rookie C programmer mistake, and I agree that this
is indeed scant ammunition in a language beef.
(*) I originally saw this beautiful metaphor, that C treats memory as an
Post by Steve Haflich
undifferentiated sea of bytes, on some discussion list but can't remember
the originator. Google shows current use scattered over many programming
subjects, but doesnt identify the original. Anyway, it is the reason that
a small hyper-efficient C-struct-in-Lisp defining macro I wrote for a
certain huge CL software product is named "define-sea-struct" and (I used
to be a sailor) the operator for computing offsets possibly through
multiple levels of nested structs is called "following-sea".
Paradoxically, http://www.ibiblio.org/hyperwar/NHC/fairwinds.htm says
"following seas" means "SAFE journey, good fortune" [emphasis added].
Semper Fi.
- Dan C.
_______________________________________________
pro mailing list
http://common-lisp.net/cgi-bin/mailman/listinfo/pro
Scott L. Burson
2014-04-27 23:55:15 UTC
Permalink
At some point on any modern OS, reading or writing
to a socket stream will involve passing to the OS (generally via a thin
user-mode C API layer like *nix read() and write(), or some socket
analogue). Neither Lisp nor C will provide any automatic bounds checking on
such a call. The OS treats the application's address space as a
mostly-contiguous undifferentiated sea of bytes(*). It doesn't matter that
at the app level C also has this model of a sea of bytes, while in Lisp the
ocean is run-time tagged into small plots. That distinction disappears once
one calls write(fd,buf,len).
I think we've all understood that.

But here's the thing. If you're writing at the application level in
Lisp (or Java, or Python, or Ruby, or ...) you're probably not going
to code the foreign call to 'write' yourself. You're probably going
to invoke some stream operation that was written either by the Lisp
implementor or by the author of a portable library. This means the
person who writes the foreign call (a) is probably more experienced
and in more of a mindset to think about things like bounds checking;
(b) is therefore likelier to insert a check that the number of bytes
you want written is no greater than the length of the array you've
provided; and (c) has the information available at that point in the
code to make that check, because the array is not represented as just
a raw byte pointer. The last point is the most important: the library
writer _can_ make the check, which is not true the way things are
usually done in C.

C, in contrast, has people writing dangerous low-level calls _all the
time_, in _application_ code. The odds of it being done correctly
every time are far poorer -- in practice, approximately zero, in any
substantial program.

It certainly is possible to write better C libraries that relieve the
application programmer of some of this burden; both Microsoft and
Apple have done some of this. But the use of these libraries is not
yet routine in portable POSIX code, and I don't know that they would
have caught the Heartbleed bug anyway.

I'm not suggesting that bounds errors are the only source of security
vulnerabilities -- William's list is a good one -- nor that use of a
"safe" language is an absolute guarantee that one won't have them.
But in practice, an attacker's time is not well spent looking for
bounds errors in applications written in Lisp/Java/etc. It _is_ well
spent looking for them in C code.

-- Scott
Jean-Claude Beaudoin
2014-04-28 03:31:41 UTC
Permalink
Post by Steve Haflich
I agree with essentially everything in wglb's message, but (once again)
I'll grumpily jump in to emphasize a point which I think many on this list
have missed.
On Fri, Apr 25, 2014 at 1:20 PM, William Lederer <
Post by William Lederer
I agree with the general sentiment that Lisp is a much safer language to
build anything in. While several in this thread are pointing to bounds
checking as one of the advantages that Lisp has over C and other languages,
there is something else I find that is also very strong: It is easier to
write programs about which a reader can reason about correctness. In Lisp,
the programs tend to be closer to provable and errors are more evident. As
in "obviously no deficiencies" vs "no obvious deficiencies".
But in my experience, vulnerabilities result from
- Buffer Overflows/lack of bounds checking (Heartbleed and friends)
- Configuration errors
- Logic Flaws
- Dangerous use of user input (leading to SQLi, XSS, XSRF)
- Improper use of cryptography
- Unclear protocol specification (leading to OpenSSL)
This (IMO entirely worthy and correct) summary can easily be
misunderstood! Lisp may be superior because it has bounds checking. (We've
previously agreed that isn't guaranteed since it isn't in the ANS, and in
any platform likely depends on optimization qualities, including the
optimization qualities under which internal called routines were compiled.)
But bugs based on buffer overflow don't on normal operating systems in
general involve bounds checking. At some point on any modern OS, reading
or writing to a socket stream will involve passing to the OS (generally via
a thin user-mode C API layer like *nix read() and write(), or some socket
analogue). Neither Lisp nor C will provide any automatic bounds checking
on such a call. The OS treats the application's address space as a
mostly-contiguous undifferentiated sea of bytes(*). It doesn't matter that
at the app level C also has this model of a sea of bytes, while in Lisp the
ocean is run-time tagged into small plots. That distinction disappears
once one calls write(fd,buf,len).
The Lisp Machine in its several manifestations might be the only
counterexample, since there was no C boundary over which to cross, and
because type and bounds checking was performed for free in the microcode.
But Lisp machines aren't around any more largely because of the economy of
scale. The number of x86 and x64 processors on the planet must be nearly
on the order of 10^9, while the number of Lisp machine processors never got
out of the 10^5 range, so Intel and AMD etc. could justify huge investments
making those processors 3 orders of magnitude faster in raw speed. Lisp
processors could not have kept up at bearable per-item cost. Alas!
I think it is not only a question of level of investment or either a
question of Lisp in hardware or an other higher language in hardware for
that matter. There seems to be some physico-technical optimality point in
question here at the hardware/software interface. From my (fading?)
memories of a past era I can somewhat recall that the last (I think) major
CPU architecture that took security support seriously in hardware was the
Intel iAPX 432, with multiple nested security rings in hardware/descriptor
supported gates/instructions. (BTW, the 432 was meant to support Ada of all
languages, not C or Lisp, but it was general-purpose enough). And history
has recorded how well this iAPX 432 architecture flew.

And while I am using the word "fly" I have that urge to ask you guys that
question: What would you personally fly, software written in C or software
written in Common Lisp? And I mean it quite literally, with you sitting in
the plane. I think that the fact that one can seriously ask that question
is one of the most significant evolution in the demands the general context
presents to any programming language standard. In 1994 fly-by-wire was
cutting edge and still quite experimental, now in 2014 it is the reality of
every day on routine commercial flight.

I see (a somewhat revised?) Common Lisp as a very good starting point to
address this new reality.
Better than C, that is for sure. (How can you even hope to make C more safe
and secure and yet still
be C is beyond my understanding, FWIW).
William Lederer
2014-04-28 14:09:21 UTC
Permalink
Regarding the question

What would you personally fly, software written in C or software written in
Common Lisp?

In the reality of today's fly-by-wire, the modern planes you fly in are
likely to have C in some critical component. Ada is likely there as well.

But let's just examine a few software related disasters to see if they are
attributable to programming language:

- Ariane 5 rocket explosion: from the official report: This loss of
information was due to specification and design errors in the software of
the inertial reference system.
- Mars Climate Orbiter: one system used metric units, another used
English
- Therac 5: improper understanding of multi-tasking code
- Heartbleed: Overly complex protocol combined with being able to read
beyond allocated memory

Of these, only heartbleed can credit language as a contributing factor.

And I again point out a software non-disaster qmail, whose author offered a
bug bounty. Secure programs can be written in C.

And if the flight safety of an aircraft depended upon the current Lisp
version of Ironclad's impenetrability, we would be in trouble.

I do prefer Lisp, and as I have said before, I think it is easier to write
correct and thus secure and safe programs in Lisp, but that is only a small
part of the story. Other critical parts to the story are:

- How well is the software specified?
- Who is the team writing the software? Are they CMM level 5?
- Is the software tuned to the user situation at hand? When the engine
exploded on the Quantas flight 32, the pilots had to deal with an almost
overwhelming number of alerts.

You do ask a good question, but in my opinion, choice of language is not at
the top of the list.

wglb
(P. S. I am not a lisp expert, but I have been programming for 48 years,
including real-time medical software, compilers, financial feed software.
For the medical system, we used assembly language. C had not yet been
invented, but turns out that doing coroutines in assembler was better than
threads showing up later in C.)



On Sun, Apr 27, 2014 at 10:31 PM, Jean-Claude Beaudoin <
Post by Jean-Claude Beaudoin
Post by Steve Haflich
I agree with essentially everything in wglb's message, but (once again)
I'll grumpily jump in to emphasize a point which I think many on this list
have missed.
On Fri, Apr 25, 2014 at 1:20 PM, William Lederer <
Post by William Lederer
I agree with the general sentiment that Lisp is a much safer language to
build anything in. While several in this thread are pointing to bounds
checking as one of the advantages that Lisp has over C and other languages,
there is something else I find that is also very strong: It is easier to
write programs about which a reader can reason about correctness. In Lisp,
the programs tend to be closer to provable and errors are more evident. As
in "obviously no deficiencies" vs "no obvious deficiencies".
But in my experience, vulnerabilities result from
- Buffer Overflows/lack of bounds checking (Heartbleed and friends)
- Configuration errors
- Logic Flaws
- Dangerous use of user input (leading to SQLi, XSS, XSRF)
- Improper use of cryptography
- Unclear protocol specification (leading to OpenSSL)
This (IMO entirely worthy and correct) summary can easily be
misunderstood! Lisp may be superior because it has bounds checking. (We've
previously agreed that isn't guaranteed since it isn't in the ANS, and in
any platform likely depends on optimization qualities, including the
optimization qualities under which internal called routines were compiled.)
But bugs based on buffer overflow don't on normal operating systems in
general involve bounds checking. At some point on any modern OS, reading
or writing to a socket stream will involve passing to the OS (generally via
a thin user-mode C API layer like *nix read() and write(), or some socket
analogue). Neither Lisp nor C will provide any automatic bounds checking
on such a call. The OS treats the application's address space as a
mostly-contiguous undifferentiated sea of bytes(*). It doesn't matter that
at the app level C also has this model of a sea of bytes, while in Lisp the
ocean is run-time tagged into small plots. That distinction disappears
once one calls write(fd,buf,len).
The Lisp Machine in its several manifestations might be the only
counterexample, since there was no C boundary over which to cross, and
because type and bounds checking was performed for free in the microcode.
But Lisp machines aren't around any more largely because of the economy of
scale. The number of x86 and x64 processors on the planet must be nearly
on the order of 10^9, while the number of Lisp machine processors never got
out of the 10^5 range, so Intel and AMD etc. could justify huge investments
making those processors 3 orders of magnitude faster in raw speed. Lisp
processors could not have kept up at bearable per-item cost. Alas!
I think it is not only a question of level of investment or either a
question of Lisp in hardware or an other higher language in hardware for
that matter. There seems to be some physico-technical optimality point in
question here at the hardware/software interface. From my (fading?)
memories of a past era I can somewhat recall that the last (I think) major
CPU architecture that took security support seriously in hardware was the
Intel iAPX 432, with multiple nested security rings in hardware/descriptor
supported gates/instructions. (BTW, the 432 was meant to support Ada of all
languages, not C or Lisp, but it was general-purpose enough). And history
has recorded how well this iAPX 432 architecture flew.
And while I am using the word "fly" I have that urge to ask you guys that
question: What would you personally fly, software written in C or software
written in Common Lisp? And I mean it quite literally, with you sitting in
the plane. I think that the fact that one can seriously ask that question
is one of the most significant evolution in the demands the general context
presents to any programming language standard. In 1994 fly-by-wire was
cutting edge and still quite experimental, now in 2014 it is the reality of
every day on routine commercial flight.
I see (a somewhat revised?) Common Lisp as a very good starting point to
address this new reality.
Better than C, that is for sure. (How can you even hope to make C more
safe and secure and yet still
be C is beyond my understanding, FWIW).
_______________________________________________
pro mailing list
http://common-lisp.net/cgi-bin/mailman/listinfo/pro
Antoni Grzymała
2014-04-28 14:19:24 UTC
Permalink
Post by William Lederer
And I again point out a software non-disaster qmail, whose author
offered a bug bounty. Secure programs can be written in C.
I think you should stop gloryfying qmail, it has known bugs, violates
some RFC's and the author (who turns out to be rather arrogant here)
wouldn't pay out the bounty:

http://www.dt.e-technik.uni-dortmund.de/~ma/qmail-bugs.html
--
[アントシカ]
William Lederer
2014-04-28 15:52:03 UTC
Permalink
Sorry, I am familiar with the controversy regarding his personality and his
argument about the denial of service issues and the claimed security bug
that happens if the size allocated to qmail exceeds the number of bytes
countable in 32 bits. Yes, he is arrogant, but he does work of the first
order.

I stand by my recommendation, and stand by the assertion that secure coding
can and has been done in C.

What is lost in this controversy is the sheer magnitude of vulnerabilities
in sendmail historically.

wglb
Post by Antoni Grzymała
Post by William Lederer
And I again point out a software non-disaster qmail, whose author
offered a bug bounty. Secure programs can be written in C.
I think you should stop gloryfying qmail, it has known bugs, violates
some RFC's and the author (who turns out to be rather arrogant here)
http://www.dt.e-technik.uni-dortmund.de/~ma/qmail-bugs.html
--
[アントシカ]
_______________________________________________
pro mailing list
http://common-lisp.net/cgi-bin/mailman/listinfo/pro
William Lederer
2014-04-29 11:48:22 UTC
Permalink
Also, DJB also wrote a replacement for the bug-infested BIND called djbdns.
That too had a security guarantee. Someone found a bug, and DJB paid out
$1000.

wglb
Post by Antoni Grzymała
Post by William Lederer
And I again point out a software non-disaster qmail, whose author
offered a bug bounty. Secure programs can be written in C.
I think you should stop gloryfying qmail, it has known bugs, violates
some RFC's and the author (who turns out to be rather arrogant here)
http://www.dt.e-technik.uni-dortmund.de/~ma/qmail-bugs.html
--
[アントシカ]
_______________________________________________
pro mailing list
http://common-lisp.net/cgi-bin/mailman/listinfo/pro
Pascal J. Bourguignon
2014-04-28 22:40:09 UTC
Permalink
Post by William Lederer
Regarding the question
What would you personally fly, software written in C or software
written in Common Lisp? 
In the reality of today's fly-by-wire, the modern planes you fly in
are likely to have C in some critical component. Ada is likely there
as well.
But let's just examine a few software related disasters to see if
Ariane 5 rocket explosion: from the official report: This loss of
information was due to specification and design errors in the
software of the inertial reference system.
Mars Climate Orbiter: one system used metric units, another used
English
Therac 5: improper understanding of multi-tasking code
Heartbleed: Overly complex protocol combined with being able to
read beyond allocated memory
Of these, only heartbleed can credit language as a contributing factor.
Not at all.


* Programmed in Common Lisp, either the fixnum in the Ariane 5 would have
been converted into a bignum, or an condition would have been
signaled, which could have been handled. This would have taken
time, which could perhaps have "exploded" the real time constraints,
but it is better to control your rocket slughishly than not to
control it at all.

* Programmed in Common Lisp, instead of using raw numbers of physical
magnitudes, you'd use objects such as:

(+ #<kilometer/hour 5.42> #<foot/fortnight 12857953.0> )
--> #<meter/second 4.7455556>

and Mars Climate Orbiter wouldn't have crashed.

* Programmed in Common Lisp, the Therac-5 bug wouldn't have occured:

"The defect was as follows: a one-byte counter in a testing
routine frequently overflowed; if an operator provided manual
input to the machine at the precise moment that this counter
overflowed, the interlock would fail."

since again, incrementing a counter doesn't fucking overflow in
lisp!

* Programmed in Common Lisp, heartbleed wouldn't have occured, because
lisp implementors provide array bound checks, and lisp programmers
are conscious enough to run always with (safety 3), as previously
discussed in this thread.


What I'm saying is that there's a mind set out-there, of using modular
arithmetic to approximate arithmetic blindly. Until you will be able to
pay $1.29 for 3 kg of apples @ $2.99, people should not program with
modular arithmetic!
Post by William Lederer
And I again point out a software non-disaster qmail, whose author
offered a bug bounty. Secure programs can be written in C.
postfix too is architectured to deal with security.

You can also write secure software on a Turing Machine.
Post by William Lederer
And if the flight safety of an aircraft depended upon the current
Lisp version of Ironclad's impenetrability, we would be in trouble.
This is another question, that of the resources invested in a software
ecosystem, and that of programming language mind share. Why the
cryptographists don't write their libraries in Common Lisp and choose to
produce piles of C instead?
--
__Pascal Bourguignon__
http://www.informatimago.com/
"Le mercure monte ? C'est le moment d'acheter !"
William Lederer
2014-04-29 02:23:03 UTC
Permalink
Regarding why programmers don't write libraries in common lisp (ignoring
what seems to be a screaming terror of the parenthetical, functional world)
is that cryptography must be fast, and it must not leak timing information.

A final word here--I spend my days auditing and pen testing programs
written in managed languages: C# and Java. None of the errors that bring
down systems and lead to breaches in these languages result from bounds
checking or buffer overflow issues. None of them are subject to the same
kinds of flaws C exposes as evidenced by heartbleed. Nonetheless, there are
vulnerabilities.

And I am sure that all remember the vulnerability exposed in Ycombinator
which is written in Lisp. Simply writing your stuff in Lisp is not enough.

wglb


On Mon, Apr 28, 2014 at 5:40 PM, Pascal J. Bourguignon <
Post by Pascal J. Bourguignon
Post by William Lederer
Regarding the question
What would you personally fly, software written in C or software
written in Common Lisp?
In the reality of today's fly-by-wire, the modern planes you fly in
are likely to have C in some critical component. Ada is likely there
as well.
But let's just examine a few software related disasters to see if
Ariane 5 rocket explosion: from the official report: This loss of
information was due to specification and design errors in the
software of the inertial reference system.
Mars Climate Orbiter: one system used metric units, another used
English
Therac 5: improper understanding of multi-tasking code
Heartbleed: Overly complex protocol combined with being able to
read beyond allocated memory
Of these, only heartbleed can credit language as a contributing factor.
Not at all.
* Programmed in Common Lisp, either the fixnum in the Ariane 5 would have
been converted into a bignum, or an condition would have been
signaled, which could have been handled. This would have taken
time, which could perhaps have "exploded" the real time constraints,
but it is better to control your rocket slughishly than not to
control it at all.
* Programmed in Common Lisp, instead of using raw numbers of physical
(+ #<kilometer/hour 5.42> #<foot/fortnight 12857953.0> )
--> #<meter/second 4.7455556>
and Mars Climate Orbiter wouldn't have crashed.
"The defect was as follows: a one-byte counter in a testing
routine frequently overflowed; if an operator provided manual
input to the machine at the precise moment that this counter
overflowed, the interlock would fail."
since again, incrementing a counter doesn't fucking overflow in
lisp!
* Programmed in Common Lisp, heartbleed wouldn't have occured, because
lisp implementors provide array bound checks, and lisp programmers
are conscious enough to run always with (safety 3), as previously
discussed in this thread.
What I'm saying is that there's a mind set out-there, of using modular
arithmetic to approximate arithmetic blindly. Until you will be able to
modular arithmetic!
Post by William Lederer
And I again point out a software non-disaster qmail, whose author
offered a bug bounty. Secure programs can be written in C.
postfix too is architectured to deal with security.
You can also write secure software on a Turing Machine.
Post by William Lederer
And if the flight safety of an aircraft depended upon the current
Lisp version of Ironclad's impenetrability, we would be in trouble.
This is another question, that of the resources invested in a software
ecosystem, and that of programming language mind share. Why the
cryptographists don't write their libraries in Common Lisp and choose to
produce piles of C instead?
--
__Pascal Bourguignon__
http://www.informatimago.com/
"Le mercure monte ? C'est le moment d'acheter !"
_______________________________________________
pro mailing list
http://common-lisp.net/cgi-bin/mailman/listinfo/pro
Alexander Schreiber
2014-04-29 07:18:12 UTC
Permalink
Post by William Lederer
Regarding why programmers don't write libraries in common lisp (ignoring
what seems to be a screaming terror of the parenthetical, functional world)
is that cryptography must be fast, and it must not leak timing information.
A final word here--I spend my days auditing and pen testing programs
written in managed languages: C# and Java. None of the errors that bring
down systems and lead to breaches in these languages result from bounds
checking or buffer overflow issues. None of them are subject to the same
kinds of flaws C exposes as evidenced by heartbleed. Nonetheless, there are
vulnerabilities.
And I am sure that all remember the vulnerability exposed in Ycombinator
which is written in Lisp. Simply writing your stuff in Lisp is not enough.
And that is a point that bears repeating: Whatever programming language
you end up using, it will not magically protect you from all errors or
mistakes. Depending on its design and other details, it might protect you
from _some_ classes of errors (such as shooting yourself in the foot with
pointers), but no matter what language, there _will_ still be plenty of
beartraps patiently waiting for the unwary. Heck, even something as heavily
discipline-and-bondage as SPARK ADA leaves opportunities to screw up
big time - just get your design assumptions wrong and you can be toast.

Kind regard,
Alex.
--
"Opportunity is missed by most people because it is dressed in overalls and
looks like work." -- Thomas A. Edison
Alexander Schreiber
2014-04-29 07:12:57 UTC
Permalink
Post by Pascal J. Bourguignon
Post by William Lederer
Regarding the question
What would you personally fly, software written in C or software
written in Common Lisp? 
In the reality of today's fly-by-wire, the modern planes you fly in
are likely to have C in some critical component. Ada is likely there
as well.
But let's just examine a few software related disasters to see if
Ariane 5 rocket explosion: from the official report: This loss of
information was due to specification and design errors in the
software of the inertial reference system.
Mars Climate Orbiter: one system used metric units, another used
English
Therac 5: improper understanding of multi-tasking code
Heartbleed: Overly complex protocol combined with being able to
read beyond allocated memory
Of these, only heartbleed can credit language as a contributing factor.
Not at all.
Any programming language will have a hard time protecting you from
design/specification errors. And providing bandaids that paper over
design problems doesn't really help.
Post by Pascal J. Bourguignon
* Programmed in Common Lisp, either the fixnum in the Ariane 5 would have
been converted into a bignum, or an condition would have been
signaled, which could have been handled. This would have taken
time, which could perhaps have "exploded" the real time constraints,
but it is better to control your rocket slughishly than not to
control it at all.
That was not the real problem. The root cause was the design assumption that
overflowing value was _physically_ limited, i.e. during normal operation
it would have been impossible to overflow and an overflow would in fact have
signaled some serious problems bad enough to abort. While this held true in
Ariane 4, it no longer was true in the more powerful Ariane 5.

Your "solution" would have papered over the flawed design assumptions, which
is _not_ the same is fixing them.
Post by Pascal J. Bourguignon
* Programmed in Common Lisp, instead of using raw numbers of physical
(+ #<kilometer/hour 5.42> #<foot/fortnight 12857953.0> )
--> #<meter/second 4.7455556>
and Mars Climate Orbiter wouldn't have crashed.
This is ridiculous. If you end up mixing measurement systems (such as metric
and imperial) in the same project, you are _already_ doing it horribly wrong.
The design fault was mixing measurement systems, which one should _never_ do
on pain of embarassing failure. Papering over this design screwup with a
language environment that _supports_ this (instead of screaming bloody
murder at such nonsense) doesn't really help here.
Post by Pascal J. Bourguignon
"The defect was as follows: a one-byte counter in a testing
routine frequently overflowed; if an operator provided manual
input to the machine at the precise moment that this counter
overflowed, the interlock would fail."
But why did the counter overflow in the first place? Was it simply programmer
oversight that too small a datatype was used or was this actually an error
that just didn't have noticeable consequences most of the times? If the
later, then again, papering over it with a never overflowing counter is
not a fix.
Post by Pascal J. Bourguignon
since again, incrementing a counter doesn't fucking overflow in
lisp!
* Programmed in Common Lisp, heartbleed wouldn't have occured, because
lisp implementors provide array bound checks, and lisp programmers
are conscious enough to run always with (safety 3), as previously
discussed in this thread.
Hehe, "conscious enough to run always with (safety 3)". Riiiiight. And nobody
was ever tempted to trade a little runtime safety for speed, correct?

As for heartbleed: arguably, the RFC that the broken code implemented
shouldn't have existed in the first place.
Post by Pascal J. Bourguignon
What I'm saying is that there's a mind set out-there, of using modular
arithmetic to approximate arithmetic blindly. Until you will be able to
modular arithmetic!
Well, modular arithmetic doesn't go away because one wishes it so. As a
developer doing non time critical high level work one might be able to
cheerfully ignore it, but the moment one writes sufficiently time critical
or low level code one will have to deal with it. Because modular arithmetic
is what your CPU is doing - unless you happen to have a CPU at hand that
does bignums natively at the register level? No? Funny that.
Post by Pascal J. Bourguignon
Post by William Lederer
And I again point out a software non-disaster qmail, whose author
offered a bug bounty. Secure programs can be written in C.
postfix too is architectured to deal with security.
You can also write secure software on a Turing Machine.
Software running on _actual_ Turing machines tends to be of mostly limited
use, though.
Post by Pascal J. Bourguignon
Post by William Lederer
And if the flight safety of an aircraft depended upon the current
Lisp version of Ironclad's impenetrability, we would be in trouble.
This is another question, that of the resources invested in a software
ecosystem, and that of programming language mind share. Why the
cryptographists don't write their libraries in Common Lisp and choose to
produce piles of C instead?
Usefulness. If I write a library in C, pretty much everything that runs on
Unix can link to it (if need be, via FFI and friends) and use it. If I write
a library i Common Lisp, then code written in Common Lisp can use it unless
people are willing to do some interesting contortions (such wrapping it in
an RPC server).

Exercise for the interested: write a library in Common Lisp that does, say,
some random data frobnication and try to use it from: C, Python, Perl, C++
_without_ writing new interface infrastructure.

Kind regards,
Alex.
--
"Opportunity is missed by most people because it is dressed in overalls and
looks like work." -- Thomas A. Edison
Pascal J. Bourguignon
2014-04-29 07:45:55 UTC
Permalink
Post by Alexander Schreiber
Post by Pascal J. Bourguignon
* Programmed in Common Lisp, either the fixnum in the Ariane 5 would have
been converted into a bignum, or an condition would have been
signaled, which could have been handled. This would have taken
time, which could perhaps have "exploded" the real time constraints,
but it is better to control your rocket slughishly than not to
control it at all.
That was not the real problem. The root cause was the design assumption that
overflowing value was _physically_ limited, i.e. during normal operation
it would have been impossible to overflow and an overflow would in fact have
signaled some serious problems bad enough to abort. While this held true in
Ariane 4, it no longer was true in the more powerful Ariane 5.
Your "solution" would have papered over the flawed design assumptions, which
is _not_ the same is fixing them.
You’re forgetting we’re talking about embedded programs with real-time processes.
You don’t have the time to stop everything and “debug” the design.
You have to control a rocket and avoid it crashing!

That’s the reason I’ve not mentionned RAX yet: the situation was quite different,
since they had the time to perform remote debugging, over several days.
Post by Alexander Schreiber
* Programmed in Common Lisp, instead of using raw numbers of physical
Post by Pascal J. Bourguignon
(+ #<kilometer/hour 5.42> #<foot/fortnight 12857953.0> )
--> #<meter/second 4.7455556>
and Mars Climate Orbiter wouldn't have crashed.
This is ridiculous. If you end up mixing measurement systems (such as metric
and imperial) in the same project, you are _already_ doing it horribly wrong.
It wasn’t in the same project. The data was actually sent from a remote Earth station.
So this is even worse than not using magnitude with units inside the process, it was a
serialization/deserialization error. But notice how Lisp prints out the speeds above!
It writes the units along with the values!

Now, of course it’s not a programming language question. We already determined that,
when noting that neither the ANSI Common Lisp nor the ANSI C standard imposes
bound checking, but that C programmers don’t code bound checkings, and C implementers,
being C programmers, implement compilers that don’t do bound checking, while the
inverse is true of Common Lisp programmers.

This is always the same thing: “statically typed” proponents want to separate the checks
from the code, performing (or not) the checks during design/proof/compilation, while
“dynamically typed” proponents keep the checks inside the code, making the compiler
and system generate and perform all the typing, bounds, etc checks at run-time.
So when a C guy (any statically typed guy) sends data, he expects that the type and
bounds of the data are know (before hand, by both parties). But when a Lisp guy (any
dynamically typed guy) sends data, he sends it in a syntactic form that explicitely
types it, and the data is parsed, validated, bound checked and typed according to
the transmitted syntax on the receiving end.


Of course, generating C code doesn’t mean that you can’t design your system in a
"dynamically typed” spirit. But this is not the natural noosphere of the C ecosystem.
Post by Alexander Schreiber
The design fault was mixing measurement systems, which one should _never_ do
on pain of embarassing failure. Papering over this design screwup with a
language environment that _supports_ this (instead of screaming bloody
murder at such nonsense) doesn't really help here.
Again, we are talking about an embedded program, in a real time system, where you
have only seconds of burn stage on re-entry, and where you DON’T HAVE THE TIME
to detect, debug, come back to the design board, compile and upload a new version!

The software that uploaded the untagged, without units, bit field *data*, instead of
some meaningful *information*, hadn’t even been completed before the orbiter was
in space! It wasn’t developed by the same team, and wasn’t compiled into the same
executable.

Nonetheless, here a lisper would have sent *information* in a sexp, and dynamic
checks and conversions would have been done.

If you will, the design would have been different in the first place!
Post by Alexander Schreiber
Post by Pascal J. Bourguignon
"The defect was as follows: a one-byte counter in a testing
routine frequently overflowed; if an operator provided manual
input to the machine at the precise moment that this counter
overflowed, the interlock would fail."
But why did the counter overflow in the first place? Was it simply programmer
oversight that too small a datatype was used or was this actually an error
that just didn't have noticeable consequences most of the times? If the
later, then again, papering over it with a never overflowing counter is
not a fix.
But it if was a problem, it *would* eventually reach a bound check, and signal
a condition, thus stopping the process of irradiating and killing people.

Remember: a Lisp program (any "dynamically typed” program) is FULL of checks!
Post by Alexander Schreiber
Post by Pascal J. Bourguignon
since again, incrementing a counter doesn't fucking overflow in
lisp!
* Programmed in Common Lisp, heartbleed wouldn't have occured, because
lisp implementors provide array bound checks, and lisp programmers
are conscious enough to run always with (safety 3), as previously
discussed in this thread.
Hehe, "conscious enough to run always with (safety 3)". Riiiiight. And nobody
was ever tempted to trade a little runtime safety for speed, correct?
Those are C programmers. You won’t find any other safety that 3 in my code.
You should not find any other safety than 3 in mission critical code, much less
in life threatening code.
Post by Alexander Schreiber
As for heartbleed: arguably, the RFC that the broken code implemented
shouldn't have existed in the first place.
Post by Pascal J. Bourguignon
What I'm saying is that there's a mind set out-there, of using modular
arithmetic to approximate arithmetic blindly. Until you will be able to
modular arithmetic!
Well, modular arithmetic doesn't go away because one wishes it so. As a
developer doing non time critical high level work one might be able to
cheerfully ignore it, but the moment one writes sufficiently time critical
or low level code one will have to deal with it. Because modular arithmetic
is what your CPU is doing - unless you happen to have a CPU at hand that
does bignums natively at the register level? No? Funny that.
This might have been true in 1968, when adding a bit of memory added 50 gr. of payload!

Nowadays, there’s no excuse.
Post by Alexander Schreiber
Post by Pascal J. Bourguignon
Post by William Lederer
And if the flight safety of an aircraft depended upon the current
Lisp version of Ironclad's impenetrability, we would be in trouble.
This is another question, that of the resources invested in a software
ecosystem, and that of programming language mind share. Why the
cryptographists don't write their libraries in Common Lisp and choose to
produce piles of C instead?
Usefulness. If I write a library in C, pretty much everything that runs on
Unix can link to it (if need be, via FFI and friends) and use it. If I write
a library i Common Lisp, then code written in Common Lisp can use it unless
people are willing to do some interesting contortions (such wrapping it in
an RPC server).
Anything running on unix can link to libecl.so (which is ironically a CL
implementation using gcc, but we can assume it’s a temporary solution).
Post by Alexander Schreiber
Exercise for the interested: write a library in Common Lisp that does, say,
some random data frobnication and try to use it from: C, Python, Perl, C++
_without_ writing new interface infrastructure.
But the point is to eliminate code written in C, Perl, C++! So your exercise is academic.

—
__Pascal Bourguignon__
Hans Hübner
2014-04-29 08:31:05 UTC
Permalink
For your amusement:

https://github.com/search?q=exec($_GET&ref=cmdform&type=Code
Alexander Schreiber
2014-04-29 19:04:32 UTC
Permalink
Post by Hans Hübner
https://github.com/search?q=exec($_GET&ref=cmdform&type=Code
Execing straight from the network? What could possibly go wrong ...

Kind regards,
Alex.
--
"Opportunity is missed by most people because it is dressed in overalls and
looks like work." -- Thomas A. Edison
Alexander Schreiber
2014-04-29 19:01:19 UTC
Permalink
Post by Alexander Schreiber
Post by Pascal J. Bourguignon
* Programmed in Common Lisp, either the fixnum in the Ariane 5 would have
been converted into a bignum, or an condition would have been
signaled, which could have been handled. This would have taken
time, which could perhaps have "exploded" the real time constraints,
but it is better to control your rocket slughishly than not to
control it at all.
That was not the real problem. The root cause was the design assumption that
overflowing value was _physically_ limited, i.e. during normal operation
it would have been impossible to overflow and an overflow would in fact have
signaled some serious problems bad enough to abort. While this held true in
Ariane 4, it no longer was true in the more powerful Ariane 5.
Your "solution" would have papered over the flawed design assumptions, which
is _not_ the same is fixing them.
You’re forgetting we’re talking about embedded programs with real-time processes.
You don’t have the time to stop everything and “debug” the design.
You have to control a rocket and avoid it crashing!
Who spoke about debugging a live rocket?
Post by Alexander Schreiber
* Programmed in Common Lisp, instead of using raw numbers of physical
Post by Pascal J. Bourguignon
(+ #<kilometer/hour 5.42> #<foot/fortnight 12857953.0> )
--> #<meter/second 4.7455556>
and Mars Climate Orbiter wouldn't have crashed.
This is ridiculous. If you end up mixing measurement systems (such as metric
and imperial) in the same project, you are _already_ doing it horribly wrong.
It wasn’t in the same project. The data was actually sent from a
remote Earth station. So this is even worse than not using magnitude
with units inside the process, it was a serialization/deserialization
error. But notice how Lisp prints out the speeds above! It writes
the units along with the values!
Now, of course it’s not a programming language question. We already
determined that, when noting that neither the ANSI Common Lisp nor the
ANSI C standard imposes bound checking, but that C programmers don’t
code bound checkings, and C implementers, being C programmers,
implement compilers that don’t do bound checking, while the inverse is
true of Common Lisp programmers.
This is always the same thing: “statically typed” proponents want to
separate the checks from the code, performing (or not) the checks
during design/proof/compilation, while “dynamically typed” proponents
keep the checks inside the code, making the compiler and system
generate and perform all the typing, bounds, etc checks at run-time.
So when a C guy (any statically typed guy) sends data, he expects that
the type and bounds of the data are know (before hand, by both
parties). But when a Lisp guy (any dynamically typed guy) sends data,
he sends it in a syntactic form that explicitely types it, and the
data is parsed, validated, bound checked and typed according to the
transmitted syntax on the receiving end.
Of course, generating C code doesn’t mean that you can’t design your
system in a "dynamically typed” spirit. But this is not the natural
noosphere of the C ecosystem.
Post by Alexander Schreiber
The design fault was mixing measurement systems, which one should
_never_ do on pain of embarassing failure. Papering over this design
screwup with a language environment that _supports_ this (instead of
screaming bloody murder at such nonsense) doesn't really help here.
Again, we are talking about an embedded program, in a real time
system, where you have only seconds of burn stage on re-entry, and
where you DON’T HAVE THE TIME to detect, debug, come back to the
design board, compile and upload a new version!
Again, what is this about live debugging a flying rocket? If you propose
writing your realtime control code and deploying it straight to your
production enviroment (in that case, the rocket about to liftoff) you
have no business writing this kind of code.

You design your system, review the design, implement, test and only
deploy it live if you are confident that it will work correctly (and
the tests agree).

The above problems are things that - at the latest - should have been
caught by the test setups. Preferrably in the design stage. Actually,
IIRc for the Ariane issue there was a test that would have revealed the
problem, but it was cancelled as being too costly. Which in retrospect
was of course penny-wise, pound-foolish.
The software that uploaded the untagged, without units, bit field
*data*, instead of some meaningful *information*, hadn’t even been
completed before the orbiter was in space! It wasn’t developed by the
same team, and wasn’t compiled into the same executable.
Nonetheless, here a lisper would have sent *information* in a sexp,
and dynamic checks and conversions would have been done.
If you will, the design would have been different in the first place!
Still, supporting multiple concurrent measurements systems means adding
complexity. Which is rarely a good idea. So again, the better approach
would have been to make sure to only use _one_ measurement system
(imperial _or_ metric (preferrably metric)) which means you don't need
the measurement system awareness and conversion code in the first place.

To borrow a saying from the car industry: "The cheapest and most
reliable part is the one that isn't there in the first place."
Post by Alexander Schreiber
Post by Pascal J. Bourguignon
"The defect was as follows: a one-byte counter in a testing
routine frequently overflowed; if an operator provided manual
input to the machine at the precise moment that this counter
overflowed, the interlock would fail."
But why did the counter overflow in the first place? Was it simply
programmer oversight that too small a datatype was used or was this
actually an error that just didn't have noticeable consequences most
of the times? If the later, then again, papering over it with a
never overflowing counter is not a fix.
But it if was a problem, it *would* eventually reach a bound check,
and signal a condition, thus stopping the process of irradiating and
killing people.
Remember: a Lisp program (any "dynamically typed” program) is FULL of
checks!
Post by Alexander Schreiber
Post by Pascal J. Bourguignon
since again, incrementing a counter doesn't fucking overflow in
lisp!
* Programmed in Common Lisp, heartbleed wouldn't have occured,
because lisp implementors provide array bound checks, and lisp
programmers are conscious enough to run always with (safety 3), as
previously discussed in this thread.
Hehe, "conscious enough to run always with (safety 3)". Riiiiight.
And nobody was ever tempted to trade a little runtime safety for
speed, correct?
Those are C programmers. You won’t find any other safety that 3 in my
code. You should not find any other safety than 3 in mission critical
code, much less in life threatening code.
There is a _mountain_ of misson critical and/or life threatening code
where "safety 3" is meaningless because it is not written in Lisp.
Post by Alexander Schreiber
As for heartbleed: arguably, the RFC that the broken code
implemented shouldn't have existed in the first place.
Post by Pascal J. Bourguignon
What I'm saying is that there's a mind set out-there, of using
modular arithmetic to approximate arithmetic blindly. Until you
not program with modular arithmetic!
Well, modular arithmetic doesn't go away because one wishes it so.
As a developer doing non time critical high level work one might be
able to cheerfully ignore it, but the moment one writes sufficiently
time critical or low level code one will have to deal with it.
Because modular arithmetic is what your CPU is doing - unless you
happen to have a CPU at hand that does bignums natively at the
register level? No? Funny that.
This might have been true in 1968, when adding a bit of memory added 50 gr. of payload!
Nowadays, there’s no excuse.
Wrong.

If your code is sufficiently time critical, stuff like that begins to
matter. At the leisurely end we have an ECU: your engine runs at, say
6000 rpm so you have an ignition coming up every 20 ms. Your code _has_
to fire the sparkplug at the correct time, with sub-millisecond
precision or you'll eventually wreck the engine. While processing a
realtime sensor data stream (engine intake (air, fuel), combuston
(temperature, pressure), exhaust (temperature, pressure, gas mix) and
others). This is routinely done using CPUs that aren't all that super
powerful, in fact, using the cheapest (and that usually means slowest)
CPUs (or rather: microcontrollers) that are still just fast enough.
For example, the Freescale S12XS engine control chip (injector and
ignition) has 8/12 KB RAM and 128/256 KB flash. You are not going to
muck around with bignums in a constrained environment like that ... ;-)

And there are many, many more of those kind of embedded control systems
around than PCs, tablets and phones (all of them pretty powerful
platforms these days) combined.

At the faster end: networking code handling 10 GBit line speeds. With
latencys in the single to double digit microsecond range, you don't have
the luxury of playing with nice abstract, far-away-from-the-metal code
if you want your packet handling code to run at useful speeds.
Post by Alexander Schreiber
Post by Pascal J. Bourguignon
Post by William Lederer
And if the flight safety of an aircraft depended upon the current
Lisp version of Ironclad's impenetrability, we would be in
trouble.
This is another question, that of the resources invested in a
software ecosystem, and that of programming language mind share.
Why the cryptographists don't write their libraries in Common Lisp
and choose to produce piles of C instead?
Usefulness. If I write a library in C, pretty much everything that
runs on Unix can link to it (if need be, via FFI and friends) and
use it. If I write a library i Common Lisp, then code written in
Common Lisp can use it unless people are willing to do some
interesting contortions (such wrapping it in an RPC server).
Anything running on unix can link to libecl.so (which is ironically a
CL implementation using gcc, but we can assume it’s a temporary
solution).
Post by Alexander Schreiber
Exercise for the interested: write a library in Common Lisp that
does, say, some random data frobnication and try to use it from: C,
Python, Perl, C++ _without_ writing new interface infrastructure.
But the point is to eliminate code written in C, Perl, C++! So your exercise is academic.
I can very confidently say: This will never happen. Just look at the
amount of _COBOL_ code that is still in use. In fact, people are _still_
writing COBOL (dude I know does just that for a bank).

Kind regards,
Alex.
--
"Opportunity is missed by most people because it is dressed in overalls and
looks like work." -- Thomas A. Edison
Jean-Claude Beaudoin
2014-04-29 08:59:26 UTC
Permalink
Post by Alexander Schreiber
Usefulness. If I write a library in C, pretty much everything that runs on
Unix can link to it (if need be, via FFI and friends) and use it. If I write
a library i Common Lisp, then code written in Common Lisp can use it unless
people are willing to do some interesting contortions (such wrapping it in
an RPC server).
I just checked http://www.cliki.net/Common%20Lisp%20implementation and I
see that nearly all currently active free implementations of CL have a FFI
with "callback" support and the commercial ones do too. Granted there may
be a "completeness" issue in most of those FFI though. But there is hardly
any serious need for a RPC server solution anymore. And on that
completeness issue (that may not be of much interest but anyway) I happen
to be currently hard at work. So you will soon have at least one free CL
that will do full C99 interfacing, with plenty of C type inferencing and
checking at runtime as Pascal seems to appreciate. The only drag with it
will be that, as in ECL, you will have make a call to initialize the CL
world/context before using the rest of the interface and probably call a
shutdown of the CL world/context at the end. I hope it is not too much
overhead.
Post by Alexander Schreiber
Exercise for the interested: write a library in Common Lisp that does, say,
some random data frobnication and try to use it from: C, Python, Perl, C++
_without_ writing new interface infrastructure.
What "new interface infrastructure"? What is that infrastructure supposed
to do?
Dan Cross
2014-04-13 16:38:51 UTC
Permalink
Post by David McClain
Just curious for other opinions... but wouldn't this (Heartbleed) sort of
buffer excess read-back failure have been prevented by utilizing a "safe"
language like Lisp or SML?
I used to be an "unsafe" language bigot -- having mastered C/C++ for many
years, and actually producing C compilers for a living at one time. I felt
there should be no barriers to me as master of my machine, and not the
other way around.
But today's software systems are so complex that it boggles the mind to
keep track of everything needed. I found during my transition years that I
could maintain code bases no larger than an absolute max of 500 KLOC, and
that I actually started losing track of details around 100 KLOC. Making the
transition to a higher level language like SML or Lisp enabled greater
productivity within those limits for me.
Part of the issue vis-a-vis security is that for many applications, much of
the complexity is abstracted away into some library that the programmer may
only be dimly aware of. While it used to be that many largish programs
were more or less self-contained, often depending only on the system
libraries, now days they tend to have a very broad set of dependencies on a
large set of libraries that themselves have a similarly large set of
dependencies. Indeed, the applications themselves are often little more
than glue tying together a (to me, anyway) surprisingly large number of
disparate libraries: transitively, the dependency graph is many times
larger than it was a decade or two ago and hides an astonishing amount of
complexity, even for applications that themselves appear trivial. Thus,
you may "introduce" a security hole into your application simply by using a
library that provides some useful bit of functionality but is implemented
terribly in a way that is not easily visible to you: that seems to be the
case with services that are affected by heartbleed.

Could using a safe language (or even one that implemented array bounds
checking) have prevented this particular bug? Absolutely. But in the
general case, modern applications have a huge surface area for attack
because of the layering of dependencies on large, complicated bits of
software that the application program has little to no control over.
Further, building all the requisite dependencies oneself in a safer
language is such a daunting task as to be generally infeasible, even if it
makes sense for specific applications. And even if I did that, eventually
I am going to pop down into some layer in the operating system or a system
library or the language runtime that is out of my control, and those things
seem to have also increased in size and complexity by a few orders of
magnitude over the last 20 or so years. And even if I write my own
compiler, operating systems, libraries, etc, I still have to wonder whether
the hardware itself is truly secure ("DEITYBOUNCE", anyone? Let alone
actual, you know, errors in the hardware). And this is completely ignoring
the value-loss proposition of targeting safer but lesser-used languages.
For better or for worse, things like heartbleed just aren't going to sway
many library writers to give up on a huge, existing target audience (even
if they should).

So even if I as a programmer am extremely careful and diligent, I may still
be burned by something rather distant from the work I myself am doing, and
I have finite capacity to influence that.

Of course, that doesn't mean that one should not oneself be careful and
diligent, or even reimplement things safely where one can! Only that the
problem is rather more difficult than can be solved by simply switching
implementation languages.

- Dan C.
Continue reading on narkive:
Loading...