style question about types, coercion, expectations for function parameters

Discussion:

Ryan Davis

2012-05-31 18:50:36 UTC

Content preview: Pro-cl, I'd like to sanity check some of the lisp idioms my
shop has (re)invented. Some background: We do a lot of web programming, shuffling
data to and fro, occasionally doing interesting calculations, but mostly
pushing bits around and displaying information in ways humans can use. We
have very few CPU bound operations, and most of those are complicated SQL
queries, not lisp operations. Most of the applications we create have few
users. As such, we haven't put much time into optimizing our lisp code, and
rely on dynamism to support rapid development. We have very few type declarations,
liberal use of CLOS, not too concerned with consing, etc. Basically using
the simplest code we can get away with. It works well for our purposes, and
we complicate code for speed when we can't get away with it. We came to lisp
from C#, so have a distaste for that style of static typing. [...]

Content analysis details: (-100.0 points, 5.0 required)

pts rule name description
---- ---------------------- --------------------------------------------------
-100 USER_IN_WHITELIST From: address is in the user's white-list
-0.0 SPF_HELO_PASS SPF: HELO matches SPF record
-0.0 T_RP_MATCHES_RCVD Envelope sender domain matches handover relay
domain
-0.0 SPF_PASS SPF: sender matches SPF record
0.0 T_DKIM_INVALID DKIM-Signature header exists but is not valid
Archived-At: <http://permalink.gmane.org/gmane.lisp.cl-pro/692>

Pro-cl,

I'd like to sanity check some of the lisp idioms my shop has (re)invented.

Some background: We do a lot of web programming, shuffling data to and
fro, occasionally doing interesting calculations, but mostly pushing
bits around and displaying information in ways humans can use. We have
very few CPU bound operations, and most of those are complicated SQL
queries, not lisp operations. Most of the applications we create have
few users. As such, we haven't put much time into optimizing our lisp
code, and rely on dynamism to support rapid development. We have very
few type declarations, liberal use of CLOS, not too concerned with
consing, etc. Basically using the simplest code we can get away with.
It works well for our purposes, and we complicate code for speed when we
can't get away with it. We came to lisp from C#, so have a distaste for
that style of static typing.

I'd like some more opinions on a pattern that has cropped up. One of the
problems we were having was quickly determining what a function expected
for it's arguments. As a somewhat contrived example, SLIME helpfully
would tell me that #'send-email wanted (to from subject body), but then
it was left to me to guess what values I should pass in. In real code
this was frequently non-trivial, and we'd be hand-tracing to figure out
where the parameter was used to figure out what it should be. Should
"to" be a string, a CLOS Client object, or the database ID of a Client?
The answer we arrived at was "yes":

(defun send-email (to from subject body)
(let ((to (etypecase to
(string to)
((integer 0) (email (fetch-client to)))
(client (email to))
)))
;; ... more code
))

The "to" parameter can be anythings that can be mapped to an email
address. It is send-email's job to send email, and it will figure it out
based on whatever you provide. If it can't do it, it'll tell you.
Usually the etypecase is pulled into it's own function, and we have
something like:

(defun send-email (to from subject body)
(let ((to (coerce-email to)))
;; ... more code
))

And also usually wrapped into a setf-generating macro and we have:

(defun send-email (to from subject body)
(coerce-email! to)
;; ... more code
)

This is very easy to follow, and when figuring out what to pass for
"to", it's trivial to M-. a few times and see what are the allowed values.

In rare cases we want this typecase to be extensible, for example to
allow another package to add new mappings. When this happens we end up
with a generic method:

(defmethod coerce-email (to)
;;same typecase as before
)
;;elsewhere
(defmethod coerce-email ((to server))
(administrator-email to))

Method dispatch takes care of the rest.

Sometimes the coerce-* function is a cond or typecase, and strays
outside the flexibility offered by standard method combination, things
like using Access's [1] generic interface.

In practice, this is approach is mostly used for converting database
CLOS objects to integer IDs and vice versa, depending on whether the
function wants an integer ID or an instantiated object.

Some alternatives we tried:
* using defmethods, but it didn't seem to offer any advantage over a
simple typecase, it was just more typing, more code to read later, and
less flexibility if the business wanted something weird.
* using check-type or assert to constrain the input options, like
C#/Java style static-typing, but that just resulted in duplicate code at
call sites to convert from whatever you had to whatever send-email wanted.
* having multiple similarly named functions where the function name
indicated what type was expected (eg: send-email, send-email-to-client,
send-email-to-client-id), this also didn't seem to offer any advantage
over a simple typecase
* naming parameters after what can be accepted (client-id,
email-string, client-or-id, etc) ended up hard to keep up to date.

There are a lot of tradeoffs with this approach, and obviously this
wouldn't work for anything with high performance requirements. Has
anyone else run into a similar problem and come up with a
different/same/better solution? Any problems I'm missing?

[1] https://github.com/AccelerationNet/access

Thanks,

--
Ryan Davis
Acceleration.net
Director of Programming Services
2831 NW 41st street, suite B
Gainesville, FL 32606

Office: 352-335-6500 x 124
Fax: 352-335-6506

Pascal J. Bourguignon

2012-05-31 19:22:01 UTC

Permalink

Content preview: Ryan Davis <ryan-***@public.gmane.org> writes: > The answer we
arrived at was "yes": > > (defun send-email (to from subject body) > (let
((to (etypecase to > (string to) > ((integer 0) (email (fetch-client to)))

(client (email to)) > ))) > ;; ... more code > )) > > The "to" parameter

can be anythings that can be mapped to an email > address. [...]

Content analysis details: (-0.0 points, 5.0 required)

pts rule name description
---- ---------------------- --------------------------------------------------
-0.0 T_RP_MATCHES_RCVD Envelope sender domain matches handover relay
domain
Archived-At: <http://permalink.gmane.org/gmane.lisp.cl-pro/693>

(defun send-email (to from subject body)
(let ((to (etypecase to
(string to)
((integer 0) (email (fetch-client to)))
(client (email to))
)))
;; ... more code
))
The "to" parameter can be anythings that can be mapped to an email
address.

Yes, it's what's usually called a "designator", specifically, an email
designator. CL itself defines and uses a few designator types (string
designators, package designators, pathname designators, list
designators, etc).

--
__Pascal Bourguignon__ http://www.informatimago.com/
A bad day in () is better than a good day in {}.

Matthew Mondor

2012-06-01 00:33:09 UTC

Permalink

Content preview: On Thu, 31 May 2012 21:22:01 +0200 "Pascal J. Bourguignon"
<pjb-jNDFPZUTrfRkIYSJMME8NAC/***@public.gmane.org> wrote: > Yes, it's what's usually called a "designator",
specifically, an email > designator. CL itself defines and uses a few designator
types (string > designators, package designators, pathname designators, list

designators, etc). [...]

Content analysis details: (-0.0 points, 5.0 required)

pts rule name description
---- ---------------------- --------------------------------------------------
-0.0 T_RP_MATCHES_RCVD Envelope sender domain matches handover relay
domain
Archived-At: <http://permalink.gmane.org/gmane.lisp.cl-pro/695>

On Thu, 31 May 2012 21:22:01 +0200

Designators are an interesting and useful concept indeed.

If creating new ones, some common sense might be required to decide how
to "resolve" to the final object, and this might need to be documented
as well.

For instance, I noticed that an implementation given a symbol as
function designator (i.e. 'foo vs #'foo) might use SYMBOL-FUNCTION at
run-time, while its optimizing compiler might generate a direct call to
the function for #'foo if it considers this safe in a whole-file
compile. In the latter case redefining dynamically FOO after loading
the module might still cause existing non-recompiled callers to call an
older version of the function #'foo, but correctly call the new
instance for 'foo, resolving the function object from the symbol at
run-time. Thus, the symbol function designator was not resolved at
compile-time.

The hyperspec seems unclear about if designators should always resolve
at run-time, though, and it may be tempting to resolve some at
compile-time... But it's clear that some shouldn't.

--
Matt

Stelian Ionescu

2012-05-31 19:45:28 UTC

Permalink

On Thu, 2012-05-31 at 14:50 -0400, Ryan Davis wrote:
[...]

Post by Ryan Davis
I'd like some more opinions on a pattern that has cropped up. One of the
problems we were having was quickly determining what a function expected
for it's arguments. As a somewhat contrived example, SLIME helpfully
would tell me that #'send-email wanted (to from subject body), but then
it was left to me to guess what values I should pass in. In real code
this was frequently non-trivial, and we'd be hand-tracing to figure out
where the parameter was used to figure out what it should be. Should
"to" be a string, a CLOS Client object, or the database ID of a Client?
(defun send-email (to from subject body)
(let ((to (etypecase to
(string to)
((integer 0) (email (fetch-client to)))
(client (email to))
)))
;; ... more code
))
The "to" parameter can be anythings that can be mapped to an email
address. It is send-email's job to send email, and it will figure it out
based on whatever you provide. If it can't do it, it'll tell you.
Usually the etypecase is pulled into it's own function, and we have
(defun send-email (to from subject body)
(let ((to (coerce-email to)))
;; ... more code
))

I've started using this idiom too: for a type FOO, have, in addition to
the assembling constructor that is make-instance, a coercing constructor
having the same name as the type itself, which can be elided with a
clever use of inlining and type declarations. The CL standard has some
instances of this, e.g. with pathnames: cl:pathname coerces a
pathname-designator(pathnamem, string or file stream) and
cl:make-pathname assembles from components.

Example:

(declaim (inline email))
(defun email (email-designator)
(etypecase email-designator
(string
email-designator)
(unsigned-byte
(email-of (fetch-client email-designator)))
(client
(email-of email-designator))))

(declaim (inline send-email))
(defun send-email (to from subject body)
(let ((to (email to))
(from (email from)))
(%send-email to from subject body)))

(defun send-site-warning (to)
(declare (type string to))
(send-email to "admin-1SZh+***@public.gmane.org" "Warning" "Bandwidth quota reached"))

The creation of the above wrapper - send-email that checks type and
coerces then calls %send-email - can also be easily automated with some
macrology

--
Stelian Ionescu a.k.a. fe[nl]ix
Quidquid latine dictum sit, altum videtur.
http://common-lisp.net/project/iolib

Tobias C Rittweiler

2012-06-01 10:22:08 UTC

Permalink

Content preview: In article <4FC7BD7C.7050205-***@public.gmane.org>, Ryan Davis
<ryan-***@public.gmane.org> wrote: > > I'd like some more opinions on a pattern
that has cropped up. One of the > problems we were having was quickly determining
what a function expected > for it's arguments. As a somewhat contrived example,
SLIME helpfully > would tell me that #'send-email wanted (to from subject
body), but then > it was left to me to guess what values I should pass in.
In real code > this was frequently non-trivial, and we'd be hand-tracing
to figure out > where the parameter was used to figure out what it should
be. Should > "to" be a string, a CLOS Client object, or the database ID of
a Client? > The answer we arrived at was "yes": > > (defun send-email (to
from subject body) > (let ((to (etypecase to > (string to) > ((integer 0)
(email (fetch-client to))) > (client (email to)) > ))) > ;; ... more code

)) [...]

I'd like some more opinions on a pattern that has cropped up. One of the
problems we were having was quickly determining what a function expected
for it's arguments. As a somewhat contrived example, SLIME helpfully
would tell me that #'send-email wanted (to from subject body), but then
it was left to me to guess what values I should pass in. In real code
this was frequently non-trivial, and we'd be hand-tracing to figure out
where the parameter was used to figure out what it should be. Should
"to" be a string, a CLOS Client object, or the database ID of a Client?
(defun send-email (to from subject body)
(let ((to (etypecase to
(string to)
((integer 0) (email (fetch-client to)))
(client (email to))
)))
;; ... more code
))

Hi Ryan!

As others have pointed out the use of designators is widely used
both in the Common Lisp language itself, and in real world usage.

Here's my experience which matches yours:

* As Stelian has pointed out, lifting the ETYPECASE into
a separate function e.g. EMAIL or EMAIL-DESIGNATOR is
usually a good idea.

* If user of the code can meaningfully extend the notion of
what an email is, then it's a good idea to also export
a generic function COERCE-EMAIL-DESIGNATOR which is
supposed to take a user's object and turn it into something
that the above function EMAIL-DESIGNATOR can understand.

If CLIENT is a class, users can already subclass that one
for extension. However:

Often you have a thing that simply contains a CLIENT (e.g.
a SESSION), and you might want to be able to just pass the
SESSION object to send-email to stand for its CLIENT slot.
In that case COERCE-EMAIL-DESIGNATOR would be the right thing.

* You may also want to define a type EMAIL in your case,
and declaim the ftype of your functions to take those.

Or use something like DEFINE-API which does exactly that
in a more succinct way. You can find it in named-readtables,
or sequence-iterators.

E.g. in your example:

(deftype email () `(or string unsigned-byte client))

(define-api send-email (to from subject body)
(email email string string => (values R1 R2)) ; R1,R2
(let ((to (email-designator to)) ; return
(from (email-designator from))) ; types
...)

This has two advantages: 1. C-c C-d d will show you the
function's argument types (unfortunately fully expanded in case
of SBCL. It would be nice if it stored the original types and
gave access to it.) 2. passing a wrong type can be caught at
compile time if that type is known at compile time.

E.g. writing (send-email :A :B ...) will result in a warning
at compile-time.

3. Combined with good docstrings, you can produce nice documentation
of the code automatically. The Documentation for named-readtables
and sequence-iterators were entirely auto-generated. Combined
with hyperdoc and slime-hyperdoc, you can get access to
documentation and argument types directly from within Slime very
conveniently.

T