Do symbols need to be EQ?

Discussion:

Do symbols need to be EQ?

Edi Weitz

2015-07-03 07:09:26 UTC

Just out of curiosity and without any relevance in practise:

Is there one place in the standard where it is explicitly said that
two symbols which are the "same" symbol must be "identical"? I know
that there are a couple of examples where this is implied, but
formally the examples aren't part of the standard, right?

The EQ dictionary entry for example shows this example:

(eq 'a 'a) => true

and then it continues with this note (emphasis mine): "Symbols that
print the same USUALLY are EQ to each other because of the use of the
INTERN function."

And the entry for INTERN is actually the closest I could find in terms
of clarification because it says that if a symbol of a specified name
is already accessible, _IT_ is returned -- which sounds like object
identity to me.

But how does this fit into the picture?

CL-USER 1 > (defparameter *s* 'foo)
*S*
CL-USER 2 > (unintern 'foo)
T
CL-USER 3 > (defparameter *s2* 'foo)
*S2*
CL-USER 4 > (eq *s* *s2*)
NIL

*S* has lost its home package and is thus not EQ to *S2*, sure, but
how do we explain this in terms of object identity? Has the UNINTERN
operation changed the identity of *S* which once was the one and only
CL-USER::FOO but can't be anymore because this role is now occupied by
*S2*?

Did I miss some clarifying words in the standard? Did I just manage
to confuse myself?

Thanks,
Edi.

PS: The UNINTERN entry warns about side effects which could harm
consistency, so maybe this is what they meant?

Anton Vodonosov

2015-07-03 07:16:57 UTC

I think the most confusing part is what you mean by "same" symbols.

Post by Edi Weitz
Is there one place in the standard where it is explicitly said that
two symbols which are the "same" symbol must be "identical"? I know
that there are a couple of examples where this is implied, but
formally the examples aren't part of the standard, right?
  (eq 'a 'a) => true
and then it continues with this note (emphasis mine): "Symbols that
print the same USUALLY are EQ to each other because of the use of the
INTERN function."
And the entry for INTERN is actually the closest I could find in terms
of clarification because it says that if a symbol of a specified name
is already accessible, _IT_ is returned -- which sounds like object
identity to me.
But how does this fit into the picture?
  CL-USER 1 > (defparameter *s* 'foo)
  *S*
  CL-USER 2 > (unintern 'foo)
  T
  CL-USER 3 > (defparameter *s2* 'foo)
  *S2*
  CL-USER 4 > (eq *s* *s2*)
  NIL
*S* has lost its home package and is thus not EQ to *S2*, sure, but
how do we explain this in terms of object identity? Has the UNINTERN
operation changed the identity of *S* which once was the one and only
CL-USER::FOO but can't be anymore because this role is now occupied by
*S2*?
Did I miss some clarifying words in the standard? Did I just manage
to confuse myself?
Thanks,
Edi.
PS: The UNINTERN entry warns about side effects which could harm
consistency, so maybe this is what they meant?

Anton Vodonosov

2015-07-03 07:30:38 UTC

EQ just checks object identity.

Symbol names, like CL-USER::FOO are a way to refer symbol objects
using packages machinery. If we manipulate packages then dereferencing
the name CL-USER::FOO may return different object, and they would not be EQ.

Yes, INTERN gives us ability to use CL-USER::FOO as a reference to
exactly the same symbol object, unless someone destructed the symbol name/object mapping.

That's what I rely to and don't expect the standard to provide
any more guarantees.

Best regards,
- Anton

Post by Anton Vodonosov
I think the most confusing part is what you mean by "same" symbols.

Is there one place in the standard where it is explicitly said that
two symbols which are the "same" symbol must be "identical"? I know
that there are a couple of examples where this is implied, but
formally the examples aren't part of the standard, right?
   (eq 'a 'a) => true
and then it continues with this note (emphasis mine): "Symbols that
print the same USUALLY are EQ to each other because of the use of the
INTERN function."
And the entry for INTERN is actually the closest I could find in terms
of clarification because it says that if a symbol of a specified name
is already accessible, _IT_ is returned -- which sounds like object
identity to me.
But how does this fit into the picture?
   CL-USER 1 > (defparameter *s* 'foo)
   *S*
   CL-USER 2 > (unintern 'foo)
   T
   CL-USER 3 > (defparameter *s2* 'foo)
   *S2*
   CL-USER 4 > (eq *s* *s2*)
   NIL
*S* has lost its home package and is thus not EQ to *S2*, sure, but
how do we explain this in terms of object identity? Has the UNINTERN
operation changed the identity of *S* which once was the one and only
CL-USER::FOO but can't be anymore because this role is now occupied by
*S2*?
Did I miss some clarifying words in the standard? Did I just manage
to confuse myself?
Thanks,
Edi.
PS: The UNINTERN entry warns about side effects which could harm
consistency, so maybe this is what they meant?

Edi Weitz

2015-07-03 07:53:19 UTC

Let me repeat: I'm not concerned about whether this could impede my
ability to write CL programs nor am I concerned that some future
implementor might not do the right thing. I just can't see the
internal logic (and the CLHS seems otherwise mostly very clear and
logical to me).

The standard actually defines the word "same" and says that two
objects are the same if they can't be distinguished by EQL (unless
another predicate is explicitly mentioned). But let's forget about
this definition (although it is hard to talk about such concepts if
you can't use certain words). I'm more concerned with object
identity:

1. I guess we all agree that there's one and only one mathematical
object which is the number 536870912.

2. We also all know that on some 32-bit implementations (EQ 536870912
536870912) can yield NIL while (EQL 536870912 536870912) must yield T.

3. So EQL is the preferred predicate in the standard and is intended
to mean that two things are _semantically_ identical although they
might _technically_ be different (like above).

4. EQ on the other hand tests whether its arguments are (according to
the CLHS) "the same, identical object." I've always understood this
as a test for identity at the implementation level I shouldn't be
concerned with. (Leaving the question open why EQ is in the standard
at all...)

5. Now, and I think this is the crucial part, by using EQ to compare
symbols in various parts of the standard, I take this as a suggestion
that there is for example one and only one symbol CL-USER::FOO like
there is one and only one number 536870912. Even more so, because
they use EQ and not EQL they also suggest - it seems to me - that this
one and only one symbol must have one and only internal
representation.

6. But if you agree with #5 and then look at my UNINTERN example how
do you explain the results? Has the symbol which once was
CL-USER::FOO and is still stored in *S* lost its identity? There are
plenty of operations which modify objects - like (SETF GETHASH) - but
none of them causes the object to lose its identity.

I guess I could rephrase my question like this: Wouldn't it be clearer
if "sameness" of symbols would be defined via EQL with something like:
"Two symbols are EQL if their names are the same under STRING= and
their home packages are the same under EQL." (And maybe some more
sentences if necessary.)

Post by Anton Vodonosov
I think the most confusing part is what you mean by "same" symbols.

Post by Edi Weitz
Is there one place in the standard where it is explicitly said that
two symbols which are the "same" symbol must be "identical"? I know
that there are a couple of examples where this is implied, but
formally the examples aren't part of the standard, right?
(eq 'a 'a) => true
and then it continues with this note (emphasis mine): "Symbols that
print the same USUALLY are EQ to each other because of the use of the
INTERN function."
And the entry for INTERN is actually the closest I could find in terms
of clarification because it says that if a symbol of a specified name
is already accessible, _IT_ is returned -- which sounds like object
identity to me.
But how does this fit into the picture?
CL-USER 1 > (defparameter *s* 'foo)
*S*
CL-USER 2 > (unintern 'foo)
T
CL-USER 3 > (defparameter *s2* 'foo)
*S2*
CL-USER 4 > (eq *s* *s2*)
NIL
*S* has lost its home package and is thus not EQ to *S2*, sure, but
how do we explain this in terms of object identity? Has the UNINTERN
operation changed the identity of *S* which once was the one and only
CL-USER::FOO but can't be anymore because this role is now occupied by
*S2*?
Did I miss some clarifying words in the standard? Did I just manage
to confuse myself?
Thanks,
Edi.
PS: The UNINTERN entry warns about side effects which could harm
consistency, so maybe this is what they meant?

Anton Vodonosov

2015-07-03 08:14:52 UTC

I personally don't think that name CL-USER::FOO anyhow represents
the "nature" of the symbol

The same number may be referenced as #x20000000 and as 536870912.
It's just a way to refer the object, not the object itself.

Lets consider and example of symbols use:

(defun print-value (value mode)
(if (eq mode 'mypkg:lowcase)
(format nil "~(~A~)" value)
(format nil "~A" value)))

So, (print-value "HelLo" 'mypkg:lowcase) returns "hello"

Lets suppose someone manipulated packages: uninternet and re-interned MYPKG:LOWCASE.

This doesn't break my PRINT-VALUE function, because the contract
of my function is not to print lower case value when MODE is a symbols
named "MYPKG:LOWCASE", but when MODE is exactly the symbol referred to
in PRINT-VALUE.

I provided a constant which allows to specify different mode,
I provided a way to refer it via package systems as 'MYPKG:LOWCASE.

If someone destroyed the mapping, well, the he can't use the name to refer my constant.
He should have stored a reference to it, or something. But PRINT-VALUE remains correct.

How about this treatment?

Best regards,
- Anton

Post by Edi Weitz
Let me repeat: I'm not concerned about whether this could impede my
ability to write CL programs nor am I concerned that some future
implementor might not do the right thing. I just can't see the
internal logic (and the CLHS seems otherwise mostly very clear and
logical to me).
The standard actually defines the word "same" and says that two
objects are the same if they can't be distinguished by EQL (unless
another predicate is explicitly mentioned). But let's forget about
this definition (although it is hard to talk about such concepts if
you can't use certain words). I'm more concerned with object
1. I guess we all agree that there's one and only one mathematical
object which is the number 536870912.
2. We also all know that on some 32-bit implementations (EQ 536870912
536870912) can yield NIL while (EQL 536870912 536870912) must yield T.
3. So EQL is the preferred predicate in the standard and is intended
to mean that two things are _semantically_ identical although they
might _technically_ be different (like above).
4. EQ on the other hand tests whether its arguments are (according to
the CLHS) "the same, identical object." I've always understood this
as a test for identity at the implementation level I shouldn't be
concerned with. (Leaving the question open why EQ is in the standard
at all...)
5. Now, and I think this is the crucial part, by using EQ to compare
symbols in various parts of the standard, I take this as a suggestion
that there is for example one and only one symbol CL-USER::FOO like
there is one and only one number 536870912. Even more so, because
they use EQ and not EQL they also suggest - it seems to me - that this
one and only one symbol must have one and only internal
representation.
6. But if you agree with #5 and then look at my UNINTERN example how
do you explain the results? Has the symbol which once was
CL-USER::FOO and is still stored in *S* lost its identity? There are
plenty of operations which modify objects - like (SETF GETHASH) - but
none of them causes the object to lose its identity.
I guess I could rephrase my question like this: Wouldn't it be clearer
"Two symbols are EQL if their names are the same under STRING= and
their home packages are the same under EQL." (And maybe some more
sentences if necessary.)

I think the most confusing part is what you mean by "same" symbols.

Is there one place in the standard where it is explicitly said that
two symbols which are the "same" symbol must be "identical"? I know
that there are a couple of examples where this is implied, but
formally the examples aren't part of the standard, right?
   (eq 'a 'a) => true
and then it continues with this note (emphasis mine): "Symbols that
print the same USUALLY are EQ to each other because of the use of the
INTERN function."
And the entry for INTERN is actually the closest I could find in terms
of clarification because it says that if a symbol of a specified name
is already accessible, _IT_ is returned -- which sounds like object
identity to me.
But how does this fit into the picture?
   CL-USER 1 > (defparameter *s* 'foo)
   *S*
   CL-USER 2 > (unintern 'foo)
   T
   CL-USER 3 > (defparameter *s2* 'foo)
   *S2*
   CL-USER 4 > (eq *s* *s2*)
   NIL
*S* has lost its home package and is thus not EQ to *S2*, sure, but
how do we explain this in terms of object identity? Has the UNINTERN
operation changed the identity of *S* which once was the one and only
CL-USER::FOO but can't be anymore because this role is now occupied by
*S2*?
Did I miss some clarifying words in the standard? Did I just manage
to confuse myself?
Thanks,
Edi.
PS: The UNINTERN entry warns about side effects which could harm
consistency, so maybe this is what they meant?

Edi Weitz

2015-07-03 08:36:00 UTC

Post by Anton Vodonosov
This doesn't break my PRINT-VALUE function, because the contract
of my function is not to print lower case value when MODE is a symbols
named "MYPKG:LOWCASE", but when MODE is exactly the symbol referred to
in PRINT-VALUE.

I think this is where we agree to disagree. Suppose you had written
your function like so:

(defun print-value (value mode)
(if (eql mode 42)
(format nil "~(~A~)" value)
(format nil "~A" value)))

Would you expect someone to be able change the identity of the
constant 42 in your function in such a way that it would no longer
work if called as (PRINT-VALUE ... 42)?

Yes, there are different ways to represent 42 (as in binary, octal,
and so on), but unless you totally mess up the readtable, there's no
simple way to make it impossible to refer to 42 with a literal
anymore.

Edi Weitz

2015-07-03 08:48:42 UTC

Perhaps the excerpt below (from a fresh LW image) makes more obvious
what my "philosophical problem" is. I have redacted the output of
DISASSEMBLE to only show the relevant parts. It shows that EQ is
essentially just one simple comparison with a machine word (which is
what I expected). It also shows that I get the same machine word
again as long as I don't mess around with UINTERN or something. But
once I've done that, I get _another_ machine word and so in terms of
simple-minded EQ I get a different object.

CL-USER 1 > (defun foo-1 (x)
(eq x 'bar))
FOO-1
CL-USER 2 > (disassemble 'foo-1)
;; ...
21: 3DF771F921 cmp eax, 21F971F7 ; BAR
26: 750D jne L3
;; ...
NIL
CL-USER 3 > (defun foo-2 (x)
(eq x 'bar))
FOO-2
CL-USER 4 > (disassemble 'foo-2)
;; ...
21: 3DF771F921 cmp eax, 21F971F7 ; BAR
26: 750D jne L3
;; ...
NIL
CL-USER 5 > (unintern 'bar)
T
CL-USER 6 > (defun foo-3 (x)
(eq x 'bar))
FOO-3
CL-USER 7 > (disassemble 'foo-3)
;; ...
21: 3DAB71F921 cmp eax, 21F971AB ; BAR
26: 750D jne L3
;; ...
NIL

Kenneth Tilton

2015-07-03 08:56:48 UTC

Post by Edi Weitz
Perhaps the excerpt below (from a fresh LW image) makes more obvious
what my "philosophical problem" is. I have redacted the output of
DISASSEMBLE to only show the relevant parts. It shows that EQ is
essentially just one simple comparison with a machine word (which is
what I expected). It also shows that I get the same machine word
again as long as I don't mess around with UINTERN or something. But
once I've done that, I get _another_ machine word and so in terms of
simple-minded EQ I get a different object.
CL-USER 1 > (defun foo-1 (x)
(eq x 'bar))
FOO-1
CL-USER 2 > (disassemble 'foo-1)
;; ...
21: 3DF771F921 cmp eax, 21F971F7 ; BAR
26: 750D jne L3
;; ...
NIL
CL-USER 3 > (defun foo-2 (x)
(eq x 'bar))
FOO-2
CL-USER 4 > (disassemble 'foo-2)
;; ...
21: 3DF771F921 cmp eax, 21F971F7 ; BAR
26: 750D jne L3
;; ...
NIL
CL-USER 5 > (unintern 'bar)
T
CL-USER 6 > (defun foo-3 (x)
(eq x 'bar))
FOO-3
CL-USER 7 > (disassemble 'foo-3)
;; ...
21: 3DAB71F921 cmp eax, 21F971AB ; BAR
26: 750D jne L3
;; ...
NIL

Sorry, where is the problem? The spec is clear that a new object (with a
new pointer) will be created given the unintern hijinx, so all is
consistent: different pointer, EQ->nil.

ie, It is not just "in terms of EQ" that you have a different object: you
have created two distinct pointer objects (and EQ dutifully says so).

And at a higher level of abstraction, you have created two different
symbols, one interned and one not.

-kt
--
Kenneth Tilton
54 Isle of Venice Dr
Fort Lauderdale, FL 33301

***@tiltontec.com
http://tiltontec.com
@tiltonsalgebra

646-269-1077

"In a class by itself." *-Macworld*

Alessio Stalla

2015-07-03 09:02:38 UTC

Package = map from symbol name to symbol object.
INTERN ~= (or (gethash ...) (setf (gethash ...)))
UNINTERN ~= remhash

There's nothing special about symbols. You'd get the same effect with a map
of constants and operations to add/remove them from the map.

Post by Kenneth Tilton

Post by Edi Weitz
Perhaps the excerpt below (from a fresh LW image) makes more obvious
what my "philosophical problem" is. I have redacted the output of
DISASSEMBLE to only show the relevant parts. It shows that EQ is
essentially just one simple comparison with a machine word (which is
what I expected). It also shows that I get the same machine word
again as long as I don't mess around with UINTERN or something. But
once I've done that, I get _another_ machine word and so in terms of
simple-minded EQ I get a different object.
CL-USER 1 > (defun foo-1 (x)
(eq x 'bar))
FOO-1
CL-USER 2 > (disassemble 'foo-1)
;; ...
21: 3DF771F921 cmp eax, 21F971F7 ; BAR
26: 750D jne L3
;; ...
NIL
CL-USER 3 > (defun foo-2 (x)
(eq x 'bar))
FOO-2
CL-USER 4 > (disassemble 'foo-2)
;; ...
21: 3DF771F921 cmp eax, 21F971F7 ; BAR
26: 750D jne L3
;; ...
NIL
CL-USER 5 > (unintern 'bar)
T
CL-USER 6 > (defun foo-3 (x)
(eq x 'bar))
FOO-3
CL-USER 7 > (disassemble 'foo-3)
;; ...
21: 3DAB71F921 cmp eax, 21F971AB ; BAR
26: 750D jne L3
;; ...
NIL

Sorry, where is the problem? The spec is clear that a new object (with a
new pointer) will be created given the unintern hijinx, so all is
consistent: different pointer, EQ->nil.
ie, It is not just "in terms of EQ" that you have a different object: you
have created two distinct pointer objects (and EQ dutifully says so).
And at a higher level of abstraction, you have created two different
symbols, one interned and one not.
-kt
--
Kenneth Tilton
54 Isle of Venice Dr
Fort Lauderdale, FL 33301
http://tiltontec.com
@tiltonsalgebra
646-269-1077
"In a class by itself." *-Macworld*

Edi Weitz

2015-07-03 09:31:55 UTC

Post by Alessio Stalla
Package = map from symbol name to symbol object.
INTERN ~= (or (gethash ...) (setf (gethash ...)))
UNINTERN ~= remhash

I would consider that to be an implementation detail. As Anton said,
this is mostly about saving space and time. It would not be
inconceivable to have an "implementation" that worked like so:

(defparameter *my-package* (make-hash-table :test 'equal))

(defun my-intern (symbol-name &optional (package *my-package*))
(or (gethash symbol-name package)
(setf (gethash symbol-name package)
(parse-integer symbol-name)))) ;; <-- imagine some
clever hashing technique

(defun my-unintern (symbol-name &optional (package *my-package*))
(remhash symbol-name package))

CL-USER > (defparameter *s* (my-intern "42"))
*S*
CL-USER > (my-unintern "42")
T
CL-USER > (eql (my-intern "42") *s*)
T

(Meaning you'd somehow enforce the same "pointer" once the symbol is
"re-created".)

Sam Steingold

2015-07-06 22:28:25 UTC

Post by Edi Weitz

Post by Alessio Stalla
Package = map from symbol name to symbol object.
INTERN ~= (or (gethash ...) (setf (gethash ...)))
UNINTERN ~= remhash

I would consider that to be an implementation detail. As Anton said,
this is mostly about saving space and time. It would not be
(defparameter *my-package* (make-hash-table :test 'equal))
(defun my-intern (symbol-name &optional (package *my-package*))
(or (gethash symbol-name package)
(setf (gethash symbol-name package)
(parse-integer symbol-name)))) ;; <-- imagine some
clever hashing technique
(defun my-unintern (symbol-name &optional (package *my-package*))
(remhash symbol-name package))
CL-USER > (defparameter *s* (my-intern "42"))
*S*
CL-USER > (my-unintern "42")
T
CL-USER > (eql (my-intern "42") *s*)
T
(Meaning you'd somehow enforce the same "pointer" once the symbol is
"re-created".)

this behavior is non-compliant.

http://www.lispworks.com/documentation/HyperSpec/Body/f_intern.htm

Post by Edi Weitz

Post by Alessio Stalla

If no such symbol is accessible in package, a new symbol with the
given name is created

i.e., after unintern, there is no symbol with this name, thus intern
creates a _NEW_ symbol which cannot be EQ to any other existing object
(this is the definition of the word "new" or "fresh").

--
Sam Steingold (http://sds.podval.org/) on darwin Ns 10.3.1348
http://www.childpsy.net/ http://truepeace.org http://mideasttruth.com
http://www.dhimmitude.org http://ffii.org http://iris.org.il
Warning! Dates in calendar are closer than they appear!

Anton Vodonosov

2015-07-03 09:15:32 UTC

A note about "philosophical problems" - if one wants to build
a compact mental model, reasonable and consistent with all
the Common Lisp properties, there probably may be more than
one way to do so, and neither of possible models can be proven incorrect.

Post by Anton Vodonosov
I personally don't think that name CL-USER::FOO anyhow represents
the "nature" of the symbol
The same number may be referenced as #x20000000 and as 536870912.
It's just a way to refer the object, not the object itself.

I want to correct myself. Unlike numbers or any other objects,
symbols _are_ about names, so we can say that the name CL-USER::FOO
represents the "nature" of symbol.

I think Common Lisp wants to save memory and speedup comparison,
so when we use the same name we get the same object, as implemented
by INTERN (this trick even has name - the Flyweight pattern).

So, this is just an optimization trick, and UNITERN is a maintenance,
system tool, not designed to express programs. We are encouraged to
operate as if the symbol name means the same object.

Anton Vodonosov

2015-07-03 09:22:32 UTC

BTW, you may be interested to know how Clojure handles symbols.

Unlike CL, where symbol is both a textual name, and a slot where
symbol value may be stored, Clojure separates these concepts.
The slot-holing object is called Var - it is similar to CL symbol.
And sybmols returned by Clojure reader are essentially strings,
qualified with namespace (another string).

Symbols are not reused, Clojure reader creates new instances of them freely.

Martin Simmons

2015-07-03 12:44:40 UTC

Post by Anton Vodonosov
A note about "philosophical problems" - if one wants to build
a compact mental model, reasonable and consistent with all
the Common Lisp properties, there probably may be more than
one way to do so, and neither of possible models can be proven incorrect.

Post by Anton Vodonosov
I personally don't think that name CL-USER::FOO anyhow represents
the "nature" of the symbol
The same number may be referenced as #x20000000 and as 536870912.
It's just a way to refer the object, not the object itself.

I want to correct myself. Unlike numbers or any other objects,
symbols _are_ about names, so we can say that the name CL-USER::FOO
represents the "nature" of symbol.
I think Common Lisp wants to save memory and speedup comparison,
so when we use the same name we get the same object, as implemented
by INTERN (this trick even has name - the Flyweight pattern).
So, this is just an optimization trick, and UNITERN is a maintenance,
system tool, not designed to express programs. We are encouraged to
operate as if the symbol name means the same object.

I disagree about it being to save memory -- a CL symbol is an object with
mutable attributes, so identity is important. Also, the identity of
uninterned symbols is just as important (e.g. for macros) as interned ones, so
finding symbols via packages (and the reader) is not fundamental to their
common use.

Packages are just a way to convert strings to symbols, which is useful when
they are obtained from files outside a running CL (e.g. via the reader/fasl
loader).

--
Martin Simmons
LispWorks Ltd
http://www.lispworks.com/

Scott McKay

2015-07-03 12:52:13 UTC

Post by Martin Simmons
Packages are just a way to convert strings to symbols, which is useful when
they are obtained from files outside a running CL (e.g. via the reader/fasl
loader).

Agreed. Isn't it the case that {package x string} -> symbol is
a 1-to-1 relationship? In which case, two symbols having the
same name in the same package implies that the two symbols
are in fact EQ?

Sorry if I'm late to the party, I haven't been thinking about this
for a few years.

--S

Alessio Stalla

2015-07-03 13:10:22 UTC

In general it is a n-to-1 relationship, n >= 0. A symbol has always a name
but it can have either no home package or one home package, and
additionally there can be any number of packages in which it is accessible.
You have to think three-dimensionally ;) yes, two symbols with the same
name in the same package are EQ. However, you can destructively alter
packages so as to replace a symbol with another with the same name *at a
later time*. Those won't be EQ, but a package will always contain at most
one symbol with a given name - at a given time.

Post by Scott McKay

Post by Martin Simmons
Packages are just a way to convert strings to symbols, which is useful when
they are obtained from files outside a running CL (e.g. via the reader/fasl
loader).

Agreed. Isn't it the case that {package x string} -> symbol is
a 1-to-1 relationship? In which case, two symbols having the
same name in the same package implies that the two symbols
are in fact EQ?
Sorry if I'm late to the party, I haven't been thinking about this
for a few years.
--S

Anton Vodonosov

2015-07-03 21:49:30 UTC

Post by Martin Simmons

I want to correct myself. Unlike numbers or any other objects,
symbols _are_ about names, so we can say that the name CL-USER::FOO
represents the "nature" of symbol.
I think Common Lisp wants to save memory and speedup comparison,
so when we use the same name we get the same object, as implemented
by INTERN (this trick even has name - the Flyweight pattern).
So, this is just an optimization trick, and UNITERN is a maintenance,
system tool, not designed to express programs. We are encouraged to
operate as if the symbol name means the same object.

I disagree about it being to save memory -- a CL symbol is an object with
mutable attributes, so identity is important.

This is part of the optimization. Functions like SYMBOL-VALUE, GET could
be, for example, backed by hash maps from symbol name, thus returning
the same value for equally named symbols.

I mean on the level of abstraction mathematicians use when they say
"let X = 10" it means that the textual name X is bound to 10.
In Common Lisp it means that the symbol object with name X is bound to 10.

So, in general, abstract sense, symbols need not to be EQ.
But Common Lisp distinguishes symbols up to their object instance identity.
I still suppose this choice is an optimization.

Post by Martin Simmons
Also, the identity of uninterned symbols is just as important (e.g. for macros)
as interned ones

I think if symbols were compared by their names instead of EQ, the
were ways to satisfy needs of macros. But that would be another language,
not CL.

Best regards,
- Anton

Steve Haflich

2015-07-03 23:37:07 UTC

Symbols must behave under EQ as they traditionally always have if symbols
are to be useful as property list indicators.

There is a semantic issue that is not explicit in the ANS but which
underlie language semantics and the concepts of "same", "identical", and
"equivalent". It concerns object mutability.

The only objects for which the EQ/EQL distinction is unspecified are
characters and numbers. But hese objects are immutable (at least in the
portable language). Most other kinds of objects, including symbols, are
mutable. The fundamental principle of "identical" is that if two objects
are EQ, mutating one of them [sic] will necessarily mutate the other. If
two objects are _not_ EQ, then mutating one will _not_ mutate the other.

I would propose that since symbols have several mutable properties,
mutating a property of one reference to a symbol will mutate that property
of another reference iff those two references are EQ. The "only if" part
of "iff" is here crucial. This is exactly the same as for conses, arrays,
structure slots, hashtables, readtable dispatches, etc. etc. etc.

It still isn't clear whether this obvious semantic property can be proven
from the ANS.

Post by Anton Vodonosov

Post by Martin Simmons

Post by Anton Vodonosov
I want to correct myself. Unlike numbers or any other objects,
symbols _are_ about names, so we can say that the name CL-USER::FOO
represents the "nature" of symbol.
I think Common Lisp wants to save memory and speedup comparison,
so when we use the same name we get the same object, as implemented
by INTERN (this trick even has name - the Flyweight pattern).
So, this is just an optimization trick, and UNITERN is a maintenance,
system tool, not designed to express programs. We are encouraged to
operate as if the symbol name means the same object.

I disagree about it being to save memory -- a CL symbol is an object with
mutable attributes, so identity is important.

This is part of the optimization. Functions like SYMBOL-VALUE, GET could
be, for example, backed by hash maps from symbol name, thus returning
the same value for equally named symbols.
I mean on the level of abstraction mathematicians use when they say
"let X = 10" it means that the textual name X is bound to 10.
In Common Lisp it means that the symbol object with name X is bound to 10.
So, in general, abstract sense, symbols need not to be EQ.
But Common Lisp distinguishes symbols up to their object instance identity.
I still suppose this choice is an optimization.

Post by Martin Simmons
Also, the identity of uninterned symbols is just as important (e.g. for

macros)

Post by Martin Simmons
as interned ones

I think if symbols were compared by their names instead of EQ, the
were ways to satisfy needs of macros. But that would be another language,
not CL.
Best regards,
- Anton

Kenneth Tilton

2015-07-03 08:21:54 UTC

Post by Edi Weitz
Is there one place in the standard where it is explicitly said that
two symbols which are the "same" symbol must be "identical"? I know
that there are a couple of examples where this is implied, but
formally the examples aren't part of the standard, right?
(eq 'a 'a) => true
and then it continues with this note (emphasis mine): "Symbols that
print the same USUALLY are EQ to each other because of the use of the
INTERN function."
And the entry for INTERN is actually the closest I could find in terms
of clarification because it says that if a symbol of a specified name
is already accessible, _IT_ is returned -- which sounds like object
identity to me.
But how does this fit into the picture?
CL-USER 1 > (defparameter *s* 'foo)
*S*
CL-USER 2 > (unintern 'foo)
T
CL-USER 3 > (defparameter *s2* 'foo)
*S2*
CL-USER 4 > (eq *s* *s2*)
NIL
*S* has lost its home package and is thus not EQ to *S2*, sure, but
how do we explain this in terms of object identity? Has the UNINTERN
operation changed the identity of *S* which once was the one and only
CL-USER::FOO but can't be anymore because this role is now occupied by
*S2*?
Did I miss some clarifying words in the standard? Did I just manage
to confuse myself?

I think you manged to confuse yourself. unintern of course did not change
the identity of *s* (by which we are meaning the symbol bound to *S*) --
identity is identity is identity. Unintern did, however, change the package
of *s*, so (as one side-effect) a new symbol of the same name in the same
package is a new object (identical to nothing at birth).

Perhaps the problem is confusing the levels of abstraction offered by (a)
EQ and (b) object identity. The latter is a very simple idea. EQ, as you
adroitly demonstrated, worries about all sorts of things, including a
symbol's package.

my2 anyway.

-kt

Post by Edi Weitz
Thanks,
Edi.
PS: The UNINTERN entry warns about side effects which could harm
consistency, so maybe this is what they meant?

--
Kenneth Tilton
54 Isle of Venice Dr
Fort Lauderdale, FL 33301

***@tiltontec.com
http://tiltontec.com
@tiltonsalgebra

646-269-1077

"In a class by itself." *-Macworld*

Edi Weitz

2015-07-03 08:29:20 UTC

EQ, as you adroitly demonstrated, worries about all sorts of things,
including a symbol's package.

Which is part of what has me confused. Up until now I would have said
that the "problem" of EQ is that it doesn't worry about _enough_
things. (EQ 3/4 3/4) is NIL because EQ doesn't bother to look "into"
the numbers (as EQL does) but just superficially checks their "pointer
identity". And for symbols that's not the case? Hmmm...

Kenneth Tilton

2015-07-03 08:46:58 UTC

Post by Edi Weitz

EQ, as you adroitly demonstrated, worries about all sorts of things,
including a symbol's package.

Which is part of what has me confused. Up until now I would have said
that the "problem" of EQ is that it doesn't worry about _enough_
things. (EQ 3/4 3/4) is NIL because EQ doesn't bother to look "into"
the numbers (as EQL does) but just superficially checks their "pointer
identity". And for symbols that's not the case? Hmmm...

<cough> OK, I myself was at the wrong level of abstraction: EQ is not
worrying about anything other than pointer identity. It is the behavior of
intern and unintern that arranges for two symbols with the same name to be
distinct objects if their packages vary.

-kt
--
Kenneth Tilton
54 Isle of Venice Dr
Fort Lauderdale, FL 33301

***@tiltontec.com
http://tiltontec.com
@tiltonsalgebra

646-269-1077

"In a class by itself." *-Macworld*

Jason Cornez

2015-07-04 09:47:01 UTC

Sorry this is arriving late - I had some trouble posting to the group yesterday.

-Jason

Post by Kenneth Tilton
Perhaps the problem is confusing the levels of abstraction offered
by (a) EQ and (b) object identity. The latter is a very simple
idea. EQ, as you adroitly demonstrated, worries about all sorts of
things, including a symbol's package.

I don't think that anything has demonstrated that EQ is worried about
anything other than object identity. And 5.3.33 is pretty clear that
this is all that EQ does

"Returns true if its arguments are the same, identical object;
otherwise, returns false."

As for symbols, I agree that unintern does NOT affect identity of a
symbol. At the repl...

(defparameter *a* 'foo)
(defparameter *b* 'foo)
(eq *a* *b*) ==> T
(unintern 'foo)
(eq *a* *b*) ==> T
(defparameter *c* 'foo)
(eq *a* *c*) ==> NIL

If there is some doubt about why the last form is NIL, it is because
when the (defparameter *c* 'foo) form is _read_, the reader creates a
new symbol (via intern) because there is no current symbol named "FOO"
in the current package - obviously, we just uninterned the previous
symbol which is still the value of *a* and *b*.

The same thing is going on in the case of a function that refers to a
symbol. The symbol won't change, unless the function text is _read_
again.

(defun func-foo (sym)
(when (eq sym 'foo)
...))

If you pass in the same symbol object, EQ will always return T. But
sure, if you unintern 'foo and then at the repl call (func-foo 'foo),
you are now passing in a brand-new symbol, and so of course EQ will
return NIL in that existing function.

Hope this helps,
-Jason

Thomas Burdick

2015-07-03 11:04:56 UTC

Post by Edi Weitz
And the entry for INTERN is actually the closest I could find in terms
of clarification because it says that if a symbol of a specified name
is already accessible, _IT_ is returned -- which sounds like object
identity to me.

[...]

Post by Edi Weitz
Did I miss some clarifying words in the standard? Did I just manage
to confuse myself?

I think the confusion here is because you're confounding the reader path to a symbol with the idea of the symbol itself. And honestly, most of the "Packages" chapter is written in a way that doesn't help (e.g., UNINTERN calling a symbol with no package "pathological").

First of all, just ignore packages completely. They are but glorified hash tables, as the first sentance of the Packages chapter tries to make clear: "A package establishes a mapping from names to symbols"

Instead, consider just the symbol type. Symbols are structured objects, just like any other. They have slots for their name, for their value, for their function value, for their properties list, and for their (optional) home package. You make them with MAKE-SYMBOL, and you copy them with COPY-SYMBOL. AFAIK, the examples given for COPY-SYMBOL are the clearest attempt the spec makes to establish this as the conceptual model.

The reason that EQL doesn't do anything special for symbols the way it does for numbers is that symbols have structure. Symbols have names, home packages, property lists ... all sorts of things that you can destructively modify. The trick with EQL and numbers only works if you can't change anything about an object. Symbols being proper objects, object identity is the only thing you can use. INTERN and UNINTERN are just GETHASH and REMHASH under the hood.

Of course, the above is just the model; implementations can do anything they want that doesn't break the model, and the spec tries to use lots of formulations and careful, confusing wording to leave implementations the most freedom.

Cheers,
Thomas

Pascal Costanza

2015-07-04 08:39:49 UTC

Most languages don’t specify object identity in sufficient detail, so I’m not surprised Common Lisp doesn’t do this either.

Pascal

Post by Edi Weitz
Is there one place in the standard where it is explicitly said that
two symbols which are the "same" symbol must be "identical"? I know
that there are a couple of examples where this is implied, but
formally the examples aren't part of the standard, right?
(eq 'a 'a) => true
and then it continues with this note (emphasis mine): "Symbols that
print the same USUALLY are EQ to each other because of the use of the
INTERN function."
And the entry for INTERN is actually the closest I could find in terms
of clarification because it says that if a symbol of a specified name
is already accessible, _IT_ is returned -- which sounds like object
identity to me.
But how does this fit into the picture?
CL-USER 1 > (defparameter *s* 'foo)
*S*
CL-USER 2 > (unintern 'foo)
T
CL-USER 3 > (defparameter *s2* 'foo)
*S2*
CL-USER 4 > (eq *s* *s2*)
NIL
*S* has lost its home package and is thus not EQ to *S2*, sure, but
how do we explain this in terms of object identity? Has the UNINTERN
operation changed the identity of *S* which once was the one and only
CL-USER::FOO but can't be anymore because this role is now occupied by
*S2*?
Did I miss some clarifying words in the standard? Did I just manage
to confuse myself?
Thanks,
Edi.
PS: The UNINTERN entry warns about side effects which could harm
consistency, so maybe this is what they meant?

--
Pascal Costanza

Ben Hyde

2015-08-16 21:17:38 UTC

"Symbols that print the same USUALLY are EQ to each other because of the use of the INTERN function."

Unusual cases, eh? Are there exampleâs that donât involve uninterned symbolâs?

(eq '#:zzz '#:zzz)

nil

(flet ((f (&aux (*gensym-counter* 123)) (print (gensym)))) (eq (f) (f)))

#:g123
#:g123
nil

24 Replies
9 Views
Permalink to this page
Disable enhanced parsing

Thread Navigation

Edi Weitz 2015-07-03 07:09:26 UTC

Anton Vodonosov 2015-07-03 07:16:57 UTC

Anton Vodonosov 2015-07-03 07:30:38 UTC

Edi Weitz 2015-07-03 07:53:19 UTC

Anton Vodonosov 2015-07-03 08:14:52 UTC

Edi Weitz 2015-07-03 08:36:00 UTC

Edi Weitz 2015-07-03 08:48:42 UTC

Kenneth Tilton 2015-07-03 08:56:48 UTC

Alessio Stalla 2015-07-03 09:02:38 UTC

Edi Weitz 2015-07-03 09:31:55 UTC

Sam Steingold 2015-07-06 22:28:25 UTC

Anton Vodonosov 2015-07-03 09:15:32 UTC

Anton Vodonosov 2015-07-03 09:22:32 UTC

Martin Simmons 2015-07-03 12:44:40 UTC

Scott McKay 2015-07-03 12:52:13 UTC

Alessio Stalla 2015-07-03 13:10:22 UTC

Anton Vodonosov 2015-07-03 21:49:30 UTC

Steve Haflich 2015-07-03 23:37:07 UTC

Kenneth Tilton 2015-07-03 08:21:54 UTC

Edi Weitz 2015-07-03 08:29:20 UTC

Kenneth Tilton 2015-07-03 08:46:58 UTC

Jason Cornez 2015-07-04 09:47:01 UTC

Thomas Burdick 2015-07-03 11:04:56 UTC

Pascal Costanza 2015-07-04 08:39:49 UTC

Ben Hyde 2015-08-16 21:17:38 UTC

about - legalese

Loading...