Discussion:
Hashing "object identities"
Antoniotti Marco
2017-12-03 11:02:37 UTC
Permalink
Hi

I am fooling around with a problem that eventually will have to use a hash table on “triples” of “integers” and “structures". Triples I can portably pass to the EQUAL hash table. I cannot use an EQUALP hash table, because I would end up wasting to much time.

Here is the rub: my triples <N, O1, O2> need to use the “object identity” of O1 and O2 (two structs). I could switch to an EQL hash table keyed on (HASH-TRIPLE-KEY N O1 O2). How would you proceed (or write HASH-TRIPLE), while staying as much portable as possible? As you well know SXHASH is practically useless.

Thanks

Marco
--
Marco Antoniotti, Associate Professor tel. +39 - 02 64 48 79 01
DISCo, Università Milano Bicocca U14 2043 http://bimib.disco.unimib.it
Viale Sarca 336
I-20126 Milan (MI) ITALY

Please check: http://cdac2018.lakecomoschool.org
Please check: http://troncopackage.org

Please note that I am not checking my Spam-box anymore.
Please do not forward this email without asking me fi
Jason Cornez
2017-12-03 11:27:32 UTC
Permalink
Add an integer slot on the structs; hash on that? -Jason

Sent from my iPhone
Post by Antoniotti Marco
Hi
I am fooling around with a problem that eventually will have to use a hash table on “triples” of “integers” and “structures". Triples I can portably pass to the EQUAL hash table. I cannot use an EQUALP hash table, because I would end up wasting to much time.
Here is the rub: my triples <N, O1, O2> need to use the “object identity” of O1 and O2 (two structs). I could switch to an EQL hash table keyed on (HASH-TRIPLE-KEY N O1 O2). How would you proceed (or write HASH-TRIPLE), while staying as much portable as possible? As you well know SXHASH is practically useless.
Thanks

Marco
--
Marco Antoniotti, Associate Professor tel. +39 - 02 64 48 79 01
DISCo, Università Milano Bicocca U14 2043 http://bimib.disco.unimib.it
Viale Sarca 336
I-20126 Milan (MI) ITALY
Please check: http://cdac2018.lakecomoschool.org
Please check: http://troncopackage.org
Please note that I am not checking my Spam-box anymore.
Please do not forward this email without asking me first (cum grano salis).
Antoniotti Marco
2017-12-03 11:38:15 UTC
Permalink
Post by Jason Cornez
Add an integer slot on the structs; hash on that? -Jason
I forgot to mention it. THAT is another thing I want to avoid because I need to share subgraphs (DAGs). I need to hash on the “object identity”.

Cheers

MA
Post by Jason Cornez
Sent from my iPhone
Post by Antoniotti Marco
Hi
I am fooling around with a problem that eventually will have to use a hash table on “triples” of “integers” and “structures". Triples I can portably pass to the EQUAL hash table. I cannot use an EQUALP hash table, because I would end up wasting to much time.
Here is the rub: my triples <N, O1, O2> need to use the “object identity” of O1 and O2 (two structs). I could switch to an EQL hash table keyed on (HASH-TRIPLE-KEY N O1 O2). How would you proceed (or write HASH-TRIPLE), while staying as much portable as possible? As you well know SXHASH is practically useless.
Thanks

Marco
--
Marco Antoniotti, Associate Professor tel. +39 - 02 64 48 79 01
DISCo, Università Milano Bicocca U14 2043 http://bimib.disco.unimib.it
Viale Sarca 336
I-20126 Milan (MI) ITALY
Please check: http://cdac2018.lakecomoschool.org
Please check: http://troncopackage.org
Please note that I am not checking my Spam-box anymore.
Please do not forward this email without asking me first (cum grano salis).
--
Marco Antoniotti, Associate Professor tel. +39 - 02 64 48 79 01
DISCo, Università Milano Bicocca U14 2043 http://bimib.disco.unimib.it
Viale Sarca 336
I-20126 Milan (MI) ITALY

Please check: http://cdac2018.lakecomoschool.org
Please check: http://troncopackage.org

Please note that I am not checking my Spam-box anymore.
Please do not forward this email without asking me first (cum grano salis).
Antoniotti Marco
2017-12-03 11:41:31 UTC
Permalink
Let me also add that I am being deliberately withholding information on what exactly I am working on/fooling around with :) There is an obvious solution based on keeping a table of such objects; alas, it would require manually garbage collecting it. I’d like to let the GC do the work.

Cheers

MA
Post by Antoniotti Marco
Post by Jason Cornez
Add an integer slot on the structs; hash on that? -Jason
I forgot to mention it. THAT is another thing I want to avoid because I need to share subgraphs (DAGs). I need to hash on the “object identity”.
Cheers

MA
Post by Jason Cornez
Sent from my iPhone
Post by Antoniotti Marco
Hi
I am fooling around with a problem that eventually will have to use a hash table on “triples” of “integers” and “structures". Triples I can portably pass to the EQUAL hash table. I cannot use an EQUALP hash table, because I would end up wasting to much time.
Here is the rub: my triples <N, O1, O2> need to use the “object identity” of O1 and O2 (two structs). I could switch to an EQL hash table keyed on (HASH-TRIPLE-KEY N O1 O2). How would you proceed (or write HASH-TRIPLE), while staying as much portable as possible? As you well know SXHASH is practically useless.
Thanks

Marco
--
Marco Antoniotti, Associate Professor tel. +39 - 02 64 48 79 01
DISCo, Università Milano Bicocca U14 2043 http://bimib.disco.unimib.it
Viale Sarca 336
I-20126 Milan (MI) ITALY
Please check: http://cdac2018.lakecomoschool.org
Please check: http://troncopackage.org
Please note that I am not checking my Spam-box anymore.
Please do not forward this email without asking me first (cum grano salis).
--
Marco Antoniotti, Associate Professor tel. +39 - 02 64 48 79 01
DISCo, Università Milano Bicocca U14 2043 http://bimib.disco.unimib.it
Viale Sarca 336
I-20126 Milan (MI) ITALY
Please check: http://cdac2018.lakecomoschool.org
Please check: http://troncopackage.org
Please note that I am not checking my Spam-box anymore.
Please do not forward this email without asking me first (cum grano salis).
--
Marco Antoniotti, Associate Professor tel. +39 - 02 64 48 79 01
DISCo, Università Milano Bicocca U14 2043 http://bimib.disco.unimib.it
Viale Sarca 336
I-20126 Milan (MI) ITALY

Please check: http://cdac2018.lakecomoschool.org
Please check: http://troncopackage.org

Please note that I am not checking my Spam-box anymore.
Please do not forward this email without asking me first (cum grano salis).
Daniel Kochmański
2017-12-03 11:48:03 UTC
Permalink
Post by Antoniotti Marco
Let me also add that I am being deliberately withholding information on what exactly I am working on/fooling around with :) There is an obvious solution based on keeping a table of such objects; alas, it would require manually garbage collecting it. I’d like to let the GC do the work.
What about weak hash tables? They are supported by most implementations
(and have portability layer in form of a library trivial-garbage).
Post by Antoniotti Marco
Cheers

MA
Regards,
Daniel
Post by Antoniotti Marco
Post by Antoniotti Marco
Post by Jason Cornez
Add an integer slot on the structs; hash on that? -Jason
I forgot to mention it. THAT is another thing I want to avoid because I need to share subgraphs (DAGs). I need to hash on the “object identity”.
Cheers

MA
Post by Jason Cornez
Sent from my iPhone
Post by Antoniotti Marco
Hi
I am fooling around with a problem that eventually will have to use a hash table on “triples” of “integers” and “structures". Triples I can portably pass to the EQUAL hash table. I cannot use an EQUALP hash table, because I would end up wasting to much time.
Here is the rub: my triples <N, O1, O2> need to use the “object identity” of O1 and O2 (two structs). I could switch to an EQL hash table keyed on (HASH-TRIPLE-KEY N O1 O2). How would you proceed (or write HASH-TRIPLE), while staying as much portable as possible? As you well know SXHASH is practically useless.
Thanks

Marco
--
Marco Antoniotti, Associate Professor tel. +39 - 02 64 48 79 01
DISCo, Università Milano Bicocca U14 2043 http://bimib.disco.unimib.it
Viale Sarca 336
I-20126 Milan (MI) ITALY
Please check: http://cdac2018.lakecomoschool.org
Please check: http://troncopackage.org
Please note that I am not checking my Spam-box anymore.
Please do not forward this email without asking me first (cum grano salis).
--
Marco Antoniotti, Associate Professor tel. +39 - 02 64 48 79 01
DISCo, Università Milano Bicocca U14 2043 http://bimib.disco.unimib.it
Viale Sarca 336
I-20126 Milan (MI) ITALY
Please check: http://cdac2018.lakecomoschool.org
Please check: http://troncopackage.org
Please note that I am not checking my Spam-box anymore.
Please do not forward this email without asking me first (cum grano salis).
--
Marco Antoniotti, Associate Professor tel. +39 - 02 64 48 79 01
DISCo, Università Milano Bicocca U14 2043 http://bimib.disco.unimib.it
Viale Sarca 336
I-20126 Milan (MI) ITALY
Please check: http://cdac2018.lakecomoschool.org
Please check: http://troncopackage.org
Please note that I am not checking my Spam-box anymore.
Please do not forward this email without asking me first (cum grano salis).
Antoniotti Marco
2017-12-03 11:50:13 UTC
Permalink
Weak hash tables must be used; no questions about this. The issue is the key of such hash tables.

I may be missing something obvious, of course...

Cheers

MA
Post by Antoniotti Marco
Let me also add that I am being deliberately withholding information on what exactly I am working on/fooling around with :) There is an obvious solution based on keeping a table of such objects; alas, it would require manually garbage collecting it. I’d like to let the GC do the work.
What about weak hash tables? They are supported by most implementations (and have portability layer in form of a library trivial-garbage).
Post by Antoniotti Marco
Cheers

MA
Regards,
Daniel
Post by Antoniotti Marco
Post by Antoniotti Marco
Post by Jason Cornez
Add an integer slot on the structs; hash on that? -Jason
I forgot to mention it. THAT is another thing I want to avoid because I need to share subgraphs (DAGs). I need to hash on the “object identity”.
Cheers

MA
Post by Jason Cornez
Sent from my iPhone
Post by Antoniotti Marco
Hi
I am fooling around with a problem that eventually will have to use a hash table on “triples” of “integers” and “structures". Triples I can portably pass to the EQUAL hash table. I cannot use an EQUALP hash table, because I would end up wasting to much time.
Here is the rub: my triples <N, O1, O2> need to use the “object identity” of O1 and O2 (two structs). I could switch to an EQL hash table keyed on (HASH-TRIPLE-KEY N O1 O2). How would you proceed (or write HASH-TRIPLE), while staying as much portable as possible? As you well know SXHASH is practically useless.
Thanks

Marco
--
Marco Antoniotti, Associate Professor tel. +39 - 02 64 48 79 01
DISCo, Università Milano Bicocca U14 2043 http://bimib.disco.unimib.it
Viale Sarca 336
I-20126 Milan (MI) ITALY
Please check: http://cdac2018.lakecomoschool.org
Please check: http://troncopackage.org
Please note that I am not checking my Spam-box anymore.
Please do not forward this email without asking me first (cum grano salis).
--
Marco Antoniotti, Associate Professor tel. +39 - 02 64 48 79 01
DISCo, Università Milano Bicocca U14 2043 http://bimib.disco.unimib.it
Viale Sarca 336
I-20126 Milan (MI) ITALY
Please check: http://cdac2018.lakecomoschool.org
Please check: http://troncopackage.org
Please note that I am not checking my Spam-box anymore.
Please do not forward this email without asking me first (cum grano salis).
--
Marco Antoniotti, Associate Professor tel. +39 - 02 64 48 79 01
DISCo, Università Milano Bicocca U14 2043 http://bimib.disco.unimib.it
Viale Sarca 336
I-20126 Milan (MI) ITALY
Please check: http://cdac2018.lakecomoschool.org
Please check: http://troncopackage.org
Please note that I am not checking my Spam-box anymore.
Please do not forward this email without asking me first (cum grano salis).
--
Marco Antoniotti, Associate Professor tel. +39 - 02 64 48 79 01
DISCo, Università Milano Bicocca U14 2043 http://bimib.disco.unimib.it
Viale Sarca 336
I-20126 Milan (MI) ITALY

Please check: http://cdac2018.lakecomoschool.org
Please check: http://troncopackage.org

Please note that I am not checking my Spam-box anymore.
Please do not forward this email without asking me first (cum grano s
Jason Cornez
2017-12-03 12:53:09 UTC
Permalink
If you *need* object identity, then CLOS objects give you that with hashing (as I am sure you know). But a slot in the struct containing a unique integer (or a gensym if you prefer) can also give you identity. I think I am perhaps not understanding exactly what you want.

-Jason
Post by Antoniotti Marco
Post by Jason Cornez
Add an integer slot on the structs; hash on that? -Jason
I forgot to mention it. THAT is another thing I want to avoid because I need to share subgraphs (DAGs). I need to hash on the “object identity”.
Cheers

MA
Post by Jason Cornez
Sent from my iPhone
Post by Antoniotti Marco
Hi
I am fooling around with a problem that eventually will have to use a hash table on “triples” of “integers” and “structures". Triples I can portably pass to the EQUAL hash table. I cannot use an EQUALP hash table, because I would end up wasting to much time.
Here is the rub: my triples <N, O1, O2> need to use the “object identity” of O1 and O2 (two structs). I could switch to an EQL hash table keyed on (HASH-TRIPLE-KEY N O1 O2). How would you proceed (or write HASH-TRIPLE), while staying as much portable as possible? As you well know SXHASH is practically useless.
Thanks

Marco
--
Marco Antoniotti, Associate Professor tel. +39 - 02 64 48 79 01
DISCo, Università Milano Bicocca U14 2043 http://bimib.disco.unimib.it
Viale Sarca 336
I-20126 Milan (MI) ITALY
Please check: http://cdac2018.lakecomoschool.org
Please check: http://troncopackage.org
Please note that I am not checking my Spam-box anymore.
Please do not forward this email without asking me first (cum grano salis).
--
Marco Antoniotti, Associate Professor tel. +39 - 02 64 48 79 01
DISCo, Università Milano Bicocca U14 2043 http://bimib.disco.unimib.it
Viale Sarca 336
I-20126 Milan (MI) ITALY
Please check: http://cdac2018.lakecomoschool.org
Please check: http://troncopackage.org
Please note that I am not checking my Spam-box anymore.
Please do not forward this email without asking me first (cum grano salis).
Pascal Bourguignon
2017-12-03 15:31:09 UTC
Permalink
Hi
I am fooling around with a problem that eventually will have to use a hash table on “triples” of “integers” and “structures". Triples I can portably pass to the EQUAL hash table. I cannot use an EQUALP hash table, because I would end up wasting to much time.
Here is the rub: my triples <N, O1, O2> need to use the “object identity” of O1 and O2 (two structs). I could switch to an EQL hash table keyed on (HASH-TRIPLE-KEY N O1 O2). How would you proceed (or write HASH-TRIPLE), while staying as much portable as possible? As you well know SXHASH is practically useless.
Post by Jason Cornez
Add an integer slot on the structs; hash on that? -Jason
I forgot to mention it. THAT is another thing I want to avoid because I need to share subgraphs (DAGs). I need to hash on the “object identity”.
The thing is that there’s no conforming way to obtain the “object identity” other than the object itself, and no other conforming way to use it than to use EQ/EQL or other operators using EQL, such as EQL hash-tables.

If you want to rely on implementation specific behavior, you can extract (a representation of) the object identity using print-unreadable-object.
If identity is true <http://www.lispworks.com/documentation/HyperSpec/Body/26_glo_t.htm#true>, the output from forms is followed by a space character and a representation of the object's identity, typically a storage address.
cf. eg. https://gitlab.com/com-informatimago/com-informatimago/blob/master/common-lisp/cesarum/utility.lisp#L833 <https://gitlab.com/com-informatimago/com-informatimago/blob/master/common-lisp/cesarum/utility.lisp#L833>

So a good conforming way to do it, is to add an integer slot to the structure, with a unique integer in it.
Another way to do it, is to use the structure as a key in a EQL hash-table mapping to the unique integer (with amortized O(1) time to get the object identity).
If you’re short on memory, you may store the structures in a vector, and use its index as object identity (but if you want to collect structures, it’ll leave holes, and it’s now O(n) to find the object identity).

So the cheapest way is indeed to add an integer slot to the structures.

Note that using sxhash doesn’t give the object identity. An implementation could very well return 42 for all the structures. So you could use it only to index buckets of structures, but you would still have to build the identifying integers in one of the above ways.

Obviously, the integer “object identity” slot of your structures would be unique amongst all your identifiable structures, and would not depend on the hash-table you would store them in.
--
__Pascal J. Bourguignon__
Antoniotti Marco
2017-12-03 16:11:00 UTC
Permalink
Post by Antoniotti Marco
Hi
I am fooling around with a problem that eventually will have to use a hash table on “triples” of “integers” and “structures". Triples I can portably pass to the EQUAL hash table. I cannot use an EQUALP hash table, because I would end up wasting to much time.
Here is the rub: my triples <N, O1, O2> need to use the “object identity” of O1 and O2 (two structs). I could switch to an EQL hash table keyed on (HASH-TRIPLE-KEY N O1 O2). How would you proceed (or write HASH-TRIPLE), while staying as much portable as possible? As you well know SXHASH is practically useless.
Post by Jason Cornez
Add an integer slot on the structs; hash on that? -Jason
I forgot to mention it. THAT is another thing I want to avoid because I need to share subgraphs (DAGs). I need to hash on the “object identity”.
The thing is that there’s no conforming way to obtain the “object identity” other than the object itself, and no other conforming way to use it than to use EQ/EQL or other operators using EQL, such as EQL hash-tables.
Yes. Hence the question. Is there any way to get them in a “semi-portable” way? I remember reading about “locatives”, would they help?
If you want to rely on implementation specific behavior, you can extract (a representation of) the object identity using print-unreadable-object.
If identity is true, the output from forms is followed by a space character and a representation of the object's identity, typically a storage address.
cf. eg. https://gitlab.com/com-informatimago/com-informatimago/blob/master/common-lisp/cesarum/utility.lisp#L833
I wouldn’t really want to use that.
So a good conforming way to do it, is to add an integer slot to the structure, with a unique integer in it.
Another way to do it, is to use the structure as a key in a EQL hash-table mapping to the unique integer (with amortized O(1) time to get the object identity).
If you’re short on memory, you may store the structures in a vector, and use its index as object identity (but if you want to collect structures, it’ll leave holes, and it’s now O(n) to find the object identity).
So the cheapest way is indeed to add an integer slot to the structures.
After reading through the email of the day and a number of sites, it looks like I may have to resort to that, using a second EQL hash table as an indirection.
Note that using sxhash doesn’t give the object identity. An implementation could very well return 42 for all the structures. So you could use it only to index buckets of structures, but you would still have to build the identifying integers in one of the above ways.
As a matter of fact that is what actually happens (modulo 42) in many implementations.

Thanks

MA
--
Marco Antoniotti, Associate Professor tel. +39 - 02 64 48 79 01
DISCo, Università Milano Bicocca U14 2043 http://bimib.disco.unimib.it
Viale Sarca 336
I-20126 Milan (MI) ITALY

Please check: http://cdac2018.lakecomoschool.org
Please check: http://troncopackage.org

Please note that I am not checking my Spam-box anymore.
Please do not forward this email without askin
Faré
2017-12-04 02:39:53 UTC
Permalink
1- Case where objects are mutable. Add an integer slot "id"
initialized from (generate-id) which increments a per-thread counter
(until it reaches end of block, then get a new block).
2- Case where objects are immutable. Add an integer slot "hash"
computed from the hashes of the slot hashes, using BLAKE2.

—♯ƒ • François-René ÐVB Rideau •Reflection&Cybernethics• http://fare.tunes.org
The difference between a claim and a real thing is about 5 million lines
of code. — Rainer Joswig

Nick Levine
2017-12-03 11:25:35 UTC
Permalink
For the structures you'll have to use EQUALP. So your question as I understand it is how to incorporate that into lookup of the triples.

You might have an EQUALP table which maps each structure into a number, and then the main lookup is keyed on a triple of numbers.

Or maybe, if speed is more important than space, each structure contains an ID slot -- a unique number -- and you key off that.

- nick
Post by Antoniotti Marco
Hi
I am fooling around with a problem that eventually will have to use a hash table on “triples” of “integers” and “structures". Triples I can portably pass to the EQUAL hash table. I cannot use an EQUALP hash table, because I would end up wasting to much time.
Here is the rub: my triples <N, O1, O2> need to use the “object identity” of O1 and O2 (two structs). I could switch to an EQL hash table keyed on (HASH-TRIPLE-KEY N O1 O2). How would you proceed (or write HASH-TRIPLE), while staying as much portable as possible? As you well know SXHASH is practically useless.
Thanks

Marco
--
Marco Antoniotti, Associate Professor tel. +39 - 02 64 48 79 01
DISCo, Università Milano Bicocca U14 2043 http://bimib.disco.unimib.it
Viale Sarca 336
I-20126 Milan (MI) ITALY
Please check: http://cdac2018.lakecomoschool.org
Please check: http://troncopackage.org
Please note that I am not checking my Spam-box anymore.
Please do not forward this email without asking me first (cum grano salis).
Antoniotti Marco
2017-12-03 11:55:32 UTC
Permalink
Hi Nick.

I am actually hashing DAGs. I used EQUALP hash tables, every lookup may end up checking entire sub-dags, which would defeat the purpose.

Marco
Post by Nick Levine
For the structures you'll have to use EQUALP. So your question as I understand it is how to incorporate that into lookup of the triples.
You might have an EQUALP table which maps each structure into a number, and then the main lookup is keyed on a triple of numbers.
Or maybe, if speed is more important than space, each structure contains an ID slot -- a unique number -- and you key off that.
- nick
Post by Antoniotti Marco
Hi
I am fooling around with a problem that eventually will have to use a hash table on “triples” of “integers” and “structures". Triples I can portably pass to the EQUAL hash table. I cannot use an EQUALP hash table, because I would end up wasting to much time.
Here is the rub: my triples <N, O1, O2> need to use the “object identity” of O1 and O2 (two structs). I could switch to an EQL hash table keyed on (HASH-TRIPLE-KEY N O1 O2). How would you proceed (or write HASH-TRIPLE), while staying as much portable as possible? As you well know SXHASH is practically useless.
Thanks

Marco
--
Marco Antoniotti, Associate Professor tel. +39 - 02 64 48 79 01
DISCo, Università Milano Bicocca U14 2043 http://bimib.disco.unimib.it
Viale Sarca 336
I-20126 Milan (MI) ITALY
Please check: http://cdac2018.lakecomoschool.org
Please check: http://troncopackage.org
Please note that I am not checking my Spam-box anymore.
Please do not forward this email without asking me first (cum grano salis).
--
Marco Antoniotti, Associate Professor tel. +39 - 02 64 48 79 01
DISCo, Università Milano Bicocca U14 2043 http://bimib.disco.unimib.it
Viale Sarca 336
I-20126 Milan (MI) ITALY

Please check: http://cdac2018.lakecomoschool.org
Please check: http://troncopackage.org

Please note that I am not checking my Spam-box anymore.
Please do not forward this email without asking
Bob Cassels
2017-12-03 11:56:04 UTC
Permalink
Steve Haflich
2017-12-03 15:41:43 UTC
Permalink
Your problem description omitted one essential design requirement:
Are these triples mutable?
Antoniotti Marco
2017-12-03 16:07:26 UTC
Permalink
Post by Steve Haflich
Are these triples mutable?
No. Each triple is, in essence, the full “object identity” (they are used in a memoization scheme).

Cheers
--
Marco Antoniotti, Associate Professor tel. +39 - 02 64 48 79 01
DISCo, Università Milano Bicocca U14 2043 http://bimib.disco.unimib.it
Viale Sarca 336
I-20126 Milan (MI) ITALY

Please check: http://cdac2018.lakecomoschool.org
Please check: http://troncopackage.org

Please note that I am not checking my Spam-box anymore.
Please do not forward this email without askin
Loading...