From: David E. Young
Subject: Caching of list-based objects
Date: 
Message-ID: <307d765b.0204050647.d9bd4c8@posting.google.com>
Greetings. I've an application that needs to manage a cache of objects
whose principal attribute is a "body" composed of lists; I'm looking
for an efficient method of retrieving these objects based on some type
of hashing algorithm applied to the list-based bodies. An example at
this point would help. This is a typical object body; note that
symbols preceded by '?' are variables:

((?x (hobbit (name frodo) (age ?age)))
 (?y (wizard (name ?name) (wise yes))))

More importantly, the following body is semantically equivalent,
despite different variable names:

((?hobbit (hobbit (name frodo) (age ?age)))
 (?wizard (wizard (name ?wizard-name) (wise yes))))

Both of these bodies should hash to the same instance. So, I've
devised a scheme where the above examples generate the same hash code;
in essence I "normalize" the body by flattening the list and replacing
each variable with a new variable composed from a monotonically
increasing value, like this:

(?_1 hobbit name frodo age ?_2 ?_3 wizard name ?_4 wise yes)

On ACL 6.1 at least, applying SXHASH to the above yields the same hash
code for each new instance of the sample body. Great.

However, the following bodies are also semantically equivalent:

((?x (hobbit (age ?age) (name frodo)))
 (?y (wizard (wise yes) (name ?name))))

((?hobbit (hobbit (name frodo) (age ?age)))
 (?wizard (wizard (name ?wizard-name) (wise yes))))

Note here the different variable names *and* the order of internal
components (age, name, wise). Normalizing each of these bodies and
applying SXHASH does *not* yield the same hash code.

So. Suggestions? I'm not sure how to approach this, and I would guess
someone here has tackled this problem before, successfully. Thanks
much for the help.

Cheers,

--
------------------------------------------
David E. Young
······@bloodhoundinc.com
http://www.bloodhoundinc.com

"Those who expect to reap the blessings of liberty
 must undergo the fatigues of supporting it."
  -- Thomas Paine

"But all the world understands my language."
  -- Franz Joseph Haydn (1732-1809)

From: Rahul Jain
Subject: Re: Caching of list-based objects
Date: 
Message-ID: <874ripizz3.fsf@photino.sid.rice.edu>
······@bloodhoundinc.com (David E. Young) writes:

> ((?x (hobbit (age ?age) (name frodo)))
>  (?y (wizard (wise yes) (name ?name))))
> 
> ((?hobbit (hobbit (name frodo) (age ?age)))
>  (?wizard (wizard (name ?wizard-name) (wise yes))))

It looks to me that you want to cannonicalize the lists in some
way... probably sorting by symbol-name of the name of the relation?

That combined with renaming the variables in a cannonicalized way
might work, or come close.

Hmm, actually, 

((?hobbit (hobbit (name frodo) (age ?age)))
 (?wizard (wizard (name ?wizard-name) (wise yes))))

should be equivalent to

((?wizard (wizard (name ?wizard-name) (wise yes)))
 (?hobbit (hobbit (name frodo) (age ?age))))

So the sorting would have to be a bit more complex...

If the head is a variable, sort based on the sorted relation names
within?

-- 
-> -/                        - Rahul Jain -                        \- <-
-> -\  http://linux.rice.edu/~rahul -=-  ············@techie.com   /- <-
-> -/ "Structure is nothing if it is all you got. Skeletons spook  \- <-
-> -\  people if [they] try to walk around on their own. I really  /- <-
-> -/  wonder why XML does not." -- Erik Naggum, comp.lang.lisp    \- <-
|--|--------|--------------|----|-------------|------|---------|-----|-|
   (c)1996-2002, All rights reserved. Disclaimer available upon request.
From: David E. Young
Subject: Re: Caching of list-based objects
Date: 
Message-ID: <20020405.215042.1605908235.1772@nc.rr.com>
In article <··············@photino.sid.rice.edu>, "Rahul Jain"
<·····@sid-1129.sid.rice.edu> wrote:

> It looks to me that you want to cannonicalize the lists in some way...
> probably sorting by symbol-name of the name of the relation?
> 
> That combined with renaming the variables in a cannonicalized way might
> work, or come close.
> 
> Hmm, actually,
> 
> ((?hobbit (hobbit (name frodo) (age ?age)))
>  (?wizard (wizard (name ?wizard-name) (wise yes))))
> 
> should be equivalent to
> 
> ((?wizard (wizard (name ?wizard-name) (wise yes)))
>  (?hobbit (hobbit (name frodo) (age ?age))))
> 
> So the sorting would have to be a bit more complex...

Yup, you're right. Hmm; but not always. Perhaps I can place a restriction
on when objects are considered semantically equivalent. I'll consider
this more carefully. Thanks much.

Cheers,
--
------------------------------------------
David E. Young
········@computer.org
http://lisa.sourceforge.net

"Those who expect to reap the blessings of liberty
 must undergo the fatigues of supporting it."
  -- Thomas Paine
  
"But all the world understands my language."
  -- Franz Joseph Haydn (1732-1809)
From: Jochen Schmidt
Subject: Re: Caching of list-based objects
Date: 
Message-ID: <a8kghm$1c4$1@rznews2.rrze.uni-erlangen.de>
David E. Young wrote:
> However, the following bodies are also semantically equivalent:
> 
> ((?x (hobbit (age ?age) (name frodo)))
>  (?y (wizard (wise yes) (name ?name))))
> 
> ((?hobbit (hobbit (name frodo) (age ?age)))
>  (?wizard (wizard (name ?wizard-name) (wise yes))))
> 
> Note here the different variable names *and* the order of internal
> components (age, name, wise). Normalizing each of these bodies and
> applying SXHASH does *not* yield the same hash code.
> 
> So. Suggestions? I'm not sure how to approach this, and I would guess
> someone here has tackled this problem before, successfully. Thanks
> much for the help.

You could for example sort the inner components alphabetically (based on 
their symbol-names.
I have not looked if it will really fit, but there is a hash-algorithm 
called "DOMHash" which is used to hash XML Documents (hierarchical 
structures...). You could take a look at that too.

ciao,
Jochen

--
http://www.dataheaven.de
From: David E. Young
Subject: Re: Caching of list-based objects
Date: 
Message-ID: <307d765b.0204051250.1d88a10c@posting.google.com>
Jochen Schmidt <···@dataheaven.de> wrote in message news:<············@rznews2.rrze.uni-erlangen.de>...
 
> You could for example sort the inner components alphabetically (based on 
> their symbol-names.
> I have not looked if it will really fit, but there is a hash-algorithm 
> called "DOMHash" which is used to hash XML Documents (hierarchical 
> structures...). You could take a look at that too.
> 
> ciao,
> Jochen

Yes indeed; sorting is a possibility I've considered. Might even be
the right solution. I'll investigate DOMHash also. Thanks much...

Cheers,

--
------------------------------------------
David E. Young
······@bloodhoundinc.com
http://www.bloodhoundinc.com

"Those who expect to reap the blessings of liberty
 must undergo the fatigues of supporting it."
  -- Thomas Paine

"But all the world understands my language."
  -- Franz Joseph Haydn (1732-1809)