From: Chris Miner
Subject: Re: Performance increase needed any ideas
Date: 
Message-ID: <Feb26.154831.96182@yuma.ACNS.ColoState.EDU>
In reference to my post for performance increase I thought of a more
specific question to ask.  

I make heavy use of member and union with #'equal as the test.  On
seing a post regarding sxhash, I had a thought.  I have a lot of code
like:
(unless (member item set :test #'equal)
	(setf set (union set (list item) :test #'equal)))

The item could be a list of lists and character strings and 
subsublists.  Suppose I wrote something like member-hash union-hash
with the same interface as member and union, but supported the set
stuff with hashtables.  So instead of looking up an item in a list I
looked it up in a hash table.  Would this give me a 10x performance
boost if the set was large and each item large as well, and I did
alot of lookups?  

Thanks,
	Chris

From: Ralph Freese
Subject: Re: Performance increase needed any ideas
Date: 
Message-ID: <C32t0w.F4v@news.Hawaii.Edu>
> In reference to my post for performance increase I thought of a more
> specific question to ask.  

> I make heavy use of member and union with #'equal as the test.  On
> seing a post regarding sxhash, I had a thought.  I have a lot of code
> like:
>    (unless (member item set :test #'equal)
> 	 (setf set (union set (list item) :test #'equal)))

Your code could be greatly improved by replacing the second line with

         (setf set (cons item set)) )

or better yet, replace both lines with

      (pushnew item set :test #'equal)

Your code tests member twice and, depending on how union is implemented 
(common lisp does not specify the order of the elements of a union), it 
may be doing a great deal of unnecessary cons'ing.  

Your right though that using a hash table would help. Member takes time 
proportional to the size of  set.  A good idea is to keep  set  as a list but 
also maintain a hash table to test if  item  is in  set. 

		Ralph Freese
		·····@math.hawaii.edu
From: Bruce R. Miller
Subject: Re: Performance increase needed any ideas
Date: 
Message-ID: <2939760556@ARTEMIS.cam.nist.gov>
In article <··················@yuma.ACNS.ColoState.EDU>, Chris Miner writes:
> In reference to my post for performance increase I thought of a more
> specific question to ask.  
> 
> I make heavy use of member and union with #'equal as the test.  On
> seing a post regarding sxhash, I had a thought.  I have a lot of code
> like:
> (unless (member item set :test #'equal)
> 	(setf set (union set (list item) :test #'equal)))

First comment:
  Either/Or, but not both!

You test whether item is in the list, and if not, you form the union
which makes the same test again.

You could use simply:
 	(setf set (union set (list item) :test #'equal))
which would only add ITEM to SET if it weren't already present.

Even then, union is a bit heavy handed for this.  It is meant to deal
with two lists, each of which may be longer than 1.  Depending on how
many special cases your implementation's UNION handles, it could be a
rather inefficient way to add a single element.

A more direct, and probably more efficient way would be simply
 (unless (member item set :test #'equal)
    (push item set))

This change will half the time spent in these code fragments.  

If the lists are long enough, you may benefit from using hash tables.
I'd suggest checking the code for redundant tests first, and see how
well it does, before changing to hash tables.

Now I may be taking a small comment too far, but you say "I have a lot
of code like..."
May I suggest that repetitions of code like the above should be
replaced by a call, say 
  (setf set (add-item item set))
[or whatever you like].  
Then add-item is defined whichever way you like.  You then localize the
code that deals with sets and make it easier to experiment with
different implementations of sets, be they lists, hash-tables, and with
different implementations of the accessor function ADD-ITEM.

bruce
······@cam.nist.gov