Re: Another newbie philosophical question...

From: John P. Flanagan
Subject: Re: Another newbie philosophical question...
Date: Mon, 01 Nov 1993 22:23:06 +0000
Message-ID: <2b428a$5fs@wstar.mil.wi.us>

····@festival.ed.ac.uk (J W Dalton) writes:

>···@sef-pmax.slisp.cs.cmu.edu writes:

>>    From: ····@festival.ed.ac.uk (J W Dalton)
>>    
>>    I'd like some feedback from implementors about an issue very
>>    near to this one.  Suppose I have the option of allocating a
>>    new hash-table or clearing an old one.  How do these options
>>    compare?  Is one best most of the time, or does it vary too
>>    much from implementation to implementation or from application
>>    to application?
>>    
>>This will vary from Lisp to Lisp and will also depend somewhat upon
>>conditions (how much of your application fits into memory, how much other
>>consing is going on, etc.).  But my guess is that most of the time you'd be
>>much better off getting a new hash table than completely clearing out an
>>old one.

>I know it varies from Lisp to Lisp, but I'd also like to know how it
>is in various particular Lisps.

>Now, a new table is initially empty.  Is making an empty table
>faster than clearing one?  How important was fast clearing to
>your hash-table design?

[This reply was posted 3 days ago; a broken server prevented delivery ... ]

A quick glance at some ancient KCL code might help answer this question:

Lclrhash()
{
        int i;

        check_arg(1);
        check_type_hash_table(&vs_base[0]);
        for(i = 0; i < vs_base[0]->ht.ht_size; i++) {
                vs_base[0]->ht.ht_self[i].hte_key = OBJNULL;
                vs_base[0]->ht.ht_self[i].hte_value = OBJNULL;
        }
        vs_base[0]->ht.ht_nent = 0;
}

Although this is hand-coded C, it's obviously not written to be fast.
Using (*location++ = NULL) or calling a fast memset() would be better.
Most clrhash implementations are written in Lisp and it's unlikely that
they would run much faster than this (even with good optimization).

For most modern Lisp implementations that use generation scavenging
(ephemeral) GC, you're probably better off allocating a new hash table
each time.  The resident GC will no doubt clear the associated hash
table space much faster than clrhash can.  Also, clearing hash table
space is a destructive operation and as such, for those that use write
barrier schemes in their GC, a destructive operation dirties a page
thereby forcing all other pointer spaces located on that same page to be
scavenged as well during the next GC cycle.  For such systems it's
probably more accurate to say that making a new hash table is more
efficient (not necessarily faster) than clearing an old one.  -- jpf.
-- 
       _/   _/_/_/   _/_/_/                                   John P. Flanagan
      _/   _/_ _/   _/_/                              WaveStar Technology, USA
 _/_/_/   _/       _/                                      ···@wstar.mil.wi.us