Maphash into a new hashtable

From: Will McCutchen
Subject: Maphash into a new hashtable
Date: Tue, 14 Jun 2005 17:08:32 +0000
Message-ID: <1118768912.064958.167190@z14g2000cwz.googlegroups.com>

Hi,

I am probably going to really butcher this explanation, but here goes.
I wrote a function that takes a function and a hashtable.  It returns a
new hashtable whose keys are identical to those in the original
hashtable and whose values are the result of applying the given
function to the values in the original hashtable.  Here it is:

(defun maphash->hash (function table)
  "Maps FUNCTION across hashtable TABLE, returning a new
   hashtable whose keys are identical to those in TABLE but
   whose values are the result of applying FUNCTION to each
   key and value in TABLE"
  (let ((result-table (make-hash-table :test (hash-table-test table))))
    (maphash #'(lambda (key value) (setf (gethash key result-table)
(funcall function key value))) table)
    result-table))

This post might be absolutely pointless, since I don't actually have a
problem.  I needed this functionality, so I wrote it, and it works.
I'm posting this because I'm very new to Lisp and I have this feeling
that I'm missing something here or that I'm approaching this problem
from the wrong side.  Is there a better way to do this?  What should I
call this function?

Thanks,


Will.

Re: Maphash into a new hashtable Thomas A. Russ
- Re: Maphash into a new hashtable Will McCutchen
  - Re: Maphash into a new hashtable Andreas Thiele
    - Re: Maphash into a new hashtable nepheles
      - Re: Maphash into a new hashtable Pascal Bourguignon
- Re: Maphash into a new hashtable Zach Beane
- Re: Maphash into a new hashtable Barry Margolin
  - Re: Maphash into a new hashtable Will McCutchen
    - Re: Maphash into a new hashtable Tron3k
      - Re: Maphash into a new hashtable Pascal Bourguignon
        Re: Maphash into a new hashtable Tron3k
Re: Maphash into a new hashtable Andreas Thiele
- Re: Maphash into a new hashtable Will McCutchen
Re: Maphash into a new hashtable Pascal Bourguignon
- Re: Maphash into a new hashtable Edi Weitz
  - Re: Maphash into a new hashtable Andreas Thiele
    - Re: Maphash into a new hashtable Rob Warnock
      - Re: Maphash into a new hashtable Andreas Thiele

From: Thomas A. Russ
Subject: Re: Maphash into a new hashtable
Date: Tue, 14 Jun 2005 17:50:03 +0000
Message-ID: <ymimzpsu8tw.fsf@sevak.isi.edu>

"Will McCutchen" <·········@gmail.com> writes:

> I am probably going to really butcher this explanation, but here goes.
> I wrote a function that takes a function and a hashtable.  It returns a
> new hashtable whose keys are identical to those in the original
> hashtable and whose values are the result of applying the given
> function to the values in the original hashtable.  Here it is:
> 
> (defun maphash->hash (function table)
>   "Maps FUNCTION across hashtable TABLE, returning a new
>    hashtable whose keys are identical to those in TABLE but
>    whose values are the result of applying FUNCTION to each
>    key and value in TABLE"
>   (let ((result-table (make-hash-table :test (hash-table-test table))))
>     (maphash #'(lambda (key value) (setf (gethash key result-table)
> (funcall function key value))) table)
>     result-table))
> 
> This post might be absolutely pointless, since I don't actually have a
> problem.  I needed this functionality, so I wrote it, and it works.
> I'm posting this because I'm very new to Lisp and I have this feeling
> that I'm missing something here or that I'm approaching this problem
> from the wrong side.  Is there a better way to do this?  What should I
> call this function?

This seems to be quite reasonable way of doing this.  The only supported
ways to iterate over hash tables is via the MAPHASH function or using
the LOOP macro.  About the only change I would make in your function
would be to specify an initial size for the result hash table.  That is
purely an efficiency measure, since it will eliminate the need to grow
the result hash table (which requires rehashing all current contents).

-- 
Thomas A. Russ,  USC/Information Sciences Institute

From: Will McCutchen
Subject: Re: Maphash into a new hashtable
Date: Tue, 14 Jun 2005 20:27:15 +0000
Message-ID: <1118780835.244765.304880@g44g2000cwa.googlegroups.com>

Thomas A. Russ wrote:
> This seems to be quite reasonable way of doing this.  The only supported
> ways to iterate over hash tables is via the MAPHASH function or using
> the LOOP macro.  About the only change I would make in your function
> would be to specify an initial size for the result hash table.  That is
> purely an efficiency measure, since it will eliminate the need to grow
> the result hash table (which requires rehashing all current contents).

Thanks for the input.  I figured I'd get around to using all of the
available keyword arguments to make-hash-table the way I've used the
:test argument, since there are accessor functions for each of these.

That way, the returned hashtable would match the given hashtable with
regards to the size, rehash-size and rehash-threshold as well.  At this
point, though, this application doesn't need to be particularly
efficient, so I've been making my hashtables with the default values
for these arguments.

Will.

From: Andreas Thiele
Subject: Re: Maphash into a new hashtable
Date: Tue, 14 Jun 2005 21:50:27 +0000
Message-ID: <d8njf6$a8i$02$1@news.t-online.com>

"Will McCutchen" <·········@gmail.com> schrieb im Newsbeitrag
·····························@g44g2000cwa.googlegroups.com...
> Thomas A. Russ wrote:
> > This seems to be quite reasonable way of doing this.  The only supported
> > ways to iterate over hash tables is via the MAPHASH function or using
> > the LOOP macro.  About the only change I would make in your function
> > would be to specify an initial size for the result hash table.  That is
> > purely an efficiency measure, since it will eliminate the need to grow
> > the result hash table (which requires rehashing all current contents).
>
> Thanks for the input.  I figured I'd get around to using all of the
> available keyword arguments to make-hash-table the way I've used the
> :test argument, since there are accessor functions for each of these.
>
> That way, the returned hashtable would match the given hashtable with
> regards to the size, rehash-size and rehash-threshold as well.  At this
> point, though, this application doesn't need to be particularly
> efficient, so I've been making my hashtables with the default values
> for these arguments.
>
>
> Will.
>

If your application doesn't need to be efficient you might use property list
or association list.

Hashtables are for improving performance. You can easily replace a property
list by a hashtable later, when you are sure you need performance.

I think it is not the lispy way to start with considering performance. If
you only have 20 entries in your list, a hashtable may even be a performance
penalty.

Andreas

From: nepheles
Subject: Re: Maphash into a new hashtable
Date: Wed, 15 Jun 2005 21:39:05 +0000
Message-ID: <1118871545.898727.308740@g49g2000cwa.googlegroups.com>

> If your application doesn't need to be efficient you might use property list
> or association list.
>
> Hashtables are for improving performance. You can easily replace a property
> list by a hashtable later, when you are sure you need performance.
>
> I think it is not the lispy way to start with considering performance. If
> you only have 20 entries in your list, a hashtable may even be a performance
> penalty.
>
> Andreas

This is something I've come up against a few times. While hash-tables
are the "general-purpose" dictionaries in most languages, Lisp offers
p-lists and a-lists as alternatives.

I almost never use the former. What benefits to you get in return for
the efficiency trade-off of p-lists?

I think I can see the attraction of a-lists: you can (read) and (print)
them without issue, use all standard list functions (e.g. to copy it),
and any :test (instead of some implementation-dependent custom
predicate/hash-function pair). Also, I suppose they don't waste space
as a hash-table can/does, and never need rehashing, but that hardly
makes them efficient :). Am I missing anything else major?

Having said that, I generally find I use hash-tables anyway. What about
other Lispers?

From: Pascal Bourguignon
Subject: Re: Maphash into a new hashtable
Date: Wed, 15 Jun 2005 22:32:18 +0000
Message-ID: <87slzjs13h.fsf@thalassa.informatimago.com>

"nepheles" <········@myrealbox.com> writes:

>> If your application doesn't need to be efficient you might use property list
>> or association list.
>>
>> Hashtables are for improving performance. You can easily replace a property
>> list by a hashtable later, when you are sure you need performance.
>>
>> I think it is not the lispy way to start with considering performance. If
>> you only have 20 entries in your list, a hashtable may even be a performance
>> penalty.
>>
>> Andreas
>
> This is something I've come up against a few times. While hash-tables
> are the "general-purpose" dictionaries in most languages, Lisp offers
> p-lists and a-lists as alternatives.
>
> I almost never use the former. What benefits to you get in return for
> the efficiency trade-off of p-lists?
>
> I think I can see the attraction of a-lists: you can (read) and (print)
> them without issue, use all standard list functions (e.g. to copy it),
> and any :test (instead of some implementation-dependent custom
> predicate/hash-function pair). Also, I suppose they don't waste space
> as a hash-table can/does, and never need rehashing, but that hardly
> makes them efficient :). Am I missing anything else major?

Yes.  All what your computer science teachers ever told you was wrong.
About this big O stuff.


Well, not exactly wrong.  They just forgot to tell (or you forgot to
listen to) the trivial bit that there are constants involved.

When you have a algorithm that allows you to fetch data in O(1), and
another that allows you to fetch datain O(N) N being the size of the
data set, you'd think that it would cost 10 time units of times to
fetch a data item in a set of size 10, with the later, and only one
time unit in the former.  Far from it.  What O(1) and O(N) tell, is
that O(1) will get you data in at most C1 time units, while the O(N)
will get you the data in at most C2�N time units.  

Now, if C1=100 and C2=5, when will it be interesting to use the O(N)
algorithm instead of the O(1) one?

  5*N<100 <=> N<20

So while N is less than 20, it's better to use the O(N) algorithm.


The actual break points are:

for interpreted clisp: 500 approx.
for compiled clisp:     50 approx.
for compiled sbcl:       5 approx.


(defun plist-or-hash (&optional (f 1000))
  (mapc (lambda (n)
          (format t "~2%list[~D]" n)
          (time (let ((m '())
                      (k '())
                      (r '()))
                  (dotimes (i n m)
                    (push (gensym) k)
                    (push i m)
                    (push (car k) m))
                  (dotimes (j f)
                    (dolist (key k)
                      (push (getf m key) r)))))
          (format t "~2%hash[~D]" n)
          (time (let ((m (make-hash-table)) ;;; :test (function equalp)))
                      (k '())
                      (r '()))
                  (dotimes (i n m)
                    (push (gensym) k)
                    (setf (gethash (car k) m) i))
                  (dotimes (j f)
                    (dolist (key k)
                      (push (gethash key m) r))))))
        '(1 5 10 50 100 500 1000))
  );;plist-or-hash

(compile 'plist-or-hash)

(prog1 (values) (plist-or-hash #+clisp 1000 #-clisp 1000000))



Another good property of a-list and p-list, is that they can be used
as stacks, where new keys hide older ones, so different part of the
program can see different "bindings":

(setf plist '(:value 1))
(let ((p1 plist))
  (push 2      plist)
  (push :value plist)
  (let ((p2 plist))
     (print (getf p1 :value))
     (print (getf p2 :value))))



> Having said that, I generally find I use hash-tables anyway. What about
> other Lispers?


-- 
__Pascal Bourguignon__                     http://www.informatimago.com/
Our enemies are innovative and resourceful, and so are we. They never
stop thinking about new ways to harm our country and our people, and
neither do we. -- Georges W. Bush

From: Zach Beane
Subject: Re: Maphash into a new hashtable
Date: Tue, 14 Jun 2005 19:49:05 +0000
Message-ID: <m3is0g3ej2.fsf@unnamed.xach.com>

···@sevak.isi.edu (Thomas A. Russ) writes:

> This seems to be quite reasonable way of doing this.  The only supported
> ways to iterate over hash tables is via the MAPHASH function or using
> the LOOP macro.

See also WITH-HASH-TABLE-ITERATOR.

Zach

From: Barry Margolin
Subject: Re: Maphash into a new hashtable
Date: Wed, 15 Jun 2005 03:32:23 +0000
Message-ID: <barmar-D065DB.23322314062005@comcast.dca.giganews.com>

In article <···············@sevak.isi.edu>,
 ···@sevak.isi.edu (Thomas A. Russ) wrote:

> "Will McCutchen" <·········@gmail.com> writes:
> 
> > I am probably going to really butcher this explanation, but here goes.
> > I wrote a function that takes a function and a hashtable.  It returns a
> > new hashtable whose keys are identical to those in the original
> > hashtable and whose values are the result of applying the given
> > function to the values in the original hashtable.  Here it is:
> > 
> > (defun maphash->hash (function table)
> >   "Maps FUNCTION across hashtable TABLE, returning a new
> >    hashtable whose keys are identical to those in TABLE but
> >    whose values are the result of applying FUNCTION to each
> >    key and value in TABLE"
> >   (let ((result-table (make-hash-table :test (hash-table-test table))))
> >     (maphash #'(lambda (key value) (setf (gethash key result-table)
> > (funcall function key value))) table)
> >     result-table))
> > 
> > This post might be absolutely pointless, since I don't actually have a
> > problem.  I needed this functionality, so I wrote it, and it works.
> > I'm posting this because I'm very new to Lisp and I have this feeling
> > that I'm missing something here or that I'm approaching this problem
> > from the wrong side.  Is there a better way to do this?  What should I
> > call this function?
> 
> This seems to be quite reasonable way of doing this.  The only supported
> ways to iterate over hash tables is via the MAPHASH function or using
> the LOOP macro.  About the only change I would make in your function
> would be to specify an initial size for the result hash table.  That is
> purely an efficiency measure, since it will eliminate the need to grow
> the result hash table (which requires rehashing all current contents).

Or at least set the size of the new hash table from the size of the 
original one, since you know that the new table will immediately need to 
grow to that size.

-- 
Barry Margolin, ······@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***

From: Will McCutchen
Subject: Re: Maphash into a new hashtable
Date: Wed, 15 Jun 2005 17:01:58 +0000
Message-ID: <1118854918.641105.231040@g14g2000cwa.googlegroups.com>

Barry Margolin wrote:
> Or at least set the size of the new hash table from the size of the
> original one, since you know that the new table will immediately need to
> grow to that size.

Here's the function I'm going with, for now:

(defun maphash->hash (function table)
  (let ((result-table (make-hash-table
		       :test (hash-table-test table)
		       :size (hash-table-size table)
		       :rehash-size (hash-table-rehash-size table)
		       :rehash-threshold (hash-table-rehash-threshold table))))
    (maphash #'(lambda (key value) (setf (gethash key result-table)
(funcall function key value))) table)
    result-table))

The output hashtable now matches the input hashtable with respect to
test, size, rehash-size and rehash-threshold.  I've called it
MAPHASH->HASH for two reasons:

1)  I couldn't think of a better name, and I thought I would end up
confusing myself if I called it HASHMAP

2)  I'm new to Lisp and I like the fact that I can draw arrows in my
function names.

Thank you all for your input, I really appreciate it.

Will.

From: Tron3k
Subject: Re: Maphash into a new hashtable
Date: Wed, 22 Jun 2005 14:43:02 +0000
Message-ID: <1119451382.004353.287530@f14g2000cwb.googlegroups.com>

I love putting arrows in my function names too. That alone is
justification enough to use Lisp instead of C++. :)

From: Pascal Bourguignon
Subject: Re: Maphash into a new hashtable
Date: Wed, 22 Jun 2005 16:56:04 +0000
Message-ID: <87vf465o0r.fsf@thalassa.informatimago.com>

"Tron3k" <······@gmail.com> writes:

> I love putting arrows in my function names too. That alone is
> justification enough to use Lisp instead of C++. :)

Well, you can put enough right arrows in C++.  Use more left arrows! 


-- 
__Pascal Bourguignon__                     http://www.informatimago.com/
Litter box not here.
You must have moved it again.
I'll poop in the sink.

From: Tron3k
Subject: Re: Maphash into a new hashtable
Date: Thu, 23 Jun 2005 15:54:01 +0000
Message-ID: <1119542041.524099.47060@g49g2000cwa.googlegroups.com>

Pascal Bourguignon wrote:
> "Tron3k" <······@gmail.com> writes:
>
> > I love putting arrows in my function names too. That alone is
> > justification enough to use Lisp instead of C++. :)
>
> Well, you can put enough right arrows in C++.  Use more left arrows!
>
>
> --
> __Pascal Bourguignon__                     http://www.informatimago.com/
> Litter box not here.
> You must have moved it again.
> I'll poop in the sink.

Haha, that's true! You know, once I used a left-right arrow, like this:
<->

In fact, it was in my function convert-rgb<->bgr, because, of course,
that function just switches between them. Lisp is so fun :)

And what in God's name is wrong with Google Groups? It looks like my
previous post is showing up below your post, and none of the quoting
came out right. Moreover, that messed-up version of my post appeared,
to my eyes, a whole day after I posted it.

From: Andreas Thiele
Subject: Re: Maphash into a new hashtable
Date: Tue, 14 Jun 2005 17:53:28 +0000
Message-ID: <d8n5ir$1t1$00$1@news.t-online.com>

"Will McCutchen" <·········@gmail.com> schrieb im Newsbeitrag
·····························@z14g2000cwz.googlegroups.com...
> Hi,
>
> I am probably going to really butcher this explanation, but here goes.
> I wrote a function that takes a function and a hashtable.  It returns a
> new hashtable whose keys are identical to those in the original
> hashtable and whose values are the result of applying the given
> function to the values in the original hashtable.  Here it is:
>
> (defun maphash->hash (function table)
>   "Maps FUNCTION across hashtable TABLE, returning a new
>    hashtable whose keys are identical to those in TABLE but
>    whose values are the result of applying FUNCTION to each
>    key and value in TABLE"
>   (let ((result-table (make-hash-table :test (hash-table-test table))))
>     (maphash #'(lambda (key value) (setf (gethash key result-table)
> (funcall function key value))) table)
>     result-table))
>
> This post might be absolutely pointless, since I don't actually have a
> problem.  I needed this functionality, so I wrote it, and it works.
> I'm posting this because I'm very new to Lisp and I have this feeling
> that I'm missing something here or that I'm approaching this problem
> from the wrong side.  Is there a better way to do this?  What should I
> call this function?
>
> Thanks,
>
>
> Will.
>

I think it is absolutely OK especially if you need this more than once.

You encapsulated the non functional operation on the hash table. Now you
have a functional description of your desired operation. I'd consider this
as advantage (same input -> same output).


Andreas

From: Will McCutchen
Subject: Re: Maphash into a new hashtable
Date: Tue, 14 Jun 2005 20:28:35 +0000
Message-ID: <1118780915.920160.70490@f14g2000cwb.googlegroups.com>

Andreas Thiele wrote:
> I think it is absolutely OK especially if you need this more than once.
>
> You encapsulated the non functional operation on the hash table. Now you
> have a functional description of your desired operation. I'd consider this
> as advantage (same input -> same output).

Thanks... I'm glad to see that it wasn't an absolutely dumb idea!

From: Pascal Bourguignon
Subject: Re: Maphash into a new hashtable
Date: Tue, 14 Jun 2005 18:15:02 +0000
Message-ID: <874qc0vm8p.fsf@thalassa.informatimago.com>

"Will McCutchen" <·········@gmail.com> writes:

> Hi,
>
> I am probably going to really butcher this explanation, but here goes.
> I wrote a function that takes a function and a hashtable.  It returns a
> new hashtable whose keys are identical to those in the original
> hashtable and whose values are the result of applying the given
> function to the values in the original hashtable.  Here it is:
>
> (defun maphash->hash (function table)
>   "Maps FUNCTION across hashtable TABLE, returning a new
>    hashtable whose keys are identical to those in TABLE but
>    whose values are the result of applying FUNCTION to each
>    key and value in TABLE"
>   (let ((result-table (make-hash-table :test (hash-table-test table))))
>     (maphash #'(lambda (key value) (setf (gethash key result-table)
> (funcall function key value))) table)
>     result-table))
>
> This post might be absolutely pointless, since I don't actually have a
> problem.  I needed this functionality, so I wrote it, and it works.
> I'm posting this because I'm very new to Lisp and I have this feeling
> that I'm missing something here or that I'm approaching this problem
> from the wrong side.  Is there a better way to do this?  

> What should I call this function?

You could name it HASHMAP.

Too bad that all MAP??? are functions, but MAPHASH is a procedure.

-- 
__Pascal Bourguignon__                     http://www.informatimago.com/
Until real software engineering is developed, the next best practice
is to develop with a dynamic system that has extreme late binding in
all aspects. The first system to really do this in an important way
is Lisp. -- Alan Kay

From: Edi Weitz
Subject: Re: Maphash into a new hashtable
Date: Tue, 14 Jun 2005 18:24:24 +0000
Message-ID: <uacls94pz.fsf@agharta.de>

On Tue, 14 Jun 2005 20:15:02 +0200, Pascal Bourguignon <···@informatimago.com> wrote:

> Too bad that all MAP??? are functions, but MAPHASH is a procedure.

I can't find the term "procedure" in the CLHS glossary.  Wrong
newsgroup?

-- 

Lisp is not dead, it just smells funny.

Real email: (replace (subseq ·········@agharta.de" 5) "edi")

From: Andreas Thiele
Subject: Re: Maphash into a new hashtable
Date: Tue, 14 Jun 2005 21:21:11 +0000
Message-ID: <d8nhoa$dm$03$1@news.t-online.com>

"Edi Weitz" <········@agharta.de> schrieb im Newsbeitrag
··················@agharta.de...
> On Tue, 14 Jun 2005 20:15:02 +0200, Pascal Bourguignon
<···@informatimago.com> wrote:
>
> > Too bad that all MAP??? are functions, but MAPHASH is a procedure.
>
> I can't find the term "procedure" in the CLHS glossary.  Wrong
> newsgroup?
> ...

A thorough explanation of lisp procedures can be found at

http://www.mactech.com/articles/mactech/Vol.01/01.08/FunctionsinLisp/

:))

To be serious: I think he meant it's imperative - you can only use it by
creating side effects respectively using assignments.

Andreas

From: Rob Warnock
Subject: Re: Maphash into a new hashtable
Date: Wed, 22 Jun 2005 00:42:19 +0000
Message-ID: <wP-dnVq3WNJ2LiXfRVn-3w@speakeasy.net>

Andreas Thiele <······@nospam.com> wrote:
+---------------
| To be serious: I think he meant it's imperative - you can only use it by
| creating side effects respectively using assignments.
+---------------

Actually, while MAPHASH->HASH may use imperative features *internally*,
externally it's behaviour is fully functional [or at least as functional
as the function over which it's mapping], and therefore I would call it
"as functional as MAPCAR" and let it go at that.

Now if it *mutated* the existing input hash table's values...  ;-}
[But it doesn't.]


-Rob

-----
Rob Warnock			<····@rpw3.org>
627 26th Avenue			<URL:http://rpw3.org/>
San Mateo, CA 94403		(650)572-2607

From: Andreas Thiele
Subject: Re: Maphash into a new hashtable
Date: Wed, 22 Jun 2005 13:32:46 +0000
Message-ID: <d9bpa3$akc$03$1@news.t-online.com>

"Rob Warnock" <····@rpw3.org> schrieb im Newsbeitrag
···························@speakeasy.net...
> ...
> Actually, while MAPHASH->HASH may use imperative features *internally*,
> externally it's behaviour is fully functional
> ...

Yes, that is what I said in my second post to this thread.

Commenting Edi's question only should make things clear for the newbie OP.

;)


Andreas