From: Xah Lee
Subject: Emacs lisp Lesson: Hash Table
Date: 
Message-ID: <1945a942-d0a8-4978-b7dc-1649be347828@e6g2000prf.googlegroups.com>
Emacs lisp Lesson: Hash Table

Xah Lee, 2008-01

This page shows you how to use hash table in emacs lisp. If you don't
know elisp, first take a gander at Emacs Lisp Basics.

(for HTML version with colors and links, see:
 http://xahlee.org/emacs/elisp_hash_table.html
)

---------------------------------------------
What Are Hash Tables

Often, you need to have a list with elements being key-value pairs.
For example, the key may be a person's name, and the value can be
their age. Or, in a more complex case, the key can be the person's ID
number, and the value can be a data of name, age, sex, phone number,
address, etc. You might have thousands of such entries, and you want
to be able to know that, given a ID, find out whether it exists in
your data. You also want to be able to add or delete, modify entries
in the data. And, you need this operation to be fast, even if you have
thousands or million entries. (If your data needs more entries than
that, the proper solution is to use a Relational Database¨J.)

From a interface point of view, such a keyed list looks like this:
"((key1 value1) (key2 value2) (key3 value3) ...)". If you want to know
a particular key exist, you can write a loop thru the list to test if
any first element of a element matches your given key. Similarly, you
can add a entry by appending to the list, or delete and modify the
list using your language's list-manipulation operators or functions.
All this functionality is provided in elisp, called Association List.

Reference: Elisp Manual: Association-Lists.

However, this method will be too slow to the point of unusable if your
have thousands of entries and frequently need to check, add, delete,
or modify the data. This is because most languages are not smart
enough to analyze your use of list to determine how best to compile it
in the language. The solution provided in most high-level languages of
today is to provide a particular data type (called a hash table), so
that when you want to use a list of huge number of pairs with fast
lookup, you use that data type.

For example, here's how hash table looks like in some high-level
languages:

# perl
%mydata = ("mary"=> 4, "jane"=> 5, "vicky"=>7);

# PHP
$mydata = array("mary"=> 4, "jane"=> 5, "vicky"=> 7);

# Python
mydata = {"mary":4, "jane":5, "vicky":7}

(see Keyed List in Python and Perl, PHP: Keyed List) Similarly, elisp
provides a hash table datatype.

Note: Hash table is also known as hash, keyed-list, dictionary, map.
It's called hash table for historical reasons. Basically, a technique
to create fast-lookup of a data, is by using a so-called hash-
function, that converts the key into a unique (randomish) number. So
that, instead of going thru the huge list to check every item, you
just check one particular memory address to see it exists. Such a
function is called "hash-function" because its key-conversion
operation is analogous to processing a type of food dish made by
disorderly and irreversibly chopping up and mixing up veggies or meat.
(See Hash function¨J, Hash (food)¨J, Hash browns¨J. )


---------------------------------------------
Hash Table In Elisp

To create a hash table, use "(make-hash-table :test 'equal)".

Here's a complete code that shows how to creat a hash table, then add,
change, remove, entries:

(let (myhash val)

  ;; create a hash table
  (setq myhash (make-hash-table :test 'equal))

  ;; add entries
  (puthash "mary" "19" myhash)
  (puthash "jane" "20" myhash)
  (puthash "carrie" "17" myhash)
  (puthash "liz" "21" myhash)

  ;; modify a entry's value
  (puthash "jane" "16" myhash)

  ;; remove a entry
  (remhash "liz" myhash)

  ;; get a entry's value
  (setq val (gethash "jane" myhash))

  (message val) ; print it
)

In the above example, both the entry's keys are datatype "strings".
However, elisp's hash table can have any lisp object as the key or
value. When we create a hash table, we used ":test 'equal" to let
elisp know that the function "equal" should be used when testing
whether a key exists. If your keys are other lisp data type, you need
to specify which equality functon emacs should use to test if a key
exist. The available value for ":test" are "'eql", "'eq", "'equal".
(You can also define your own equality test. See elisp manual for
detail. (linked at bottom of this page))

To clear all entries in a hash table, use "(clrhash <<myhash>>)".

-------------------
Check If A Key Exists

To check if a key exists, use the "(gethash <<mykey>> <<myhash>>)". If it
exists, its value is returned, else "nil" is returned. Example:

(let (myhash)
  (setq myhash (make-hash-table :test 'equal))
  (puthash 'mary "19" myhash)

  (gethash 'vicky myhash) ; returns nil
)

"gethash" has a optional third parameter "default". If the key does
not exist, "default" is returned.

-------------------
Apply A Function To All Key-Value Pairs

To apply a function to all entries in a hash table, use "(maphash
<<myfun>> <<myhashtable>>)". The function "myfun" must take 2 arguments,
key and value. For example, if you want a list of all keys, you can
write a function that puts the key into a list, then print out the
list. We show several examples below.

-------------------
Get All Keys

To get all keys into a list, we can use a maphash with a function that
pulls the keys into a list. Here's a example:

;; example of getting all keys from a hash table
(let (myhash allkeys)

  ;; add items to the hash
  (setq myhash (make-hash-table :test 'equal))
  (puthash "mary" "19" myhash)
  (puthash "jane" "20" myhash)
  (puthash "carrie" "17" myhash)
  (puthash "liz" "21" myhash)

  ;; set to a empty list
  (setq allkeys '())

  (defun pullkeys (kk vv)
    "prepend the key "kk" to the list "allkeys""
    (setq allkeys (cons kk allkeys))
  )

  ;; get all the keys
  (maphash 'pullkeys myhash)

  ;; sort and print it out
  (sort allkeys 'string<)
)

In the above example, "pullkeys" is used only once. Since it is used
only once, we don't need to define it separately, but just use lambda
construct. So, the maphash line can be just like this: "(maphash
(lambda (kk vv) (setq allkeys (cons kk allkeys))) myhash)".

You can define a "keys" function that returns all keys of a given hash
table. Like this:

(defun keys (hashtable)
  "Return all keys in hashtable."
  (let (allkeys)
    (maphash (lambda (kk vv) (setq allkeys (cons kk allkeys)))
hashtable)
    allkeys
  )
)

With this function, you can use it like this: "(keys myhash)".

-------------------
Get All Values

Getting all values is just like getting all keys above. Here's a
function that does it.

(defun hash-values (hashtable)
  "Return all keys in hashtable."
  (let (allvals)
    (maphash (lambda (kk vv) (setq allvals (cons vv allvals)))
hashtable)
    allvals
  )
)

-------------------
Print All Entries Sorted By Key

Here's a example how you can print out all entries in a hash table by
sorted key.

First, we define a function that turns a hash table into a list.

(defun hash-to-list (hashtable)
  "Return a list that represent the hashtable."
  (let (mylist)
    (maphash (lambda (kk vv) (setq mylist (cons (list kk vv) mylist)))
hashtable)
    mylist
  )
)

;; example of turning a hash table into list then sort it
(let (myhash mylist)

  ;; add items to the hash
  (setq myhash (make-hash-table :test 'equal))
  (puthash "mary" "19" myhash)
  (puthash "jane" "20" myhash)
  (puthash "carrie" "17" myhash)
  (puthash "liz" "21" myhash)

  ;; get the hash table into a list
  (setq mylist (hash-to-list myhash))

  ;; sort and print it out
  (sort mylist (lambda (a b) (string< (car a) (car b))))
)

;; prints (("carrie" "17") ("jane" "20") ("liz" "21") ("mary" "19"))


-------------------
Hash Table With Lisp Objects

Elisp's hash table's key or value can be any lisp object. Also, you
can define your own equality test, as well as providing few other
parameters when creating a hash table. For detail, see elisp manual.

Reference: Elisp Manual: Hash-Tables.

Emacs is beautiful!

  Xah
  ···@xahlee.org
¡Æ http://xahlee.org/

From: Xah Lee
Subject: Re: Emacs lisp Lesson: Hash Table
Date: 
Message-ID: <65024d3d-2823-44e9-b040-8e54d3bfb909@i12g2000prf.googlegroups.com>
in emacs lisp, when creating a hash table, there's the ¡°:weakness¡±
thing.

See
http://xahlee.org/elisp/Creating-Hash.html

what does that mean? Suppose if i do

  (setq myhash (make-hash-table :test 'equal :weakness 'key))
  (puthash "mary" "19" myhash)
  ...

Then, the "mary" would disappear if i don't access it??

  Xah
  ···@xahlee.org
¡Æ http://xahlee.org/
From: John Thingstad
Subject: Re: Emacs lisp Lesson: Hash Table
Date: 
Message-ID: <op.t4c9rkpqut4oq5@pandora.alfanett.no>
På Thu, 03 Jan 2008 17:44:01 +0100, skrev Xah Lee <···@xahlee.org>:

> in emacs lisp, when creating a hash table, there's the “:weakness”
> thing.
>
> See
> http://xahlee.org/elisp/Creating-Hash.html
>
> what does that mean? Suppose if i do
>
>   (setq myhash (make-hash-table :test 'equal :weakness 'key))
>   (puthash "mary" "19" myhash)
>   ...
>
> Then, the "mary" would disappear if i don't access it??
>
>   Xah
>   ···@xahlee.org
> ∑ http://xahlee.org/
>

No. key isn't owned by the hash. So if the owner of the object removes the  
reference, access to the assosiation returns nil. Without weakness: 'key  
the object couldn't be removed before all references (in this case the  
ones held by hash also) were lost. This is usefull for caching among other  
things where you want quick access to a value if it is still in memory,  
but do not want control over the objects.
In the case where you only have the key object in the hash table it seems  
to retain it, but then :weakness doesn't make much sense and should be nil  
(default)

type <ctrl>-h i m elisp <enter> => gets the elisp manual
i make-hash<tab><enter> => gets the documentation for make-hash-table

see under :weakness

--------------
John Thingstad
From: Rainer Joswig
Subject: Re: Emacs lisp Lesson: Hash Table
Date: 
Message-ID: <joswig-4761B0.19435403012008@news-europe.giganews.com>
In article 
<····································@i12g2000prf.googlegroups.com>,
 Xah Lee <···@xahlee.org> wrote:

> in emacs lisp, when creating a hash table, there's the ��:weakness��
> thing.
> 
> See
> http://xahlee.org/elisp/Creating-Hash.html
> 
> what does that mean? Suppose if i do
> 
>   (setq myhash (make-hash-table :test 'equal :weakness 'key))
>   (puthash "mary" "19" myhash)
>   ...
> 
> Then, the "mary" would disappear if i don't access it??
> 
>   Xah
>   ···@xahlee.org
> �� http://xahlee.org/

As long as objects are referenced they cannot be garbage collected.
Objects can be referenced for example by variables, being
part of a list, array or hash-table. As long as there is
one reference to the object, it cannot be garbage collected.
If objects are referenced by a weak hash-table, this reference
does not count. So, if an object is only referenced by
one or more weak hash-tables, it can be garbage collected.

So, the purpose of a weak hash-table is to provide access
to the object while the object 'exists'. The programmer does
not have to remove the object from the hash-table 'manually' when
he/she no longer uses the object.

Some Lisps provide other 'weak' datastructures (weak pointers,
weak arrays, ...).

See also:

  http://www.haible.de/bruno/papers/cs/weak/WeakDatastructures-writeup.html
  http://en.wikipedia.org/wiki/Weak_reference

-- 
http://lispm.dyndns.org/
From: Alex Mizrahi
Subject: Re: Emacs lisp Lesson: Hash Table
Date: 
Message-ID: <477d261e$0$90270$14726298@news.sunsite.dk>
 XL> what does that mean? Suppose if i do

 XL>   (setq myhash (make-hash-table :test 'equal :weakness 'key))
 XL>   (puthash "mary" "19" myhash)
 XL>   ...
 XL> Then, the "mary" would disappear if i don't access it??

----
Weak hash tables are useful for keeping track of information in a 
non-obtrusive way, for example to implement caching. If the cache contains 
objects such as buffers, markers, image instances, etc. that will eventually 
disappear and get garbage-collected, using a weak hash table ensures that 
these objects are collected normally rather than remaining around forever, 
long past their actual period of use. (Otherwise, you'd have to explicitly 
map over the hash table every so often and remove unnecessary elements.)
----

as you see it make more sense to use it for some objects rather than for 
strings, when these objects are referenced not only in your hash but in some 
other places too.