From: Dmitrii Manin
Subject: assoc or binary search?
Date: 
Message-ID: <MANIN.95Sep19170953@pendragon.rockefeller.edu>
Hi,

I have a big alist, consisting of thousands of elements, with cars
being strings and cdrs further lists (not so big). It is sorted
alphabetically according to cars. My question is: in emacs lisp,
what is more efficient -- built-in `assoc' or a binary search
implemented in elisp? 

Basically, this amounts to whether assoc implements any smart
strategies or fast assembler/C code, or is it equivalent to a simple
consequential lookup written in lisp?

Thanks for any hint


--
- M

This is a custom comp.lang.lisp-specific signature

From: Erik Naggum
Subject: Re: assoc or binary search?
Date: 
Message-ID: <19950920T005924Z@naggum.no>
[Dmitrii Manin]

|   I have a big alist, consisting of thousands of elements, with cars
|   being strings and cdrs further lists (not so big).  It is sorted
|   alphabetically according to cars.  My question is: in emacs lisp, what
|   is more efficient -- built-in `assoc' or a binary search implemented in
|   elisp?

the builtin `assoc' will be faster, because a binary search on a list is a
horribly inefficient proposal.

|   Basically, this amounts to whether assoc implements any smart
|   strategies or fast assembler/C code, or is it equivalent to a simple
|   consequential lookup written in lisp?

Emacs Lisp actually has a sort of hash table, but it only works on symbols,
and is not really space-efficient; it's the obarray.  since you have
strings for keys, this is probably going to win big.

here's a tiny hash-table package (fresh from my keyboard):

;;; hash-table.el --- hash-table support functions

;; Copyright 1995 Free Software Foundation, Inc.

;; Author: Erik Naggum <····@naggum.no>

;; This code is not part of GNU Emacs (yet).

;; This code is free software; you can redistribute it and/or modify it
;; under the terms of the GNU General Public License as published by the
;; Free Software Foundation; either version 2, or (at your option) any
;; later version.

;; This code is distributed in the hope that it will be useful, but WITHOUT
;; ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
;; FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
;; for more details.

;; You should have received a copy of the GNU General Public License along
;; with GNU Emacs, for which this code is intended; see the file COPYING.
;; If not, write to the Free Software Foundation, 59 Temple Place, Suite
;; 330, Boston, MA 02111-1307, USA.

;;; Commentary:

;; Using obarrays, provide a rudimentary hash-table mechanism.
;; Requires Emacs 19.29 or higher.

;;; Code:

(defun make-hash-table (&optional size)
  "Make a hash table of with SIZE slots.
SIZE should be prime for best effect."
  (make-vector (or size 101) 0))

(defun gethash (key hash-table &optional default)
  "Return value associated with KEY in HASH-TABLE."
  (let* ((hash-key (if (stringp key) key (prin1-to-string key)))
	 (symbol (intern-soft hash-key hash-table)))
    (or (symbol-value symbol) default)))

(defun sethash (key hash-table value)
  "Set the value associated with KEY in HASH-TABLE to VALUE."
  (let* ((hash-key (if (stringp key) key (prin1-to-string key)))
	 (symbol (intern hash-key hash-table)))
    (set symbol value)))

(defun remhash (key hash-table)
  "Remove the entry associated with KEY in HASH-TABLE."
  (let* ((hash-key (if (stringp key) key (prin1-to-string key))))
    (unintern hash-key hash-table)))

(defun maphash (function hash-table)
  "Call FUNCTION on every (key value) pair in HASH-TABLE.
Note: all keys are strings in this implementation."
  (mapatoms `(lambda (sym)
	       (funcall ,function (symbol-name sym) (symbol-value sym)))
	    hash-table))

(defun clrhash (hash-table)
  "Removes all entries in HASH-TABLE."
  (mapatoms `(lambda (sym)
	       (unintern sym ,hash-table))
	    hash-table)
  hash-table)

(defsetf gethash sethash)

;;; hash-table.el ends here

hash-tables afford nearly constant-time access, while an alist gives access
times proportional to the length of the list.

(you may wish to prefix the function names above with some unique prefix
since Emacs Lisp doesn't have packages, and others might want to write
their own hash-table support functions, too.  unless I get this into 19.30,
that is, for which the above comments and copyright notice.)

#<Erik 3020547563>
-- 
there's a car that has more computing power than it took to get man
to the moon: the BMW 750.  but _you_ are stuck here on earth.  haha.
From: Marty Hall
Subject: Re: assoc or binary search?
Date: 
Message-ID: <DF7Fo8.57o@aplcenmp.apl.jhu.edu>
In article <·················@scarp.ed.ac.uk> ···@ed.ac.uk (Tim Bradshaw) writes:
>* Erik Naggum wrote:
>> Emacs Lisp actually has a sort of hash table, but it only works on symbols,
>> and is not really space-efficient; it's the obarray.  since you have
>> strings for keys, this is probably going to win big.
>
>XEmacs has native hashtables which hash on eqlness (I think), and even
>weak hashtables.

And I think that the CL-style hashing functions in cl.el (really
cl-extra.el) check to see if these tables are available and use them
if they are.
						- Marty
(proclaim '(inline skates))
From: David Neves
Subject: Re: assoc or binary search?
Date: 
Message-ID: <neves-2009950940180001@neves.ils.nwu.edu>
In article <···················@pendragon.rockefeller.edu>,
·····@camelot.rockefeller.edu wrote:

:  Hi,
:  
:  I have a big alist, consisting of thousands of elements, with cars
:  being strings and cdrs further lists (not so big). It is sorted
:  alphabetically according to cars. My question is: in emacs lisp,
:  what is more efficient -- built-in `assoc' or a binary search
:  implemented in elisp? 
:  
:  Basically, this amounts to whether assoc implements any smart
:  strategies or fast assembler/C code, or is it equivalent to a simple
:  consequential lookup written in lisp?
Assoc would have to be very much faster than Gnu Emacs byte code to make a
linear search be faster than a binary search (on a balanced tree) for
thousands of elements.  I doubt assoc item processing is dramatically
faster (by a factor of a thousand or more) to be able to overcome its
linear nature.  However, as others have pointed out you probably want to
go with hash tables rather than binary trees in this case.
-David