looking up entries in a table of ranges

From: ······@je1.jsc.nasa.gov
Subject: looking up entries in a table of ranges
Date: Mon, 07 Feb 2000 00:00:00 +0000
Message-ID: <823dr5knm0.fsf@je1.jsc.nasa.gov>

I need to make a table that returns an entry for a value in a given
range.  And I'm wondering if a more experienced programmer can comment
on whether this is a common data structure with a common name.

For example say you have weight x = 50.0

and your *table* looks something like this ...

0.0  <= x <= 10.0       this stuff
10.0 <  x <= 40.0	this other stuff
40.0 <  x <= 100.0	yet some other stuff
... and so on

and you would call (get-val 50.0 *table*) to get the data for that
range.  

Is there a common name for this? 
Is this something that should just be thrown into a list?

--Jason Kantz

Re: looking up entries in a table of ranges Gareth McCaughan
- Re: looking up entries in a table of ranges ······@je1.jsc.nasa.gov
  - Re: looking up entries in a table of ranges Gareth McCaughan
  - Re: looking up entries in a table of ranges Marco Antoniotti
    - Re: looking up entries in a table of ranges Jason Kantz
    - Re: looking up entries in a table of ranges Tom Breton
      - Re: looking up entries in a table of ranges Marco Antoniotti
- Re: looking up entries in a table of ranges Robert Munyer
  - Re: looking up entries in a table of ranges Jason Kantz
Re: looking up entries in a table of ranges Andrew Cooke

From: Gareth McCaughan
Subject: Re: looking up entries in a table of ranges
Date: Tue, 08 Feb 2000 00:00:00 +0000
Message-ID: <86g0v4bjdw.fsf@g.local>

······@je1.jsc.nasa.gov writes:

> I need to make a table that returns an entry for a value in a given
> range.  And I'm wondering if a more experienced programmer can comment
> on whether this is a common data structure with a common name.
> 
> For example say you have weight x = 50.0
> 
> and your *table* looks something like this ...
> 
> 0.0  <= x <= 10.0       this stuff
> 10.0 <  x <= 40.0	this other stuff
> 40.0 <  x <= 100.0	yet some other stuff
> ... and so on
> 
> and you would call (get-val 50.0 *table*) to get the data for that
> range.  
> 
> Is there a common name for this? 

Not that I know of, but there's a lot I don't know.

> Is this something that should just be thrown into a list?

Depends on how many intervals you have, what you know
about the likely distribution of the x-values, and so
on. If the number of intervals isn't too small, you
will get faster answers by using a tree structure;
so if the divisions between your ranges are 10,40,100,180,260,800
then you could do, in effect:

  if x <= 100:
    if x <= 40:    we're in the 10..40 range
    else:          we're in the 40..100 range
  else:
    if x <= 260:
      if x <= 180: we're in the 100..180 range
      else:        we're in the 180..260 range
    else:
      if x <= 800: we're in the 260..800 range
      else:        we're in the 800..infinity range

There are plenty of ways to get this kind of calculation
done at runtime, of course. The most elegant, and probably
the least efficient, goes something like this:

    (defclass range-discriminator () ())

    (defclass trivial-range-discriminator (range-discriminator)
      result)

    (defclass split-range-discriminator (range-discriminator)
      (dividing-point :reader dividing-point)
      (when-below     :reader when-below     :type range-discriminator)
      (when-not-below :reader when-not-below :type range-discriminator))

    (defmethod discriminate ((r trivial-range-discriminator) x)
      (trivial-range-discriminator-result r))

    (defmethod discriminate ((r split-range-discriminator) x)
      (if (< x (dividing-point r))
        (discriminate (when-below r) x)
        (discriminate (when-not-below r) x)))

I'm too lazy to bother defining the constructors.

It might be a win to "truncate" below some minimum level
of complexity and switch over to a straightforward list
traversal then. In terms of the implementation sketched
above, that would mean having another subclass called
something like POLYSPLIT-RANGE-DISCRIMINATOR that holds
a larger number of division points. Sort of like the way
you truncate a quicksort routine before it recurses all
the way down to subarrays of 1 or 2 elements.

                           *

If the division points are fairly evenly spaced, you might
be able to do a lot better than this, as follows. Divide up
the range of possible values into a number of equal portions,
with the property that most of those portions lie entirely
within single ranges. Now have an array of that size, each
entry of which says (somehow -- the precise representation
is up to you) either "if x is in this portion then the result
we want is definitely so-and-so" or else "ix s is in this
portion then we need to subdivide further", How you handle
those subdivisions is again up to you.

If you do this, and if you can choose a good divisor, then
most of the time all you have to do is one division (plus
float->integer conversion), followed by an array lookup.
Actually, you probably want to replace the division by a
multiplication, which will (1) be faster and (2) help you
to realise that the thing you divide by doesn't have to
be an integer. :-)

                           *

If you spend a *lot* of time searching these tables,
it might be worth turning the data structure into actual
code: build it as s-expressions and call COMPILE.

                           *

Having said all that: the *first* thing you should do is
to implement it in the simplest way possible. Make sure
that all your accesses to the data structure go through
functions with sensible names -- don't use CAR or AREF
or whatever directly. If your program isn't fast enough,
then profile it and see whether this stuff is using up
much of the execution time. Only then should you think
about clever tree structures and table lookups and
dynamic code generation and things.

-- 
Gareth McCaughan  ················@pobox.com
sig under construction

From: ······@je1.jsc.nasa.gov
Subject: Re: looking up entries in a table of ranges
Date: Tue, 08 Feb 2000 00:00:00 +0000
Message-ID: <82n1pbtywy.fsf@je1.jsc.nasa.gov>

Thank you for your comments.  I'm going to use a sorted assoc list
with find-if.  Right now there are few intervals and the data is
loaded from a run-time file containing other lists.

(defun interval-limit (interval-assoc)
  (car interval-assoc))

(defun interval-data (interval-assoc)
  (cdr interval-assoc))

(defun find-interval (value interval-list &key (predicate #'<=) (key #'identity))
  (funcall key
	   (interval-data
	    (find-if (lambda (interval-assoc)
		       (funcall predicate value (interval-limit interval-assoc)))
		     interval-list))))


Jason Kantz
http://kantz.com/jason/

From: Gareth McCaughan
Subject: Re: looking up entries in a table of ranges
Date: Tue, 08 Feb 2000 00:00:00 +0000
Message-ID: <86ya8vz21a.fsf@g.local>

······@je1.jsc.nasa.gov writes:

> Thank you for your comments.  I'm going to use a sorted assoc list
> with find-if.  Right now there are few intervals and the data is
> loaded from a run-time file containing other lists.

Good call. Keep the hairier stuff in mind in case it turns out
that this is a performance bottleneck, but it probably won't be.

-- 
Gareth McCaughan  ················@pobox.com
sig under construction

From: Marco Antoniotti
Subject: Re: looking up entries in a table of ranges
Date: Wed, 09 Feb 2000 00:00:00 +0000
Message-ID: <lwbt5qsqgp.fsf@parades.rm.cnr.it>

······@je1.jsc.nasa.gov writes:

> Thank you for your comments.  I'm going to use a sorted assoc list
> with find-if.  Right now there are few intervals and the data is
> loaded from a run-time file containing other lists.
> 
> (defun interval-limit (interval-assoc)
>   (car interval-assoc))
> 
> (defun interval-data (interval-assoc)
>   (cdr interval-assoc))

What's wrong with 

	(defstruct interval
           limit
           data)

?

Cheers


-- 
Marco Antoniotti ===========================================
PARADES, Via San Pantaleo 66, I-00186 Rome, ITALY
tel. +39 - 06 68 10 03 17, fax. +39 - 06 68 80 79 26
http://www.parades.rm.cnr.it/~marcoxa

From: Jason Kantz
Subject: Re: looking up entries in a table of ranges
Date: Wed, 09 Feb 2000 00:00:00 +0000
Message-ID: <82vh3yny8f.fsf@je1.jsc.nasa.gov>

Marco Antoniotti <·······@parades.rm.cnr.it> writes:

> What's wrong with 
> 
> 	(defstruct interval
>            limit
>            data)
> 
> ?

I need to read the data from a text file and the program already has a
facility to do this for data in lists.

I guess it didn't dawn on me that I can substitute a struct for a
list, by using...

#S(INTERVAL limit 40 data '((30 20) (50 20) (50 40) (30 40)))

instead of 

(40 ((30 20) (50 20) (50 40) (30 40)))

From: Tom Breton
Subject: Re: looking up entries in a table of ranges
Date: Thu, 10 Feb 2000 00:00:00 +0000
Message-ID: <m3hffhg3tc.fsf@world.std.com>

Marco Antoniotti <·······@parades.rm.cnr.it> writes:

> ······@je1.jsc.nasa.gov writes:
> 
> > Thank you for your comments.  I'm going to use a sorted assoc list
> > with find-if.  Right now there are few intervals and the data is
> > loaded from a run-time file containing other lists.
> > 
> > (defun interval-limit (interval-assoc)
> >   (car interval-assoc))
> > 
> > (defun interval-data (interval-assoc)
> >   (cdr interval-assoc))
> 
> What's wrong with 
> 
> 	(defstruct interval
>            limit
>            data)

Good suggestion.  But J, since you're apparently getting data from a
file containing lists, instead write:

        (defstruct (interval (:type list))
        ...)

-- 
Tom Breton, http://world.std.com/~tob
Not using "gh" since 1997. http://world.std.com/~tob/ugh-free.html

From: Marco Antoniotti
Subject: Re: looking up entries in a table of ranges
Date: Thu, 10 Feb 2000 00:00:00 +0000
Message-ID: <lw3dr1fmqv.fsf@parades.rm.cnr.it>

Tom Breton <···@world.std.com> writes:

> Marco Antoniotti <·······@parades.rm.cnr.it> writes:
> 
> > ······@je1.jsc.nasa.gov writes:
> > 
> > > Thank you for your comments.  I'm going to use a sorted assoc list
> > > with find-if.  Right now there are few intervals and the data is
> > > loaded from a run-time file containing other lists.
> > > 
> > > (defun interval-limit (interval-assoc)
> > >   (car interval-assoc))
> > > 
> > > (defun interval-data (interval-assoc)
> > >   (cdr interval-assoc))
> > 
> > What's wrong with 
> > 
> > 	(defstruct interval
> >            limit
> >            data)
> 
> Good suggestion.  But J, since you're apparently getting data from a
> file containing lists, instead write:
> 
>         (defstruct (interval (:type list))
>         ...)

Of course.  The point is really to have the system do things for you
(i.e. build constructors and accessors).

Cheers

-- 
Marco Antoniotti ===========================================
PARADES, Via San Pantaleo 66, I-00186 Rome, ITALY
tel. +39 - 06 68 10 03 17, fax. +39 - 06 68 80 79 26
http://www.parades.rm.cnr.it/~marcoxa

From: Robert Munyer
Subject: Re: looking up entries in a table of ranges
Date: Wed, 09 Feb 2000 00:00:00 +0000
Message-ID: <87qp30$g5v$1@eve.enteract.com>

In article <··············@g.local>,
Gareth McCaughan <················@pobox.com> wrote:
> ······@je1.jsc.nasa.gov writes:

> > 0.0  <= x <= 10.0       this stuff
> > 10.0 <  x <= 40.0	this other stuff
> > 40.0 <  x <= 100.0	yet some other stuff
> > ... and so on
> > 
> > and you would call (get-val 50.0 *table*) to get the data for
> > that range.

I had to do something like that, a number of years ago.  First I
did something quick-and-dirty using an association list, like this:

  (cdr (assoc weight table :test #'>))

When I needed more speed, I wrote a macro that took the association
list as an argument, and generated a nest of IFs like the following:

>   if x <= 100:
>     if x <= 40:    we're in the 10..40 range
>     else:          we're in the 40..100 range
>   else:
>     if x <= 260:
>       if x <= 180: we're in the 100..180 range
>       else:        we're in the 180..260 range
>     else:
>       if x <= 800: we're in the 260..800 range
>       else:        we're in the 800..infinity range

The macro was easy to write, and the code it generated was very
fast.  Of course, that approach only works if you know the boundaries
at compile time...

-- Robert Munyer <······@mcs.com>

From: Jason Kantz
Subject: Re: looking up entries in a table of ranges
Date: Wed, 09 Feb 2000 00:00:00 +0000
Message-ID: <82wvoenz4q.fsf@je1.jsc.nasa.gov>

Cool!

This ...

(defun find-interval (value interval-list &key (predicate #'<=) (key #'identity))
  (funcall key
	   (interval-data
	    (find-if (lambda (interval-assoc)
		       (funcall predicate value (interval-limit interval-assoc)))
		     interval-list))))

becomes ...

(defun find-interval (value interval-list &key (predicate #'<=) (key #'identity))
  (funcall key (cdr (assoc value interval-list :test predicate))))


Jason Kantz
http://kantz.com/jason/

From: Andrew Cooke
Subject: Re: looking up entries in a table of ranges
Date: Tue, 08 Feb 2000 00:00:00 +0000
Message-ID: <87okk6$42t$1@nnrp1.deja.com>

(I'm assuming no "gaps", no overlapping ranges, and arguments always in
range - you may need to worry about those and modify the following).

The usual way of finding something by value is to use sorted
values then, to find the entry for a certain index, start at the middle.
If your index is less than the middle value, try halfway between the
start and the middle, otherwise try halfway between the middle and the
end.

So, for example, you could store (cons range-start stuff) in an array,
sorted by the range start (if necessary, you can sort the array by
calling sort and using a comparison function that compares the car of
the entries).  Then follow the algorithm sketched above to find the
appropriate range-start and stuff.

Modifying and resorting an array is expensive, so if the ranges change
you might want to use a tree instead.  For best performance with many
changes to the range information consider a tree that keeps itself
balanced (eg red-black tree).

Alternatively, if there's only a few ranges, just stick them in a list
and run through them!

Finally, if there's a simple way of reducing any value in the range to a
particular value, you can use a hash table.  For example, if the ranges
were 0-99, 100-199, 200-299 etc, then you can use (floor (/ value 100))
to reduce a value to 0, 1, 2 etc.  Because these start at 0 and
increment uniformly you don't even need a hashtable - an array will do
fine.

Feel free to ask for more info on any of the above.

Andrew
http://www.andrewcooke.free-online.co.uk/index.html

In article <··············@je1.jsc.nasa.gov>,
  ······@je1.jsc.nasa.gov wrote:
>
> I need to make a table that returns an entry for a value in a given
> range.  And I'm wondering if a more experienced programmer can comment
> on whether this is a common data structure with a common name.
>
> For example say you have weight x = 50.0
>
> and your *table* looks something like this ...
>
> 0.0  <= x <= 10.0       this stuff
> 10.0 <  x <= 40.0	this other stuff
> 40.0 <  x <= 100.0	yet some other stuff
> ... and so on
>
> and you would call (get-val 50.0 *table*) to get the data for that
> range.
>
> Is there a common name for this?
> Is this something that should just be thrown into a list?
>
> --Jason Kantz
>

Sent via Deja.com http://www.deja.com/
Before you buy.