From: Bruce L. Lambert
Subject: declaring contents of a list?
Date: 
Message-ID: <s99r3ubp5io42@corp.supernews.com>
Hi everyone,

In my curent project, I use a data structure I call a document profile. Each
profile is a list of dotted pairs. The car of the pair is a word index. I
know the word index will always be an (unsigned-byte 16). The cdr of the
pair is a weight for the specified word, and I know it will always be a
short-float between 0.0 and 1.0.

My question is how can I take advantage of what I know about the contents of
these lists to write more efficient code? Below is an example of a simple
function to trim a document profile. What sort of declaration would help me
take advantage of what I know about the types that will appear in a profile?
One of my attempts is commented out. It did not appear to make much
difference. Sort was still not inlined.

Thanks.


(defun trim-profile (profile n)

  "Trims a doc profile to include only the n terms
   with the highest weights."

  (declare (list profile)
           (type (integer 0 500) n)
           (optimize (speed 3) (safety 1))
        ;; (:explain :calls :types :boxing)
            )

  (let ((sorted-profile '()))

    #|
    (setf sorted-profile
      (sort profile #'(lambda (weight1 weight2)
                        (declare (type (short-float 0.0s0 1.0s0) weight1
weight2))
                        (if (> weight1 weight2) weight1 weight2))
            :key #'rest))
     |#

    (setf sorted-profile (sort profile #'> :key #'rest))

    (values (subseq sorted-profile 0 n))))


--
Bruce L. Lambert
········@uic.edu

From: Fernando D. Mato Mira
Subject: Re: declaring contents of a list?
Date: 
Message-ID: <3895669C.B1198E1C@iname.com>
"Bruce L. Lambert" wrote:

> My question is how can I take advantage of what I know about the contents of
> these lists to write more efficient code? Below is an example of a simple
> function to trim a document profile. What sort of declaration would help me
> take advantage of what I know about the types that will appear in a profile?

If you really need the elements to be pairs, CL would need `chunks' [what was
the name for m-ary conses?],
and DEFSTRUCT would have to be accept that as base type.

The closest one can get today is:

(defstruct (foo (:type vector)))
  (index 0 :type (unsigned-byte 16))
  (weight 0.0 type (short-float 0.0s0 1.0s0)))

or, eventually, replacing `vector' with `list'.


So I think the best is probably to be straightforward:

(defstruct foo
  (index 0 :type (unsigned-byte 16))
  (weight 0.0 :type (short-float 0.0s0 1.0s0)))

Just make sure you have arranged for struct accessors to get inlined.

Still, there's no specializing type specifiers for lists, so you'll have to use
THE when accessing
your compounds. Or maybe you could use vectors instead..

Regards,

--
Fernando D. Mato Mira
Real-Time SW Eng & Networking
Advanced Systems Engineering Division
CSEM
Jaquet-Droz 1                   email: matomira AT acm DOT org
CH-2007 Neuchatel                 tel:       +41 (32) 720-5157
Switzerland                       FAX:       +41 (32) 720-5720

www.csem.ch      www.vrai.com     ligwww.epfl.ch/matomira.html
From: Robert Monfera
Subject: Re: declaring contents of a list?
Date: 
Message-ID: <3895BC88.6907375E@fisec.com>
"Fernando D. Mato Mira" wrote:

> So I think the best is probably to be straightforward:
>
> (defstruct foo
>   (index 0 :type (unsigned-byte 16))
>   (weight 0.0 :type (short-float 0.0s0 1.0s0)))

Conceptually this is solid and if there is no problem with performance,
it is the way to go.

If performance or space matters, then there is an opportunity to
improve.  The reason is that current implementations do not handle
arrays of unboxed structs (or CLOS instances), resulting cache misses
and multiple times the space occupied.  It is feasible to implement it,
because the width of the instances is known as long as its slots are
declared so - here the problem again is that this involves the
reimplementation of much of the structure or CLOS functionality.

This is why I'd recommend plain vectors if performance or space matters
- implementations do a very good job with handling of arrays containing
immediate values.

Robert
From: Pierre R. Mai
Subject: Re: declaring contents of a list?
Date: 
Message-ID: <87ln562h9n.fsf@orion.dent.isdn.cs.tu-berlin.de>
Robert Monfera <·······@fisec.com> writes:

> If performance or space matters, then there is an opportunity to
> improve.  The reason is that current implementations do not handle
> arrays of unboxed structs (or CLOS instances), resulting cache misses
> and multiple times the space occupied.  It is feasible to implement it,
> because the width of the instances is known as long as its slots are
> declared so - here the problem again is that this involves the
> reimplementation of much of the structure or CLOS functionality.

I don't think storing unboxed structs or CLOS instances is possible
without either introducing something like "final" for class
hierarchies, or some form of subtype restrictions on array
elements, or you only do partial unboxing or some hack like this.

As CL stands now, the following is legal code:

(defclass simple-class ()
  ((foo :type single-float :initform 0.0f0)
   (bar :type fixnum :initform 0)))

(defvar *my-array* (make-array 256 :element-type 'simple-class))

(defclass bigger-class (simple-class)
  ((more-stuff ...)))

(setf (aref *my-array* 0) (make-instance 'bigger-class))

The same goes for defstruct and structure types (via :include)...

Regs, Pierre.

-- 
Pierre Mai <····@acm.org>         PGP and GPG keys at your nearest Keyserver
  "One smaller motivation which, in part, stems from altruism is Microsoft-
   bashing." [Microsoft memo, see http://www.opensource.org/halloween1.html]
From: Fernando D. Mato Mira
Subject: Re: declaring contents of a list?
Date: 
Message-ID: <38970519.976AFECE@iname.com>
"Pierre R. Mai" wrote:

> or some form of subtype restrictions on array
> elements

Yes, yes! And don't forget about an :embedded option (note that :unboxed =>
embedded,
but :embedded =/=> unboxed), and that the same goes for CLOS slots!

--
Fernando D. Mato Mira
Real-Time SW Eng & Networking
Advanced Systems Engineering Division
CSEM
Jaquet-Droz 1                   email: matomira AT acm DOT org
CH-2007 Neuchatel                 tel:       +41 (32) 720-5157
Switzerland                       FAX:       +41 (32) 720-5720

www.csem.ch      www.vrai.com     ligwww.epfl.ch/matomira.html
From: Fernando D. Mato Mira
Subject: Embedded slot fun (was: declaring contents of a list?)
Date: 
Message-ID: <38985062.4FCBF7E7@iname.com>
"Fernando D. Mato Mira" wrote:

> "Pierre R. Mai" wrote:
>
> > or some form of subtype restrictions on array
> > elements
>
> Yes, yes! And don't forget about an :embedded option (note that :unboxed =>
> embedded,
> but :embedded =/=> unboxed), and that the same goes for CLOS slots!

BTW, for Bruno's pleasure, this opens up a whole Pandora's box regarding
CHANGE-CLASS and
UPDATE-INSTANCE-FOR-REDEFINED-CLASS:

1. You can't CHANGE-CLASS of an unboxed component (obvious)
2. Embedded components may only have their class changed if the new instance
footprint is not larger (of course, plainly prohibiting to ease up
implementation would be possible).
3. If you redefine a class C, you should update all instances of any classes
embedding C (ie, those classes are `redefined'. Apply recursively).
4. CHANGE-CLASS on an object with :embedded (_not_ :unboxed) slots must
`recurse' on those (i.e., you have to replace pointers everywhere (by NIL, if
the slot was deleted)). Given that external type restrictions may apply, the
new type for the slot must be compatible with the older one. So, if you want
to wildly change the type, first you do a CHANGE-CLASS to a class removing the
:embeded qualifiers for the problem slots, and then you do another one that
changes their types. An alternative, simplifying view would be that while
changes to an embedded object component's is shared, changing the reference is
not, but that has something of a NJ school flavor, IMHO.


--
((( DANGER )) LISP BIGOT (( DANGER )) LISP BIGOT (( DANGER )))

Fernando D. Mato Mira
Real-Time SW Eng & Networking
Advanced Systems Engineering Division
CSEM
Jaquet-Droz 1                   email: matomira AT acm DOT org
CH-2007 Neuchatel                 tel:       +41 (32) 720-5157
Switzerland                       FAX:       +41 (32) 720-5720

www.csem.ch      www.vrai.com     ligwww.epfl.ch/matomira.html
From: Fernando D. Mato Mira
Subject: Re: declaring contents of a list?
Date: 
Message-ID: <38957052.91C460ED@iname.com>
Caveat: of course with structs you'll almost certainly will get bitten in space by type headers, and this seems to be a very important issue in your application, so you'll just have to keep your conses and use THE.

--
Fernando D. Mato Mira
Real-Time SW Eng & Networking
Advanced Systems Engineering Division
CSEM
Jaquet-Droz 1                   email: matomira AT acm DOT org
CH-2007 Neuchatel                 tel:       +41 (32) 720-5157
Switzerland                       FAX:       +41 (32) 720-5720

www.csem.ch      www.vrai.com     ligwww.epfl.ch/matomira.html
From: Robert Monfera
Subject: Re: declaring contents of a list?
Date: 
Message-ID: <38959001.BEED95A6@fisec.com>
"Bruce L. Lambert" wrote:
>
> Hi everyone,
>
> In my curent project, I use a data structure I call a document profile. Each
> profile is a list of dotted pairs. The car of the pair is a word index. I
> know the word index will always be an (unsigned-byte 16). The cdr of the
> pair is a weight for the specified word, and I know it will always be a
> short-float between 0.0 and 1.0.
>
> My question is how can I take advantage of what I know about the contents of
> these lists to write more efficient code? Below is an example of a simple
> function to trim a document profile. What sort of declaration would help me
> take advantage of what I know about the types that will appear in a profile?
> One of my attempts is commented out. It did not appear to make much
> difference. Sort was still not inlined.

- Try declaring (optimize speed) in your lambda (maybe the declaration
in defun already does this)

- Declaring things like lists are not making much (if any) difference

- Instead of a list, do sort on a simple-vector of type t:

    (sort (the (simple-array t (*)) sorted-profile)
          #'(lambda ... ))

- Another thing you can do is creating two identical-length vectors (one
for the word number (simple-array fixnum (*)), the other for the weight
(simple-array short-float (*))), and a third identical-sized index
vector (simple-array t (*)), which contains numbers 0,1,2,..,n.  Then
sort the index vector with the appropriate :key function (declare fixnum
and float arguments to both key and comparison functions), and at the
end access the word number and weight vectors in the order determined by
the sorted index-vector.

- Your commented invocation of SORT is a little lengthy; you could have
written

#'(lambda (weigh1 weigh2)
    (> (the short-float weight1) (the short-float weight2)))

    there is no need for returning the number, so do away with the IF.

- See what you gain by doing away with :key - put your AREF into the
comparator:

#'(lambda (i1 i2)
    (> (the short-float (aref weights (the fixnum i1)))
       (the short-float (aref weights (the fixnum i2)))))

- ACL does not do in-lining except primitive built-in functions.  It's
not a problem for SORT, because sorting is usually not in a tight inner
loop where function call overhead matters - the only possible problem is
that your comparison and key functions may not be inlined.

A benchmark: SORT in ACL can sort a vector of 1,000,000 numbers in a few
seconds.

Hope this helps.

Robert

>
> Thanks.
>
> (defun trim-profile (profile n)
>
>   "Trims a doc profile to include only the n terms
>    with the highest weights."
>
>   (declare (list profile)
>            (type (integer 0 500) n)
>            (optimize (speed 3) (safety 1))
>         ;; (:explain :calls :types :boxing)
>             )
>
>   (let ((sorted-profile '()))
>
>     #|
>     (setf sorted-profile
>       (sort profile #'(lambda (weight1 weight2)
>                         (declare (type (short-float 0.0s0 1.0s0) weight1
> weight2))
>                         (if (> weight1 weight2) weight1 weight2))
>             :key #'rest))
>      |#
>
>     (setf sorted-profile (sort profile #'> :key #'rest))
>
>     (values (subseq sorted-profile 0 n))))
>
> --
> Bruce L. Lambert
> ········@uic.edu