From: Dave Roberts
Subject: UFFI does massive consing?
Date: 
Message-ID: <dvv5c.18787$1p.341330@attbi_s54>
So I'm trying to use UFFI for a simple DNS decoding routine using Linux's
resolver library. I have one function to simply get a byte from a foreign
object buffer:

(defun get-byte (buffer size index)
  "Get a byte from the foreign object array at the specified index."
  (assert (< index size))
  (uffi:deref-array buffer :unsigned-byte index))

Using this under SBCL seems to result in MASSIVE consing:

* (time (get-byte buffer 181 12))

Evaluation took:
                 0.006 seconds of real time
                 0.0 seconds of user run time
                 0.0 seconds of system run time
                 0 page faults and
                 89952 bytes consed.

So, reading a single byte from a foreign buffer results in 90K of
consing?!?!?!

Am I doing something wrong? Needless to say, when I use this primitive in my
code, just decoding a simple DNS message results in multi-megabytes of
consing.

So my question is: Is this an issue with UFFI or with SBCL (0.8.6 right now,
just retrieved 0.8.8 this morning but haven't installed it yet as I don't
want to shut down my emacs session... ;-)

-- 
Dave Roberts
·············@re-move.droberts.com

From: Tim Bradshaw
Subject: Re: UFFI does massive consing?
Date: 
Message-ID: <fbc0f5d1.0403160302.726c5e80@posting.google.com>
Dave Roberts <·············@re-move.droberts.com> wrote in message news:<·····················@attbi_s54>...
> So I'm trying to use UFFI for a simple DNS decoding routine using Linux's
> resolver library. I have one function to simply get a byte from a foreign
> object buffer:
> 
> (defun get-byte (buffer size index)
>   "Get a byte from the foreign object array at the specified index."
>   (assert (< index size))
>   (uffi:deref-array buffer :unsigned-byte index))
> 
> Using this under SBCL seems to result in MASSIVE consing:
> 
Stupid question: is it compiled?
From: Christophe Rhodes
Subject: Re: UFFI does massive consing?
Date: 
Message-ID: <sq3c89xaxe.fsf@lambda.dyndns.org>
··········@tfeb.org (Tim Bradshaw) writes:

> Dave Roberts <·············@re-move.droberts.com> wrote in message news:<·····················@attbi_s54>...
>> Using this under SBCL seems to result in MASSIVE consing:
>> 
> Stupid question: is it compiled?

Yes and no. :-)

SBCL doesn't have an interpreter to speak of; everything is compiled.
Given that, Dave's code is indeed compiled.

However, the generic alien routines (used when the alien access is to
an uncertain type) invoke the compiler to compute an accessing
function.  So in that sense, Dave's code is not as compiled as it
should be.

Christophe
-- 
http://www-jcsu.jesus.cam.ac.uk/~csr21/       +44 1223 510 299/+44 7729 383 757
(set-pprint-dispatch 'number (lambda (s o) (declare (special b)) (format s b)))
(defvar b "~&Just another Lisp hacker~%")    (pprint #36rJesusCollegeCambridge)
From: Kenny Tilton
Subject: Re: UFFI does massive consing?
Date: 
Message-ID: <fWF5c.52062$Wo2.44343@twister.nyc.rr.com>
Christophe Rhodes wrote:

> ··········@tfeb.org (Tim Bradshaw) writes:
> 
> 
>>Dave Roberts <·············@re-move.droberts.com> wrote in message news:<·····················@attbi_s54>...
>>
>>>Using this under SBCL seems to result in MASSIVE consing:
>>>
>>
>>Stupid question: is it compiled?
> 
> 
> Yes and no. :-)
> 
> SBCL doesn't have an interpreter to speak of; everything is compiled.
> Given that, Dave's code is indeed compiled.

Hmmm, come to think of it, something like (time (deref...)) should all 
get compiled and loaded before the time function ever gets kicked off, 
so time would /not/ be seeing that effort as I feared.

kt

-- 
http://tilton-technology.com

Why Lisp? http://alu.cliki.net/RtL%20Highlight%20Film

Your Project Here! http://alu.cliki.net/Industry%20Application
From: Dave Roberts
Subject: Re: UFFI does massive consing?
Date: 
Message-ID: <3Bv5c.18820$1p.342382@attbi_s54>
Dave Roberts wrote:

> Evaluation took:
>                  0.006 seconds of real time

WHOA! Worse, just look at the run time. 6 ms just to read a single byte??
When this primitive gets used up through my top-level code, it takes me
over 250 ms to decode a single DNS resource record and the program conses
over 4.2 MB!!!

Help! ;-)

What am I doing wrong?

-- 
Dave Roberts
·············@re-move.droberts.com
From: Kenny Tilton
Subject: Re: UFFI does massive consing?
Date: 
Message-ID: <lbw5c.50963$Wo2.9409@twister.nyc.rr.com>
Dave Roberts wrote:

> So I'm trying to use UFFI for a simple DNS decoding routine using Linux's
> resolver library. I have one function to simply get a byte from a foreign
> object buffer:
> 
> (defun get-byte (buffer size index)
>   "Get a byte from the foreign object array at the specified index."
>   (assert (< index size))
>   (uffi:deref-array buffer :unsigned-byte index))
> 
> Using this under SBCL seems to result in MASSIVE consing:
> 
> * (time (get-byte buffer 181 12))
> 
> Evaluation took:
>                  0.006 seconds of real time
>                  0.0 seconds of user run time
>                  0.0 seconds of system run time
>                  0 page faults and
>                  89952 bytes consed.
> 
> So, reading a single byte from a foreign buffer results in 90K of
> consing?!?!?!
> 
> Am I doing something wrong? Needless to say, when I use this primitive in my
> code, just decoding a simple DNS message results in multi-megabytes of
> consing.
> 
> So my question is: Is this an issue with UFFI or with SBCL (0.8.6 right now,
> just retrieved 0.8.8 this morning but haven't installed it yet as I don't
> want to shut down my emacs session... ;-)

 From your UFFI source, pruned for SBCL expansion:

(defmacro deref-array (obj type i)
   "Returns a field from a row"
   #+(or lispworks cmu sbcl scl) (declare (ignore type))
   #+sbcl  `(sb-alien:deref ,obj ,i)

I am using UFFI for Cello and it's all CPU all the time over here, 
things seem quite zippy and I have not yet begun to tune for speed. But 
my main point above is that in this case UFFI is gone once the macro has 
expanded, so UFFI is not doing anything at run time. I think most of the 
UFFI wrappers are like that.

I'll time some derefs in a few, let you know what I find.



kt

-- 
http://tilton-technology.com

Why Lisp? http://alu.cliki.net/RtL%20Highlight%20Film

Your Project Here! http://alu.cliki.net/Industry%20Application
From: Kenny Tilton
Subject: Re: UFFI does massive consing?
Date: 
Message-ID: <srw5c.51100$Wo2.23540@twister.nyc.rr.com>
Kenny Tilton wrote:

> 
> 
> Dave Roberts wrote:
> 
>> So I'm trying to use UFFI for a simple DNS decoding routine using Linux's
>> resolver library. I have one function to simply get a byte from a foreign
>> object buffer:
>>
>> (defun get-byte (buffer size index)
>>   "Get a byte from the foreign object array at the specified index."
>>   (assert (< index size))
>>   (uffi:deref-array buffer :unsigned-byte index))
>>
>> Using this under SBCL seems to result in MASSIVE consing:
>>
>> * (time (get-byte buffer 181 12))
>>
>> Evaluation took:
>>                  0.006 seconds of real time
>>                  0.0 seconds of user run time
>>                  0.0 seconds of system run time
>>                  0 page faults and
>>                  89952 bytes consed.
>>
>> So, reading a single byte from a foreign buffer results in 90K of
>> consing?!?!?!

I show:

; cpu time (non-gc) 0 msec user, 0 msec system
; cpu time (gc)     0 msec user, 0 msec system
; cpu time (total)  0 msec user, 0 msec system
; real time  0 msec
; space allocation:
;  45 cons cells, 1,168 other bytes, 0 static bytes

But then I put the time directly around the deref.

Did you repeat your timing twice? Sometimes you can pick up some heavy 
lifting done by lisp implementations the first time they see a function.

If you want to time the deref, put the timing around the deref.

kt

-- 
http://tilton-technology.com

Why Lisp? http://alu.cliki.net/RtL%20Highlight%20Film

Your Project Here! http://alu.cliki.net/Industry%20Application
From: Dave Roberts
Subject: Re: UFFI does massive consing?
Date: 
Message-ID: <V2x5c.21557$JL2.258941@attbi_s03>
Kenny Tilton wrote:

> I show:
> 
> ; cpu time (non-gc) 0 msec user, 0 msec system
> ; cpu time (gc)     0 msec user, 0 msec system
> ; cpu time (total)  0 msec user, 0 msec system
> ; real time  0 msec
> ; space allocation:
> ;  45 cons cells, 1,168 other bytes, 0 static bytes
> 
> But then I put the time directly around the deref.

Here's my try:

* (time (deref buffer 12))

Evaluation took:
                 0.005 seconds of real time
                 0.0 seconds of user run time
                 0.0 seconds of system run time
                 0 page faults and
                 98352 bytes consed.


Hmmm.... okay, this looks like an SBCL thing, not a UFFI thing. Sorry for
the trouble. I need to get better about remembering to expand macros and
such. I hadn't ever realized that this would have mapped so cleanly into
the SBCL alien interface.

-- 
Dave Roberts
·············@re-move.droberts.com
From: Kenny Tilton
Subject: Re: UFFI does massive consing?
Date: 
Message-ID: <l1z5c.51709$Wo2.31105@twister.nyc.rr.com>
Dave Roberts wrote:

> Kenny Tilton wrote:
> 
> 
>>I show:
>>
>>; cpu time (non-gc) 0 msec user, 0 msec system
>>; cpu time (gc)     0 msec user, 0 msec system
>>; cpu time (total)  0 msec user, 0 msec system
>>; real time  0 msec
>>; space allocation:
>>;  45 cons cells, 1,168 other bytes, 0 static bytes
>>
>>But then I put the time directly around the deref.
> 
> 
> Here's my try:
> 
> * (time (deref buffer 12))

I /think/ SBCL is silently compiling that form and then executing the 
compiled result, so you are timing a lot besides the deref. Not sure, 
but this is how you could tell:

(defun time-deref (buffer)
    (time (deref buffer 12)))

Put that in a source file and compile it (I don't know about these 
things, but I think you can just compile it via "(compile-something...") 
and then see what you get from:

(time-deref <uffi buffer allocation>)

  Don't forget to execute a few times to see if results bounce around.

kt

-- 
http://tilton-technology.com

Why Lisp? http://alu.cliki.net/RtL%20Highlight%20Film

Your Project Here! http://alu.cliki.net/Industry%20Application
From: Christophe Rhodes
Subject: Re: UFFI does massive consing?
Date: 
Message-ID: <sq7jxlxb1n.fsf@lambda.dyndns.org>
Dave Roberts <·············@re-move.droberts.com> writes:

> (defun get-byte (buffer size index)
>   "Get a byte from the foreign object array at the specified index."
>   (assert (< index size))
>   (uffi:deref-array buffer :unsigned-byte index))
>
> So, reading a single byte from a foreign buffer results in 90K of
> consing?!?!?!
>
> Am I doing something wrong? Needless to say, when I use this primitive in my
> code, just decoding a simple DNS message results in multi-megabytes of
> consing.

For efficient access to foreign variables, SBCL's compiler and foreign
function interface need to know their types at compile-time.  If they
don't, they will call out to routines that, as you've observed, are
woefully inefficient.  Fixing the generic routines' woeful
inefficiency isn't high-priority, because generally foriegn types
_are_ known at compile-time.  (Does anyone have a real-world example
where this isn't the case, incidentally?)

When you compiled your code, you will probably have received some
compiler notes from SBCL telling you where it has failed to optimize
"alien" calls.  (If not, then when you compiled UFFI you will have
received some compiler notes...).  Following those notes and making
sure that all alien types are completely identified will probably
remove the problem.

Christophe
-- 
http://www-jcsu.jesus.cam.ac.uk/~csr21/       +44 1223 510 299/+44 7729 383 757
(set-pprint-dispatch 'number (lambda (s o) (declare (special b)) (format s b)))
(defvar b "~&Just another Lisp hacker~%")    (pprint #36rJesusCollegeCambridge)
From: Dave Roberts
Subject: Re: UFFI does massive consing?
Date: 
Message-ID: <n7D5c.20946$_w.483517@attbi_s53>
Christophe Rhodes wrote:
 
> For efficient access to foreign variables, SBCL's compiler and foreign
> function interface need to know their types at compile-time.  If they
> don't, they will call out to routines that, as you've observed, are
> woefully inefficient.  Fixing the generic routines' woeful
> inefficiency isn't high-priority, because generally foriegn types
> _are_ known at compile-time.  (Does anyone have a real-world example
> where this isn't the case, incidentally?)
> 
> When you compiled your code, you will probably have received some
> compiler notes from SBCL telling you where it has failed to optimize
> "alien" calls.  (If not, then when you compiled UFFI you will have
> received some compiler notes...).  Following those notes and making
> sure that all alien types are completely identified will probably
> remove the problem.

Indeed. I didn't pay too much attention to them first time around (SBCL
spits out all sorts of warnings, normally). Okay, here's where buffer is
defined:

(defvar buffer (uffi:allocate-foreign-object :unsigned-byte len))

SBCL compiler then says:

; file: /home/dave/lispstuff/resolver/resolver.lisp
; in: DEFVAR BUFFER
;     (UFFI:ALLOCATE-FOREIGN-OBJECT :UNSIGNED-BYTE LEN)
; --> MAKE-ALIEN 
; ==>
;   (SB-ALIEN-INTERNALS:%SAP-ALIEN (SB-ALIEN::%MAKE-ALIEN (* 8 LEN))
;                                  '#<SB-ALIEN-INTERNALS:ALIEN-POINTER-TYPE
;                                     (* (UNSIGNED 8))>)
; 
; note: unable to
;         optimize
;       because:
;         could not optimize away %SAP-ALIEN: forced to do runtime 
; allocation of alien-value structure

I'm not really sure how to handle that.

Hints?

-- 
Dave Roberts
·············@re-move.droberts.com
From: Kevin M. Rosenberg
Subject: Re: UFFI does massive consing?
Date: 
Message-ID: <slrnc5h6ro.k5u.kevin@tiger.med-info.com>
On 2004-03-16, Dave Roberts <·············@re-move.droberts.com> wrote:
> So, reading a single byte from a foreign buffer results in 90K of
> consing?!?!?!

As Christophe as mentioned, sbcl and cmucl have such massive consing
when dereferencing a foreign pointer when the type is not known. I ran
across this in CLSQL with a function that was attempting to
dereference a character pointer. William Newman helped me identify the
cause. Form the #lisp IRC logs at http://meme.b9.com/ for 2003/03/29:

  "IIRC the compiler does issue optimizer notes for this, it's just that
  they don't particularly stand out from the other notes so you might
  not realize that these particular optimization notes are OPTIMIZATION
  PROBLEMS WHICH WILL BURN ALL YOUR CPU CYCLES IN THE DEEPEST PITS OF
  COMPUTATIONAL HELL."

The solution was to give that type information to the Lisp
implementation. From CLSQL's uffi/clsql-uffi.lisp:

(uffi:def-type char-ptr-def (* :unsigned-char))

(defun convert-raw-field (char-ptr types index &optional length)
  (declare (optimize (speed 3) (safety 0) (space 0))
 	   (type char-ptr-def char-ptr))
 ...


-- 
Kevin Rosenberg
·····@rosenberg.net
From: Dave Roberts
Subject: Re: UFFI does massive consing?
Date: 
Message-ID: <Jqx6c.38719$KO3.101060@attbi_s02>
Kevin M. Rosenberg wrote:

> The solution was to give that type information to the Lisp
> implementation. From CLSQL's uffi/clsql-uffi.lisp:
> 
> (uffi:def-type char-ptr-def (* :unsigned-char))
> 
> (defun convert-raw-field (char-ptr types index &optional length)
>   (declare (optimize (speed 3) (safety 0) (space 0))
>  (type char-ptr-def char-ptr))
>  ...
> 
> 

Sorry, was traveling for work for a couple of days. I had been reading this
on Google groups from a hotel room but didn't have my home machine with me
and didn't want to answer until I could try it out.

So, that did it. I added a similar declaration to my lowest-level GET-BYTE
function and now it's lightening fast. "0 bytes consed," says TIME.

Now, can somebody explain to me how UFFI's DEF-TYPE is interacting with the
type declaration? I don't think I understand that yet. Also, given that the
call to UFFI:DEREF-ARRAY contains a type declaration, why can't that be
used? I was using: (uffi:deref-array buffer :unsigned-byte index). It looks
like the :UNSIGNED-BYTE piece is totally ignored when the macro expands.

-- 
Dave Roberts
·············@re-move.droberts.com
From: Kevin M. Rosenberg
Subject: Re: UFFI does massive consing?
Date: 
Message-ID: <slrnc5n2h9.e42.kevin@tiger.med-info.com>
On 2004-03-19, Dave Roberts <·············@re-move.droberts.com> wrote:
> So, that did it. I added a similar declaration to my lowest-level GET-BYTE
> function and now it's lightening fast. "0 bytes consed," says TIME.

Great!

> Now, can somebody explain to me how UFFI's DEF-TYPE is interacting with the
> type declaration? I don't think I understand that yet. Also, given that the
> call to UFFI:DEREF-ARRAY contains a type declaration, why can't that be
> used? I was using: (uffi:deref-array buffer :unsigned-byte index). It looks
> like the :UNSIGNED-BYTE piece is totally ignored when the macro expands.

UFFI is a rather thin macro layer. The best place to check how a UFFI
macro is interacting with the CL implementation is to look at the UFFI
source code. From src/primitives.lisp:
(defmacro def-type (name type)
  ...
  #+sbcl
  `(deftype ,name () '(sb-alien:alien ,(convert-from-uffi-type type
  :declare)))

So uffi:def-type simply adds a deftype form to SBCL compiler. SBCL
uses this to optimize the derefernce.

Yes, you are right about the type being ignored in deref-array on
SBCL. SBCL, unlike some Lisp implementations, does not use the type
when derefering. From src/aggregates.lisp:

(defmacro deref-array (obj type i)
  #+(or lispworks cmu sbcl scl) (declare (ignore type))
  #+sbcl `(sb-alien:deref ,obj ,i)

If you have further UFFI questions, probably the best place to ask
them is on the UFFI mail list.

-- 
Kevin Rosenberg
·····@rosenberg.net