I am using the foreign function interface in CMU CL to do some low
level I/O. However, it seems like calls to a foreign function are
*horribly* costly. I'm hoping I'm doing something wrong. The following
is a simple experiment:
C module:
void
ipu_nop()
{
}
Lisp:
(def-alien-routine ("ipu_nop" ipu-nop) void)
and then
(defun test-ff ()
(loop repeat 1024 do (ipu-nop)))
This little piece of code takes 60 seconds to run on an Ultra-1!
* (time (test-ff))
Compiling LAMBDA NIL:
Compiling Top-Level Form:
Evaluation took:
60.03 seconds of real time
0.0 seconds of user run time
0.0 seconds of system run time
0 page faults and
23209144 bytes consed.
NIL
That is 0.05 seconds per call!
Am I doing something wrong or is it this bad?
Johan Parin
Ericsson AXE Research & Development, Stockholm, SWEDEN
From: Raymond Toy
Subject: Re: FF interface in CMU CL - horribly inneficient?
Date:
Message-ID: <4nafg47fbr.fsf@rtp.ericsson.se>
>>>>> "Johan" == Johan Parin <······@uab.ericsson.se> writes:
Johan> C module:
Johan> void ipu_nop() { }
Johan> Lisp:
Johan> (def-alien-routine ("ipu_nop" ipu-nop) void)
Johan> and then
Johan> (defun test-ff ()
Johan> (loop repeat 1024 do (ipu-nop)))
Johan> This little piece of code takes 60 seconds to run on an
Johan> Ultra-1!
Johan> * (time (test-ff))
Johan> Compiling LAMBDA NIL: Compiling Top-Level Form: Evaluation
Johan> took:
Johan> 60.03 seconds of real time
Johan> 0.0 seconds of user run time
Johan> 0.0 seconds of system run time
Johan> 0 page faults and 23209144 bytes consed.
Johan> NIL
On an Ultra-1 I have here, it only takes 11 sec to run. What version
of CMUCL are you running? This is 18a plus some stuff.
However, you should get MUCH better times if you compile the Lisp
code. When I just compile the def-alien-routine, the time goes from
11 sec to 0.05 sec and only 24000 bytes consed. If I also compile
test-ff, the time drops to 0.001 sec and only 48 bytes consed. This
doesn't seem too slow to me.
Ray
Raymond Toy <···@rtp.ericsson.se> writes:
>
> >>>>> "Johan" == Johan Parin <······@uab.ericsson.se> writes:
>
>
>
> On an Ultra-1 I have here, it only takes 11 sec to run. What version
> of CMUCL are you running? This is 18a plus some stuff.
>
> However, you should get MUCH better times if you compile the Lisp
> code. When I just compile the def-alien-routine, the time goes from
> 11 sec to 0.05 sec and only 24000 bytes consed. If I also compile
> test-ff, the time drops to 0.001 sec and only 48 bytes consed. This
> doesn't seem too slow to me.
>
On top of that, I fail to see the point of using the FFI to do "low
level i/o" in CMUCL. The code you will come up with will be
unportable anyway, so why not just use the UNIX interface in CMUCL?
Moreover, CMUCL will have READ- and WRITE-SEQUENCE very soon.
--
Marco Antoniotti
==============================================================================
California Path Program - UC Berkeley
Richmond Field Station
tel. +1 - 510 - 231 9472
From: Johan Parin
Subject: Re: FF interface in CMU CL - horribly inneficient?
Date:
Message-ID: <kviuuo355m.fsf@uab.ericsson.se>
Raymond Toy writes:
<snip>
Raymond> On an Ultra-1 I have here, it only takes 11 sec to run.
Raymond> What version of CMUCL are you running? This is 18a plus
Raymond> some stuff.
I'm using an 18a version that I picked up at www.cons.org a few months
ago. This time it took 22 seconds... The test-ff routine is
compiled. I had not compiled the def-alien-routine form. When I did
that, I got down to 0.0 seconds and 48 bytes consed, so this is
apparently the trick.
Raymond> However, you should get MUCH better times if you compile
Raymond> the Lisp code. When I just compile the def-alien-routine,
Raymond> the time goes from 11 sec to 0.05 sec and only 24000 bytes
Raymond> consed. If I also compile test-ff, the time drops to 0.001
Raymond> sec and only 48 bytes consed. This doesn't seem too slow
Raymond> to me.
Raymond> Ray
So the specific problem I described is solved by compiling the
def-alien-routine form. However there is another (more worrying)
problem (everything is compiled in the examples below):
(eval-when (compile load eval)
(defconstant *bufsiz* 512))
(defvar arr (make-alien (array (unsigned 8) #.*bufsiz*)))
(defvar write-ptr 0)
(defun put-byte (byte)
(setf (deref (deref arr) write-ptr) byte)
(incf write-ptr)
(when (> write-ptr *bufsiz*)
;; Write out data
(setf write-ptr 0)))
(defun test-ff ()
(loop repeat 1024 do (put-byte #xf)))
* (time (test-ff))
Compiling LAMBDA NIL:
Compiling Top-Level Form:
< gc >
Evaluation took:
57.7 seconds of real time
0.0 seconds of user run time
0.0 seconds of system run time
0 page faults and
59105200 bytes consed.
I also don't understand the following compilation warning:
In: DEFVAR ARR
(MAKE-ALIEN (ARRAY (UNSIGNED 8) 512))
==>
(ALIEN::%SAP-ALIEN (ALIEN::%MAKE-ALIEN (* 8 #))
'#<ALIEN::ALIEN-POINTER-TYPE (* (ARRAY # 512))>)
Note: Unable to optimize because:
Could not optimize away %SAP-ALIEN: forced to do runtime
allocation of alien-value structure.
If, on the other hand I declare arr as a regular array and redeclare
the put-byte function as:
(defun put-byte (byte)
(setf (aref arr write-ptr) byte)
(incf write-ptr)
(when (= write-ptr *bufsiz*)
(with-alien ((al-arr (array (unsigned 8) #.*bufsiz*)))
(dotimes (i *bufsiz*)
(setf (deref al-arr i) (aref arr i)))
;; Write out data
)
(setf write-ptr 0)))
Then I get 0 seconds and 48 bytes consed. Is this the way these things
have to be done? It takes some of the power out of being able to
manipulate foreign data.
- Johan
In article <··············@uab.ericsson.se>,
Johan Parin <······@uab.ericsson.se> uses the alien interface in CMUCL:
JP> (defvar *arr* (make-alien (array (unsigned 8) #.*bufsiz*)))
JP> (defvar *write-ptr* 0)
JP> (defun put-byte (byte)
JP> (setf (deref (deref *arr*) *write-ptr*) byte)
JP> (incf *write-ptr*)
JP> (when (> *write-ptr* *bufsiz*)
JP> ;; Write out data
JP> (setf *write-ptr* 0)))
[ performance is horrible: ]
JP> 59105200 bytes consed.
Just declare the type of arr in the `put-byte' function. This will
keep CMUCL from allocating a new alien at each deref.
(defun put-byte (byte)
(declare (type (alien (* (array (unsigned 8) #.*bufsiz*))) *arr*)
(fixnum *write-ptr*))
(setf (deref (deref *arr*) *write-ptr*) byte)
(incf *write-ptr*)
(when (> *write-ptr* *bufsiz*)
;; Write out data
(setf *write-ptr* 0)))
and then I get:
* (time (test-ff))
Evaluation took:
0.01 seconds of real time
0.01 seconds of user run time
0.0 seconds of system run time
0 page faults and
48 bytes consed.
(By the way: `48 bytes consed' really means that there was no consing
at all.)
In general, I have found CMUCL to be very inefficient when your code
is dynamically typed. When it is statically typed, and the compiler
knows it (i.e. there are enough declarations), it is pretty good.
You can get the compiler to tell you about this sort of problems by
asking it to give efficiency notes; check the user's manual (I don't
have it handy).
Still, I am puzzled by the 60 Mb consed in the first case. Could any
of the CMUCL developers explain exactly what was allocated?
J.
From: Raymond Toy
Subject: Re: FF interface in CMU CL - horribly inneficient?
Date:
Message-ID: <4nyb3kqngb.fsf@rtp.ericsson.se>
Johan Parin <······@uab.ericsson.se> writes:
>
> So the specific problem I described is solved by compiling the
> def-alien-routine form. However there is another (more worrying)
> problem (everything is compiled in the examples below):
>
> (eval-when (compile load eval)
> (defconstant *bufsiz* 512))
>
Add a declaim here for arr:
(declaim (type (alien (* (array (unsigned 8) 512))) arr))
> (defvar arr (make-alien (array (unsigned 8) #.*bufsiz*)))
> (defvar write-ptr 0)
I then get 0 sec and 48 bytes consed.
>> I also don't understand the following compilation warning:
>> In: DEFVAR ARR
>> (MAKE-ALIEN (ARRAY (UNSIGNED 8) 512))
>> ==>
>> (ALIEN::%SAP-ALIEN (ALIEN::%MAKE-ALIEN (* 8 #))
>> '#<ALIEN::ALIEN-POINTER-TYPE (* (ARRAY # 512))>)
>> Note: Unable to optimize because:
>> Could not optimize away %SAP-ALIEN: forced to do runtime
>> allocation of alien-value structure.
This shouldn't really matter since arr is only initialized once.
At this point, I have to agree with what Martin (Marco?) said: Why
not use unix:unix-read and friends to do your I/O? It's already
unportable, so use something nicer.
Perhaps if you gave an explanation of exactly what you want to do and
why it must use foreign data to do it, someone could give a suggestion
on how to do it without foreign data. Your examples don't really show
why it must use foreign data.
Ray
From: Johan Parin
Subject: Re: FF interface in CMU CL - horribly inneficient?
Date:
Message-ID: <kvhga73173.fsf@uab.ericsson.se>
Raymond Toy writes:
<snip>
> Add a declaim here for arr:
> (declaim (type (alien (* (array (unsigned 8) 512))) arr))
>> (defvar arr (make-alien (array (unsigned 8) #.*bufsiz*)))
>> (defvar write-ptr 0)
> I then get 0 sec and 48 bytes consed.
Thank you, that helped a lot!
<snip>
> At this point, I have to agree with what Martin (Marco?) said: Why
> not use unix:unix-read and friends to do your I/O? It's already
> unportable, so use something nicer.
I don't need foreign types when using the UNIX system calls?
> Perhaps if you gave an explanation of exactly what you want to do
> and why it must use foreign data to do it, someone could give a
> suggestion on how to do it without foreign data. Your examples
> don't really show why it must use foreign data.
What I'm trying to do is implement buffered I/O that will work with
sockets and pipes (the data flow is bidirectional) in both CMU CL and
Allegro. I realize that I will have to use some #+ conditionals. I
tried to use streams and make-fd-stream, but couldn't get that to work
with sockets.
I thought a good way to do this would be to use a common C file, which
declares wrappers for the system functions, and use #+ conditionals in
the Lisp part, but maybe that's the wrong approach. As I understand
it, Allegro does not provide direct access to system calls, so I would
have to declare C wrappers for them.
> Ray
- Johan Parin
Johan Parin <······@uab.ericsson.se> writes:
>
>
> What I'm trying to do is implement buffered I/O that will work with
> sockets and pipes (the data flow is bidirectional) in both CMU CL and
> Allegro. I realize that I will have to use some #+ conditionals. I
> tried to use streams and make-fd-stream, but couldn't get that to work
> with sockets.
>
I believe that it'd more worthwhile to try to fix 'make-fd-stream'
or to add extra stream operations in CMUCL than to provide one more
extra layer.
But maybe we should take this off line to the CMUCL mailing lists.
--
Marco Antoniotti
==============================================================================
California Path Program - UC Berkeley
Richmond Field Station
tel. +1 - 510 - 231 9472
From: Raymond Toy
Subject: Re: FF interface in CMU CL - horribly inneficient?
Date:
Message-ID: <4niuunqlor.fsf@rtp.ericsson.se>
Johan Parin <······@uab.ericsson.se> writes:
>
> > At this point, I have to agree with what Martin (Marco?) said: Why
> > not use unix:unix-read and friends to do your I/O? It's already
> > unportable, so use something nicer.
>
> I don't need foreign types when using the UNIX system calls?
Maybe. However the system calls probably hide the foreign stuff and
the results are returned as Lisp objects. (I think.)
> What I'm trying to do is implement buffered I/O that will work with
> sockets and pipes (the data flow is bidirectional) in both CMU CL and
> Allegro. I realize that I will have to use some #+ conditionals. I
> tried to use streams and make-fd-stream, but couldn't get that to work
> with sockets.
I've never tried any of these so I can't help.
Ray
Johan Parin <······@uab.ericsson.se> writes:
>
> What I'm trying to do is implement buffered I/O that will work with
> sockets and pipes (the data flow is bidirectional) in both CMU CL and
> Allegro. I realize that I will have to use some #+ conditionals. I
> tried to use streams and make-fd-stream, but couldn't get that to work
> with sockets.
>
> I thought a good way to do this would be to use a common C file, which
> declares wrappers for the system functions, and use #+ conditionals in
> the Lisp part, but maybe that's the wrong approach. As I understand
> it, Allegro does not provide direct access to system calls, so I would
> have to declare C wrappers for them.
>
> > Ray
>
> - Johan Parin
You might want to have a look at the lisp-tk interface that I ported
from gcl to work under both Allegro and CMU, it uses socket
connections, and sigusr (CMU) or sigio (allegro) handlers for flow
control. This might give you an idea how to uses system functions
under CMU and allegro.
It's available under http://www.arch.usyd.edu.au/~thorsten/lisp/index.html
thorsten
-----------------------------------------------------------------------
\ k
\ e Thorsten Schnier
\ y
/\ \ Key Centre of Design Computing
/ / c University of Sydney
\ / e NSW 2006
\ / n
\/ t ········@arch.usyd.edu.au
DESIGN r http://www.arch.usyd.edu.au/~thorsten/
COMPUTING e
Johan Parin <······@uab.ericsson.se> writes:
>I am using the foreign function interface in CMU CL to do some low
>level I/O. However, it seems like calls to a foreign function are
>*horribly* costly. I'm hoping I'm doing something wrong. The following
>is a simple experiment:
[...]
>Am I doing something wrong or is it this bad?
Did you compile the file your alien definition is in?
The only thing that is slow is returning strings from aliens, since
they've got to be consed. Everything else shouldn't be more than
normal function calls.
Martin
--
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
···············@wavehh.hanse.de http://www.cons.org/cracauer/
BSD User Group Hamburg/Germany http://www.bsdhh.org/