From: Andreas Thiele
Subject: implementation dependent performance
Date: 
Message-ID: <fnjng2$ft0$03$1@news.t-online.com>
Hi,

I wonder a bit about differences in implementation dependend performance. I wrote a batch job which runs accross a database via Uffi-Calls on unixODBC and found the following times for part A of my batch

CMUCL 692 seconds
sbcl 24 seconds
LispWorks 13.9 seconds

Perhaps a bit unfair because on CMUCL there have been a lot of warnings that optimizations aren't  possible.

Is this a know issue?

Andreas

From: Maciej Katafiasz
Subject: Re: implementation dependent performance
Date: 
Message-ID: <fnl26n$ou9$1@news.net.uni-c.dk>
Den Mon, 28 Jan 2008 05:59:33 +0100 skrev Andreas Thiele:

> I wonder a bit about differences in implementation dependend
> performance. I wrote a batch job which runs accross a database via
> Uffi-Calls on unixODBC and found the following times for part A of my
> batch
> 
> CMUCL 692 seconds
> sbcl 24 seconds
> LispWorks 13.9 seconds
> 
> Perhaps a bit unfair because on CMUCL there have been a lot of warnings
> that optimizations aren't  possible.
> 
> Is this a know issue?

No, in fact, almost nothing is known about your issue, you gave info 
spare enough to be completely useless.

Cheers,
Maciej
From: Raymond Wiker
Subject: Re: implementation dependent performance
Date: 
Message-ID: <m24pcxamyf.fsf@Macintosh-2.local>
Maciej Katafiasz <········@gmail.com> writes:

> Den Mon, 28 Jan 2008 05:59:33 +0100 skrev Andreas Thiele:
>
>> I wonder a bit about differences in implementation dependend
>> performance. I wrote a batch job which runs accross a database via
>> Uffi-Calls on unixODBC and found the following times for part A of my
>> batch
>> 
>> CMUCL 692 seconds
>> sbcl 24 seconds
>> LispWorks 13.9 seconds
>> 
>> Perhaps a bit unfair because on CMUCL there have been a lot of warnings
>> that optimizations aren't  possible.
>> 
>> Is this a know issue?
>
> No, in fact, almost nothing is known about your issue, you gave info 
> spare enough to be completely useless.

	Actually, given the difference in running time between CMUCL
and SBCL, one might guess that Andreas did not (explicitly) compile
the code. Alternatively, ISTR that SBCL has some optimizations for
alien calls that CMUCL foes not have, which means that parts of the
"alien" machinery will be invoked on every call, instead of just once,
at compilation time.

	More information would definitely help.
From: Juho Snellman
Subject: Re: implementation dependent performance
Date: 
Message-ID: <87lk697sln.fsf@vasara.proghammer.com>
Raymond Wiker <···@RawMBP.local> writes:
> Alternatively, ISTR that SBCL has some optimizations for
> alien calls that CMUCL foes not have, which means that parts of the
> "alien" machinery will be invoked on every call, instead of just once,
> at compilation time.

I wouldn't neccessarily talk about optimizations. Assuming that the
code does not have sufficient type declarations for aliens, both SBCL
and CMUCL will go through the process of invoking that machinery. It's
just that SBCL invokes it with the interpreter, while CMUCL (lacking a
real interpreter) has to call the compiler.

An order of magnitude difference due to this sounds completely
plausible. But note that SBCL would also benefit greatly from the
proper type declarations, just not as much as CMUCL.

-- 
Juho Snellman
From: Andreas Thiele
Subject: Re: implementation dependent performance
Date: 
Message-ID: <fnn7o0$af0$02$1@news.t-online.com>
Juho Snellman wrote:
> Raymond Wiker <···@RawMBP.local> writes:
>> Alternatively, ISTR that SBCL has some optimizations for
>> alien calls that CMUCL foes not have, which means that parts of the
>> "alien" machinery will be invoked on every call, instead of just once,
>> at compilation time.
> 
> I wouldn't neccessarily talk about optimizations. Assuming that the
> code does not have sufficient type declarations for aliens, both SBCL
> and CMUCL will go through the process of invoking that machinery. It's
> just that SBCL invokes it with the interpreter, while CMUCL (lacking a
> real interpreter) has to call the compiler.
> 
> An order of magnitude difference due to this sounds completely
> plausible. But note that SBCL would also benefit greatly from the
> proper type declarations, just not as much as CMUCL.

Juho,

I'd like to thank you and all previous posters for their answers. Sounds
like people feel a bit offended by my post. Sorry, I didn't want to do
this, I was under pressure to find a working solution.

Because there seems to be some interest, I'll give some more information.
I don't use CLSQL but my own ODBC Lisp program. This has been
developed using LispWorks for Windows and LispWorks FLI library.
Now I had the need, to write a batch job to run on Linux. I decided
to adapt my code to Uffi and developed the port using CMUCL. Within
two days, I had a running solution but, utterly slow. So I gave sbcl a try
and astonishingly it was close in speed to the LispWorks solution.

Of course I compiled all code (funny idea, I didn't).

CMUCL throws warnings on me during compilation every time a foreign object
is allocated. Now, I think these warnings are not problematic, because allocation
of foreign objects is seldom done. The test program reads a lot of
data (approx 100 000 rows) predominantely calls fetch-cursor which does not
allocate foreign objects. Here a typical message during compilation

; File: /devel/lib/atp/odbc-3/sql-interface.lisp

; In: DEFUN GET-DIAG-REC

;   (UFFI:WITH-FOREIGN-OBJECTS (# #)
;                              (LET* #
;                                #))
; --> UFFI:WITH-FOREIGN-OBJECT ALIEN:WITH-ALIEN EXTENSIONS:COMPILER-LET LET
; --> MULTIPLE-VALUE-PROG1 SYMBOL-MACROLET LET* UFFI:WITH-FOREIGN-OBJECTS
; --> UFFI:WITH-FOREIGN-OBJECT ALIEN:WITH-ALIEN EXTENSIONS:COMPILER-LET LET
; --> MULTIPLE-VALUE-PROG1 SYMBOL-MACROLET LET* ALIEN:ADDR
; --> ALIEN::%LOCAL-ALIEN-ADDR
; ==>
;   (ALIEN::%SAP-ALIEN C::VAR '#<ALIEN::ALIEN-POINTER-TYPE #>)
; Note: Unable to optimize because:
;     Could not optimize away %SAP-ALIEN: forced to do runtime
; allocation of alien-value structure.
;
;   (UFFI:ALLOCATE-FOREIGN-STRING 9)
; --> ALIEN:MAKE-ALIEN
; ==>
;   (ALIEN::%SAP-ALIEN (ALIEN::%MAKE-ALIEN #) '#<ALIEN::ALIEN-POINTER-TYPE #>)
; Note: Unable to optimize because:
;     Could not optimize away %SAP-ALIEN: forced to do runtime
; allocation of alien-value structure.
;
;   (UFFI:ALLOCATE-FOREIGN-STRING 257)
; --> ALIEN:MAKE-ALIEN
; ==>
;   (ALIEN::%SAP-ALIEN (ALIEN::%MAKE-ALIEN #) '#<ALIEN::ALIEN-POINTER-TYPE #>)
; Note: Unable to optimize because:
;     Could not optimize away %SAP-ALIEN: forced to do runtime
; allocation of alien-value structure.
;
;   (UFFI:WITH-FOREIGN-OBJECTS (# #)
;                              (LET* #
;                                #))
; --> UFFI:WITH-FOREIGN-OBJECT ALIEN:WITH-ALIEN EXTENSIONS:COMPILER-LET LET
; --> MULTIPLE-VALUE-PROG1 SYMBOL-MACROLET LET* ALIEN:ADDR
; --> ALIEN::%LOCAL-ALIEN-ADDR
; ==>
;   (ALIEN::%SAP-ALIEN C::VAR '#<ALIEN::ALIEN-POINTER-TYPE #>)
; Note: Unable to optimize because:
;     Could not optimize away %SAP-ALIEN: forced to do runtime
; allocation of alien-value structure.

And here the function of interest as example:

(defun get-diag-rec (handle type record-number)
  "Get DiagRec record-number in lispy format."
  (let (ret)
    (uffi:with-foreign-objects ((sqlnative :long) 
    (txtlen 'SQLSMALLINT))
      (let* ((sqlstate (uffi:allocate-foreign-string 9))
             (sqlmsg   (uffi:allocate-foreign-string 257))
             (r (SQLGetDiagRec type handle record-number 
          sqlstate
          sqlnative
          sqlmsg
          256 txtlen)))
 (unless (= r SQL_NO_DATA)
   (unwind-protect 
        (progn
   (unless (odbc-success r) 
     (odbc-error "SQLGetDiagRec failed."))
   (setf ret (list :sqlstate
     (uffi:convert-from-foreign-string sqlstate)
     :sqlnative
     (uffi:deref-pointer sqlnative :long)
     :sqlmsg
     (string-trim '(#\Newline)
           (uffi:convert-from-foreign-string sqlmsg))
     :record record-number)))
     (uffi:free-foreign-object sqlstate)
     (uffi:free-foreign-object sqlmsg)))))
    ret))

Again, I now think these warning are not the problem. The test program
executes 6 SQL statements and reads the results. The warnings only play
a role when preparing the statements but not when reading the bulk of data.

I guess what happens is heavy allocation of strings. Each data field is
read as string and on each column for each row I call

(setf field (buffer-field cursor i))
(setf field (uffi:convert-from-foreign-string field))
(setf field (string-trim '(#\Space) field))

before returning field to the caller.

I guess sbcl possibly has a better garbage collection when it comes to
heavy allocation of strings.

I think I have to profile to further investigate this matter. I try to find the
time :)


Thanks for your interest.

Andreas
From: Andreas Thiele
Subject: Re: implementation dependent performance
Date: 
Message-ID: <fnu52r$of0$01$1@news.t-online.com>
Andreas Thiele wrote:
> Hi,
> 
> I wonder a bit about differences in implementation dependend performance. I wrote a batch job which runs accross a database via
> Uffi-Calls on unixODBC and found the following times for part A of my batch 
> 
> CMUCL 692 seconds
> sbcl 24 seconds
> LispWorks 13.9 seconds
> 
> Perhaps a bit unfair because on CMUCL there have been a lot of warnings that optimizations aren't  possible.
> 
> Is this a know issue?
> 
> Andreas

After checking and discussing I did not yet find the true cause for this difference. I ported my lib to CFFI 
and found the following times:

CMUCL
; Evaluation took:
;   9.36 seconds of real time
;   7.924795 seconds of user run time
;   0.516921 seconds of system run time
;   23,720,226,048 CPU cycles
;   [Run times include 1.04 seconds GC run time]
;   0 page faults and
;   212,625,800 bytes consed.

LispWorks
User time    =        8.407
System time  =        0.086
Elapsed time =        9.307
Allocation   = 42099764 bytes
0 Page faults

sbcl
Evaluation took:
  7.927 seconds of real time
  6.214055 seconds of user run time
  0.304953 seconds of system run time
  [Run times include 0.938 seconds GC run time.]
  0 calls to %EVAL
  0 page faults and
  520,450,008 bytes consed.

Happy Lisping

Andreas