From: M Jared Finder
Subject: What's up with DISASSEMBLE and Slime?
Date: 
Message-ID: <421a9add@x-privat.org>
I've been experimenting with SBCL, seeing how good the optimizer is, but
I've hit a pretty big snag.  DISASSEMBLE is acting different under Slime
then it is if I run SBCL at the command line!  For example the function:

(defun binary-add (x y)
   (declare (fixnum x y))
   (the fixnum (+ x y)))

Disassembles as follows if I run with Slime:

; 099DDC36:       01FA             ADD EDX, EDI               ;
no-arg-parsing entry point
;       38:       8B4DF8           MOV ECX, [EBP-8]
;       3B:       8B45FC           MOV EAX, [EBP-4]
;       3E:       83C102           ADD ECX, 2
;       41:       8BE5             MOV ESP, EBP
;       43:       8BE8             MOV EBP, EAX
;       45:       FFE1             JMP ECX
;       47:       90               NOP
;

But disassembles as follows when using raw SBCL:

; 09A59462:       64               BYTE #X64                  ;
no-arg-parsing entry point
;      463:       C6054800000000   MOV BYTE PTR [#x48], 0
;      46A:       64               BYTE #X64
;      46B:       C6054400000001   MOV BYTE PTR [#x44], 1
;      472:       B920000000       MOV ECX, 32
;      477:       E8E0E75FFE       CALL #x8057C5C             ; alloc_to_ecx
;      47C:       8D4907           LEA ECX, [ECX+7]
;      47F:       C741F93E060000   MOV DWORD PTR [ECX-7], 1598
;      486:       64               BYTE #X64
;      487:       C6054400000000   MOV BYTE PTR [#x44], 0
;      48E:       64               BYTE #X64
;      48F:       803D4800000000   CMP BYTE PTR [#x48], 0
;      496:       7402             JEQ L0
;      498:       CC09             BREAK 9                    ; pending
interrupt trap
;      49A: L0:   B84A000000       MOV EAX, 74
;      49F:       8941FD           MOV [ECX-3], EAX
;      4A2:       C741050B000005   MOV DWORD PTR [ECX+5], 83886091
;      4A9:       8B050894A509     MOV EAX, [#x9A59408]       ;
"SB-DEBUG-CATCH-TAG"
;      4AF:       894109           MOV [ECX+9], EAX
;      4B2:       C7410D0B000005   MOV DWORD PTR [ECX+13], 83886091
;      4B9:       8B0518030005     MOV EAX, [#x5000318]
;      4BF:       64               BYTE #X64
;      4C0:       8B040500000000   MOV EAX, [EAX]
;      4C7:       8945C4           MOV [EBP-60], EAX
;      4CA:       8B05D8010005     MOV EAX, [#x50001D8]
;      4D0:       64               BYTE #X64
;      4D1:       8B040500000000   MOV EAX, [EAX]
;      4D8:       8B1518020005     MOV EDX, [#x5000218]
;      4DE:       64               BYTE #X64
;      4DF:       8B141500000000   MOV EDX, [EDX]
;      4E6:       8945C8           MOV [EBP-56], EAX
;      4E9:       8955CC           MOV [EBP-52], EDX
;      4EC:       8965C0           MOV [EBP-64], ESP
;      4EF:       8D55D0           LEA EDX, [EBP-48]
;      4F2:       8B05F8010005     MOV EAX, [#x50001F8]
;      4F8:       64               BYTE #X64
;      4F9:       8B040500000000   MOV EAX, [EAX]
;      500:       8902             MOV [EDX], EAX
;      502:       896A04           MOV [EDX+4], EBP
;      505:       C74208C595A509   MOV DWORD PTR [EDX+8], 161846725
;      50C:       894A0C           MOV [EDX+12], ECX
;      50F:       8B05D8010005     MOV EAX, [#x50001D8]
;      515:       64               BYTE #X64
;      516:       8B040500000000   MOV EAX, [EAX]
;      51D:       894210           MOV [EDX+16], EAX
;      520:       8B05D8010005     MOV EAX, [#x50001D8]
;      526:       64               BYTE #X64
;      527:       89140500000000   MOV [EAX], EDX
;      52E:       8B050C94A509     MOV EAX, [#x9A5940C]       ;
'*COMPILE-FILE-PATHNAME*
;      534:       8B7011           MOV ESI, [EAX+17]
;      537:       64               BYTE #X64
;      538:       8B343500000000   MOV ESI, [ESI]
;      53F:       83FE4A           CMP ESI, 74
;      542:       750C             JNE L1
;      544:       8B70FD           MOV ESI, [EAX-3]
;      547:       83FE4A           CMP ESI, 74
;      54A:       0F8431010000     JEQ L8
;      550: L1:   8BDC             MOV EBX, ESP
;      552:       83EC0C           SUB ESP, 12
;      555:       8B151094A509     MOV EDX, [#x9A59410]       ; "(+ X Y)"
;      55B:       8B3D1494A509     MOV EDI, [#x9A59414]       ; '(2 2 4
2 ...)
;      561:       8B051894A509     MOV EAX, [#x9A59418]       ;
#<FDEFINITION object for SB-C::STEP-FORM>
;      567:       B90C000000       MOV ECX, 12
;      56C:       896BFC           MOV [EBX-4], EBP
;      56F:       8BEB             MOV EBP, EBX
;      571:       FF5005           CALL DWORD PTR [EAX+5]
;      574:       8BE3             MOV ESP, EBX
;      576:       8B55EC           MOV EDX, [EBP-20]
;      579:       8B7DE8           MOV EDI, [EBP-24]
;      57C:       E80F6C5AF7       CALL #x1000190             ; GENERIC-+
;      581:       8BE3             MOV ESP, EBX
;      583:       F6C203           TEST DL, 3
;      586:       7531             JNE L2
;      588:       8BF4             MOV ESI, ESP
;      58A:       52               PUSH EDX
;      58B:       8975F4           MOV [EBP-12], ESI
;      58E:       C745F004000000   MOV DWORD PTR [EBP-16], 4
;      595:       8B0DD8010005     MOV ECX, [#x50001D8]
;      59B:       64               BYTE #X64
;      59C:       8B0C0D00000000   MOV ECX, [ECX]
;      5A3:       8B4910           MOV ECX, [ECX+16]
;      5A6:       8B05D8010005     MOV EAX, [#x50001D8]
;      5AC:       64               BYTE #X64
;      5AD:       890C0500000000   MOV [EAX], ECX
;      5B4:       E990000000       JMP L6
;      5B9: L2:   8B051C94A509     MOV EAX, [#x9A5941C]       ; 'FIXNUM
;      5BF:       CC0A             BREAK 10                   ; error trap
;      5C1:       03               BYTE #X03
;      5C2:       1F               BYTE #X1F                  ;
OBJECT-NOT-TYPE-ERROR
;      5C3:       8E               BYTE #X8E                  ; EDX
;      5C4:       0E               BYTE #X0E                  ; EAX
;      5C5:       8D73FC           LEA ESI, [EBX-4]
;      5C8:       8B7DC0           MOV EDI, [EBP-64]
;      5CB:       8BC7             MOV EAX, EDI
;      5CD:       83EF04           SUB EDI, 4
;      5D0:       894DF0           MOV [EBP-16], ECX
;      5D3:       C1E902           SHR ECX, 2
;      5D6:       E303             JECXZ L3
;      5D8:       FD               STD
;      5D9:       F2               REPNE
;      5DA:       A5               MOVSD
;      5DB: L3:   8D6704           LEA ESP, [EDI+4]
;      5DE:       8945F4           MOV [EBP-12], EAX
;      5E1:       8B4DC8           MOV ECX, [EBP-56]
;      5E4:       8B55CC           MOV EDX, [EBP-52]
;      5E7:       8B05D8010005     MOV EAX, [#x50001D8]
;      5ED:       64               BYTE #X64
;      5EE:       890C0500000000   MOV [EAX], ECX
;      5F5:       8B0518020005     MOV EAX, [#x5000218]

;      5FB:       64               BYTE #X64
;      5FC:       89140500000000   MOV [EAX], EDX
;      603:       8B75C4           MOV ESI, [EBP-60]
;      606:       8B1518030005     MOV EDX, [#x5000318]
;      60C:       64               BYTE #X64
;      60D:       8B141500000000   MOV EDX, [EDX]
;      614:       39D6             CMP ESI, EDX
;      616:       7431             JEQ L6
;      618: L4:   8B42FC           MOV EAX, [EDX-4]
;      61B:       09C0             OR EAX, EAX
;      61D:       7415             JEQ L5
;      61F:       8B4AF8           MOV ECX, [EDX-8]
;      622:       8B5811           MOV EBX, [EAX+17]
;      625:       64               BYTE #X64
;      626:       890C1D00000000   MOV [EBX], ECX
;      62D:       C742FC00000000   MOV DWORD PTR [EDX-4], 0
;      634: L5:   83EA08           SUB EDX, 8
;      637:       39D6             CMP ESI, EDX
;      639:       75DD             JNE L4
;      63B:       8B0D18030005     MOV ECX, [#x5000318]
;      641:       64               BYTE #X64
;      642:       89140D00000000   MOV [ECX], EDX
;      649: L6:   8B75F4           MOV ESI, [EBP-12]
;      64C:       8B4DF0           MOV ECX, [EBP-16]
;      64F:       8B45F8           MOV EAX, [EBP-8]
;      652:       83F904           CMP ECX, 4
;      655:       750F             JNE L7
;      657:       8B56FC           MOV EDX, [ESI-4]
;      65A:       8B7DFC           MOV EDI, [EBP-4]
;      65D:       8BE5             MOV ESP, EBP
;      65F:       8BEF             MOV EBP, EDI
;      661:       83C002           ADD EAX, 2
;      664:       FFE0             JMP EAX
;      666: L7:   8BDD             MOV EBX, EBP
;      668:       8B6DFC           MOV EBP, [EBP-4]
;      66B:       E9A8695AF7       JMP #x1000018              ;
RETURN-MULTIPLE
;      670:       CC0A             BREAK 10                   ; error trap
;      672:       02               BYTE #X02
;      673:       18               BYTE #X18                  ;
INVALID-ARG-COUNT-ERROR
;      674:       4D               BYTE #X4D                  ; ECX
;      675:       CC0A             BREAK 10                   ; error trap
;      677:       02               BYTE #X02
;      678:       08               BYTE #X08                  ;
OBJECT-NOT-FIXNUM-ERROR
;      679:       8E               BYTE #X8E                  ; EDX
;      67A:       CC0A             BREAK 10                   ; error trap
;      67C:       04               BYTE #X04
;      67D:       08               BYTE #X08                  ;
OBJECT-NOT-FIXNUM-ERROR
;      67E:       FECE01           BYTE #XFE, #XCE, #X01      ; EDI
;      681: L8:   CC0A             BREAK 10                   ; error trap
;      683:       02               BYTE #X02
;      684:       1A               BYTE #X1A                  ;
UNBOUND-SYMBOL-ERROR
;      685:       0E               BYTE #X0E                  ; EAX
;


YUCK!

What's going on that makes the Slime version *so much simpler* then the
raw SBCL version?

   -- MJF

From: Barry Margolin
Subject: Re: What's up with DISASSEMBLE and Slime?
Date: 
Message-ID: <barmar-F2C5F8.23341421022005@comcast.dca.giganews.com>
In article <········@x-privat.org>, M Jared Finder <·····@hpalace.com> 
wrote:

> I've been experimenting with SBCL, seeing how good the optimizer is, but
> I've hit a pretty big snag.  DISASSEMBLE is acting different under Slime
> then it is if I run SBCL at the command line!  For example the function:

...
> 
> What's going on that makes the Slime version *so much simpler* then the
> raw SBCL version?

Are you also using Slime to compile the function in the first place?  It 
looks like the function was compiled with high optimization in one case, 
and with no optimization in the other.

-- 
Barry Margolin, ······@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
From: Christophe Rhodes
Subject: Re: What's up with DISASSEMBLE and Slime?
Date: 
Message-ID: <sqmztxdmcu.fsf@cam.ac.uk>
M Jared Finder <·····@hpalace.com> writes:

> What's going on that makes the Slime version *so much simpler* then
> the raw SBCL version?

Different compilation options.

Christophe
From: Juho Snellman
Subject: Re: What's up with DISASSEMBLE and Slime?
Date: 
Message-ID: <slrnd1m3s7.ifj.jsnell@sbz-31.cs.Helsinki.FI>
<·····@hpalace.com> wrote:
> I've been experimenting with SBCL, seeing how good the optimizer is, but
> I've hit a pretty big snag.  DISASSEMBLE is acting different under Slime
> then it is if I run SBCL at the command line!  For example the function:

[ Assuming that you actually have the same compiler policy in both 
  lisp instances. ]

I think I've seem this behaviour before, but forgot to investigate
whether it was by design or by accident. As a workaround you can wrap
the defun in an eval:

* (declaim (optimize (speed 3) (safety 0)))
* (eval '(defun binary-add (x y)
           (declare (fixnum x y))
           (the fixnum (+ x y))))

BINARY-ADD
* (disassemble 'binary-add)

; 00028F48:       4801FA           ADD RDX, RDI               ; no-arg-parsing entry point
;       4B:       488B4DF0         MOV RCX, [RBP-16]
;       4F:       488B45F8         MOV RAX, [RBP-8]
;       53:       4883C103         ADD RCX, 3
;       57:       488BE5           MOV RSP, RBP
;       5A:       488BE8           MOV RBP, RAX
;       5D:       48               BYTE #X48
;       5E:       FFE1             JMP ECX
;                                                                          
NIL
* (defun binary-add (x y)
    (declare (fixnum x y))
    (the fixnum (+ x y)))
STYLE-WARNING: redefining BINARY-ADD in DEFUN

BINARY-ADD
* (disassemble 'binary-add)
;
; 000B8164:       40               BYTE #X40                  ; no-arg-parsing entry point
;      165:       C604251803004000 MOV BYTE PTR [#x40000318], 0
;      16D:       40               BYTE #X40
;      16E:       C60425E802004008 MOV BYTE PTR [#x400002E8], 8
[... about 170 lines removed ...]
;      3C5: L7:   CC0A             BREAK 10                   ; error trap
;      3C7:       02               BYTE #X02
;      3C8:       1A               BYTE #X1A                  ; UNBOUND-SYMBOL-ERROR
;      3C9:       0F               BYTE #X0F                  ; RAX

-- 
Juho Snellman
"Premature profiling is the root of all evil."
From: M Jared Finder
Subject: Re: What's up with DISASSEMBLE and Slime?
Date: 
Message-ID: <421c0fdf$1@x-privat.org>
Juho Snellman wrote:
> <·····@hpalace.com> wrote:
> 
>>I've been experimenting with SBCL, seeing how good the optimizer is, but
>>I've hit a pretty big snag.  DISASSEMBLE is acting different under Slime
>>then it is if I run SBCL at the command line!  For example the function:
> 
> [ Assuming that you actually have the same compiler policy in both 
>   lisp instances. ]
> 
> I think I've seem this behaviour before, but forgot to investigate
> whether it was by design or by accident. As a workaround you can wrap
> the defun in an eval:
> 
> * (declaim (optimize (speed 3) (safety 0)))
> * (eval '(defun binary-add (x y)
>            (declare (fixnum x y))
>            (the fixnum (+ x y))))

Yup that's the difference.  I noticed that putting the code in a file
and using LOAD also causes it to be optimized.  But that makes no sense;
why is top level code left unoptimized?  I assume that its something
about the way DECLAIM and PROCLAIM work, but I thought they affect all
code evaluated after the declamation/proclamation.

   -- MJF
From: Rahul Jain
Subject: Re: What's up with DISASSEMBLE and Slime?
Date: 
Message-ID: <87hdjxnptw.fsf@nyct.net>
M Jared Finder <·····@hpalace.com> writes:

> Yup that's the difference.  I noticed that putting the code in a file
> and using LOAD also causes it to be optimized.  But that makes no sense;
> why is top level code left unoptimized?  I assume that its something
> about the way DECLAIM and PROCLAIM work, but I thought they affect all
> code evaluated after the declamation/proclamation.

No, they affect all code COMPILEd after the declamation/proclamation. 
SBCL's interactive "compile" isn't a COMPILE. It tries to get the job
done quickly. If you want properly optimized compilation, COMPILE the
function.

-- 
Rahul Jain
·····@nyct.net
Professional Software Developer, Amateur Quantum Mechanicist
From: M Jared Finder
Subject: Re: What's up with DISASSEMBLE and Slime?
Date: 
Message-ID: <42233877_3@x-privat.org>
Rahul Jain wrote:
> M Jared Finder <·····@hpalace.com> writes:
> 
>>Yup that's the difference.  I noticed that putting the code in a file
>>and using LOAD also causes it to be optimized.  But that makes no sense;
>>why is top level code left unoptimized?  I assume that its something
>>about the way DECLAIM and PROCLAIM work, but I thought they affect all
>>code evaluated after the declamation/proclamation.
> 
> No, they affect all code COMPILEd after the declamation/proclamation. 
> SBCL's interactive "compile" isn't a COMPILE. It tries to get the job
> done quickly. If you want properly optimized compilation, COMPILE the
> function.

Ah, that makes sense.  It's an unfortunate side effect of SBCL's way of 
compiling everything with out doing a true compile, that you can't just 
write:

* (compile 'binary-add)

because BINARY-ADD is already compiled, and instead have to write:

* (compile 'binary-add '(lambda (x y)
                           (declare (fixnum x y))
                           (the fixnum (+ x y))))

   -- MJF