I'm trying to understand how disassemble works. Here is a test in 2
implementations:
* (defun add (x y) (+ x y))
ADD
* (disassemble 'add)
in SBCL
; 23C06A25: 8B55F4 MOV EDX, [EBP-12] ; no-arg-
parsing entry point
; 28: 8B7DF0 MOV EDI, [EBP-16]
; 2B: E890973FFE CALL #x220001C0 ;
GENERIC-+
; 30: 7302 JNB L0
; 32: 8BE3 MOV ESP, EBX
; 34: L0: 8D65F8 LEA ESP, [EBP-8]
; 37: F8 CLC
; 38: 8B6DFC MOV EBP, [EBP-4]
; 3B: C20400 RET 4
; 3E: CC0A BREAK 10 ; error
trap
; 40: 02 BYTE #X02
; 41: 18 BYTE #X18 ;
INVALID-ARG-COUNT-ERROR
; 42: 4D BYTE #X4D ; ECX
;
in Corman
;Disassembling from address #xD72A70:
;#x0: 55 push ebp
;#x1: 8BEC mov ebp,esp
;#x3: 57 push edi
;#x4: 83F902 cmp ecx,00002h
;#x7: 7406 je 000F
;#x9: FF96C4100000 call near dword ptr [esi+0000010C4h] ;
Call COMMON-LISP::%WRONG-NUMBER-
OF-ARGS
;#xF: FF750C push dword ptr [ebp+0000Ch]
;#x12: 8B4508 mov eax,[ebp+00008h]
;#x15: 5A pop edx
;#x16: A807 test al,007h
;#x18: 750B jne 0025
;#x1A: F6C207 test dl,007h
;#x1D: 7506 jne 0025
;#x1F: 03C2 add eax,edx
;#x21: 7108 jno 002B
;#x23: 2BC2 sub eax,edx
;#x25: FF9604180000 call near dword ptr [esi+000001804h] ;
Call COMMON-LISP::%PLUS_EAX_EDX
;#x2B: B901000000 mov ecx,0001
;#x30: 8BE5 mov esp,ebp
;#x32: 5D pop ebp
;#x33: C3 ret
Does this mean that if I take any of the code above and use a
microsoft assembler I would be able to make a standalone program from
this function (or anyone in CL)?
On Jan 23, 3:41 pm, Francogrex <······@grex.org> wrote:
> Does this mean that if I take any of the code above and use a
> microsoft assembler I would be able to make a standalone program from
> this function (or anyone in CL)?
No. The disassemble function is provided to allow the user to see how
the compiler is compiling their code.
I can't talk generally for every implementation, but basically you can,
but in most cases 99.9999% you can't.
Normally a C compiler comes with an ABI standard (Application Binary
Interface) - it's basically a specification of how you call your
function, how you provide the arguments (through the stack, registers,
or both, and which way), etc. Also which registers need to preserved
across calls, and which can be used, etc.
For example take a look at the ARM ABI:
http://www.arm.com/products/DevTools/ABI.html
There is no ABI for Lisps - e.g. having Lispworks as DLL, and Allegro as
DLL won't do any good - they can't call each other directly, as there
was not established low-level protocol between them (such as ABI).
But granted, this is what gives freedom to Lisp Compilers to optimize in
areas, which are even not possible in standard C language - for example
- how closures are stored, non-local gotos are done, exception handling,
multiple values, optional values, garbage collecting, boxing, unboxing,
etc, most importantly deep and shallow binding.
As "C" is more or less a machine level language, it does not differ much
from what most of the CPU's around there are, and that's why such
interoperability is possible - e.g. library or DLL compiled with one "C"
implementation talking with another - even if there might be ways in
each of the compilers to do the things a little bit different, there is
almost always a compatibility mode (e.g. in gcc - do it the way msvc
does in mingw, or doit the way the ARM ABI says).
But you know, "C" ABI goes as far as the hardware being used is
similiar, and the compilers - for example ABI won't help you on the
Playstation3 where you have PPU code calling SPU one, or it won't help
you calling a network machine to another one, etc.
In such cases, either solutions like CFFI, IDL, or anything that packs
arguments, and environment (if needed), function call, and calls other
would work.
So in short - the dissasembly is useful to check whether the Lisp
Compiler is doing alright. I'm mostly using it to cross check certain
calcutions vs. "C" code - I'm not worried about the code size usually,
but the consing or anytime I see a GENERIC-+ call on a function that
usually operates on one and same type across large data - there is
definitely need for improvement
Just try this:
* (defun add (x y) (declare (fixnum x y) (optimize (speed 3) (safety 0)
(space 0) (debug 0))) (the fixnum (+ x y)))
STYLE-WARNING: redefining ADD in DEFUN
ADD
* (disassemble 'add)
; 23C1598A: 01FA ADD EDX, EDI ;
no-arg-parsing entry point
; 8C: 8D65F8 LEA ESP, [EBP-8]
; 8F: F8 CLC
; 90: 8B6DFC MOV EBP, [EBP-4]
; 93: C20400 RET 4
;
NIL
*
As you can see SBCL assumes that first argument is in EDX (or EDI),
second in EDI (or EDX), and result is in EDX - also it restores the
caller state (kind of like "Pascal" calling convention, instead of
relying on the callee (as in "C")
But you would get totally different results in other compilers
Francogrex wrote:
> I'm trying to understand how disassemble works. Here is a test in 2
> implementations:
>
> * (defun add (x y) (+ x y))
> ADD
> * (disassemble 'add)
>
> in SBCL
>
> ; 23C06A25: 8B55F4 MOV EDX, [EBP-12] ; no-arg-
> parsing entry point
> ; 28: 8B7DF0 MOV EDI, [EBP-16]
> ; 2B: E890973FFE CALL #x220001C0 ;
> GENERIC-+
> ; 30: 7302 JNB L0
> ; 32: 8BE3 MOV ESP, EBX
> ; 34: L0: 8D65F8 LEA ESP, [EBP-8]
> ; 37: F8 CLC
> ; 38: 8B6DFC MOV EBP, [EBP-4]
> ; 3B: C20400 RET 4
> ; 3E: CC0A BREAK 10 ; error
> trap
> ; 40: 02 BYTE #X02
> ; 41: 18 BYTE #X18 ;
> INVALID-ARG-COUNT-ERROR
> ; 42: 4D BYTE #X4D ; ECX
> ;
>
> in Corman
> ;Disassembling from address #xD72A70:
> ;#x0: 55 push ebp
> ;#x1: 8BEC mov ebp,esp
> ;#x3: 57 push edi
> ;#x4: 83F902 cmp ecx,00002h
> ;#x7: 7406 je 000F
> ;#x9: FF96C4100000 call near dword ptr [esi+0000010C4h] ;
> Call COMMON-LISP::%WRONG-NUMBER-
> OF-ARGS
> ;#xF: FF750C push dword ptr [ebp+0000Ch]
> ;#x12: 8B4508 mov eax,[ebp+00008h]
> ;#x15: 5A pop edx
> ;#x16: A807 test al,007h
> ;#x18: 750B jne 0025
> ;#x1A: F6C207 test dl,007h
> ;#x1D: 7506 jne 0025
> ;#x1F: 03C2 add eax,edx
> ;#x21: 7108 jno 002B
> ;#x23: 2BC2 sub eax,edx
> ;#x25: FF9604180000 call near dword ptr [esi+000001804h] ;
> Call COMMON-LISP::%PLUS_EAX_EDX
>
> ;#x2B: B901000000 mov ecx,0001
> ;#x30: 8BE5 mov esp,ebp
> ;#x32: 5D pop ebp
> ;#x33: C3 ret
>
>
> Does this mean that if I take any of the code above and use a
> microsoft assembler I would be able to make a standalone program from
> this function (or anyone in CL)?
>
>
Francogrex wrote:
> I'm trying to understand how disassemble works. Here is a test in 2
> implementations:
...
> Does this mean that if I take any of the code above and use a
> microsoft assembler I would be able to make a standalone program from
> this function (or anyone in CL)?
Not directly
- disassemble doesn't have to respect any external assembler's syntax
- it doesn't show the environment (libraries in memory, stack,
registers, etc.) required to successfully call the code
But the general premise (that you see the raw assembly) should be true.
- Daniel
D Herring <········@at.tentpost.dot.com> writes:
> Francogrex wrote:
> > I'm trying to understand how disassemble works. Here is a test in 2
> > implementations:
> ...
> > Does this mean that if I take any of the code above and use a
> > microsoft assembler I would be able to make a standalone program from
> > this function (or anyone in CL)?
>
> Not directly
> - disassemble doesn't have to respect any external assembler's syntax
> - it doesn't show the environment (libraries in memory, stack,
> registers, etc.) required to successfully call the code
>
> But the general premise (that you see the raw assembly) should be
> true.
But there is the additional compliation of lisp implementations that
don't compile to native code, but rather to an intermediate byte-code
for a virtual machine.
In such an implementation I would expect DISASSEMBLE to show the byte
code and not assembler.
--
Thomas A. Russ, USC/Information Sciences Institute
Thomas A. Russ wrote:
> D Herring <········@at.tentpost.dot.com> writes:
>
>> Francogrex wrote:
>>> I'm trying to understand how disassemble works. Here is a test in 2
>>> implementations:
>> ...
>>> Does this mean that if I take any of the code above and use a
>>> microsoft assembler I would be able to make a standalone program from
>>> this function (or anyone in CL)?
>> Not directly
>> - disassemble doesn't have to respect any external assembler's syntax
>> - it doesn't show the environment (libraries in memory, stack,
>> registers, etc.) required to successfully call the code
>>
>> But the general premise (that you see the raw assembly) should be
>> true.
>
> But there is the additional compilation of lisp implementations that
> don't compile to native code, but rather to an intermediate byte-code
> for a virtual machine.
>
> In such an implementation I would expect DISASSEMBLE to show the byte
> code and not assembler.
>
Eh. For a dead language, there sure are a bunch of options.
# clisp
[1]> (defun f (x) (+ x 1))
F
[2]> (disassemble #'f)
Disassembly of function F
1 required argument
0 optional arguments
No rest parameter
No keyword parameters
3 byte-code instructions:
0 (LOAD&PUSH 1)
1 (CALLS2 177) ; 1+
3 (SKIP&RET 2)
NIL
[3]>
Details... You win just because nobody built the hardware yet. ;)
- Daniel