From: Greg Menke
Subject: Assembler in Common Lisp
Date: 
Message-ID: <m37jsma4wg.fsf@europa.pienet>
Sometime last Fall I asked a question about how CL might be used to
create a macro assembler for a microcontroller.  I finally got around
to giving it a try and I think its working.  Its not entirely
complete, there are more instructions & directives to implement and
things related to generating binary images to do, but I wanted to
sanity check some of the basic design.  The assembler is working now
with the instructions I've implemented so far.  Hopefully this is
enough detail to suggest if I'm doing things right or not, which I
hope you might comment on.

- Assembler implementation is in package 'asm51.  The assembler
  creates a "private" package & imports 'common-lisp and the assembler
  compile-time symbols.  Code to be assembled is presented to the
  Assembler via a stream.

- Assembly occurs with *package* set to the private package, so all of
  common-lisp and the assembler directives are directly available to
  the user's sourcecode.

- Two directives are implemented as macros, "defcode" and "defdata".
  They accept a lambda-ish arrangement of sexp-formatted assembly, as
  follows;

        (defcode temp
           (nop)
           (mov  A 1)
           (call 'xyzpdq)
           (ret))

  Which causes the assembler to create a symbol named "TEMP" in the
  private package & save off all sexps after the symbol name as the
  contents of the symbol, to be assembled later.  Note the quoted
  symbol, it refers to a symbol introduced by an invocation of defcode
  elsewhere in the source.

- Each instruction is set up as a defmethod with specializations for
  all the legal combinations of parameters.  In the (nop) example,
  there is only one specialization.  The (mov ..) example, a number of
  specializations exist, including one that serves this case, a
  literal value is moved to the accumulator register.  All the cpu
  registers exist as variously typed symbols to support the
  specializations.

- First, (read) is used to get each sexp.  (eval) is used to execute
  it.  This causes the defcode/defdata macros to expand & execute,
  introducing each symbol in the sourcecode.

- After all symbols are introduced, the contents of each symbol are
  individually (eval)'ed, causing the specialized defmethods to be
  invoked which themselves yield Lisp "object" macros that will be
  used last to generate the binary output.  Additionally, this step
  provides the runtime size of the code or data represented by the
  symbol, so now each symbol knows how long it is & runtime addresses
  are then assigned.

- Symbol references like 'xyzpdq above are resolved last.  Once each
  symbol has a runtime address, the address is stored to the symbol's
  symbol-value, so the presence of a symbol in an expression will
  yield its runtime address in the last step.

- Lastly the "object" macros in each symbol are (eval)ed, yielding the
  output binary.

The upshot of (read) and (eval) here is that the user can employ all
of Common Lisp as an assembly-time "metalanguage".  I think this means
I can use defmacros ,etc.. in the source code to build a fancy macro
assembler.  By introducing each symbol given in the source as a
full-blown Lisp symbol, I don't have to parse anything by hand or keep
anything anywhere outside an instance of a defstruct in each symbol's
plist which the assembler references as necessary.  The assembler
internals sanity check incoming values as code is emitted, so
incorrect types/magnitudes will be caught.

The assembler source remains nice and simple, the bulk of the content
is defintions of the cpu registers and the instruction defmethods.

Does this sound like the right sort of approach? 

Thanks,

Gregm

From: Jeff
Subject: Re: Assembler in Common Lisp
Date: 
Message-ID: <RtiOc.212881$XM6.52468@attbi_s53>
Greg Menke wrote:

> 
> Sometime last Fall I asked a question about how CL might be used to
> create a macro assembler for a microcontroller.  I finally got around
> to giving it a try and I think its working.  Its not entirely
> complete, there are more instructions & directives to implement and
> things related to generating binary images to do, but I wanted to
> sanity check some of the basic design.  The assembler is working now
> with the instructions I've implemented so far.  Hopefully this is
> enough detail to suggest if I'm doing things right or not, which I
> hope you might comment on.
> 
>         (defcode temp
>            (nop)
>            (mov  A 1)
>            (call 'xyzpdq)
>            (ret))

I've actually been working on something similar for the ARM7TDMI.
Coming from a Forth background, I was curiously wondering how something
like this might be done. I'm using Corman Lisp, and it is nice to see
how Roger implemented the x86 assembler in it. You may want to download
it and give it a look. For example:

  (defasm some-code (x)
    {
      mov eax,ecx
      jne :t1
    :t1
      ret
    })

He basically went through and made read macros for { , : [ etc. and
just executes them at read time into actually instructions.

My version (for the ARM) is much more macro based and acts very much
like your version (and Forth I might add). If you'd like to compare
notes sometimes, feel free to email me. I'd like to know how you
handled forward branching.

Jeff
From: Greg Menke
Subject: Re: Assembler in Common Lisp
Date: 
Message-ID: <m31xiuf9ar.fsf@europa.pienet>
"Jeff" <···@nospam.insightbb.com> writes:

> Greg Menke wrote:
> 
> I've actually been working on something similar for the ARM7TDMI.
> Coming from a Forth background, I was curiously wondering how something
> like this might be done. I'm using Corman Lisp, and it is nice to see
> how Roger implemented the x86 assembler in it. You may want to download
> it and give it a look. For example:
> 
>   (defasm some-code (x)
>     {
>       mov eax,ecx
>       jne :t1
>     :t1
>       ret
>     })
> 
> He basically went through and made read macros for { , : [ etc. and
> just executes them at read time into actually instructions.
> 
> My version (for the ARM) is much more macro based and acts very much
> like your version (and Forth I might add). If you'd like to compare
> notes sometimes, feel free to email me. I'd like to know how you
> handled forward branching.

I handle it by accumulating symbols first, computing the size of the
data represented by each symbol, then assigning addresses.  Once
that's done, each instruction is generated.  Since the runtime
addresses are known at compile-time, forward references are no
different than any other kind.  Instances of code or data symbols as
operands are evaluated the normal way at compile-time, so arithmetic
upon them and vector tables using them act "normally".  References
within a symbol like :ti in your example are offsets from the start of
the symbol.

The key to making this work seems to be folding each instruction's
operands into a macro, that are evaluated later on once the addresses
are known.  That makes intuitive sense from a "call" standpoint, but
it also works with the db,dw concepts.

I opted for the sexp based syntax because I think I want to
symbolically compose assembly just like I would Lisp.  By maintaining
a consistent syntax, I think building up a macro language on top may
be a bit easier.

btw, my target is an 8052A, and the ultimate goal is to build a
"Lisp-ish" compiled language to use on it instead of the built-in
Basic.

Gregm
From: Jeff
Subject: Re: Assembler in Common Lisp
Date: 
Message-ID: <1nkOc.195644$JR4.92383@attbi_s54>
Greg Menke wrote:

> "Jeff" <···@nospam.insightbb.com> writes:
> 
> > My version (for the ARM) is much more macro based and acts very much
> > like your version (and Forth I might add). If you'd like to compare
> > notes sometimes, feel free to email me. I'd like to know how you
> > handled forward branching.
> 
> I handle it by accumulating symbols first, computing the size of the
> data represented by each symbol, then assigning addresses.  Once
> that's done, each instruction is generated.  Since the runtime
> addresses are known at compile-time, forward references are no
> different than any other kind. 

Something you made find useful from the Forth land is the use of a
stack for forward references. This makes life so much easier, and
actually makes the assembly code much easier to read as well. For
example, an example from my assembler would be:

(defasm graphics
  (mov r2 4)
  (cmp r0 2)
  (if le
    (lsl r1 r2 8)
    (orr r0 r1))
  (lsl r1 r2 24)
  (strh r0 r1)
  (bx lr))

The IF macro, in this case, would save the temporary address of its
location and reserve the bytes necessary for branch operation and then
compile all the instructions left with &rest, which returns the number
of bytes compiles inside. I take the same approach for looping
constructs:

(defasm count-to-10
  (loop-until eq
    (sub r0 r0 1)
    (cmp r0 0)))

Jeff
From: Greg Menke
Subject: Re: Assembler in Common Lisp
Date: 
Message-ID: <m3vfg5ejtu.fsf@europa.pienet>
"Jeff" <···@nospam.insightbb.com> writes:

> Greg Menke wrote:
> 
> > "Jeff" <···@nospam.insightbb.com> writes:
> > 
> > > My version (for the ARM) is much more macro based and acts very much
> > > like your version (and Forth I might add). If you'd like to compare
> > > notes sometimes, feel free to email me. I'd like to know how you
> > > handled forward branching.
> > 
> > I handle it by accumulating symbols first, computing the size of the
> > data represented by each symbol, then assigning addresses.  Once
> > that's done, each instruction is generated.  Since the runtime
> > addresses are known at compile-time, forward references are no
> > different than any other kind. 
> 
> Something you made find useful from the Forth land is the use of a
> stack for forward references. This makes life so much easier, and
> actually makes the assembly code much easier to read as well. For
> example, an example from my assembler would be:
> 
> (defasm graphics
>   (mov r2 4)
>   (cmp r0 2)
>   (if le
>     (lsl r1 r2 8)
>     (orr r0 r1))
>   (lsl r1 r2 24)
>   (strh r0 r1)
>   (bx lr))

But I don't think I need a stack.  When I run thru a symbol's assembly
source, the first evaluation of each entry yields an instance of a
struct that contains the # of bytes to be generated and a macro that
records the operands, which generates the output data when its
executed later on.  The # of bytes is used to compute the sizes and
addresses of the symbols so the macros can know the runtime addresses
when they're run.  This way forward references are the same as any
other kind- they're all just addresses.

Put another way, I reserve the space each instruction will consume
because I know how many bytes each one needs.  That gives enough info
to set up the addresses, and then I can generate the binary.  To take
your example above, I would implement (if) as a Lisp macro that yields
a bunch of assembly to implement the runtime behavior.

Gregm
From: Kalle Olavi Niemitalo
Subject: Re: Assembler in Common Lisp
Date: 
Message-ID: <87u0vpnbkz.fsf@Astalo.kon.iki.fi>
"Jeff" <···@nospam.insightbb.com> writes:

> The IF macro, in this case, would save the temporary address of its
> location and reserve the bytes necessary for branch operation and then
> compile all the instructions left with &rest, which returns the number
> of bytes compiles inside.

Wouldn't it then be more appropriate to call the macro WHEN?
From: Christophe Turle
Subject: Re: Assembler in Common Lisp
Date: 
Message-ID: <ced6ik$f4u$1@amma.irisa.fr>
Greg Menke wrote:
> Sometime last Fall I asked a question about how CL might be used to
> create a macro assembler for a microcontroller.  I finally got around
> to giving it a try and I think its working.  Its not entirely
> complete, there are more instructions & directives to implement and
> things related to generating binary images to do, but I wanted to
> sanity check some of the basic design.  The assembler is working now
> with the instructions I've implemented so far.  Hopefully this is
> enough detail to suggest if I'm doing things right or not, which I
> hope you might comment on.
> 
> - Assembler implementation is in package 'asm51.  The assembler
>   creates a "private" package & imports 'common-lisp and the assembler
>   compile-time symbols.  Code to be assembled is presented to the
>   Assembler via a stream.
> 
> - Assembly occurs with *package* set to the private package, so all of
>   common-lisp and the assembler directives are directly available to
>   the user's sourcecode.
> 
> - Two directives are implemented as macros, "defcode" and "defdata".
>   They accept a lambda-ish arrangement of sexp-formatted assembly, as
>   follows;
> 
>         (defcode temp
>            (nop)
>            (mov  A 1)
>            (call 'xyzpdq)
>            (ret))
> 
>   Which causes the assembler to create a symbol named "TEMP" in the
>   private package & save off all sexps after the symbol name as the
>   contents of the symbol, to be assembled later.  Note the quoted
>   symbol, it refers to a symbol introduced by an invocation of defcode
>   elsewhere in the source.
> 
> - Each instruction is set up as a defmethod with specializations for
>   all the legal combinations of parameters.  In the (nop) example,
>   there is only one specialization.  The (mov ..) example, a number of
>   specializations exist, including one that serves this case, a
>   literal value is moved to the accumulator register.  All the cpu
>   registers exist as variously typed symbols to support the
>   specializations.
> 
> - First, (read) is used to get each sexp.  (eval) is used to execute
>   it.  This causes the defcode/defdata macros to expand & execute,
>   introducing each symbol in the sourcecode.
> 
> - After all symbols are introduced, the contents of each symbol are
>   individually (eval)'ed, causing the specialized defmethods to be
>   invoked which themselves yield Lisp "object" macros that will be
>   used last to generate the binary output.  Additionally, this step
>   provides the runtime size of the code or data represented by the
>   symbol, so now each symbol knows how long it is & runtime addresses
>   are then assigned.
> 
> - Symbol references like 'xyzpdq above are resolved last.  Once each
>   symbol has a runtime address, the address is stored to the symbol's
>   symbol-value, so the presence of a symbol in an expression will
>   yield its runtime address in the last step.
> 
> - Lastly the "object" macros in each symbol are (eval)ed, yielding the
>   output binary.
> 
> The upshot of (read) and (eval) here is that the user can employ all
> of Common Lisp as an assembly-time "metalanguage".  I think this means
> I can use defmacros ,etc.. in the source code to build a fancy macro
> assembler.  By introducing each symbol given in the source as a
> full-blown Lisp symbol, I don't have to parse anything by hand or keep
> anything anywhere outside an instance of a defstruct in each symbol's
> plist which the assembler references as necessary.  The assembler
> internals sanity check incoming values as code is emitted, so
> incorrect types/magnitudes will be caught.
> 
> The assembler source remains nice and simple, the bulk of the content
> is defintions of the cpu registers and the instruction defmethods.
> 
> Does this sound like the right sort of approach? 
> 
> Thanks,
> 
> Gregm
> 

I think that defining sexp syntax for an other programming language is the right way to go. Sort of defining the meta-language
to program in a target language.

re-using CL facilities like (read) (eval) and macros is very cool too.

Can you decompose a very simple example ? to see all steps.

Can you post somewhere your code ?


ctu.

cturle @ irisa fr
From: Greg Menke
Subject: Re: Assembler in Common Lisp
Date: 
Message-ID: <m34qnpjtgo.fsf@europa.pienet>
Christophe Turle <······@nospam.fr> writes:
> 
> I think that defining sexp syntax for an other programming language
> is the right way to go. Sort of defining the meta-language to
> program in a target language.
> 
> re-using CL facilities like (read) (eval) and macros is very cool
> too.
> 
> Can you decompose a very simple example ? to see all steps.
> 
> Can you post somewhere your code ?
> 

When its a bit more complete, I will do so.  I have an object dump
working, but I need to implement the rest of the instructions, more
directives to control linking and the output image structure- and also
routines to emit the binary image in a useful form.  Shouldn't take
much longer- Lisp does the difficult stuff for me.

Gregm