stumped on a macro

From: David Wallis
Subject: stumped on a macro
Date: Mon, 28 Mar 2005 16:52:01 +0000
Message-ID: <Pine.OSX.4.60.0503281750420.4227@dwws-computer.local>

I'm trying to write my first non-trivial macro, and it's had me stumped 
for about 3 weeks. Can anyone help?

I contacted the list a while back about implementing operators to do array 
arithmetic. I got that working (see www.cpom.org/nlisp/index.html), but I 
want to re-do them as macros instead of functions and I'm struggling.

I want to write expressions like the following (Note (.iseq a b) is a 
function that returns an array of numbers between the  a and b.  I'm just 
using .iseq as an example of something that will return an array. It's not 
important to the problem).

(.+ (.iseq 1 10) 3d0 (.iseq 11 20))

And I want .+ to be a macro that will expand the above to something like:

(let ((r (make-array 10 :element-type '(complex double-float)))
       (a (.iseq 1 10))
       (b 3d0)
       (c (.iseq 11 20)))
    (dotimes (i 10)
       (setf (aref r i) (+ (aref a i) b (aref c i)))) r)

(Of course, for the real code, I should use gensyms, and I'll have a load 
of type declarations etc. for efficiancy).

One advantage of using a macro is that I want to get the array size at 
compile time to pass to make-array and for the dotimes loop. This will 
save doing them at run time and will speed it up no end. I can't think of 
any reason why this shouldn't be possible.

The macro has to look at each argument that I pass and insert either (aref 
x i) or just x, depending on the 'type' of the argument (if it's an array 
or a scaler) in the last line.

Here's the problem. If I have a function called .+ and call it like this:

(.+ (.iseq 1 10) 3d0)

lisp will evaluate the call to .iseq and pass to .+ an array and 3d0. With 
a macro call, the .iseq will not be evaluated, so the macro will get the 
expression (.iseq 1 10) and not the value.

So, if I don't have the values of the arguments, how can I find the type 
and array size? Or to put it another way, I need to evaluate the arguments 
to the macro to expand it.

My final macro will probably take a function as the fist argument, so I 
can define functions like .+ at compile time with any array dimensions and 
combination of array and scaler arguments. Or even pass a lambda 
expression and some arguments. Any function that operates on scaler 
arguments will do, and the macro writes the appropriate loop. This is all 
easy if I can just solve the current problem.

If anyone can suggest a solution or offer any help, I would be very 
greatful. Maybe my whole approach is wrong?

Re: stumped on a macro ···············@yahoo.com
- Re: stumped on a macro David Wallis
  - Re: stumped on a macro Bruce Stephens
    - Re: stumped on a macro Bruce Stephens
  - Re: stumped on a macro Pascal Bourguignon
  - Re: stumped on a macro Rene de Visser
  - Re: stumped on a macro Nicolas Neuss
Re: stumped on a macro Frank Buss
Re: stumped on a macro Pascal Bourguignon
Re: stumped on a macro Alan Crowe
Re: stumped on a macro Kalle Olavi Niemitalo

From: ···············@yahoo.com
Subject: Re: stumped on a macro
Date: Mon, 28 Mar 2005 17:20:41 +0000
Message-ID: <1112030441.024670.321160@g14g2000cwa.googlegroups.com>

> So, if I don't have the values of the arguments, how can I find the
type
> and array size? Or to put it another way, I need to evaluate the
arguments
> to the macro to expand it.

You've hit on some key issues here.  The macro will have no way of
knowing the array size, because the array doesn't yet exist at compile
file.  Can you pass the size in as an argument to the macro?  Or can it
be declared with defconstant?

Can the scalar appear anywhere in the arg list, or only in a
predictable position?  And can there be more than one scalar?  I'm
imagining a private language like
(.+ seq0 'scalar 3d0 seq1 seq2 'scalar (complex 3 4) seq3)

I enjoy this kind of question--how much speed can you buy with tricks
at compile time.  But wiser heads would warn against premature
optimization.  "The first rule of optimization is, don't"--meaning one
should use profiling to find the *real* bottlenecks in speed and space,
and optimize only them.  For instance, I'm not sure how much time it
would save to know the array size at compile time.  Twenty calls to +
of complex double-floats probably dominates one call to length for
arrays.

From: David Wallis
Subject: Re: stumped on a macro
Date: Mon, 28 Mar 2005 18:31:21 +0000
Message-ID: <Pine.OSX.4.60.0503281902500.4257@dwws-computer.local>

> You've hit on some key issues here.  The macro will have no way of
> knowing the array size, because the array doesn't yet exist at compile
> file.  Can you pass the size in as an argument to the macro?  Or can it
> be declared with defconstant?

Yes. I see your point, but...

The array doesn't exist at compile time, but the  code that will create 
the array exists at compile time (for any program you could write I 
think). I think the  information exists at  compile time, but I'm not sure 
how to get to it.

I'm not sure if this holds for interpreted expressions typed in at the top 
level?

> Can the scalar appear anywhere in the arg list, or only in a
> predictable position?  And can there be more than one scalar?  I'm
> imagining a private language like
> (.+ seq0 'scalar 3d0 seq1 seq2 'scalar (complex 3 4) seq3)

Hmmm.  The idea is to be able to write numerical expressions simply and 
concisely. For example:

(.sum (.exp (.* i 2d0 pi f0 (./ X N))))

where x here is an array. My nlisp project is to use CL as a basis for an 
IDL/Matlab type language by implementing array operators and functions. 
I'm using gnuplot for plotting and a FFI to GSL for the numerical 
library.

You may be right about the optimization. I'll do some timing tests. The 
CMUCL compiler does give a warning 'though about being unable to optimize 
because it doesn't know the dimensions of the array. If it's not a 
problem, maybe I can keep it as a function, but I don't see an obvious 
way of doing this because I still need to generate a loop with the correct 
positions of x and (aref x i). (I do need specific positions so I can 
write any expression). At the moment, array operators in nlisp only take 2 
arguments, which is a pain.

From: Bruce Stephens
Subject: Re: stumped on a macro
Date: Mon, 28 Mar 2005 19:08:49 +0000
Message-ID: <874qevpozi.fsf@cenderis.demon.co.uk>

David Wallis <···@dwws-computer.local> writes:

[...]

> The array doesn't exist at compile time, but the code that will
> create the array exists at compile time (for any program you could
> write I think). I think the information exists at compile time, but
> I'm not sure how to get to it.

It exists, but I suspect only in the sense that if you evaluate the
code you can get the result.

How about having the macros evaluate the expression twice: once to get
size and type information, and again to construct code to evaluate the
expression?  (Alternatively you could think of the macro constructing
an expression tree, annotating it with size and type information, and
then walking it to produce code.)

I think that would work for arrays of constant size (constant known at
compile time).  I don't think you'd be able to use it conveniently
with functions (so I suspect users would need to use macros a lot).

I think it ought to work, though?

From: Bruce Stephens
Subject: Re: stumped on a macro
Date: Mon, 28 Mar 2005 19:31:45 +0000
Message-ID: <87u0mvo9cu.fsf@cenderis.demon.co.uk>

Bruce Stephens <············@cenderis.demon.co.uk> writes:

[...]

> I think it ought to work, though?

No, now I think about it, that doesn't make sense.

I think the basic idea would work: if things like .+, .sum, etc., were
functions which constructed some representation of the expression,
then you could build a macro that would produce code that would
evaluate the expression (having calculated array sizes and types).
That might be inconvenient to use, though.  Maybe it wouldn't be worth
the effort.

From: Pascal Bourguignon
Subject: Re: stumped on a macro
Date: Mon, 28 Mar 2005 18:52:44 +0000
Message-ID: <87mzsnvc03.fsf@thalassa.informatimago.com>

David Wallis <···@dwws-computer.local> writes:

> > You've hit on some key issues here.  The macro will have no way of
> > knowing the array size, because the array doesn't yet exist at compile
> > file.  Can you pass the size in as an argument to the macro?  Or can it
> > be declared with defconstant?
> 
> Yes. I see your point, but...
> 
> The array doesn't exist at compile time, but the  code that will
> create the array exists at compile time (for any program you could
> write I think). I think the  information exists at  compile time, but
> I'm not sure how to get to it.

In general, the fastest way to get the result given the code, is to
execute the code.  Oops, it's not compilation time anymore!

In practice, you can get good results with  type inference or type
declarations. 

> I'm not sure if this holds for interpreted expressions typed in at the
> top level?

Well, do you know a lot of C++ implementation with any kind of "top
level" (REPL)?

> Hmmm.  The idea is to be able to write numerical expressions simply
> and concisely. For example:
> 
> (.sum (.exp (.* i 2d0 pi f0 (./ X N))))
> 
> where x here is an array. My nlisp project is to use CL as a basis for
> an IDL/Matlab type language by implementing array operators and
> functions. I'm using gnuplot for plotting and a FFI to GSL for the
> numerical library.

Perhaps you could have a look at maxima...

-- 
__Pascal Bourguignon__                     http://www.informatimago.com/

Nobody can fix the economy.  Nobody can be trusted with their finger
on the button.  Nobody's perfect.  VOTE FOR NOBODY.

From: Rene de Visser
Subject: Re: stumped on a macro
Date: Mon, 28 Mar 2005 20:17:46 +0000
Message-ID: <d29opa$kha$05$1@news.t-online.com>

"David Wallis" <···@dwws-computer.local> schrieb im Newsbeitrag > You may be 
right about the optimization. I'll do some timing tests. The
> CMUCL compiler does give a warning 'though about being unable to optimize 
> because it doesn't know the dimensions of the array. If it's not a 
> problem, maybe I can keep it as a function, but I don't see an obvious

Unfortunately I can't remember the name of the paper describing how to do 
this.
Something with "type specialisation" I think.

Basically the idea is that you generate separate code for each possible 
number of dimensions. i.e. code for 1-D, 2-D, 3-D, etc.

At run time you branch to the correct version, but you do this as early as 
possible
(outside of any loops for example if possible)
i.e. you do a type dispatching at run time to dispatch to the correct code.

something like.

If array1 is 3d and array2 is 2d then
call array-2d-3d multiply.

array-2d-3d multiply. is compiled to be especially fast for this case.

because the dispatching is very quick compared to the cost of calculations 
you get a good speed up.

You can use macros to generate the many diffent versions of the routines 
that you need.

Rene.

From: Nicolas Neuss
Subject: Re: stumped on a macro
Date: Tue, 29 Mar 2005 11:58:23 +0000
Message-ID: <87zmwmk6jk.fsf@ortler.iwr.uni-heidelberg.de>

David Wallis <···@dwws-computer.local> writes:

> Hmmm.  The idea is to be able to write numerical expressions simply and
> concisely. For example:
>
> (.sum (.exp (.* i 2d0 pi f0 (./ X N))))
>
> where x here is an array. My nlisp project is to use CL as a basis for an
> IDL/Matlab type language by implementing array operators and functions. I'm
> using gnuplot for plotting and a FFI to GSL for the numerical library.

I suggest to allow these operators to be generic functions.  Have a look at
Matlisp (a CL interface to LAPACK), and how Femlisp (a toolbox for solving
partial differential equations) does runtime compilation to get reasonable
performance for long vectors.

Nicolas.

From: Frank Buss
Subject: Re: stumped on a macro
Date: Mon, 28 Mar 2005 17:05:59 +0000
Message-ID: <d29dhn$ij5$1@newsreader3.netcologne.de>

David Wallis <···@dwws-computer.local> wrote:

> Maybe my whole approach is wrong?

perhaps yes. Macro arguments are not evaluated, because this is in general 
not possible at compile time and macros are expanded at compile time (or 
read-time? I don't know). For example:

(defun get-array-or-integer-depending-on-user-input ()
...
)

(my-macro (get-array-or-integer-depending-on-user-input)) ?

-- 
Frank Bu�, ··@frank-buss.de
http://www.frank-buss.de, http://www.it4-systems.de

From: Pascal Bourguignon
Subject: Re: stumped on a macro
Date: Mon, 28 Mar 2005 17:52:45 +0000
Message-ID: <87r7hzves2.fsf@thalassa.informatimago.com>

David Wallis <···@dwws-computer.local> writes:
> Maybe my whole approach is wrong?

Yes. Ponder macro-expansion time vs. run-time, and think about the
difference between static typing and dynamic typing.

You'd need a compiler for a sublanguage with type inferance or type
declarations (and loose the dynamic typing) to be able to know at
compilation time the size of the intermediary and final results.

How would you compile:

(defun f (x) (make-array (list x x) :initial-element 1))
(.+ (f (random (progn (princ "Please enter a positive integer:") (read)))) 2)

If you accept types such as (matrix * *) in this language, you'll be
back to the starting case: lisp.

Once you've limited your language to types of predefined size, you'll
still have to define the time when allocation will be done.  At the
beginning of a function?  Of the program?  Within the loops?  Which
means that you'll need to analyse the program _globally_.  So you
won't be able even to just analyse one expression (what if it's inside
a loop?), but you'll have to analyse whole files.


-- 
__Pascal Bourguignon__                     http://www.informatimago.com/
Wanna go outside.
Oh, no! Help! I got outside!
Let me back inside!

From: Alan Crowe
Subject: Re: stumped on a macro
Date: Mon, 28 Mar 2005 19:05:04 +0000
Message-ID: <86ekdz4mn3.fsf@cawtech.freeserve.co.uk>

David Wallis strains to move things to complie time
> And I want .+ to be a macro that will expand the above to something like:
>
> (let ((r (make-array 10 :element-type '(complex double-float)))
>        (a (.iseq 1 10))
>        (b 3d0)
>        (c (.iseq 11 20)))
>     (dotimes (i 10)
>        (setf (aref r i) (+ (aref a i) b (aref c i)))) r)
>
> (Of course, for the real code, I should use gensyms, and I'll have a load 
> of type declarations etc. for efficiancy).
>
> One advantage of using a macro is that I want to get the array size at 
> compile time to pass to make-array and for the dotimes loop. This will 
> save doing them at run time and will speed it up no end. I can't think of 
> any reason why this shouldn't be possible.

Thinking back to my assembler writing days I cannot see that
knowing the array size at compile time is going to make a
difference.

The core of the code will still end up something like this

Label loop
Load reg1 indirect addr-reg-x 
load reg2 indirect addr-reg-y with post increment
call subroutine foo ; f(reg1,reg2) -> reg1
store reg1 indirect addr-reg-x with post increment
compare reg3 with addr-reg-x 
branch-greater loop ; not there yet

earlier reg3 was initialised perhaps by

add immediate compile-time-length reg1 -> reg3

or by

add indirect addr-reg-x offset -1 reg1 reg3

depending on whether the size is known at compile time.
There is no great speed gain to be won here.

I also think that the problem, with the macro not being able
to find the size at compile time, is not soluble in CL.

I think the two issues are related; the language lacks a
feature that doesn't actually help.

Alan Crowe
Edinburgh
Scotland

From: Kalle Olavi Niemitalo
Subject: Re: stumped on a macro
Date: Mon, 28 Mar 2005 19:08:24 +0000
Message-ID: <87u0mvh9lj.fsf@Astalo.kon.iki.fi>

David Wallis <···@dwws-computer.local> writes:

> One advantage of using a macro is that I want to get the array size at
> compile time to pass to make-array and for the dotimes loop.

So you want .+ to know the array size at compile time.

> (.+ (.iseq 1 10) 3d0)

Then you need a way to detect at compile time that the form
(.iseq 1 10) would return a vector of length 10.  You could
define a function for this purpose:

(defun form-vector-length (form &optional environment)
  "Find the length of the vector that FORM would return.
Return a number if FORM is known to return a vector of that length.
Return :SCALAR if FORM is known not to return an array,
or :UNKNOWN if the type is not known or might vary between calls."
  ...)

Then, the .+ macro can call (form-vector-length '(.iseq 1 10)
#<environment>).  (Actually, the semantics might be better if .+
were a function with a compiler macro.)

I think the easiest way to make FORM-VECTOR-LENGTH work is to put
an analyzer function on the property list of the .ISEQ symbol.
FORM-VECTOR-LENGTH would look this function up and call it,
passing it the whole form (.iseq 1 10) and the environment as
arguments.  The function would then analyze the form in .ISEQ
specific ways and return the appropriate value.

There are two flaws in this approach:

* Because property lists are global, you cannot have analyzers
  for functions that were bound with FLET or LABELS.  This can be
  partially fixed by collecting lexical bindings of analyzers in
  a separate data structure, publishing it with MACROLET, and
  accessing it with either MACROEXPAND-1 or MACRO-FUNCTION.
  However, this still doesn't make FLET and LABELS automatically
  shadow the analyzers of outer bindings; for that, you'd need
  non-standard environment access functions, I think.

* It is not possible to write portable analyzers for binding
  forms such as LET.  In (.+ (let ((x 0)) (.+ (foo)))), the
  analyzer for LET would have to analyze the form (.+ (foo))
  in an environment that includes a variable binding for X.
  However, there is no portable way to construct such an
  environment, so you'd have to use your own object in its
  place... but then the inner .+ would get in trouble, because
  it would potentially need to expand the INCF macro, and
  MACROEXPAND wouldn't understand your custom environment object.
  The environment access functions ought to help here, too.
  Fortunately, FORM-VECTOR-LENGTH can always return :UNKNOWN in
  such situations, so the problem is not fatal.

It may also be possible to avoid all this trouble and instead
expand .+ to something that gives a sufficiently smart compiler
enough information to let it do the job on its own.  I don't know
how this would be done in practice.