I'm trying to write my first non-trivial macro, and it's had me stumped
for about 3 weeks. Can anyone help?
I contacted the list a while back about implementing operators to do array
arithmetic. I got that working (see www.cpom.org/nlisp/index.html), but I
want to re-do them as macros instead of functions and I'm struggling.
I want to write expressions like the following (Note (.iseq a b) is a
function that returns an array of numbers between the a and b. I'm just
using .iseq as an example of something that will return an array. It's not
important to the problem).
(.+ (.iseq 1 10) 3d0 (.iseq 11 20))
And I want .+ to be a macro that will expand the above to something like:
(let ((r (make-array 10 :element-type '(complex double-float)))
(a (.iseq 1 10))
(b 3d0)
(c (.iseq 11 20)))
(dotimes (i 10)
(setf (aref r i) (+ (aref a i) b (aref c i)))) r)
(Of course, for the real code, I should use gensyms, and I'll have a load
of type declarations etc. for efficiancy).
One advantage of using a macro is that I want to get the array size at
compile time to pass to make-array and for the dotimes loop. This will
save doing them at run time and will speed it up no end. I can't think of
any reason why this shouldn't be possible.
The macro has to look at each argument that I pass and insert either (aref
x i) or just x, depending on the 'type' of the argument (if it's an array
or a scaler) in the last line.
Here's the problem. If I have a function called .+ and call it like this:
(.+ (.iseq 1 10) 3d0)
lisp will evaluate the call to .iseq and pass to .+ an array and 3d0. With
a macro call, the .iseq will not be evaluated, so the macro will get the
expression (.iseq 1 10) and not the value.
So, if I don't have the values of the arguments, how can I find the type
and array size? Or to put it another way, I need to evaluate the arguments
to the macro to expand it.
My final macro will probably take a function as the fist argument, so I
can define functions like .+ at compile time with any array dimensions and
combination of array and scaler arguments. Or even pass a lambda
expression and some arguments. Any function that operates on scaler
arguments will do, and the macro writes the appropriate loop. This is all
easy if I can just solve the current problem.
If anyone can suggest a solution or offer any help, I would be very
greatful. Maybe my whole approach is wrong?
> So, if I don't have the values of the arguments, how can I find the
type
> and array size? Or to put it another way, I need to evaluate the
arguments
> to the macro to expand it.
You've hit on some key issues here. The macro will have no way of
knowing the array size, because the array doesn't yet exist at compile
file. Can you pass the size in as an argument to the macro? Or can it
be declared with defconstant?
Can the scalar appear anywhere in the arg list, or only in a
predictable position? And can there be more than one scalar? I'm
imagining a private language like
(.+ seq0 'scalar 3d0 seq1 seq2 'scalar (complex 3 4) seq3)
I enjoy this kind of question--how much speed can you buy with tricks
at compile time. But wiser heads would warn against premature
optimization. "The first rule of optimization is, don't"--meaning one
should use profiling to find the *real* bottlenecks in speed and space,
and optimize only them. For instance, I'm not sure how much time it
would save to know the array size at compile time. Twenty calls to +
of complex double-floats probably dominates one call to length for
arrays.
> You've hit on some key issues here. The macro will have no way of
> knowing the array size, because the array doesn't yet exist at compile
> file. Can you pass the size in as an argument to the macro? Or can it
> be declared with defconstant?
Yes. I see your point, but...
The array doesn't exist at compile time, but the code that will create
the array exists at compile time (for any program you could write I
think). I think the information exists at compile time, but I'm not sure
how to get to it.
I'm not sure if this holds for interpreted expressions typed in at the top
level?
> Can the scalar appear anywhere in the arg list, or only in a
> predictable position? And can there be more than one scalar? I'm
> imagining a private language like
> (.+ seq0 'scalar 3d0 seq1 seq2 'scalar (complex 3 4) seq3)
Hmmm. The idea is to be able to write numerical expressions simply and
concisely. For example:
(.sum (.exp (.* i 2d0 pi f0 (./ X N))))
where x here is an array. My nlisp project is to use CL as a basis for an
IDL/Matlab type language by implementing array operators and functions.
I'm using gnuplot for plotting and a FFI to GSL for the numerical
library.
You may be right about the optimization. I'll do some timing tests. The
CMUCL compiler does give a warning 'though about being unable to optimize
because it doesn't know the dimensions of the array. If it's not a
problem, maybe I can keep it as a function, but I don't see an obvious
way of doing this because I still need to generate a loop with the correct
positions of x and (aref x i). (I do need specific positions so I can
write any expression). At the moment, array operators in nlisp only take 2
arguments, which is a pain.
David Wallis <···@dwws-computer.local> writes:
[...]
> The array doesn't exist at compile time, but the code that will
> create the array exists at compile time (for any program you could
> write I think). I think the information exists at compile time, but
> I'm not sure how to get to it.
It exists, but I suspect only in the sense that if you evaluate the
code you can get the result.
How about having the macros evaluate the expression twice: once to get
size and type information, and again to construct code to evaluate the
expression? (Alternatively you could think of the macro constructing
an expression tree, annotating it with size and type information, and
then walking it to produce code.)
I think that would work for arrays of constant size (constant known at
compile time). I don't think you'd be able to use it conveniently
with functions (so I suspect users would need to use macros a lot).
I think it ought to work, though?
Bruce Stephens <············@cenderis.demon.co.uk> writes:
[...]
> I think it ought to work, though?
No, now I think about it, that doesn't make sense.
I think the basic idea would work: if things like .+, .sum, etc., were
functions which constructed some representation of the expression,
then you could build a macro that would produce code that would
evaluate the expression (having calculated array sizes and types).
That might be inconvenient to use, though. Maybe it wouldn't be worth
the effort.
David Wallis <···@dwws-computer.local> writes:
> > You've hit on some key issues here. The macro will have no way of
> > knowing the array size, because the array doesn't yet exist at compile
> > file. Can you pass the size in as an argument to the macro? Or can it
> > be declared with defconstant?
>
> Yes. I see your point, but...
>
> The array doesn't exist at compile time, but the code that will
> create the array exists at compile time (for any program you could
> write I think). I think the information exists at compile time, but
> I'm not sure how to get to it.
In general, the fastest way to get the result given the code, is to
execute the code. Oops, it's not compilation time anymore!
In practice, you can get good results with type inference or type
declarations.
> I'm not sure if this holds for interpreted expressions typed in at the
> top level?
Well, do you know a lot of C++ implementation with any kind of "top
level" (REPL)?
> Hmmm. The idea is to be able to write numerical expressions simply
> and concisely. For example:
>
> (.sum (.exp (.* i 2d0 pi f0 (./ X N))))
>
> where x here is an array. My nlisp project is to use CL as a basis for
> an IDL/Matlab type language by implementing array operators and
> functions. I'm using gnuplot for plotting and a FFI to GSL for the
> numerical library.
Perhaps you could have a look at maxima...
--
__Pascal Bourguignon__ http://www.informatimago.com/
Nobody can fix the economy. Nobody can be trusted with their finger
on the button. Nobody's perfect. VOTE FOR NOBODY.
"David Wallis" <···@dwws-computer.local> schrieb im Newsbeitrag > You may be
right about the optimization. I'll do some timing tests. The
> CMUCL compiler does give a warning 'though about being unable to optimize
> because it doesn't know the dimensions of the array. If it's not a
> problem, maybe I can keep it as a function, but I don't see an obvious
Unfortunately I can't remember the name of the paper describing how to do
this.
Something with "type specialisation" I think.
Basically the idea is that you generate separate code for each possible
number of dimensions. i.e. code for 1-D, 2-D, 3-D, etc.
At run time you branch to the correct version, but you do this as early as
possible
(outside of any loops for example if possible)
i.e. you do a type dispatching at run time to dispatch to the correct code.
something like.
If array1 is 3d and array2 is 2d then
call array-2d-3d multiply.
array-2d-3d multiply. is compiled to be especially fast for this case.
because the dispatching is very quick compared to the cost of calculations
you get a good speed up.
You can use macros to generate the many diffent versions of the routines
that you need.
Rene.
David Wallis <···@dwws-computer.local> writes:
> Hmmm. The idea is to be able to write numerical expressions simply and
> concisely. For example:
>
> (.sum (.exp (.* i 2d0 pi f0 (./ X N))))
>
> where x here is an array. My nlisp project is to use CL as a basis for an
> IDL/Matlab type language by implementing array operators and functions. I'm
> using gnuplot for plotting and a FFI to GSL for the numerical library.
I suggest to allow these operators to be generic functions. Have a look at
Matlisp (a CL interface to LAPACK), and how Femlisp (a toolbox for solving
partial differential equations) does runtime compilation to get reasonable
performance for long vectors.
Nicolas.
David Wallis <···@dwws-computer.local> wrote:
> Maybe my whole approach is wrong?
perhaps yes. Macro arguments are not evaluated, because this is in general
not possible at compile time and macros are expanded at compile time (or
read-time? I don't know). For example:
(defun get-array-or-integer-depending-on-user-input ()
...
)
(my-macro (get-array-or-integer-depending-on-user-input)) ?
--
Frank Bu�, ··@frank-buss.de
http://www.frank-buss.de, http://www.it4-systems.de
David Wallis <···@dwws-computer.local> writes:
> Maybe my whole approach is wrong?
Yes. Ponder macro-expansion time vs. run-time, and think about the
difference between static typing and dynamic typing.
You'd need a compiler for a sublanguage with type inferance or type
declarations (and loose the dynamic typing) to be able to know at
compilation time the size of the intermediary and final results.
How would you compile:
(defun f (x) (make-array (list x x) :initial-element 1))
(.+ (f (random (progn (princ "Please enter a positive integer:") (read)))) 2)
If you accept types such as (matrix * *) in this language, you'll be
back to the starting case: lisp.
Once you've limited your language to types of predefined size, you'll
still have to define the time when allocation will be done. At the
beginning of a function? Of the program? Within the loops? Which
means that you'll need to analyse the program _globally_. So you
won't be able even to just analyse one expression (what if it's inside
a loop?), but you'll have to analyse whole files.
--
__Pascal Bourguignon__ http://www.informatimago.com/
Wanna go outside.
Oh, no! Help! I got outside!
Let me back inside!
David Wallis strains to move things to complie time
> And I want .+ to be a macro that will expand the above to something like:
>
> (let ((r (make-array 10 :element-type '(complex double-float)))
> (a (.iseq 1 10))
> (b 3d0)
> (c (.iseq 11 20)))
> (dotimes (i 10)
> (setf (aref r i) (+ (aref a i) b (aref c i)))) r)
>
> (Of course, for the real code, I should use gensyms, and I'll have a load
> of type declarations etc. for efficiancy).
>
> One advantage of using a macro is that I want to get the array size at
> compile time to pass to make-array and for the dotimes loop. This will
> save doing them at run time and will speed it up no end. I can't think of
> any reason why this shouldn't be possible.
Thinking back to my assembler writing days I cannot see that
knowing the array size at compile time is going to make a
difference.
The core of the code will still end up something like this
Label loop
Load reg1 indirect addr-reg-x
load reg2 indirect addr-reg-y with post increment
call subroutine foo ; f(reg1,reg2) -> reg1
store reg1 indirect addr-reg-x with post increment
compare reg3 with addr-reg-x
branch-greater loop ; not there yet
earlier reg3 was initialised perhaps by
add immediate compile-time-length reg1 -> reg3
or by
add indirect addr-reg-x offset -1 reg1 reg3
depending on whether the size is known at compile time.
There is no great speed gain to be won here.
I also think that the problem, with the macro not being able
to find the size at compile time, is not soluble in CL.
I think the two issues are related; the language lacks a
feature that doesn't actually help.
Alan Crowe
Edinburgh
Scotland
David Wallis <···@dwws-computer.local> writes:
> One advantage of using a macro is that I want to get the array size at
> compile time to pass to make-array and for the dotimes loop.
So you want .+ to know the array size at compile time.
> (.+ (.iseq 1 10) 3d0)
Then you need a way to detect at compile time that the form
(.iseq 1 10) would return a vector of length 10. You could
define a function for this purpose:
(defun form-vector-length (form &optional environment)
"Find the length of the vector that FORM would return.
Return a number if FORM is known to return a vector of that length.
Return :SCALAR if FORM is known not to return an array,
or :UNKNOWN if the type is not known or might vary between calls."
...)
Then, the .+ macro can call (form-vector-length '(.iseq 1 10)
#<environment>). (Actually, the semantics might be better if .+
were a function with a compiler macro.)
I think the easiest way to make FORM-VECTOR-LENGTH work is to put
an analyzer function on the property list of the .ISEQ symbol.
FORM-VECTOR-LENGTH would look this function up and call it,
passing it the whole form (.iseq 1 10) and the environment as
arguments. The function would then analyze the form in .ISEQ
specific ways and return the appropriate value.
There are two flaws in this approach:
* Because property lists are global, you cannot have analyzers
for functions that were bound with FLET or LABELS. This can be
partially fixed by collecting lexical bindings of analyzers in
a separate data structure, publishing it with MACROLET, and
accessing it with either MACROEXPAND-1 or MACRO-FUNCTION.
However, this still doesn't make FLET and LABELS automatically
shadow the analyzers of outer bindings; for that, you'd need
non-standard environment access functions, I think.
* It is not possible to write portable analyzers for binding
forms such as LET. In (.+ (let ((x 0)) (.+ (foo)))), the
analyzer for LET would have to analyze the form (.+ (foo))
in an environment that includes a variable binding for X.
However, there is no portable way to construct such an
environment, so you'd have to use your own object in its
place... but then the inner .+ would get in trouble, because
it would potentially need to expand the INCF macro, and
MACROEXPAND wouldn't understand your custom environment object.
The environment access functions ought to help here, too.
Fortunately, FORM-VECTOR-LENGTH can always return :UNKNOWN in
such situations, so the problem is not fatal.
It may also be possible to avoid all this trouble and instead
expand .+ to something that gives a sufficiently smart compiler
enough information to let it do the job on its own. I don't know
how this would be done in practice.