In a tight Lispworks loop which involves array references, disassembling the
code shows that aref is being called, and each call is being set up, such
that each aref takes many times longer than it could take.
Is there any easy way to get the speed effect in Lisp of doing simple array
referencing in C/C++ such as x+=y[i] etc.? E.g., is there an inline version
of aref, or any way to make it inline?
By the way, the answers I have been getting in this forum have been very
helpful, and I hope a lot of other people reading them are finding them
helpful too. Someone should go through these discussions sometime and write
a reference book from them, or maybe put them in some kind of specialized
search engine to find and browse discussions related to a particular
question.
>>>>> "lispnewbie" == lispnewbie <··········@questions.com> writes:
lispnewbie> In a tight Lispworks loop which involves array references, disassembling the
lispnewbie> code shows that aref is being called, and each call is being set up, such
lispnewbie> that each aref takes many times longer than it could take.
lispnewbie> Is there any easy way to get the speed effect in Lisp of doing simple array
lispnewbie> referencing in C/C++ such as x+=y[i] etc.? E.g., is there an inline version
lispnewbie> of aref, or any way to make it inline?
Can't say specifically for Lispworks, but you should at least say what
type of array you're working on. CMUCL can inline aref if the array
is a 1-dimensional simple-array of "good" types like fixnum,
(signed-byte 32), single-float, double-float, etc. Otherwise, aref
can't be inlined, because the compiler doesn't know enough about the
array to inline the call.
Ray
On 03 Jun 2001 15:13:04 -0400, Raymond Toy <···@rtp.ericsson.se> wrote:
>Can't say specifically for Lispworks, but you should at least say what
>type of array you're working on. CMUCL can inline aref if the array
Ok, I posted some example code a few minutes ago, but my real question is
more general. In general, is there a way to explicitly tell the Lispworks
compiler to make a built-in function inline? Or can it only be done by
telling to you want speed, etc., and hoping it can figure out that making the
function inline would be helpful? And if that is the case, does it actually
do it for any specific functions? If so, what are they? And what is the
best way to find this general kind of information? Are there any specific
manuals which go into such details?
In article <··································@4ax.com>,
<··········@questions.com> wrote:
>Ok, I posted some example code a few minutes ago, but my real question is
>more general. In general, is there a way to explicitly tell the Lispworks
>compiler to make a built-in function inline? Or can it only be done by
>telling to you want speed, etc., and hoping it can figure out that making the
>function inline would be helpful?
No, there's no way to request that a particular built-in function be
open-coded; what if it doesn't have the information available to open-code
that function? In general, if the implementation knows how to open-code a
function call, it will do it automatically unless you've told it not to do
so (with a NOTINLINE declaration).
> And if that is the case, does it actually
>do it for any specific functions? If so, what are they? And what is the
>best way to find this general kind of information? Are there any specific
>manuals which go into such details?
I think you'll probably find that just about every implementation
open-codes low-level functions like CONS, CAR, and CDR. I don't think most
implementation document this level of detail; if you want to know whether a
particular function is inlined, disassemble a function that calls it.
--
Barry Margolin, ······@genuity.net
Genuity, Burlington, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.
Please DON'T copy followups to me -- I'll assume it wasn't posted to the group.
Barry Margolin <······@genuity.net> writes:
> No, there's no way to request that a particular built-in function be
> open-coded; what if it doesn't have the information available to open-code
> that function? In general, if the implementation knows how to open-code a
> function call, it will do it automatically unless you've told it not to do
> so (with a NOTINLINE declaration).
I think I would have said:
There's no way to request that a particular
function be "open-codable", since that is done by placing
an INLINE declaration BEFORE the time of definition, and since built-ins
are, by definition, things whose definition time has passed. There IS a way
to request that a particular function be "open-coded", in case the system
definers made the right declaration (or equivalent) earlier in order to
follow up on such a thing.
> I think you'll probably find that just about every implementation
> open-codes low-level functions like CONS, CAR, and CDR. I don't think most
> implementation document this level of detail; if you want to know whether a
> particular function is inlined,
[in a particular circumstance]
> disassemble a function that calls it.
Some compilers provisionally open-code things, so it won't tell you reliably
for the general case. Ask the implementor if you really want to be sure.
In article <···············@world.std.com>,
Kent M Pitman <······@world.std.com> wrote:
>Barry Margolin <······@genuity.net> writes:
>
>> No, there's no way to request that a particular built-in function be
>> open-coded; what if it doesn't have the information available to open-code
>> that function? In general, if the implementation knows how to open-code a
>> function call, it will do it automatically unless you've told it not to do
>> so (with a NOTINLINE declaration).
>
>I think I would have said:
>
>There's no way to request that a particular
>function be "open-codable", since that is done by placing
>an INLINE declaration BEFORE the time of definition, and since built-ins
>are, by definition, things whose definition time has passed. There IS a way
>to request that a particular function be "open-coded", in case the system
>definers made the right declaration (or equivalent) earlier in order to
>follow up on such a thing.
How do you request that a particular function be open-coded? I know you
can request that it *not* be open-coded, by using NOTINLINE. Are you
suggesting that he use INLINE explicitly on a built-in function?
My expectation (I don't think the standard actually says so, but in
practice I suspect I'm right) is that if the implementor made a function
open-codABLE, they default to making it open-codED, so such a declaration
is unnecessary (except to undo an earlier NOTINLINE declaration). Does
anyone actually suggest that people put the following in their code, just
in case the implementor chose the wrong default?
(declaim (inline car cdr cons + - * /))
The standard may not require implementations to open-code these functions,
but market forces generally do.
--
Barry Margolin, ······@genuity.net
Genuity, Burlington, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.
Please DON'T copy followups to me -- I'll assume it wasn't posted to the group.
Barry Margolin <······@genuity.net> writes:
> In article <···············@world.std.com>,
> Kent M Pitman <······@world.std.com> wrote:
> >Barry Margolin <······@genuity.net> writes:
> >
> >> No, there's no way to request that a particular built-in function be
> >> open-coded; what if it doesn't have the information available to open-code
> >> that function? In general, if the implementation knows how to open-code a
> >> function call, it will do it automatically unless you've told it not to do
> >> so (with a NOTINLINE declaration).
> >
> >I think I would have said:
> >
> >There's no way to request that a particular
> >function be "open-codable", since that is done by placing
> >an INLINE declaration BEFORE the time of definition, and since built-ins
> >are, by definition, things whose definition time has passed. There IS a way
> >to request that a particular function be "open-coded", in case the system
> >definers made the right declaration (or equivalent) earlier in order to
> >follow up on such a thing.
>
> How do you request that a particular function be open-coded? I know you
> can request that it *not* be open-coded, by using NOTINLINE. Are you
> suggesting that he use INLINE explicitly on a built-in function?
I think I'm suggesting he could.. unless someone finds a passage saying
you can't do that. I don't know that making an INLINE declarationw ould
do much good, for reasons you cite, but it doesn't seem to me to cause lots
of harm. It probably already is INLINE, as you note.
> My expectation (I don't think the standard actually says so, but in
> practice I suspect I'm right) is that if the implementor made a function
> open-codABLE, they default to making it open-codED, so such a declaration
> is unnecessary (except to undo an earlier NOTINLINE declaration).
Yes, but it might have been disabled subsequent to that. Not only by the
system folks, but by some "helpful" other user-package, including your own.
> Does
> anyone actually suggest that people put the following in their code, just
> in case the implementor chose the wrong default?
>
> (declaim (inline car cdr cons + - * /))
Well, I personally don't recommend EVER proclaiming or declaiming a system
function to be NOTINLINE, but it could happen. I would only do it by DECLARE
within a known lexical region, so it doesn't overflow into someone else's
code, forcing a need for a defensive INLINE. Normally I'd expect such an
INLINE to be superfluous, but ...
> The standard may not require implementations to open-code these functions,
> but market forces generally do.
A more interesting case isn't + or - but MAPCAR. It's a toss-up whether
to open-code MAPCAR, I bet, because of all the code bloat. I can imagine
an implementation not doing so in general. But there may be cases where
it's a good idea, and doing
(locally (declare (inline mapcar))
(mapcar ...))
might be helpful in some implementations, I would think. Then again, the
system might already do it based on speed > space or some other such thing.
But it's beyond the definition of the spec to know how the system actually
decides. Seems to me the INLINE declaration is the most perspicuous, direct
way to say what you want.
Kent M Pitman <······@world.std.com> writes:
> Barry Margolin <······@genuity.net> writes:
[...]
> > How do you request that a particular function be open-coded? I know you
> > can request that it *not* be open-coded, by using NOTINLINE. Are you
> > suggesting that he use INLINE explicitly on a built-in function?
>
> I think I'm suggesting he could.. unless someone finds a passage saying
> you can't do that. I don't know that making an INLINE declarationw ould
> do much good, for reasons you cite, but it doesn't seem to me to cause lots
> of harm. It probably already is INLINE, as you note.
CMU CL has an ext:maybe-inline declaration, which tells the compiler
to record the expansion of a function, but doesn't cause actual inline
expansion to occur, unless the space optimization quality in effect at
the expansion site is 0, or the function is subsequently/locally
declared inline. Various built-in functions are declared
maybe-inline, and hence the user can cause inline expansion of those
to happen by locally declaring them inline.
Things like digit-char-p, maphash, read-char, unread-char, read-byte
are declared ext:maybe-inline, and there have been requests to declare
more functions ext:maybe-inline, though one has to be careful not do
overdo something like this.
> A more interesting case isn't + or - but MAPCAR. It's a toss-up whether
> to open-code MAPCAR, I bet, because of all the code bloat. I can imagine
> an implementation not doing so in general. But there may be cases where
> it's a good idea, and doing
> (locally (declare (inline mapcar))
> (mapcar ...))
> might be helpful in some implementations, I would think. Then again, the
> system might already do it based on speed > space or some other such thing.
> But it's beyond the definition of the spec to know how the system actually
> decides. Seems to me the INLINE declaration is the most perspicuous, direct
> way to say what you want.
maphash is an example of CMU CL enabling such a thing. mapcar isn't
declared ext:maybe-inline, because mapcar is always open-coded in CMU
CL via a source-transform.
Regs, Pierre.
--
Pierre R. Mai <····@acm.org> http://www.pmsf.de/pmai/
The most likely way for the world to be destroyed, most experts agree,
is by accident. That's where we come in; we're computer professionals.
We cause accidents. -- Nathaniel Borenstein
In article <··············@orion.bln.pmsf.de>,
Pierre R. Mai <····@acm.org> wrote:
>CMU CL has an ext:maybe-inline declaration, which tells the compiler
>to record the expansion of a function, but doesn't cause actual inline
>expansion to occur, unless the space optimization quality in effect at
>the expansion site is 0, or the function is subsequently/locally
>declared inline. Various built-in functions are declared
>maybe-inline, and hence the user can cause inline expansion of those
>to happen by locally declaring them inline.
Except for the fact that it conditionalizes on the space optimization
quality, this is essentially equivalent to:
(declaim (inline <func-name>))
(defun <func-name> ...)
(declaim (notinline <func-name>))
There's no portable way to get open coding conditionalized on optimization
settings.
--
Barry Margolin, ······@genuity.net
Genuity, Burlington, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.
Please DON'T copy followups to me -- I'll assume it wasn't posted to the group.
Barry Margolin <······@genuity.net> writes:
> In article <··············@orion.bln.pmsf.de>,
> Pierre R. Mai <····@acm.org> wrote:
> >CMU CL has an ext:maybe-inline declaration, which tells the compiler
> >to record the expansion of a function, but doesn't cause actual inline
> >expansion to occur, unless the space optimization quality in effect at
> >the expansion site is 0, or the function is subsequently/locally
> >declared inline. Various built-in functions are declared
> >maybe-inline, and hence the user can cause inline expansion of those
> >to happen by locally declaring them inline.
>
> Except for the fact that it conditionalizes on the space optimization
> quality, this is essentially equivalent to:
>
> (declaim (inline <func-name>))
> (defun <func-name> ...)
> (declaim (notinline <func-name>))
Yes, but it is more convenient, too, because you don't need two
declarations in different places.
> There's no portable way to get open coding conditionalized on optimization
> settings.
Indeed there isn't even portable access to the current optimization
settings, which would be nice to have, so that users could tailor
their compiler-macros accordingly, etc.
Regs, Pierre.
--
Pierre R. Mai <····@acm.org> http://www.pmsf.de/pmai/
The most likely way for the world to be destroyed, most experts agree,
is by accident. That's where we come in; we're computer professionals.
We cause accidents. -- Nathaniel Borenstein
In article <··············@orion.bln.pmsf.de>,
Pierre R. Mai <····@acm.org> wrote:
>Barry Margolin <······@genuity.net> writes:
>
>> In article <··············@orion.bln.pmsf.de>,
>> Pierre R. Mai <····@acm.org> wrote:
>> >CMU CL has an ext:maybe-inline declaration, which tells the compiler
>> >to record the expansion of a function, but doesn't cause actual inline
>> >expansion to occur, unless the space optimization quality in effect at
>> >the expansion site is 0, or the function is subsequently/locally
>> >declared inline. Various built-in functions are declared
>> >maybe-inline, and hence the user can cause inline expansion of those
>> >to happen by locally declaring them inline.
>>
>> Except for the fact that it conditionalizes on the space optimization
>> quality, this is essentially equivalent to:
>>
>> (declaim (inline <func-name>))
>> (defun <func-name> ...)
>> (declaim (notinline <func-name>))
>
>Yes, but it is more convenient, too, because you don't need two
>declarations in different places.
That can be fixed portably quite easily:
(defmacro defun-inlinable (name args &body body)
`(progn
(declaim (inline ,name))
(defun ,name ,args ,@body)
(declaim (notinline ,name))))
In fact, you could even do:
(defmacro defun-inlinable (name args &body body)
`(progn
(declaim (#+cmucl ext:maybe-inline #-cmucl inline ,name))
(defun ,name ,args ,@body)
#-cmucl (declaim (notinline ,name))))
Now, not only does it do what you want without having to have multiple
declarations, but it does CMUCL's cool thing automatically as well.
--
Barry Margolin, ······@genuity.net
Genuity, Burlington, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.
Please DON'T copy followups to me -- I'll assume it wasn't posted to the group.
Barry Margolin <······@genuity.net> writes:
> (defmacro defun-inlinable (name args &body body)
> `(progn
> (declaim (#+cmucl ext:maybe-inline #-cmucl inline ,name))
> (defun ,name ,args ,@body)
> #-cmucl (declaim (notinline ,name))))
I would have spelled this defun-inlineable. But because either form
is so visually painful, I recommend instead just avoiding the whole
issue and going with:
(defmacro defun-inline (name args &body body)
`(progn
(declaim (inline ,name))
(defun ,name ,args ,@body)))
(defmacro defun-maybe-inline (name args &body body)
`(progn
#+cmucl (declaim (ext:maybe-inline ,name))
#+cmucl (declaim (inline ,name))
(defun ,name ,args ,@body)
#-cmucl (declaim (notinline ,name))
',name))
(defmacro defun-notinline (name args &body body)
`(progn
(declaim (inline ,name))
(defun ,name ,args ,@body)
(declaim (notinline ,name))
',name))
Note also that purely as a style issue, I prefer to put the #+/#- in a
different position than Barry put them in order to make grep and/or
tags-search more likely to succeed with pattern substrings that
juxtapose the paren with the conditionalized item.
Barry Margolin <······@genuity.net> writes:
> That can be fixed portably quite easily:
>
> (defmacro defun-inlinable (name args &body body)
> `(progn
> (declaim (inline ,name))
> (defun ,name ,args ,@body)
> (declaim (notinline ,name))))
Yes, that's another way to achieve similar results, although it has
associated costs of function definitions not being as recognizable
anymore (it might also fool tools that work on the source code). Not
a high cost to pay, and probably one preferable to inventing
non-portable declarations.
Regs, Pierre.
--
Pierre R. Mai <····@acm.org> http://www.pmsf.de/pmai/
The most likely way for the world to be destroyed, most experts agree,
is by accident. That's where we come in; we're computer professionals.
We cause accidents. -- Nathaniel Borenstein
* ··········@questions.com
> By the way, the answers I have been getting in this forum have been very
> helpful, and I hope a lot of other people reading them are finding them
> helpful too. Someone should go through these discussions sometime and
> write a reference book from them, or maybe put them in some kind of
> specialized search engine to find and browse discussions related to a
> particular question.
This is actually a _very_ bad idea. First off, the answers still belong
to the authors, who have implicitly given their permission to be used and
distributed in the context of a questions-and-answers style newsgroup,
and not for any other purpose. If you look around at the intellectual
property discussions in the news media, you will find that journalists
and other contributors demand extra payment for any "reuse" of their
material. Since contributors are _not_ paid on the Net, abusing their
goodwill by "collecting" their contributions is very difficult legal
territory. Search engines are difficult enough to defend, but if anyone
made money specifically on such collections, expect expensive.lawsuits.
Second, the reason technical answers to technical questions work so well
in this particular context is that anyone can, and usually does, correct
any and all mistakes and problems in the answers. In other words, it
pays off to be imprecise and incorrect in order to produce the most lucid
answers. The most precise and correct an answer is, the less discussion
it leads to. The reason you see such high quality articles in this
newsgroup in particular is that those people who care the most about
technical precision and accuracy care enough to respond and clarify.
Quite often on USENET, eager newbies respond with answers that are not
actually _wrong_, but still misleading and uninformed, because the more
experienced users have seen the same question thousands of times, and
only step in to correct a misleading response from an eager newbie. Some
eager newbies are not quite ready for this, and insist that their answers
are correct. The higher quality the answers in the newsgroup, the fewer
eager newbies, but also the fewer general number of posters, because few
people have the guts to stand up in a crowd of people they believe know
the answer better than them and speak.
The answer to your implicit request is just to hang around here even
after you tire of the repeated simple questions you know the answer to
and cannot imagine that somebody else failed to pick up from the many
available and high-quality sources. Having people ask questions is one
thing -- but a newsgroup thrives only when enough fearless newbies post
wrong answers -- that is the only way that the question gets any _real_
answers that lead to _understanding_.
This necessarily means that USENET is _not_ an archivable medium, but a
forum that lives in the precarious balance between experts and newbies.
So instead of automated tools that violate people's willingness to share,
just collect your own insights and bring them to the next "generation" of
people who discover Common Lisp by answering the questions and correcting
the mistakes made by other newbies who got it wrong in their eagerness.
#:Erik
--
Travel is a meat thing.
··········@questions.com writes:
> In a tight Lispworks loop which involves array references, disassembling the
> code shows that aref is being called, and each call is being set up, such
> that each aref takes many times longer than it could take.
It would help to see the loop and generated code to see what, if any,
hints you gave the compiler regarding the type of the array and array
index and whether you were explicitly compiling for speed or for
safety.
>>>>> "lispnewbie" == lispnewbie <··········@questions.com> writes:
lispnewbie> On Sun, 03 Jun 2001 21:12:05 GMT, ·············@yahoo.com (Stephen J. Bevan)
lispnewbie> wrote:
>> It would help to see the loop and generated code to see what, if any,
>> hints you gave the compiler regarding the type of the array and array
>> index and whether you were explicitly compiling for speed or for
>> safety.
lispnewbie> (defun test1 (n)
lispnewbie> (declare (optimize speed (safety 0)
lispnewbie> (fixnum-safety 0) (debug 0)))
lispnewbie> (let ((v (make-sequence '(vector fixnum) 500000)))
lispnewbie> (dotimes (i n)
lispnewbie> (loop as e across v sum e))))
My guess is that Lispworks isn't smart enough to see that v is really
a (simple-array fixnum (5000000)), and thus can't inline aref. For
this example, CMUCL does inline the aref. However, CMUCL has other
deficiencies such as not knowing the types of the loop counters and
using generic arithmetic on them.
Ray
··········@questions.com writes:
> On Sun, 03 Jun 2001 21:12:05 GMT, ·············@yahoo.com (Stephen J. Bevan)
> wrote:
>
> >It would help to see the loop and generated code to see what, if any,
> >hints you gave the compiler regarding the type of the array and array
> >index and whether you were explicitly compiling for speed or for
> >safety.
>
> (defun test1 (n)
> (declare (optimize speed (safety 0)
> (fixnum-safety 0) (debug 0)))
> (let ((v (make-sequence '(vector fixnum) 500000)))
> (dotimes (i n)
> (loop as e across v sum e))))
There are obvious (to the initiated ;) ways in which you can improve
the type declarations in this function to allow LW to generate better
code:
- Although a sufficiently smart compiler might be able to deduce the
type of v by itself, most compilers aren't that clever, and hence it
pays to declare it specifically. Since you didn't do so, LW assumes
that v can be any kind of object, or at least any kind of array,
which includes non-simple arrays (like displaced arrays, arrays with
fill-pointers, adjustable arrays, etc.), arrays with other element
types (and hence other memory layouts), multi-dimensional arrays,
etc. To take care of all of these situations, aref will have to do
much work, and hence it doesn't pay of to inline that large amount
of code, so LW doesn't.
By adding a declaration of the form
(declare (type (simple-array fixnum (*)) v))
Or even more specific
(declare (type (simple-array fixnum (500000)) v))
You will allow LW to inline aref profitably, because it now knows
exactly what kind of array to expect, and can therefore open-code
just the relevant stuff (a couple of instructions instead of a
couple of hundred instructions).
The important speed-up here doesn't come from inlining/open-coding
itself, but rather from eliding quite a number of checks,
calculations, etc.
- Adding declarations for e and the sum of it (the sum of two fixnums
isn't necessarily a fixnum itself), you can speed up the summation
part quite a bit as well, if you can guarantee that the sum of all
elements still fits inside a fixnum.
- While we are at it, we might want to add declarations for n and i as
well.
This gives us the following function:
(defun test1 (n)
(declare (optimize speed (safety 0) (fixnum-safety 0) (debug 0))
(fixnum n))
(let ((v (make-sequence '(vector fixnum) 500000)))
(declare (type (simple-array fixnum (500000)) v))
(dotimes (i n)
(declare (fixnum i))
(loop as e of-type fixnum across v sum e of-type fixnum))))
Which sums 1,000,000,000 fixnums (invoked with n=2000) in 24.5s on an
AMD K6-2/550, or 24.5ns per fixnum/iteration, which comes out at
around 13.5 cycles per iteration.
On current CMU CL you will get down to 11.4ns or 6.25 cycles per
iteration (and you might safe yourself a couple of the declarations,
because CMU CL's compiler does a slightly better job at infering type
declarations from other known types).
The inner-most loop looks like this in CMU CL:
89: L1: MOV ECX, [EDX+EBX+1]
8D: ADD EBX, 4
90: ADD ESI, ECX
92: L2: CMP EBX, EDI
94: JL L1
Regs, Pierre.
--
Pierre R. Mai <····@acm.org> http://www.pmsf.de/pmai/
The most likely way for the world to be destroyed, most experts agree,
is by accident. That's where we come in; we're computer professionals.
We cause accidents. -- Nathaniel Borenstein
On 04 Jun 2001 15:14:56 +0200, "Pierre R. Mai" <····@acm.org> wrote:
> (declare (type (simple-array fixnum (500000)) v))
Yes, declaring the array to be of fixnums turns out to be the key to making
this kind of code an order of magnitude faster in Lispworks and approximately
the same speed as C/C++.
Thanks for all of your very helpful answers to my "newbie" questions. I hope
a thousand lurking newbies are reading them and learning as much as I am from
them.
··········@questions.com writes:
> In a tight Lispworks loop which involves array references, disassembling the
> code shows that aref is being called, and each call is being set up, such
> that each aref takes many times longer than it could take.
>
In the case of your code, use svref instead of aref and Lispworks will
inline it.
--
Lieven Marchand <···@wyrd.be>
Making laws appears to win votes. Enforcing them doesn't.
See Rule One. Roger Burton West in the monastery.