This is an idle-curiosity question, and if there's a standard answer
someone can perhaps point me to it:
What is the overhead for having multiple-value-return in Lisp? Which
aspects of the current design are the most expensive?
The second part of the question is meant to elicit answers about aspects
such as the ability of a procedure to return different numbers of
arguments on different calls; the fact that there need be no
correspondence between the number of values the caller expects and the
number returned; and so forth.
The reason I ask is that there is a lot of traffic on this newsgroup
about new dialects of Lisp, and one thing I wish Lisp were better at was
the ability to compile into really efficient code.
--
-- Drew McDermott
Yale Computer Science Department
Drew McDermott <··················@at.yale.dot.edu> writes:
> This is an idle-curiosity question, and if there's a standard answer
> someone can perhaps point me to it:
>
> What is the overhead for having multiple-value-return in Lisp? Which
> aspects of the current design are the most expensive?
I don't know about a `standard' answer, but the design usually assumes
that expecting and returning single values is the `fast path'. The
return value has a canonical place to go (top of stack, distinguished
register, whatever).
If you want to return multiple values, though, there are issues. If
you have more than one value, you have to store the values while
computing the remaining ones. Once you have all the values, you have
to figure out if the caller wants them all (or just some of them),
then you have to move them to where the caller wants them.
All this requires either a bunch of stack shuffling or consing.
But as far as overhead... there is none in the normal case, and
only a little in the multiple value case.
Drew McDermott <··················@at.yale.dot.edu> writes:
> This is an idle-curiosity question, and if there's a standard answer
> someone can perhaps point me to it:
As others have said, this is an implementational issue, so the answer
is to ask your vendor.
> What is the overhead for having multiple-value-return in Lisp? Which
> aspects of the current design are the most expensive?
If code is lexically apparent, it should be able to be rewritten as
if it were simpler. Thus, for example, in Allegro CL, the following
lexically visible values become associated with the arguments to bar
and thus disappear:
CL-USER(1): (compile (defun foo ()
(declare (optimize speed (safety 0) (debug 0)))
(multiple-value-bind (a b)
(values 1000 2000)
(bar a b))))
Warning: While compiling these undefined functions were referenced: BAR.
FOO
NIL
NIL
CL-USER(2): (disassemble *)
;; disassembly of #<Function FOO>
;; formals:
;; constant vector:
0: BAR
;; code start: #x71645cbc:
0: b8 a0 0f 00 movl eax,$4000 ; 1000
00
5: 8b 5e 12 movl ebx,[esi+18] ; BAR
8: ba 40 1f 00 movl edx,$8000 ; 2000
00
13: ff 67 27 jmp *[edi+39] ; SYS::TRAMP-TWO
CL-USER(3):
When the values are returned from the function, Allegro CL actually
returns the first values in the same registers as are generally
used for argument passing (the exception to this is the Sparc, which
due to its register-windows architecture can only return one value
in a register). On the x86, we assign eax and edx for the first two
arguments, which means also that they are used for the first two
values returned. Note also that the same count register (ecx, in
this architecture) is used to return the count of values returned,
and note also (for later) that the carry bit is forced using the
stc instruction:
CL-USER(3): (compile (defun foo ()
(declare (optimize speed (safety 0) (debug 0)))
(values 1000 2000)))
FOO
NIL
NIL
CL-USER(4): (disassemble *)
;; disassembly of #<Function FOO>
;; formals:
;; code start: #x7164b88c:
0: b8 a0 0f 00 movl eax,$4000 ; 1000
00
5: ba 40 1f 00 movl edx,$8000 ; 2000
00
10: 33 c9 xorl ecx,ecx
12: b1 02 movb cl,$2
14: f9 stc
15: 8b 75 fc movl esi,[ebp-4]
18: c3 ret
19: 90 nop
CL-USER(5):
The reason why we set the carry bit is for optimization; by far the
largest percentage of returns yield exactly one value, so it is not
necessary to set the ecx register to the value 1 - instead we simply
clear the carry bit (clc), and this serves to alert any callers or
their callers that are in fact interested that the values count is 1:
CL-USER(5): (compile (defun foo ()
(declare (optimize speed (safety 0) (debug 0)))
(values 1000)))
FOO
NIL
NIL
CL-USER(6): (disassemble *)
;; disassembly of #<Function FOO>
;; formals:
;; code start: #x7164d924:
0: b8 a0 0f 00 movl eax,$4000 ; 1000
00
5: f8 clc
6: 8b 75 fc movl esi,[ebp-4]
9: c3 ret
CL-USER(7):
Note that most callers of foo are not even going to care how many
values were passed back - CL semantics implies that the eax register
will get nil for no values returned, and the first value for all other
cases, and so if a function calls foo in other than multiple-value-*,
it need only grab eax to get the value it is after.
> The second part of the question is meant to elicit answers about
> aspects such as the ability of a procedure to return different numbers
> of arguments on different calls; the fact that there need be no
> correspondence between the number of values the caller expects and the
> number returned; and so forth.
It is actually those callers which are looking for multiple values
to be returned (multiple-value-bind, etc) which end up bearing the
burden of testing for number of values returned. But there are even
optimizations for that; for example, once it is determined that K
values were not returned, variable K, K+1, ... N (where N is the
number of the last variable to be assigned or bound) can be
automatically set to nil.
Next, if more values are returned than can be put somewhere, those
extra values must be placed somewhere. We have a global pseudo-resource
for storing values temporarily, and because values returned tend to be
highly temporal, that resource can generally be global and need not be
constantly allocated.
Finally, the presence of multiple-value-call constitutes a special case;
which also is an exception to the previous paragraph; since for the second
and subsequent forms there are values that have already been collected,
those collected values must be stored away so that each current form can
be run with no need to know where to place its values (except in the same
standard place) and the m-v-c will take care of the appending of the
values appropriately.
> The reason I ask is that there is a lot of traffic on this newsgroup
> about new dialects of Lisp, and one thing I wish Lisp were better at
> was the ability to compile into really efficient code.
I think you need not worry about CL in the area of
multiple-values-returned. You should experiment with implementations,
and complain to your vendor if dissatisfied.
--
Duane Rettig ·····@franz.com Franz Inc. http://www.franz.com/
555 12th St., Suite 1450 http://www.555citycenter.com/
Oakland, Ca. 94607 Phone: (510) 452-2000; Fax: (510) 452-0182
Drew McDermott wrote:
>
> What is the overhead for having multiple-value-return in Lisp? Which
> aspects of the current design are the most expensive?
>
Thanks for the detailed answers to my question.
It's interesting that the answerers felt compelled to say that this was
an "implementation issue." What else would it be? Obviously, different
implementations handle the problem differently, and the overhead for a
language feature may depend on the architecture of the machine the
compiler is targeting. It is still possible to discuss the typical
bottom-line cost of a feature, and to speculate on how much more
efficient compiled Lisp would be if, say, it didn't have generic
arithmetic and bignums.
--
-- Drew McDermott
Yale Computer Science Department
[...]
DMcD> It's interesting that the answerers felt compelled to say
DMcD> that this was an "implementation issue." [...]
That's probably because you posed the question (AFAIR) as a language
question. But yes, we do tend to be defensive about that.
DMcD> ... It is still possible to discuss the typical
DMcD> bottom-line cost of a feature, and to speculate on how much
DMcD> more efficient compiled Lisp would be if, say, it didn't
DMcD> have generic arithmetic and bignums.
Indeed, it is possible, but this is or should be in a FAQ. Are you
just returning to actively coding in Lisp? The language does have
enough facilities to turn those features off (though not all of those
are standardized). We've had quite a few threads here over the years
where people got tips on wringing efficiency out of their
implementation by means of declarations etc. Maybe we should compile
some of those threads and stick it as a document somewhere?
cheers,
BM
Bulent Murtezaoglu wrote:
> Are you
> just returning to actively coding in Lisp?
Yes, the way Rush Limbaugh has just returned to actively popping pills.
I get the shakes if I don't write Lisp code several times a day.
I can't even end this with a smiley because the unbalanced paren would
freak me out.
-- Drew McDermott
Hi Drew McDermott,
> It's interesting that the answerers felt compelled to say that this was
> an "implementation issue." What else would it be? Obviously, different
> implementations handle the problem differently, and the overhead for a
> language feature may depend on the architecture of the machine the
> compiler is targeting.
Actually Drew I felt compelled to delete most of my reply because I just
couldn't produce a saccharine tone.
Note that: (a) There is not a lot of traffic on this newsgroup about new
dialects of Lisp. The vast majority of the traffic discusses writing and
efficiently implementing code in ANSI Common Lisp; and (b) "one thing I
wish Lisp were better at was the ability to compile into really efficient
code" is inflammatory. It implies that you want to throw out language
features before even considering that some /implementations/ may be able
to adequately and efficiently handle multiple values.
So I replied in the hope that if you followed the links then maybe, just
maybe, you could accept Lisp for what it is before wanting a new less
expressive dialect.
> It is still possible to discuss the typical bottom-line cost of a
> feature, and to speculate on how much more efficient compiled Lisp would
> be if, say, it didn't have generic arithmetic and bignums.
Here you go wanting to throw out tonnes more of Lisp. Does this look
like generic arithmetic to you:
* (disassemble (compile nil (lambda (a b) (+ (the (integer 0 10) a)
(the (integer 50 10000) b)))))
; Compiling lambda (a b):
; Compiling Top-Level Form:
481241A0: .entry "lambda (a b)"(a b) ; (function (t t)
; (integer 50 10010))
B8: pop dword ptr [ebp-8]
BB: lea esp, [ebp-32]
BE: add edx, edi ; No-arg-parsing entry point
C0: mov ecx, [ebp-8]
C3: mov eax, [ebp-4]
C6: add ecx, 2
C9: mov esp, ebp
CB: mov ebp, eax
CD: jmp ecx
CF: nop
To me it looks like an x86 machine level integer add. The compiler is
able to infer that the addition of A and B will be an integer between 50
and 10010 and at a zero level of safety the compiler will inline fixnum
arithmetic without even checking for some wrong input types. So something
can go horribly wrong:
* (compile nil (lambda (a b) (+ (the (integer 0 10) a) (the (integer 50 10000) b))))
; Compiling lambda (a b):
; Compiling Top-Level Form:
#<Function "lambda (a b)" {48188889}>
nil
nil
* (funcall * 10000000000000 1000000000000000000)
#<Unknown Immediate Object, lowtag=#b110, type=#xB6 {9031AFB6}>
[This is not the right answer!]
Whereas by default Lisp will do the right thing:[0]
* (compile nil (lambda (a b) (+ a b)))
; Compiling lambda (a b):
; Compiling Top-Level Form:
#<Function "lambda (a b)" {4819A411}>
nil
nil
* (funcall * 10000000000000 1000000000000000000)
1000010000000000000
If you have little concern about writing correct programs first instead
of incorrect programs fast Lisp is going to drive you crazy.
Regards,
Adam
0. This still works at a zero level of safety because of the generic
arithmetic:
* (disassemble (compile nil (lambda (a b) (+ a b))))
; Compiling lambda (a b):
; Compiling Top-Level Form:
481AB9D8: .entry "lambda (a b)"(a b) ; (function (t t) number)
9F0: pop dword ptr [ebp-8]
9F3: lea esp, [ebp-32]
9F6: mov [ebp-12], edx
9F9: mov [ebp-16], edi
9FC: mov edx, [ebp-12] ; No-arg-parsing entry point
9FF: mov edi, [ebp-16]
A02: call #x100001C8 ; #x100001C8: generic-+
A07: mov esp, ebx
A09: mov ecx, [ebp-8]
A0C: mov eax, [ebp-4]
A0F: add ecx, 2
A12: mov esp, ebp
A14: mov ebp, eax
A16: jmp ecx
Adam Warner <······@consulting.net.nz> writes:
> * (funcall * 10000000000000 1000000000000000000)
>
> 1000010000000000000
Now that's a misleading use for a variable with such a name.
Hi Kalle Olavi Niemitalo,
> Adam Warner writes:
>
>> * (funcall * 10000000000000 1000000000000000000)
>>
>> 1000010000000000000
>
> Now that's a misleading use for a variable with such a name.
:-) For others a little perplexed, * is a variable maintained for
interactive use. It stores the last evaluation from the
read-eval-print-loop and should not appear in this form in a standard
program. (funcall * 2 3) is not equivalent to (funcall #'* 2 3) nor
(funcall '* 2 3). Even the last two cases are not necessarily equivalent
as FUNCALL coerces a symbol to a function from the _global_ environment:
(flet ((* (&rest args) (apply #'+ args)))
(format t "~S returns ~A and ~S returns ~A.~%"
#1='(funcall #'* 2 3) #.#1# #2='(funcall '* 2 3) #.#2#))
=> (funcall #'* 2 3) returns 5 and (funcall '* 2 3) returns 6.
Regards,
Adam
PS: While it might not look like it, the form being evaluated is indeed an
s-expression:
'(flet ((* (&rest args) (apply #'+ args)))
(format t "~S returns ~A and ~S returns ~A.~%"
#1='(funcall #'* 2 3) #.#1# #2='(funcall '* 2 3) #.#2#))
=> (flet ((* (&rest args) (apply #'+ args)))
(format t "~S returns ~A and ~S returns ~A.~%"
'(funcall #'* 2 3) (funcall #'* 2 3) '(funcall '* 2 3) (funcall '* 2 3)))
On Mon, Oct 27, 2003 at 10:03:42PM +1300, Adam Warner wrote:
> (flet ((* (&rest args) (apply #'+ args)))
This is not permitted by the standard, btw, for symbols from the CL
package such as CL:*.
--
; Matthew Danish <·······@andrew.cmu.edu>
; OpenPGP public key: C24B6010 on keyring.debian.org
; Signed or encrypted mail welcome.
; "There is no dark side of the moon really; matter of fact, it's all dark."
Hi Matthew Danish,
> On Mon, Oct 27, 2003 at 10:03:42PM +1300, Adam Warner wrote:
>> (flet ((* (&rest args) (apply #'+ args)))
>
> This is not permitted by the standard, btw, for symbols from the CL
> package such as CL:*.
I'd be very surprised if lexical bindings were not permitted. I did
nothing to replace the global function definition. In fact I demonstrated
how the global function definition was preserved. Kent M Pitman does
the same thing in a FUNCALL example, lexically binding CONS:
(flet ((cons (x y) `(kons ,x ,y)))
(let ((cons (symbol-function '+)))
(funcall #'cons
(funcall 'cons 1 2)
(funcall cons 1 2))))
=> (KONS (1 . 2) 3)
I suspect we have helped each other today Matthew :-)
Regards,
Adam
On Tue, Oct 28, 2003 at 02:03:38AM +1300, Adam Warner wrote:
> Hi Matthew Danish,
>
> > On Mon, Oct 27, 2003 at 10:03:42PM +1300, Adam Warner wrote:
> >> (flet ((* (&rest args) (apply #'+ args)))
> >
> > This is not permitted by the standard, btw, for symbols from the CL
> > package such as CL:*.
>
> I'd be very surprised if lexical bindings were not permitted. I did
11.1.2.1.2.1 Some Exceptions to Constraints on the COMMON-LISP Package
for Conforming Programs
``If an external symbol of the COMMON-LISP package is not defined as a
standardized function, macro, or special operator, it is allowed to
lexically bind it as a function (e.g., with flet), to declare the ftype
of that binding, and (in implementations which provide the ability to do
so) to trace that binding.''
Basically, the rationale for the restrictions is this:
* Macros trying to use functions or variables from the CL packages
should not have to worry about name capture.
* Implementations should feel free to specially handle functions and
variables from the CL package.
* CL symbols are shared by all CL programs, therefore no change to
non-sharable [1] global properties of CL symbols is allowed.
[1] I think GET is okay, since the plist can be shared by many programs.
I didn't spot any restrictions on it, at least.
--
; Matthew Danish <·······@andrew.cmu.edu>
; OpenPGP public key: C24B6010 on keyring.debian.org
; Signed or encrypted mail welcome.
; "There is no dark side of the moon really; matter of fact, it's all dark."
Hi Matthew Danish,
>> > On Mon, Oct 27, 2003 at 10:03:42PM +1300, Adam Warner wrote:
>> >> (flet ((* (&rest args) (apply #'+ args)))
>> >
>> > This is not permitted by the standard, btw, for symbols from the CL
>> > package such as CL:*.
>>
>> I'd be very surprised if lexical bindings were not permitted. I did
>
> 11.1.2.1.2.1 Some Exceptions to Constraints on the COMMON-LISP Package
> for Conforming Programs
>
> ``If an external symbol of the COMMON-LISP package is not defined as a
> standardized function, macro, or special operator, it is allowed to
> lexically bind it as a function (e.g., with flet), to declare the ftype
> of that binding, and (in implementations which provide the ability to do
> so) to trace that binding.''
>
>
> Basically, the rationale for the restrictions is this:
>
> * Macros trying to use functions or variables from the CL packages
> should not have to worry about name capture.
> * Implementations should feel free to specially handle functions and
> variables from the CL package.
> * CL symbols are shared by all CL programs, therefore no change to
> non-sharable [1] global properties of CL symbols is allowed.
OK, thanks again Matthew and wrong again Adam. I can most identify with
macros not having to worry about function name capture. FUNCALL
contains a very naughty example.
> [1] I think GET is okay, since the plist can be shared by many programs.
> I didn't spot any restrictions on it, at least.
Well at least GET is an accessor, not a "standardized function, macro, or
special operator." Luckily all the loop keywords will be OK as well.
Regards,
Adam
From: Kalle Olavi Niemitalo
Subject: what is an accessor (Re: Efficiency of multiple values)
Date:
Message-ID: <8765iazf5u.fsf_-_@Astalo.kon.iki.fi>
Adam Warner <······@consulting.net.nz> writes:
> Well at least GET is an accessor, not a "standardized function, macro, or
> special operator."
The title "Accessor GET" means that GET is globally bound as a
function (which is of course a standardized function) and there
is either another function (SETF GET) or a setf expander for GET.
See 1.4.4.14 and 5.1.1.2, and "accessor" and "operator" in the
Glossary.
On Tue, 28 Oct 2003 03:01:13 +1300, Adam Warner wrote:
>> [1] I think GET is okay, since the plist can be shared by many programs.
>> I didn't spot any restrictions on it, at least.
> Well at least GET is an accessor, not a "standardized function, macro, or
> special operator." Luckily all the loop keywords will be OK as well.
GET is a "standardized function"; but I think you misinterpreted: I
expect Matthew meant it was OK to use GET (and, more to the point,
SETF thereof) on symbols in the COMMON-LISP package -- i.e., to modify
their property lists, not to lexically fbind GET.
[BTW, the loop keywords AND, DO, IF, WHEN, UNLESS, APPEND, NCONC,
COUNT, and = are all standard functions/macros/special operators, too]
--
Cogito ergo I'm right and you're wrong. -- Blair Houghton
(setq reply-to
(concatenate 'string "Paul Foley " "<mycroft" '(··@) "actrix.gen.nz>"))
Adam Warner <······@consulting.net.nz> writes:
> I'd be very surprised if lexical bindings [of predefined symbols in
> package common-lisp] were not permitted.
Check out CLHS section 11.1.2.1.2: Constraints on the COMMON-LISP
Package for Conforming Programs.
--
Frode Vatvedt Fjeld
Hi Kalle Olavi Niemitalo,
> Adam Warner writes:
>
>> * (funcall * 10000000000000 1000000000000000000)
>>
>> 1000010000000000000
>
> Now that's a misleading use for a variable with such a name.
:-) For others a little perplexed, * is a variable maintained for
interactive use. It stores the last evaluation from the
read-eval-print-loop and should not appear in this form in a standard
program. (funcall * 2 3) is not equivalent to (funcall #'* 2 3) nor
(funcall '* 2 3). Even the last two cases are not necessarily equivalent
as FUNCALL coerces a symbol to a function from the _global_ environment:
(flet ((* (&rest args) (apply #'+ args)))
(format t "~S returns ~A and ~S returns ~A.~%"
#1='(funcall #'* 2 3) #.#1# #2='(funcall '* 2 3) #.#2#))
=> (funcall #'* 2 3) returns 5 and (funcall '* 2 3) returns 6.
Regards,
Adam
PS: While it might not look like it, the form being evaluated is indeed
composed of s-expressions:
'(flet ((* (&rest args) (apply #'+ args)))
(format t "~S returns ~A and ~S returns ~A.~%"
#1='(funcall #'* 2 3) #.#1# #2='(funcall '* 2 3) #.#2#))
=> (flet ((* (&rest args) (apply #'+ args)))
(format t "~S returns ~A and ~S returns ~A.~%"
'(funcall #'* 2 3) (funcall #'* 2 3) '(funcall '* 2 3) (funcall '* 2 3)))
Adam Warner <······@consulting.net.nz> writes:
> (funcall * 2 3) is not equivalent to (funcall #'* 2 3) nor
> (funcall '* 2 3). Even the last two cases are not necessarily equivalent
> as FUNCALL coerces a symbol to a function from the _global_ environment:
However, the choice of environment does not matter to conforming
programs, because CLHS sections 1.5.2 and 11.1.2.1.2 together
forbid them from binding * as a function.
Hi Drew McDermott,
> What is the overhead for having multiple-value-return in Lisp? Which
> aspects of the current design are the most expensive?
This is an implementation issue. Here's how CMUCL approaches it:
<http://cvs2.cons.org/ftp-area/cmucl/doc/cmu-user/compiler-hint.html#toc150>
<http://cvs2.cons.org/ftp-area/cmucl/doc/cmu-user/compiler-hint.html#local-call-return>
Regards,
Adam
In article <············@news.wss.yale.edu>,
Drew McDermott <··················@at.yale.dot.edu> wrote:
> This is an idle-curiosity question, and if there's a standard answer
> someone can perhaps point me to it:
>
> What is the overhead for having multiple-value-return in Lisp? Which
> aspects of the current design are the most expensive?
>
> The second part of the question is meant to elicit answers about aspects
> such as the ability of a procedure to return different numbers of
> arguments on different calls; the fact that there need be no
> correspondence between the number of values the caller expects and the
> number returned; and so forth.
I'm reasonably familiar with the implementations of two quite different
"Lisp" compilers: Gwydion Dylan and Chicken Scheme. Both compile via C,
but do it quite differently.
Gwydion Dylan compiles to "direct form" C, that is: each Dylan function
compiles into one C function, which returns via a normal C return. It
has several strategies, depending on what is known. If there is kown to
be a single return result and the type is declared then d2c returns a
simple raw C value (int, float, char, pointer as appropriate). If there
are an exactly known number of return values (>1) then a C struct type
is declared to hold those values (using an appropriate raw C type, if
known) and the struct is returned by value (the C compiler may well
convert this into caller allocates space and passes in a pointer). If
the number of return values is unknown then they are boxed and pushed
onto a special "values" stack and the caller figures out how many
results there are by comparing the stack pointer before and after the
call. In no case is there any consing.
Chicken Scheme compiles to CPS C, that is: each Scheme basic block
compiles into one C function, and the C functions never return but
instead call their continuation. This means that argument passing and
result returning are implemented exactly the same -- each argument or
result value is passed using the C function call mechanism. If there
are an unknown number of arguments/results then C's stdargs mechanism is
used for the variable ones.
Hope this helps...
-- Bruce
Drew McDermott wrote:
> This is an idle-curiosity question, and if there's a standard answer
> someone can perhaps point me to it:
>
> What is the overhead for having multiple-value-return in Lisp? Which
> aspects of the current design are the most expensive?
I know that multiple-values in Scheme and CommonLisp aren't
the same, but maybe you can get some ideas from the following
paper:
Michael Ashley and R. Kent Dybvig.
"An Efficient Implementation of Multiple Return Values in Scheme".
ACM Conference on LISP and Functional Programming. June 1994.
<http://www.cs.indiana.edu/~dyb/papers/mrvs.ps.gz>
The technique used is quite clever.
Found at the Compiler Technology/Implementation Techniques and
Optimization of Bender's excellent bibliography:
<http://library.readscheme.org/page8.html>
--
Jens Axel S�gaard