Using type declarations in safety checks

From: ········@irtnog.org
Subject: Using type declarations in safety checks
Date: Fri, 16 Nov 2001 22:14:54 +0000
Message-ID: <w4oy9l6l5wh.fsf@eco-fs1.irtnog.org>

Would static type checking (as opposed to optimization) be a
legitimate use of (DECLARE (TYPE ...)) forms?

Take, for example, the following definitions:

        (DECLAIM (SAFETY 3))

        (DEFUN FOO (BAR BAZ)
          (DECLARE (TYPE STRING BAR)
                   (TYPE LIST BAZ)
          ...))

The following form would be compiled/executed without error:

        (FOO "TEST1" (LIST 1 2 3))

Whereas our hypothetical compiler (or evaluator) would signal a
TYPE-ERROR (or maybe two type errors) on the following form:

        (FOO :TEST1 #(1 2 3))

Type declarations in Lisp seem to only influence the optimizer, and in
all the code examples I've seen, type checking is explicit (e.g. the
programmer wraps their function body with the appropriate tests).

I haven't read anything in the CLHS that would lead me to believe that
this use of type declaration forms would cause my implementation to be
non-conforming (all else being equal).

-- 
"We know for certain only when we know little.  With knowlege, doubt
increases." - Goethe

Re: Using type declarations in safety checks Barry Margolin
Re: Using type declarations in safety checks Pierre R. Mai
- Re: Using type declarations in safety checks ········@acm.org
  - Re: Using type declarations in safety checks Pierre R. Mai
  - Re: Using type declarations in safety checks Hannu Koivisto
    - Re: Using type declarations in safety checks Kent M Pitman
    - Re: Using type declarations in safety checks ········@acm.org
      - Re: Using type declarations in safety checks Kent M Pitman
        Re: Using type declarations in safety checks Michael Hudson
- Re: Using type declarations in safety checks Bruce Hoult
  - Re: Using type declarations in safety checks Pierre R. Mai
    - Re: Using type declarations in safety checks ········@irtnog.org
      - Re: Using type declarations in safety checks Matthew X. Economou
        Re: Using type declarations in safety checks Tim Bradshaw
        Re: Using type declarations in safety checks Thomas F. Burdick
    - Re: Using type declarations in safety checks Bruce Hoult
      - Re: Using type declarations in safety checks Pierre R. Mai

From: Barry Margolin
Subject: Re: Using type declarations in safety checks
Date: Fri, 16 Nov 2001 22:43:09 +0000
Message-ID: <1EgJ7.74$I25.6705@burlma1-snr2>

In article <···············@eco-fs1.irtnog.org>,  <········@irtnog.org> wrote:
>Would static type checking (as opposed to optimization) be a
>legitimate use of (DECLARE (TYPE ...)) forms?

There's nothing prohibiting it.  You shouldn't even need the high safety
setting.

>
>Take, for example, the following definitions:
>
>        (DECLAIM (SAFETY 3))
>
>        (DEFUN FOO (BAR BAZ)
>          (DECLARE (TYPE STRING BAR)
>                   (TYPE LIST BAZ)
>          ...))
>
>The following form would be compiled/executed without error:
>
>        (FOO "TEST1" (LIST 1 2 3))
>
>Whereas our hypothetical compiler (or evaluator) would signal a
>TYPE-ERROR (or maybe two type errors) on the following form:
>
>        (FOO :TEST1 #(1 2 3))

I think it should be a compiler warning, not an error.  After all, you
might redefine FOO later, before you actually invoke the function
containing this erroneous call.

Many compilers already warn if you call a function with the wrong number of
arguments, this is essentially the same kind of thing.  In general,
compilers can warn about anything they want, as long as the execution of
the code obeys the standard.

-- 
Barry Margolin, ······@genuity.net
Genuity, Woburn, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.
Please DON'T copy followups to me -- I'll assume it wasn't posted to the group.

From: Pierre R. Mai
Subject: Re: Using type declarations in safety checks
Date: Sat, 17 Nov 2001 00:37:13 +0000
Message-ID: <87bsi2b5c6.fsf@orion.bln.pmsf.de>

········@irtnog.org writes:

> Would static type checking (as opposed to optimization) be a
> legitimate use of (DECLARE (TYPE ...)) forms?

Indeed it is, and CMU CL's new (in 1990) compiler (named Python) was
written more or less in order to prove that the required amount of
type inferencing was possible, and that this kind of checking was
practicable.

> Take, for example, the following definitions:
> 
>         (DECLAIM (SAFETY 3))
> 
>         (DEFUN FOO (BAR BAZ)
>           (DECLARE (TYPE STRING BAR)
>                    (TYPE LIST BAZ)
>           ...))
> 
> The following form would be compiled/executed without error:
> 
>         (FOO "TEST1" (LIST 1 2 3))
> 
> Whereas our hypothetical compiler (or evaluator) would signal a
> TYPE-ERROR (or maybe two type errors) on the following form:
> 
>         (FOO :TEST1 #(1 2 3))

The compiler is supposed to issue warnings for things that will
(likely) lead to errors at run-time, not errors.

Even without the optimize safety declaration, CMU CL will issue the
warnings you request (and as long as safety is not 0, it will also
emit corresponding type-checks, so that errors will be signalled at
run-time).  Taking your example file (without the erroneous declaim
form):

* (compile-file "/tmp/test.cl")

Python version 1.0, VM version Intel x86 on 17 NOV 01 01:22:20 am.
Compiling: /tmp/test.cl 17 NOV 01 01:22:14 am

Converted FOO.
Compiling DEFUN FOO: 
Converted GOOD.
Compiling DEFUN GOOD: 
Converted BAD.
Compiling DEFUN BAD: 

File: /tmp/test.cl

In: DEFUN BAD
  (FOO :TEST1 #(1 2 3))
Warning: This is not a (VALUES &OPTIONAL BASE-STRING &REST T):
  :TEST1
Warning: This is not a (VALUES &OPTIONAL LIST &REST T):
  #(1 2 3)

Byte Compiling Top-Level Form: 

Compilation unit finished.
  2 warnings

/tmp/test.x86f written.
Compilation finished in 0:00:01.

#p"/tmp/test.x86f"
T
T

CMU CL does this kind of thing quite extensively, even for more
complicated code.  There are however a couple of places where CMU CL
isn't as eager to check things, and one explicit goal of the initiator
of the SBCL project has been to eliminate those places, so that the
original idea of Python (use type declarations for checking _and_
optimization) is carried out to its fullest extent.

> Type declarations in Lisp seem to only influence the optimizer, and in
> all the code examples I've seen, type checking is explicit (e.g. the
> programmer wraps their function body with the appropriate tests).

The problem with using type declarations to do this kind of thing is
that the ANSI standard doesn't _require_ the compiler to check them at
compile-time, and it especially doesn't require the compiler to emit
code that checks them at run-time, e.g.

(defun foo (x)
  (declare (type string x))
  ...)

would have to raise an error when _called at run-time_ (e.g. by the
user at the repl) with a symbol, in order to be equivalent to

(defun foo (x)
  (check-type x string)
  ...)

CMU CL does this kind of stuff, but again it isn't guaranteed to
happen in other implementations.

So if you want to have portably robust code, you will want to write
explicit type-checks.  Furthermore you have more control with
check-type and friends over the text of the error raised, and
check-type will also establish a store-value restart.

> I haven't read anything in the CLHS that would lead me to believe that
> this use of type declaration forms would cause my implementation to be
> non-conforming (all else being equal).

Indeed nothing in the CLHS prohibits compilers from doing extensive
type-checking at compile-time, based on type declarations, and other
factors.

Regs, Pierre.

-- 
Pierre R. Mai <····@acm.org>                    http://www.pmsf.de/pmai/
 The most likely way for the world to be destroyed, most experts agree,
 is by accident. That's where we come in; we're computer professionals.
 We cause accidents.                           -- Nathaniel Borenstein

From: ········@acm.org
Subject: Re: Using type declarations in safety checks
Date: Sat, 17 Nov 2001 14:22:47 +0000
Message-ID: <XouJ7.43735$wd1.3574330@news20.bellglobal.com>

"Pierre R. Mai" <····@acm.org> writes:
> The problem with using type declarations to do this kind of thing is
> that the ANSI standard doesn't _require_ the compiler to check them at
> compile-time, and it especially doesn't require the compiler to emit
> code that checks them at run-time, e.g.
> 
> (defun foo (x)
>   (declare (type string x))
>   ...)

A question:  How does one combine types in such declarations?

For instance, I've got a function which starts out thus:

(defun db-put (db key datum
		  &key (start 0) end db-start db-end
		  (txn *transaction*) (flags 0))
  (declare (type db db) (string key datum))

I'd _like_ to put in the check:
  (declare (type transaction txn))
except that txn is also allowed to be NIL.

I'm not sure how to express that; could it be something like the following?
  (declare (or (type transaction txn) (nil txn)))

(It's running on CMU/CL, FYI...)
-- 
(concatenate 'string "cbbrowne" ·@cbbrowne.com")
http://www.cbbrowne.com/info/oses.html
Rules of the Evil Overlord #71. "If I decide to test a lieutenant's
loyalty and see if he/she should  be made a trusted lieutenant, I will
have a crack squad of marksmen  standing by in case the answer is no."
<http://www.eviloverlord.com/>

From: Pierre R. Mai
Subject: Re: Using type declarations in safety checks
Date: Sat, 17 Nov 2001 15:19:48 +0000
Message-ID: <87zo5la0h7.fsf@orion.bln.pmsf.de>

········@acm.org writes:

> A question:  How does one combine types in such declarations?
> 
> For instance, I've got a function which starts out thus:
> 
> (defun db-put (db key datum
> 		  &key (start 0) end db-start db-end
> 		  (txn *transaction*) (flags 0))
>   (declare (type db db) (string key datum))
> 
> I'd _like_ to put in the check:
>   (declare (type transaction txn))
> except that txn is also allowed to be NIL.

You can use all of the valid Common Lisp type specifiers for the
declarations, so OR is your friend here:

(defun db-put (db key datum &key (start 0) end db-start db-end
                                 (txn *transaction*) (flags 0))
  (declare (type db db) (type string key datum)
           (type (or transaction null) txn)
           (type fixnum start)
           (type (or fixnum null) end)
           ...)
  ...)

> I'm not sure how to express that; could it be something like the following?
>   (declare (or (type transaction txn) (nil txn)))

Nearly correct:  You don't or the declarations, you or the types ;)

> (It's running on CMU/CL, FYI...)

With the given declarations, CMU CL might even derive that e.g. end is
a fixnum at a given point in the code.  If it isn't, and it really
matters for performance reasons, you might want to rebind it to a
value that is guaranteed to be a fixnum, and redeclare it accordingly.

Regs, Pierre.

-- 
Pierre R. Mai <····@acm.org>                    http://www.pmsf.de/pmai/
 The most likely way for the world to be destroyed, most experts agree,
 is by accident. That's where we come in; we're computer professionals.
 We cause accidents.                           -- Nathaniel Borenstein

From: Hannu Koivisto
Subject: Re: Using type declarations in safety checks
Date: Sat, 17 Nov 2001 15:09:42 +0000
Message-ID: <87k7wptow9.fsf@lynx.ionific.com>

········@acm.org writes:

| "Pierre R. Mai" <····@acm.org> writes:
| > The problem with using type declarations to do this kind of thing is
| > that the ANSI standard doesn't _require_ the compiler to check them at
| > compile-time, and it especially doesn't require the compiler to emit
| > code that checks them at run-time, e.g.
| > 
| > (defun foo (x)
| >   (declare (type string x))
| >   ...)
| 
| A question:  How does one combine types in such declarations?

Use compound type specifiers (CLHS -> Types and Classes -> Types ->
Type Specifiers).

| For instance, I've got a function which starts out thus:
| 
| (defun db-put (db key datum
| 		  &key (start 0) end db-start db-end
| 		  (txn *transaction*) (flags 0))
|   (declare (type db db) (string key datum))
| 
| I'd _like_ to put in the check:

I think it would be better if you called it a declaration and not a
check even though it may cause a check to be generated in the
implementation that you are using.

|   (declare (type transaction txn))
| except that txn is also allowed to be NIL.

(declare (type (or transaction null) txn))

-- 
Hannu

From: Kent M Pitman
Subject: Re: Using type declarations in safety checks
Date: Sat, 17 Nov 2001 15:51:51 +0000
Message-ID: <sfw7ksptmy0.fsf@shell01.TheWorld.com>

Hannu Koivisto <·····@iki.fi> writes:

> | For instance, I've got a function which starts out thus:
> | 
> | (defun db-put (db key datum
> | 		  &key (start 0) end db-start db-end
> | 		  (txn *transaction*) (flags 0))
> |   (declare (type db db) (string key datum))
> | 
> | I'd _like_ to put in the check:
> 
> I think it would be better if you called it a declaration and not a
> check even though it may cause a check to be generated in the
> implementation that you are using.

Yes, I completely agree with this.

An implementation is permitted to simply believe the declaration without
checking it and this can cause exactly the opposite effect that you want.
If you're uncertain that the data is correctly connected to a reliable
source, you should typically NOT declare it since that will make it less
likely that the implementation will make unwarranted assumptions.

For example, if you do
 (defun f (x y) (declare (fixnum x y)) (+ x y))
don't be surprised if some implementation that gets
 (f 'a 'b)
just adds the two pointers to A and B without checking and gives you
back some large integer whose value is garbage.  Nothing forbids an
individual _implementation_ from promising you more (as with CMU CL
or the Symbolics CLTL and CLOE Developer facilities), but the _language_
definitely does not promise you more.

From: ········@acm.org
Subject: Re: Using type declarations in safety checks
Date: Tue, 20 Nov 2001 20:25:30 +0000
Message-ID: <__yK7.5910$op.1510473@news20.bellglobal.com>

Hannu Koivisto <·····@iki.fi> writes:
> ········@acm.org writes:
> | For instance, I've got a function which starts out thus:
> | 
> | (defun db-put (db key datum
> | 		  &key (start 0) end db-start db-end
> | 		  (txn *transaction*) (flags 0))
> |   (declare (type db db) (string key datum))
> | 
> | I'd _like_ to put in the check:
> 
> I think it would be better if you called it a declaration and not a
> check even though it may cause a check to be generated in the
> implementation that you are using.
> 
> |   (declare (type transaction txn))
> | except that txn is also allowed to be NIL.
> 
> (declare (type (or transaction null) txn))

Thank you all for your comments; I certainly agree that it is more
properly called a "declaration;" that would certainly be consistent
with the special form's name, DECLARE.

To be sure, the intent is to make it possible for the implementation
to "check" that values conform with that declaration.

Kent's comments that declaring such could lead to the possibility that
"the implementation will make unwarranted assumptions" are a _little_
distressing, although only a little.  The implementations I'd be most
concerned about, at present, either:
  a) Largely ignore the declaration, or
  b) Use it to help do further inference, likely leading to improving
     type checking throughout.  (CMUCL :-).)
-- 
(concatenate 'string "cbbrowne" ·@ntlug.org")
http://www.cbbrowne.com/info/x.html
"The idea that Bill Gates has appeared like a knight in shining armour
to  lead all customers  out of  a mire  of technological  chaos neatly
ignores  the  fact  that  it  was  he  who,  by  peddling  second-rate
technology, led them  into it in the first place."  - Douglas Adams in
Guardian, 25-Aug-95

From: Kent M Pitman
Subject: Re: Using type declarations in safety checks
Date: Tue, 20 Nov 2001 21:09:47 +0000
Message-ID: <sfwy9l1cfok.fsf@shell01.TheWorld.com>

········@acm.org writes:

> Kent's comments that declaring such could lead to the possibility that
> "the implementation will make unwarranted assumptions" are a _little_
> distressing, although only a little.

But this is the real purpose of declarations, historically, at least in Lisp.

The notion is that there are things the compiler can't know and can't
figure out.  Not just due to "halting problem" effects in doing
general flow analysis but because some truths rely on knowledge about
what external data will be supplied to a program (the kind of data,
its format, how well error-checked it is, etc).  Absent such global
knowledge, efficient code can't be produced.  Declarations exist for
the express purpose of making the compiler a promise that the compiler
cannot infer on its own.  But if you violate the promise, you're on
your own.  This is, at its heart, the reason that Lisp programmers are
not overzealous about type declaring stuff that doesn't need to be
type declared.  It often just reduces the robustness without a
corresponding increase in speed, since a lot of cases simply aren't going
to compile differently regardless of the declared type--a pointer is a
pointer in many cases, and there may be no advantage to knowing it's a
pointer to a user-defined FROB rather than a pointer to a STRING.

> The implementations I'd be most
> concerned about, at present, either:
>   a) Largely ignore the declaration, or
>   b) Use it to help do further inference, likely leading to improving
>      type checking throughout.  (CMUCL :-).)

What CMU CL does is VERY interesting, btw, and it's a pity more
implementations haven't had time/energy/money enough to follow its lead.
We certainly designed CL to allow this kind of thing, but we knew it
was work to implement so we didn't require it.

But I don't think you're right to say only these two categories.  I
think there are implementations where if you type declare something to
be FIXNUM or some FLOAT type, that you'd better really give a fixnum
or a float because otherwise you may just scramble some other kind of
pointer.  Other than the number types, I agree with you that lots of
implementations ignore type declarations.  Though I wouldn't expect any
such incidental fact to be unchanging forever.  It's often a resource
issue.  There are good optimizations that can be done with a variety of
othe types, and I'd expect eventually more implementations will do them.

From: Michael Hudson
Subject: Re: Using type declarations in safety checks
Date: Thu, 22 Nov 2001 09:58:52 +0000
Message-ID: <upu6bt9db.fsf@python.net>

Kent M Pitman <······@world.std.com> writes:

> ········@acm.org writes:
> 
> > Kent's comments that declaring such could lead to the possibility that
> > "the implementation will make unwarranted assumptions" are a _little_
> > distressing, although only a little.
> 
> But this is the real purpose of declarations, historically, at least in Lisp.
> 
> The notion is that there are things the compiler can't know and can't
> figure out.  Not just due to "halting problem" effects in doing
> general flow analysis but because some truths rely on knowledge about
> what external data will be supplied to a program (the kind of data,
> its format, how well error-checked it is, etc).  Absent such global
> knowledge, efficient code can't be produced.  Declarations exist for
> the express purpose of making the compiler a promise that the compiler
> cannot infer on its own.

This question doesn't really fit here, but I thought I'd wedge it in
anyway.

Has there been any work done on what might be called adaptive
optimization?  

I'm envisaging a system which, for example, "notices" that a function
is "usually" called with small integer arguments and so compiles a
version of the function specialized on such arguments and call that if
the first function is indeed called with small interger arguments.
Then it might notice that some of the call sites can only call the
function with small integer arguments, and insert direct calls to the
specialized function.

The idea would be that as you fed the system more and more data, it
would gradually get faster on "it's own".  I guess this is a bit like
an enormous version of the trace cache found in P4s.

Did the Self compiler do any of these things?

Just curious.

Cheers,
M.

-- 
  LINTILLA:  You could take some evening classes.
    ARTHUR:  What, here?
  LINTILLA:  Yes, I've got a bottle of them.  Little pink ones.
                   -- The Hitch-Hikers Guide to the Galaxy, Episode 12

From: Bruce Hoult
Subject: Re: Using type declarations in safety checks
Date: Sat, 17 Nov 2001 03:21:14 +0000
Message-ID: <bruce-6DEC23.16211417112001@news.paradise.net.nz>

In article <··············@orion.bln.pmsf.de>, "Pierre R. Mai" 
<····@acm.org> wrote:

> ········@irtnog.org writes:
> 
> > Would static type checking (as opposed to optimization) be a
> > legitimate use of (DECLARE (TYPE ...)) forms?
> 
> Indeed it is, and CMU CL's new (in 1990) compiler (named Python) was
> written more or less in order to prove that the required amount of
> type inferencing was possible, and that this kind of checking was
> practicable.

And Dylan was designed to exhance the amount of such checking possible.  
A lot of things in CMU CL were incorporated into the CMU Gwydion 
project's Dylan compiler, "d2c".


> > Take, for example, the following definitions:
> > 
> >         (DECLAIM (SAFETY 3))
> > 
> >         (DEFUN FOO (BAR BAZ)
> >           (DECLARE (TYPE STRING BAR)
> >                    (TYPE LIST BAZ)
> >           ...))
> > 
> > The following form would be compiled/executed without error:
> > 
> >         (FOO "TEST1" (LIST 1 2 3))
> > 
> > Whereas our hypothetical compiler (or evaluator) would signal a
> > TYPE-ERROR (or maybe two type errors) on the following form:
> > 
> >         (FOO :TEST1 #(1 2 3))
> 
> The compiler is supposed to issue warnings for things that will
> (likely) lead to errors at run-time, not errors.

Dylan has the same philosophy, but sometimes you're *certain* that the 
errors will happen...

---------------------
define method foo(a :: <string>, b :: <list>)
  pair(a, b)
end;

foo(test:, #[1, 2, 3])
---------------------
In Top level form.:
"noapplicable.dylan", line 7, characters 1 through 22:
    foo(test:, #[1, 2, 3])
    ^^^^^^^^^^^^^^^^^^^^^^
  Error: No applicable methods for argument types
    (singleton(#"test"), direct-instance(<simple-object-vector>))
  in call of foo
---------------------

If you use a simple function rather than a generic function then you get 
two errors:

---------------------
define function foo(a :: <string>, b :: <list>)
  pair(a, b)
end;

foo(test:, #[1, 2, 3])
---------------------
In Top level form.:
"noapplicable.dylan", line 7, characters 1 through 22:
    foo(test:, #[1, 2, 3])
    ^^^^^^^^^^^^^^^^^^^^^^
  Error: Wrong type for the first argument in call of foo.
  Wanted <string>, but got singleton(#"test")
"noapplicable.dylan", line 7, characters 1 through 22:
    foo(test:, #[1, 2, 3])
    ^^^^^^^^^^^^^^^^^^^^^^
  Error: Wrong type for the second argument in call of foo.
  Wanted <list>, but got direct-instance(<simple-object-vector>)
---------------------


There are other cases in which d2c will not issue an error but will 
optomize a function, or one branch of an IF, down to basically "(print 
"Type error") (abort)".  How to get it to issue warnings for these when 
they will help the user and not issue too many of them is still an open 
question.

-- Bruce

From: Pierre R. Mai
Subject: Re: Using type declarations in safety checks
Date: Sun, 18 Nov 2001 15:12:10 +0000
Message-ID: <873d3cyuyd.fsf@orion.bln.pmsf.de>

Bruce Hoult <·····@hoult.org> writes:

> > The compiler is supposed to issue warnings for things that will
> > (likely) lead to errors at run-time, not errors.
> 
> Dylan has the same philosophy, but sometimes you're *certain* that the 
> errors will happen...

In the context of the CL standard, the compiler only signals errors
when it is unable to process the source code.  When it detects a
situation that might (even with certainty) lead to an error at
run-time, it signals a warning.  Ignoring warnings of the compiler
happens at your own peril in CL, because it is the compilers way of
telling you that something is very likely wrong.  If you know better,
there are ways of telling the compiler, and silencing the warning in
this way.

What is called a warning in many other languages is called a
style-warning in CL parlance.  Only those warnings are "ignorable".

There is a very important reason for CL to delay the signalling of an
error to run-time, even in the cases where it is dead certain that
such an error will occur:  The application might have established
condition-handlers and restarts that allow it to deal with that
situation, based on knowledge available at runtime.  If the compiler
refused to compile such code (that is what an error at compile-time
means, i.e. no loadable output is produced), this realistic scenario
would not be possible to implement.

From another POV in a truly dynamic language, the compiler can be
certain about not so many things, e.g. in your example:

> ---------------------
> define method foo(a :: <string>, b :: <list>)
>   pair(a, b)
> end;
> 
> foo(test:, #[1, 2, 3])
> ---------------------
> In Top level form.:
> "noapplicable.dylan", line 7, characters 1 through 22:
>     foo(test:, #[1, 2, 3])
>     ^^^^^^^^^^^^^^^^^^^^^^
>   Error: No applicable methods for argument types
>     (singleton(#"test"), direct-instance(<simple-object-vector>))
>   in call of foo
> ---------------------

Unless you have some guarantee that noone will be able to define
different methods on foo, how is the compiler to decide that there
will be no applicable method on foo at run-time?  Another file which
has the following content

define method foo(a :: <string>, b :: <vector>)
  pair(a, b)
end;

might have been loaded prior to the file in question, making the call
to foo perfectly legal.  Or I might have defined a method on
no-applicable-method (or established a handler for the corresponding
condition[1]), that dynamically defined additional methods from
information present in the run-time environment, or otherwise handled
the situation so that the call to foo is again perfectly valid.

Note that I haven't looked at any Dylan language standard, so Dylan's
model of compilation or execution might prohibit all such scenarios,
and in that case the compiler is perfectly in its rights, but CL's
model does most certainly allow this scenario, and it offers
flexibility that a lot of people depend on.  Not only should a CL
compiler not emit an error, I'd be seriously cross with it if it
insisted on issuing a warning, or even a style-warning, since the
above scenario is perfectly valid, and requiring a certain compilation
order would be onerous.

> If you use a simple function rather than a generic function then you get 
> two errors:
> 
> ---------------------
> define function foo(a :: <string>, b :: <list>)
>   pair(a, b)
> end;
> 
> foo(test:, #[1, 2, 3])
> ---------------------
> In Top level form.:
> "noapplicable.dylan", line 7, characters 1 through 22:
>     foo(test:, #[1, 2, 3])
>     ^^^^^^^^^^^^^^^^^^^^^^
>   Error: Wrong type for the first argument in call of foo.
>   Wanted <string>, but got singleton(#"test")
> "noapplicable.dylan", line 7, characters 1 through 22:
>     foo(test:, #[1, 2, 3])
>     ^^^^^^^^^^^^^^^^^^^^^^
>   Error: Wrong type for the second argument in call of foo.
>   Wanted <list>, but got direct-instance(<simple-object-vector>)
> ---------------------

Again, CL allows you to redefine foo to handle additional types.
And user-code can get between the definition and the call with an
appropriate handler.  I'd accept a warning here, though, since the
code will very, very likely raise an error at load-time, even if that
error is subsequently handled.  The difference here is caused by the
fact that functions are more closed-world/static than GFs, but that is
only a difference in degree, since CL allows redefinition of functions
at will.

> There are other cases in which d2c will not issue an error but will 
> optomize a function, or one branch of an IF, down to basically "(print 
> "Type error") (abort)".  How to get it to issue warnings for these when 
> they will help the user and not issue too many of them is still an open 
> question.

A CL compiler isn't even allowed to issue an error at compile-time for
a file containing only the following top-level form:

(error "End the world!")

Again, the error is to be signalled at run-time, since only there the
corresponding handlers can enter the picture.

Turn it as you want, CL is completely dynamic by default, and people
take full advantage of that fact, as Kent has already pointed out.
Hence the rights of the compiler to refuse certain code have to be
clearly restricted.  CL doesn't consider a program that will cause
run-time errors (regardless of their origin) as a "faulty program" to
be rejected by the compiler.  It considers it to be a perfectly valid
program, with clearly defined semantics.  Hence the compiler can't be
allowed to arbitrarily reject programs.

Regs, Pierre.

Footnotes: 
[1]  ANSI CL doesn't mandate the type of the error raised by the
     default method on no-applicable-method, but many implementations
     provide a specific condition for this error.

-- 
Pierre R. Mai <····@acm.org>                    http://www.pmsf.de/pmai/
 The most likely way for the world to be destroyed, most experts agree,
 is by accident. That's where we come in; we're computer professionals.
 We cause accidents.                           -- Nathaniel Borenstein

From: ········@irtnog.org
Subject: Re: Using type declarations in safety checks
Date: Mon, 19 Nov 2001 02:54:09 +0000
Message-ID: <w4ou1vra2su.fsf@eco-fs1.irtnog.org>

I obviously don't understand the compiler's semantics as well as I
should.  Thank you all for your illuminating replies.

-- 
"We know for certain only when we know little.  With knowlege, doubt
increases." - Goethe

From: Matthew X. Economou
Subject: Re: Using type declarations in safety checks
Date: Mon, 19 Nov 2001 18:59:09 +0000
Message-ID: <w4od72ea8oy.fsf@eco-fs1.irtnog.org>

>>>>> "XF" == "Matthew X. Economou" <········@irtnog.org> writes:

    XF> I obviously don't understand the compiler's semantics as well
    XF> as I should.

Wait a second!  If the compiler only emits warnings for stuff like
invalid arguments to a function, that means that function calls must
be handled at run-time, not at compile-time or link/load-time.  That
means that callees must check the number of arguments, and that
callers must look up the function binding every time they jump.

Is that indeed what "dynamic" implies?  Isn't that really inefficient?
How in the world would the linker/loader work?

-- 
"We know for certain only when we know little.  With knowlege, doubt
increases." - Goethe

From: Tim Bradshaw
Subject: Re: Using type declarations in safety checks
Date: Tue, 20 Nov 2001 11:34:35 +0000
Message-ID: <fbc0f5d1.0111200334.6fcda441@posting.google.com>

"Matthew X. Economou" <········@irtnog.org> wrote in message news:<···············@eco-fs1.irtnog.org>...
> 
> Wait a second!  If the compiler only emits warnings for stuff like
> invalid arguments to a function, that means that function calls must
> be handled at run-time, not at compile-time or link/load-time.  That
> means that callees must check the number of arguments, and that
> callers must look up the function binding every time they jump.

In general, yes.  However with suitable declarations and compilation
settings, most or all of this may be optimised away.  For instance the
compiler may signal a warning but generate code which assumes the
given number of arguments in any case, or it may compile a `slow' call
when it is worried about, or does not know, the number of arguments
(for instance in the case of APPLY) but a fast call otherwise.

> Is that indeed what "dynamic" implies?  Isn't that really inefficient?

In general yes, this is what dynamic implies: various decisions may
get delayed until run time that are made at compile time in static
languages.

> How in the world would the linker/loader work?

Ah, well, we rely on post-50s technology for that.  We have an
extremely clever dynamic, incremental, link-loader, the first example
of which dates from around 1958.  It's really a shame that these
systems aren't more widely known, and most people are stuck with early
50s link-load technology.

--tim

From: Thomas F. Burdick
Subject: Re: Using type declarations in safety checks
Date: Tue, 20 Nov 2001 19:12:34 +0000
Message-ID: <xcvlmh1p87x.fsf@conquest.OCF.Berkeley.EDU>

"Matthew X. Economou" <········@irtnog.org> writes:

> >>>>> "XF" == "Matthew X. Economou" <········@irtnog.org> writes:
> 
>     XF> I obviously don't understand the compiler's semantics as well
>     XF> as I should.
> 
> Wait a second!  If the compiler only emits warnings for stuff like
> invalid arguments to a function, that means that function calls must
> be handled at run-time, not at compile-time or link/load-time.  That
> means that callees must check the number of arguments, and that
> callers must look up the function binding every time they jump.
> 
> Is that indeed what "dynamic" implies?  Isn't that really inefficient?
> How in the world would the linker/loader work?

As Tim Bradshaw already pointed out, this is only expensive in the
general case.  If you get lots of flexibility, and the common case is
pretty efficient, it's good software engineering to go with that, even
if the uncommonly-used general case is slow.  As an example, CMUCL has
the notion of compilation blocks, calls internal to which can be
heavily optimized, because they won't get redefinitions.  This is a
lousy idea for software in heavy development, but as a delivery
mechanism, it's not too onerous to have to recompile at the
granularity of an entire file.  General function calls are then only
done across module lines.

Also, consider the case of a function fixnum-average, which takes two
fixnums as arguments, and returns a fixnum.  It's obvious how the
compiler can use this knowledge to generate fast function calls.  But
there's no reason why it need generate only one entry point to the
function.  If fixnum-average gets redefined, this efficient entry
point can be overwritten with code to make a general case call to the
new definition.  Now, you only pay for dynamicity if you use it.  I
think there's an entire language designed around pay-as-you-go ... oh,
except it won't even let you pay for dynamic things :-)

-- 
           /|_     .-----------------------.                        
         ,'  .\  / | No to Imperialist war |                        
     ,--'    _,'   | Wage class war!       |                        
    /       /      `-----------------------'                        
   (   -.  |                               
   |     ) |                               
  (`-.  '--.)                              
   `. )----'

From: Bruce Hoult
Subject: Re: Using type declarations in safety checks
Date: Mon, 19 Nov 2001 00:31:35 +0000
Message-ID: <bruce-BADF6E.13313519112001@news.paradise.net.nz>

In article <··············@orion.bln.pmsf.de>, "Pierre R. Mai" 
<····@acm.org> wrote:

> Bruce Hoult <·····@hoult.org> writes:
> 
> > > The compiler is supposed to issue warnings for things that will
> > > (likely) lead to errors at run-time, not errors.
> > 
> > Dylan has the same philosophy, but sometimes you're *certain* that the 
> > errors will happen...
> 
> There is a very important reason for CL to delay the signalling of an
> error to run-time, even in the cases where it is dead certain that
> such an error will occur:  The application might have established
> condition-handlers and restarts that allow it to deal with that
> situation, based on knowledge available at runtime.  If the compiler
> refused to compile such code (that is what an error at compile-time
> means, i.e. no loadable output is produced), this realistic scenario
> would not be possible to implement.

As I said, Dylan has the same philosophy in general.

> From another POV in a truly dynamic language, the compiler can be
> certain about not so many things, e.g. in your example:
> 
> > ---------------------
> > define method foo(a :: <string>, b :: <list>)
> >   pair(a, b)
> > end;
> > 
> > foo(test:, #[1, 2, 3])
> > ---------------------
> > In Top level form.:
> > "noapplicable.dylan", line 7, characters 1 through 22:
> >     foo(test:, #[1, 2, 3])
> >     ^^^^^^^^^^^^^^^^^^^^^^
> >   Error: No applicable methods for argument types
> >     (singleton(#"test"), direct-instance(<simple-object-vector>))
> >   in call of foo
> > ---------------------
> 
> Unless you have some guarantee that noone will be able to define
> different methods on foo, how is the compiler to decide that there
> will be no applicable method on foo at run-time?

That's just it.  You *do* have such a guarantee in this case, because 
the "foo" GF is sealed by default.  If you wanted it to be open then you 
would declare:

  define open generic foo(a, b);

Possibly you would also restrict the argument and/or return types to 
something more specific than <object>, which is the default.

> Another file which has the following content
> 
> define method foo(a :: <string>, b :: <vector>)
>   pair(a, b)
> end;
> 
> might have been loaded prior to the file in question, making the call
> to foo perfectly legal.

If the GF is open then it will be as you say.  if the GF is sealed then 
the compiler will give an error for the call to foo, and the runtime 
will give an error when the library containing the additional definition 
is loaded.

> Or I might have defined a method on
> no-applicable-method (or established a handler for the corresponding
> condition[1]), that dynamically defined additional methods from
> information present in the run-time environment, or otherwise handled
> the situation so that the call to foo is again perfectly valid.

The Dylan Reference Manual doesn't list a standard no-applicable-method 
condtion.  I'm not sure why.  You can certainly catch it as as <error>, 
but anything more is non-portable.  Gwydion Dylan, for example, throws a 
<simple-error>, which is an error with a format string and some 
arguments and no restarts.

If you want to catch the error and recover somehow then it seems to me 
that the better method in any case is to simply define a default 
method...

  define method foo(a :: <object>, b :: <object>) ...  end

... and that method can either directly do what is appropriate or else 
can add a new methed to the GF (if it's open).

> Note that I haven't looked at any Dylan language standard, so Dylan's
> model of compilation or execution might prohibit all such scenarios,
> and in that case the compiler is perfectly in its rights, but CL's
> model does most certainly allow this scenario, and it offers
> flexibility that a lot of people depend on.

Dylan offers quite a lot of flexibility in how static or how dynamic 
various aspects of your program are.  Not quite as much on the dynamic 
end as CL, for sure, since Dylan doesn't have "eval", but more than most 
compiled languages.

> > There are other cases in which d2c will not issue an error but will 
> > optomize a function, or one branch of an IF, down to basically "(print 
> > "Type error") (abort)".  How to get it to issue warnings for these when 
> > they will help the user and not issue too many of them is still an open 
> > question.
> 
> A CL compiler isn't even allowed to issue an error at compile-time for
> a file containing only the following top-level form:
> 
> (error "End the world!")
> 
> Again, the error is to be signalled at run-time, since only there the
> corresponding handlers can enter the picture.

Same in Dylan.

> Turn it as you want, CL is completely dynamic by default

Not completely.  We have seen that CL doesn't allow you to redefine < 
system-wide to accomodate new types while Dylan does.  CL lets you 
shadow it in your own code, but you can't propogate that out into other 
people's libraries.

> Footnotes: 
> [1]  ANSI CL doesn't mandate the type of the error raised by the
>      default method on no-applicable-method, but many implementations
>      provide a specific condition for this error.

Ah!  So in fact it's exactly the same situation as in Dylan.  You can't 
portably do anything useful.  CMU Gwydion Dylan raises a <simple-error> 
(format string and args) while Functional Developer (nee Harlequin 
Dylan) raises a <dispatch-error> which yo can presumably do something 
with.

-- Bruce

From: Pierre R. Mai
Subject: Re: Using type declarations in safety checks
Date: Sat, 24 Nov 2001 00:04:03 +0000
Message-ID: <87oflt3uh8.fsf@orion.bln.pmsf.de>

Bruce Hoult <·····@hoult.org> writes:

[ Delaying errors until run-time ]

> As I said, Dylan has the same philosophy in general.

I fail to see how that is the case:

> > From another POV in a truly dynamic language, the compiler can be
> > certain about not so many things, e.g. in your example:
> > 
> > > ---------------------
> > > define method foo(a :: <string>, b :: <list>)
> > >   pair(a, b)
> > > end;
> > > 
> > > foo(test:, #[1, 2, 3])
> > > ---------------------
> > > In Top level form.:
> > > "noapplicable.dylan", line 7, characters 1 through 22:
> > >     foo(test:, #[1, 2, 3])
> > >     ^^^^^^^^^^^^^^^^^^^^^^
> > >   Error: No applicable methods for argument types
> > >     (singleton(#"test"), direct-instance(<simple-object-vector>))
> > >   in call of foo
> > > ---------------------
> > 
> > Unless you have some guarantee that noone will be able to define
> > different methods on foo, how is the compiler to decide that there
> > will be no applicable method on foo at run-time?
> 
> That's just it.  You *do* have such a guarantee in this case, because 
> the "foo" GF is sealed by default.  If you wanted it to be open then you 
> would declare:

But even if you know that that will raise an error at run-time, why
does the Dylan compiler find it to be acceptable to isse an error at
compile-time?  Or does it still produce runnable code when errors are
encountered?  And if so, how does it distinguish that situation from
"real" compilation errors, i.e. errors that prevent the compiler from
emitting runnable code?

In my eyes, and that's a view that seems to be shared by many language
communities, an error at compile-time says to me:  You can't rely on
the emitted code (if any), to do anything well defined.  That is why
the things we discussed are warnings in CL, because the semantics of
the compiled code _are_ well defined, even if they are often not what
the programmer intended.

> > Or I might have defined a method on
> > no-applicable-method (or established a handler for the corresponding
> > condition[1]), that dynamically defined additional methods from
> > information present in the run-time environment, or otherwise handled
> > the situation so that the call to foo is again perfectly valid.
> 
> The Dylan Reference Manual doesn't list a standard no-applicable-method 
> condtion.  I'm not sure why.  You can certainly catch it as as <error>, 
> but anything more is non-portable.  Gwydion Dylan, for example, throws a 
> <simple-error>, which is an error with a format string and some 
> arguments and no restarts.

Well, since we are in c.l.l, and discussing CL, reading the CL
standard might shed more light on things.  no-applicable-method is a
GF that is invoked when there is no applicable method on a GF.  It is
the default method on this GF that signals an error.  Users are
allowed to override this behaviour.

> If you want to catch the error and recover somehow then it seems to me 
> that the better method in any case is to simply define a default 
> method...
> 
>   define method foo(a :: <object>, b :: <object>) ...  end
> 
> ... and that method can either directly do what is appropriate or else 
> can add a new methed to the GF (if it's open).

While this is one possible solution, it isn't the only useful
solution.  There are situations where a specific method on N-A-M is
going to be much more perspicuous, especially if there are lots of GF
involved.

In any case this is immaterial to the problem at hand.

Regs, Pierre.

-- 
Pierre R. Mai <····@acm.org>                    http://www.pmsf.de/pmai/
 The most likely way for the world to be destroyed, most experts agree,
 is by accident. That's where we come in; we're computer professionals.
 We cause accidents.                           -- Nathaniel Borenstein