From: Xah Lee
Subject: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1149829487.457604.17510@i40g2000cwc.googlegroups.com>
in March, i posted a essay “What is Expressiveness in a Computer
Language”, archived at:
http://xahlee.org/perl-python/what_is_expresiveness.html

I was informed then that there is a academic paper written on this
subject.

On the Expressive Power of Programming Languages, by Matthias
Felleisen, 1990.
http://www.ccs.neu.edu/home/cobbe/pl-seminar-jr/notes/2003-sep-26/expressive-slides.pdf

Has anyone read this paper? And, would anyone be interested in giving a
summary?

thanks.

   Xah
   ···@xahlee.org
 ∑ http://xahlee.org/

From: Joe Marshall
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1149863687.298352.45980@h76g2000cwa.googlegroups.com>
Xah Lee wrote:
> in March, i posted a essay "What is Expressiveness in a Computer
> Language", archived at:
> http://xahlee.org/perl-python/what_is_expresiveness.html
>
> I was informed then that there is a academic paper written on this
> subject.
>
> On the Expressive Power of Programming Languages, by Matthias
> Felleisen, 1990.
> http://www.ccs.neu.edu/home/cobbe/pl-seminar-jr/notes/2003-sep-26/expressive-slides.pdf
>
> Has anyone read this paper? And, would anyone be interested in giving a
> summary?

The gist of the paper is this:  Some computer languages seem to be
`more expressive' than
others.  But anything that can be computed in one Turing complete
language can be computed in any other Turing complete language.
Clearly the notion of
expressiveness isn't concerned with ultimately computing the answer.

Felleisen's paper puts forth a formal definition of expressiveness in
terms of semantic
equivilances of small, local constructs.  In his definition, wholescale
program transformation is
disallowed so you cannot appeal to Turing completeness to claim program
equivalence.

Expressiveness isn't necessarily a good thing.  For instance, in C, you
can express the
addresses of variables by using pointers.  You cannot express the same
thing in Java, and
most people consider this to be a good idea.
From: Simon Richard Clarkstone
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e6c5e2$3cn$1@heffalump.dur.ac.uk>
Joe Marshall wrote:
> Xah Lee wrote:
>>On the Expressive Power of Programming Languages, by Matthias
>>Felleisen, 1990.
>>http://www.ccs.neu.edu/home/cobbe/pl-seminar-jr/notes/2003-sep-26/expressive-slides.pdf
> 
> The gist of the paper is this: Some computer languages seem to be 
> `more expressive' than others. But anything that can be computed in
> one Turing complete language can be computed in any other Turing
> complete language. Clearly the notion of expressiveness isn't
> concerned with ultimately computing the answer.
> 
> Felleisen's paper puts forth a formal definition of expressiveness in
> terms of semantic equivilances of small, local constructs. In his
> definition, wholescale program transformation is disallowed so you
> cannot appeal to Turing completeness to claim program equivalence.

I suspect that the small, local transformations versus global 
transformations is also to do with the practice of not saying the same 
thing twice.  Everything from subroutines to LISP macros also helps 
here, increasing language expressiveness.

> Expressiveness isn't necessarily a good thing. For instance, in C,
> you can express the addresses of variables by using pointers. You
> cannot express the same thing in Java, and most people consider this
> to be a good idea.

Assuming the more-expressive feature does not preclude the 
less-expressive one, good/bad depends on the programmer.  I know *I* 
can't be trusted with pointers ;-) , but I know many programmers benefit 
greatly from them.  Of course, knowing that the programmer cannot do 
something does help the compiler stop you shooting yourself in the foot.

-- 
Simon Richard Clarkstone: ··············@·····························@
hotmail.com ### "I have a spelling chequer / it came with my PC /
it plainly marks for my revue / Mistake's I cannot sea" ...
by: John Brophy (at: http://www.cfwf.ca/farmj/fjjun96/)
From: Ken Tilton
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <bJlig.1852$E9.1201@fe11.lga>
Joe Marshall wrote:
> Xah Lee wrote:
> 
>>in March, i posted a essay "What is Expressiveness in a Computer
>>Language", archived at:
>>http://xahlee.org/perl-python/what_is_expresiveness.html
>>
>>I was informed then that there is a academic paper written on this
>>subject.
>>
>>On the Expressive Power of Programming Languages, by Matthias
>>Felleisen, 1990.
>>http://www.ccs.neu.edu/home/cobbe/pl-seminar-jr/notes/2003-sep-26/expressive-slides.pdf
>>
>>Has anyone read this paper? And, would anyone be interested in giving a
>>summary?
> 
> 
> The gist of the paper is this:  Some computer languages seem to be
> `more expressive' than
> others.  But anything that can be computed in one Turing complete
> language can be computed in any other Turing complete language.
> Clearly the notion of
> expressiveness isn't concerned with ultimately computing the answer.
> 
> Felleisen's paper puts forth a formal definition of expressiveness in
> terms of semantic
> equivilances of small, local constructs.  In his definition, wholescale
> program transformation is
> disallowed so you cannot appeal to Turing completeness to claim program
> equivalence.
> 
> Expressiveness isn't necessarily a good thing.  For instance, in C, you
> can express the
> addresses of variables by using pointers.  You cannot express the same
> thing in Java, and
> most people consider this to be a good idea.
> 

Thanks for the summary.

Me, I would like to see a definition of expressiveness that would 
exclude a programming mechanism from "things to be expressed".

If the subject is programmer productivity, well, I write programs to get 
some behavior out of them, such as operating an ATM cash dispenser. If I 
need to keep a list of transactions, I need to express the abstraction 
"list" in some data structure or other, but below that level of 
abstraction I am just hacking code, not expressing myself -- well, that 
is the distinction for which I am arguing.

heck, in this case I will even give you as "thing to express" getting 
back multiple values from a function. That comes up all the time, and it 
can be an aggravation or a breeze. But then I would score C down because 
it does not really return multiple values. One still has some heavy 
lifting to do to fake the expressed thing. But I would still give it an 
edge over java because Java's fakery would have to be a composite object 
-- one could not have a primary return value as the function result and 
ancillary values "somewhere else".

kt

-- 
Cells: http://common-lisp.net/project/cells/

"I'll say I'm losing my grip, and it feels terrific."
    -- Smiling husband to scowling wife, New Yorker cartoon
From: Xah Lee
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1150275826.112536.280940@f6g2000cwb.googlegroups.com>
hi Joe,

Joe Marshall wrote:
 « Expressiveness isn't necessarily a good thing.  For instance, in C,
you can express the addresses ...»

we gotta be careful here, because soon we gonna say binaries are the
most expressive. For instance, in assembly, you can express the
registers and stuff.

Expressiveness, with respect to — for lack of refined terms —
semantics, is a good thing, period. When discussing a language's
semantical expressiveness, it goes without saying that a “domain”
are understood, or needs to be defined. This is almost never mentioned
because it is very difficult. Put it in driveler's chant for better
understanding: we can't “compare apples with oranges”.

Let me give a example. Let's say i invented a language, where, there's
no addition of numbers, but phaserfy and realify with respective
operators ph and re. So, in my language, to do 1+2, you write “ph 1
re ph 2”, which means, to phaserfy 1, and phaserfy 2, then realify
their results, which results in 3. Now, this language is the most
expressive, because it can deal with concepts of phaserfy and realify
that no other lang can.

This may seem ridiculous, but is in fact a lot imperative languages do.
I won't go long here, but for instance, the addresses or references of
C and Perl is such. And in Java and few other OOP langs, there's
“iterator” and “enumerator” things, are likewise immaterial.

As to OOP's iterator and enumerator things, and the general perspective
of extraneous concepts in languages, i'll have to write a essay in
detail some other day.

----

Thanks for the summary.

Is there no one else who are able to read that paper?

  Xah
  ···@xahlee.org
∑ http://xahlee.org/

> Xah Lee wrote:
> > in March, i posted a essay "What is Expressiveness in a Computer
> > Language", archived at:
> > http://xahlee.org/perl-python/what_is_expresiveness.html
> > ...
> > On the Expressive Power of Programming Languages, by Matthias
> > Felleisen, 1990.

> > http://www.ccs.neu.edu/home/cobbe/pl-seminar-jr/notes/2003-sep-26/expressive-slides.pdf

Joe Marshall wrote:
> The gist of the paper is this:  Some computer languages seem to be
> `more expressive' than
> others.  But anything that can be computed in one Turing complete
> language can be computed in any other Turing complete language.
> Clearly the notion of
> expressiveness isn't concerned with ultimately computing the answer.
>
> Felleisen's paper puts forth a formal definition of expressiveness in
> terms of semantic
> equivilances of small, local constructs.  In his definition, wholescale
> program transformation is
> disallowed so you cannot appeal to Turing completeness to claim program
> equivalence.
>
> Expressiveness isn't necessarily a good thing.  For instance, in C, you
> can express the
> addresses of variables by using pointers.  You cannot express the same
> thing in Java, and
> most people consider this to be a good idea.
From: Torben Ægidius Mogensen
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <7zpshbsvjy.fsf@app-1.diku.dk>
"Joe Marshall" <··········@gmail.com> writes:


> > On the Expressive Power of Programming Languages, by Matthias
> > Felleisen, 1990.
> > http://www.ccs.neu.edu/home/cobbe/pl-seminar-jr/notes/2003-sep-26/expressive-slides.pdf
> 
> The gist of the paper is this: Some computer languages seem to be
> `more expressive' than others.  But anything that can be computed in
> one Turing complete language can be computed in any other Turing
> complete language.  Clearly the notion of expressiveness isn't
> concerned with ultimately computing the answer.
>
> Felleisen's paper puts forth a formal definition of expressiveness
> in terms of semantic equivilances of small, local constructs.  In
> his definition, wholescale program transformation is disallowed so
> you cannot appeal to Turing completeness to claim program
> equivalence.

I think expressiveness is more subtle than this.  Basically, it boils
down to: "How quickly can I write a program to solve my problem?".

There are several aspects relevant to this issue, some of which are:

 - Compactness: How much do I have to type to do what I want?

 - Naturality: How much effort does it take to convert the concepts of
   my problem into the concepts of the language?

 - Feedback: Will the language provide sensible feedback when I write
   nonsensical things?

 - Reuse: How much effort does it take to reuse/change code to solve a
   similar problem?

Compactness is hard to measure.  It isn't really about the number of
characters needed in a program, as I don't think one-character symbols
instead of longer keywords make a language more expressive.  It is
better to count lexical units, but if there are too many different
predefined keywords and operators, this isn't reasonable either.
Also, the presence of opaque one-liners doesn't make a language
expressible.  Additionally, as mentioned above, Turing-completeness
(TC) allows you to implement any TC language in any other, so above a
certain size, the choice of language doesn't affect size.  But
something like (number of symbols in program)/log(number of different
symbols) is not too bad.  If programs are allowed to use standard
libraries, the identifiers in the libraries should be counted in the
number of different symbols.

Naturality is very difficult to get a grip on, and it strongly depends
on the type of problem you want to solve.  So it only makes sense to
talk about expressiveness relative to a set of problem domains.  If
this set is small, domain-specific languages win hands down, so if you
want to compare expressiveness of general-purpose languages, you need
a large set of very different problems.  And even with a single
problem, it is hard to get an objective measure of how difficult it is
to map the problem's concepts to those of the language.  But you can
normally observe whether you need to overspecify the concept (i.e.,
you are required to make arbitrary decisions when mapping from concept
to data), if the mapping is onto (i.e., can you construct data that
isn't sensible in the problem domain) and how much redundancy your
representation has.

Feedback is a mixture of several things.  Partly, it is related to
naturality, as a close match between problem concepts and language
concepts makes it less likely that you will express nonsense (relative
to the problem domain) that makes sense in the language.  For example,
if you have to code everything as natural numbers, untyped pure lambda
calculus or S-expressions, there is a good chance that you can get
nonsense past the compiler.  Additionally, it is about how difficult
it is to tie an observation about a computed result to a point in the
program.

Measuring reuse depends partly on what is meant by problems being
similar and also on whether you at the time you write the original
code can predict what types of problems you might later want to solve,
i.e., if you can prepare the code for reuse.  Some languages provide
strong mechanisms for reuse (templates, object hierarchies, etc.), but
many of those require that you can predict how the code is going to be
reused.  So, maybe, you should measure how difficult it is to reuse a
piece of code that is _not_ written with reuse in mind.

This reminds me a bit of last years ICFP contest, where part of the
problem was adapting to a change in specification after the code was
written.

> Expressiveness isn't necessarily a good thing.  For instance, in C,
> you can express the addresses of variables by using pointers.  You
> cannot express the same thing in Java, and most people consider this
> to be a good idea.

I think this is pretty much covered by the above points on naturality
and feedback: Knowing the address of a value or object is an
overspecification unless the address maps back into something in the
problem domain.

On a similar note, is a statically typed langauge more or less
expressive than a dynamically typed language?  Some would say less, as
you can write programs in a dynamically typed language that you can't
compile in a statically typed language (without a lot of encoding),
whereas the converse isn't true.  However, I think this is misleading,
as it ignores the feedback issue: It takes longer for the average
programmer to get the program working in the dynamically typed
language.

        Torben
From: Rob Thorpe
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1150297932.458202.227360@c74g2000cwc.googlegroups.com>
Torben Ægidius Mogensen wrote:
> On a similar note, is a statically typed langauge more or less
> expressive than a dynamically typed language?  Some would say less, as
> you can write programs in a dynamically typed language that you can't
> compile in a statically typed language (without a lot of encoding),
> whereas the converse isn't true.  However, I think this is misleading,
> as it ignores the feedback issue: It takes longer for the average
> programmer to get the program working in the dynamically typed
> language.

>From the point of view purely of expressiveness I'd say it's rather
different.

If a language can express constraints of one kind that is an increase
in expressiveness.
If a language requires constraint to be in one particular way thats a
decrease in expressiveness.

So I would say languages that can be statically typed and can be
dynamically typed are the most expressive.  Languages that require
static typing or are dynamic but cannot express static typing are less
expressive.

This meets my experience of what useful in practice too, static typing
for everything is painful for writing simple code.  Pure dynamic typing
is painful when writing complex code because it makes impossible a
layer of error checking that could otherwise be useful.
From: Joachim Durchholz
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e6pms6$59s$1@online.de>
Rob Thorpe schrieb:
> 
> If a language can express constraints of one kind that is an increase
> in expressiveness.

Agreed.

> If a language requires constraint to be in one particular way thats a
> decrease in expressiveness.

Unless alternatives would be redundant.
Having redundant ways to express the same thing doesn't make a language 
more or less expressive (but programs written in it become more 
difficult to maintain).

> So I would say languages that can be statically typed and can be
> dynamically typed are the most expressive.  Languages that require
> static typing or are dynamic but cannot express static typing are less
> expressive.

Note that this is a different definition of expressiveness.
(The term is very diffuse...)

I think Felleisen's paper defines something that should be termed 
"conciseness".
Whether there's a way to express constraints or other static properties 
of the software is something different. I don't have a good word for it, 
but "expressiveness" covers too much for my taste to really fit.

Regards,
Jo
From: Raffael Cavallaro
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <2006061410510511272-raffaelcavallaro@pasdespamsilvousplaitmaccom>
On 2006-06-14 09:42:25 -0400, ·······@app-1.diku.dk (Torben �gidius 
Mogensen) said:

> It takes longer for the average
> programmer to get the program working in the dynamically typed
> language.

Though I agree with much of your post I would say that many here find 
the opposite to be true - it takes us longer to get a program working 
in a statically typed language because we have to keep adding/changing 
things to get the compiler to stop complaining and actually compile and 
run a program which would be perfectly permissible in a dynamically 
typed language such as common lisp - for example - heterogeneous lists 
and forward references to as yet non-existent functions.
From: Joachim Durchholz
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e6pmjl$4j5$1@online.de>
Raffael Cavallaro schrieb:
> On 2006-06-14 09:42:25 -0400, ·······@app-1.diku.dk (Torben �gidius 
> Mogensen) said:
> 
>> It takes longer for the average
>> programmer to get the program working in the dynamically typed
>> language.
> 
> Though I agree with much of your post I would say that many here find 
> the opposite to be true - it takes us longer to get a program working in 
> a statically typed language because we have to keep adding/changing 
> things to get the compiler to stop complaining and actually compile and 
> run

I think Torben was assuming a language with type inference. You write 
only those type annotations that really carry meaning (and some people 
let the compiler infer even these).

 > a program which would be perfectly permissible in a dynamically
> typed language such as common lisp - for example - heterogeneous lists 
> and forward references to as yet non-existent functions.

Um... heterogenous lists are not necessarily a sign of expressiveness. 
The vast majority of cases can be transformed to homogenous lists 
(though these might then contain closures or OO objects).

As to references to nonexistent functions - heck, I never missed these, 
not even in languages without type inference :-)

I don't hold that they are a sign of *in*expressiveness either. They are 
just typical of highly dynamic programming environments such as Lisp or 
Smalltalk.

Regards,
Jo
From: Pascal Bourguignon
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <878xnzv5i3.fsf@thalassa.informatimago.com>
Joachim Durchholz <··@durchholz.org> writes:
> Raffael Cavallaro schrieb:
>> a program which would be perfectly permissible in a dynamically
>> typed language such as common lisp - for example - heterogeneous
>> lists and forward references to as yet non-existent functions.
>
> Um... heterogenous lists are not necessarily a sign of
> expressiveness. The vast majority of cases can be transformed to
> homogenous lists (though these might then contain closures or OO
> objects).

In lisp, all lists are homogenous: lists of T.


-- 
__Pascal Bourguignon__                     http://www.informatimago.com/

ADVISORY: There is an extremely small but nonzero chance that,
through a process known as "tunneling," this product may
spontaneously disappear from its present location and reappear at
any random place in the universe, including your neighbor's
domicile. The manufacturer will not be responsible for any damages
or inconveniences that may result.
From: Raffael Cavallaro
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <2006061501582450878-raffaelcavallaro@pasdespamsilvousplaitmaccom>
On 2006-06-14 16:36:52 -0400, Pascal Bourguignon <···@informatimago.com> said:

> In lisp, all lists are homogenous: lists of T.

CL-USER 123 > (loop for elt in (list #\c 1 2.0d0 (/ 2 3)) collect 
(type-of elt))
(CHARACTER FIXNUM DOUBLE-FLOAT RATIO)

i.e., "heterogenous" in the common lisp sense: having different dynamic 
types, not in the H-M sense in which all lisp values are of the single 
union type T.
From: Torben Ægidius Mogensen
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <7zy7vxqxni.fsf@app-1.diku.dk>
Raffael Cavallaro <················@pas-d'espam-s'il-vous-plait-mac.com> writes:

> On 2006-06-14 16:36:52 -0400, Pascal Bourguignon <···@informatimago.com> said:
> 
> > In lisp, all lists are homogenous: lists of T.
> 
> CL-USER 123 > (loop for elt in (list #\c 1 2.0d0 (/ 2 3)) collect
> (type-of elt))
> (CHARACTER FIXNUM DOUBLE-FLOAT RATIO)
> 
> i.e., "heterogenous" in the common lisp sense: having different
> dynamic types, not in the H-M sense in which all lisp values are of
> the single union type T.

What's the difference?  Dynamically types values _are_ all members of
a single tagged union type.  The main difference is that the tages
aren't always visible and that there are only a fixed, predefined
number of them.

        Torben
From: Pascal Costanza
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <4ffd0tF1iprnnU1@individual.net>
Torben �gidius Mogensen wrote:
> Raffael Cavallaro <················@pas-d'espam-s'il-vous-plait-mac.com> writes:
> 
>> On 2006-06-14 16:36:52 -0400, Pascal Bourguignon <···@informatimago.com> said:
>>
>>> In lisp, all lists are homogenous: lists of T.
>> CL-USER 123 > (loop for elt in (list #\c 1 2.0d0 (/ 2 3)) collect
>> (type-of elt))
>> (CHARACTER FIXNUM DOUBLE-FLOAT RATIO)
>>
>> i.e., "heterogenous" in the common lisp sense: having different
>> dynamic types, not in the H-M sense in which all lisp values are of
>> the single union type T.
> 
> What's the difference?  Dynamically types values _are_ all members of
> a single tagged union type.

Yes, but that's mostly a meaningless statement in a dynamically typed 
language. In a dynamically typed language, you typically don't care 
about the static types.

> The main difference is that the tages
> aren't always visible and that there are only a fixed, predefined
> number of them.

Depending on the language, the number of "tags" is not fixed.


Pascal

-- 
3rd European Lisp Workshop
July 3 - Nantes, France - co-located with ECOOP 2006
http://lisp-ecoop06.bknr.net/
From: Raffael Cavallaro
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <2006061501423327544-raffaelcavallaro@pasdespamsilvousplaitmaccom>
On 2006-06-14 15:04:34 -0400, Joachim Durchholz <··@durchholz.org> said:

> Um... heterogenous lists are not necessarily a sign of expressiveness. 
> The vast majority of cases can be transformed to homogenous lists 
> (though these might then contain closures or OO objects).
> 
> As to references to nonexistent functions - heck, I never missed these, 
> not even in languages without type inference :-)
> 
> I don't hold that they are a sign of *in*expressiveness either. They 
> are just typical of highly dynamic programming environments such as 
> Lisp or Smalltalk.

This is a typical static type advocate's response when told that users 
of dynamically typed languages don't want their hands tied by a type 
checking compiler:

"*I* don't find those features expressive so *you* shouldn't want them."

You'll have to excuse us poor dynamically typed language rubes - we 
find these features expressive and we don't want to give them up just 
to silence a compiler whose static type checks are of dubious value in 
a world where user inputs of an often unpredictable nature can come at 
a program from across a potentially malicious internet making run-time 
checks a practical necessity.
From: Joachim Durchholz
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e6tt7j$b41$1@online.de>
Raffael Cavallaro schrieb:
> On 2006-06-14 15:04:34 -0400, Joachim Durchholz <··@durchholz.org> said:
> 
>> Um... heterogenous lists are not necessarily a sign of expressiveness. 
>> The vast majority of cases can be transformed to homogenous lists 
>> (though these might then contain closures or OO objects).
>>
>> As to references to nonexistent functions - heck, I never missed 
>> these, not even in languages without type inference :-)
>>
>> [[snipped - doesn't seem to relate to your answer]]
> 
> This is a typical static type advocate's response when told that users 
> of dynamically typed languages don't want their hands tied by a type 
> checking compiler:
> 
> "*I* don't find those features expressive so *you* shouldn't want them."

And this is a typical dynamic type advocate's response when told that 
static typing has different needs:

"*I* don't see the usefulness of static typing so *you* shouldn't want 
it, either."

No ad hominem arguments, please. If you find my position undefendable, 
give counterexamples.
Give a heterogenous list that would to too awkward to live in a 
statically-typed language.
Give a case of calling nonexistent functions that's useful.
You'll get your point across far better that way.

Regards,
Jo
From: Sacha
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <dsvkg.484881$w52.12555370@phobos.telenet-ops.be>
"Joachim Durchholz" <··@durchholz.org> wrote in message 
·················@online.de...
> Raffael Cavallaro schrieb:
>> On 2006-06-14 15:04:34 -0400, Joachim Durchholz <··@durchholz.org> said:
>>
>>> Um... heterogenous lists are not necessarily a sign of expressiveness. 
>>> The vast majority of cases can be transformed to homogenous lists 
>>> (though these might then contain closures or OO objects).
>>>
>>> As to references to nonexistent functions - heck, I never missed these, 
>>> not even in languages without type inference :-)
>>>
>>> [[snipped - doesn't seem to relate to your answer]]
>>
> Give a heterogenous list that would to too awkward to live in a 
> statically-typed language.

Many lists are heterogenous, even in statically typed languages.
For instance lisp code are lists, with several kinds of atoms and 
sub-lists..
A car dealer will sell cars, trucks and equipment..
In a statically typed language you would need to type the list on a common 
ancestor...
What would then be the point of statical typing , as you stilll need to type 
check
each element in order to process that list ? Sure you can do this in a 
statically-typed
language, you just need to make sure some relevant ancestor exists. In my 
experience
you'll end up with the base object-class more often than not, and that's 
what i call
dynamic typing.

> Give a case of calling nonexistent functions that's useful.

I might want to test some other parts of my program before writing this 
function.
Or maybe will my program compile that function depending on user input.
As long as i get a warning for calling a non-existing function, everything 
is fine.

Sacha 
From: Hendrik Maryns
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e6u156$3dd$1@newsserv.zdv.uni-tuebingen.de>
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Sacha schreef:
> "Joachim Durchholz" <··@durchholz.org> wrote in message 
> ·················@online.de...
>> Raffael Cavallaro schrieb:
>>> On 2006-06-14 15:04:34 -0400, Joachim Durchholz <··@durchholz.org> said:
>>>
>>>> Um... heterogenous lists are not necessarily a sign of expressiveness. 
>>>> The vast majority of cases can be transformed to homogenous lists 
>>>> (though these might then contain closures or OO objects).
>>>>
>>>> As to references to nonexistent functions - heck, I never missed these, 
>>>> not even in languages without type inference :-)
>>>>
>>>> [[snipped - doesn't seem to relate to your answer]]
>> Give a heterogenous list that would to too awkward to live in a 
>> statically-typed language.
> 
> Many lists are heterogenous, even in statically typed languages.
> For instance lisp code are lists, with several kinds of atoms and 
> sub-lists..
> A car dealer will sell cars, trucks and equipment..
> In a statically typed language you would need to type the list on a common 
> ancestor...
> What would then be the point of statical typing , as you stilll need to type 
> check
> each element in order to process that list ? Sure you can do this in a 
> statically-typed
> language, you just need to make sure some relevant ancestor exists. In my 
> experience
> you'll end up with the base object-class more often than not, and that's 
> what i call
> dynamic typing.

In my experience you won’t.  I almost never have a List<Object> (Java),
and when I have one, I start thinking on how I can improve the code to
get rid of it.

H.
- --
Hendrik Maryns

==================
http://aouw.org
Ask smart questions, get good answers:
http://www.catb.org/~esr/faqs/smart-questions.html
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2 (GNU/Linux)

iD8DBQFEkofme+7xMGD3itQRAo0XAJ9DiG228ZKMLX8JH+u9X+6YGEHwgQCdH/Jn
kC/F/b5rbmUvUzKYIv8agis=
=iuV0
-----END PGP SIGNATURE-----
From: Joachim Durchholz
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e6v9fa$pgc$1@online.de>
Sacha schrieb:
> 
> Many lists are heterogenous, even in statically typed languages.
> For instance lisp code are lists, with several kinds of atoms and 
> sub-lists..

Lisp isn't exactly a statically-typed language :-)

> A car dealer will sell cars, trucks and equipment..
> In a statically typed language you would need to type the list on a common 
> ancestor...

Where's the problem with that?

BTW the OO way isn't the only way to set up a list from heterogenous data.
In statically-typed FPL land, lists require homogenous data types all 
right, but the list elements aren't restricted to data - they can be 
functions as well.
Now the other specialty of FPLs is that you can construct functions at 
run-time - you take a function, fill some of its parameters and leave 
others open - the result is another function. And since you'll iterate 
over the list and will do homogenous processing over it, you construct 
the function so that it will do all the processing that you'll later need.

The advantage of the FPL way over the OO way is that you can use ad-hoc 
functions. You don't need precognition to know which kinds of data 
should be lumped under a common supertype - you simply write and/or 
construct functions of a common type that will go into the list.

> What would then be the point of statical typing , as you stilll need to type 
> check each element in order to process that list ?

Both OO and FPL construction allow static type checks.

 > Sure you can do this in a
> statically-typed
> language, you just need to make sure some relevant ancestor exists. In my 
> experience
> you'll end up with the base object-class more often than not, and that's 
> what i call dynamic typing.

Not quite - the common supertype is more often than not actually useful.

However, getting the type hierarchy right requires a *lot* of 
experimentation and fine-tuning. You can easily spend a year or more 
(sometimes *much* more) with that (been there, done that). Even worse, 
once the better hierarchy is available, you typically have to adapt all 
the client code that uses it (been there, done that, too).

That's the problems in OO land. FPL land doesn't have these problems - 
if the list type is just a function mapping two integers to another 
integer, reworking the data types that went into the functions of the 
list don't require those global changes.

>> Give a case of calling nonexistent functions that's useful.
> 
> I might want to test some other parts of my program before writing this 
> function.

That's unrelated to dynamic typing. All that's needed is an environment 
that throws an exception once such an undefined function is called, 
instead of aborting the compilation.

I'll readily admit that very few static languages offer such an 
environment. (Though, actually, C interpreters do exist.)

> Or maybe will my program compile that function depending on user input.

Hmm... do I really want this kind of power at the user's hand in the age 
of malware?

> As long as i get a warning for calling a non-existing function, everything 
> is fine.

That depends.
For software that's written to run once (or very few times), and where 
somebody who's able to correct problems is always nearby, that's a 
perfectly viable strategy.
For safety-critical software where problems must be handled within 
seconds (or an even shorter period of time), you want to statically 
ensure as many properties as you can. You'll take not just static 
typing, you also want to ascertain value ranges and dozens of other 
properties. (In Spark, an Ada subset, this is indeed done.)

Between those extremes, there's a broad spectrum.
From: Raffael Cavallaro
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <2006061611291216807-raffaelcavallaro@pasdespamsilvousplaitmaccom>
On 2006-06-16 05:22:08 -0400, Joachim Durchholz <··@durchholz.org> said:

> And this is a typical dynamic type advocate's response when told that 
> static typing has different needs:
> 
> "*I* don't see the usefulness of static typing so *you* shouldn't want 
> it, either."

But I haven't made this sort of argument. I never said you shouldn't 
use static typing if you want to. There are indeed types of software 
where one wants the guarantees provided by static type checks. For 
example, software that controls irreplaceable or very expensive 
equipment such as space craft, or software that can kill people if it 
fails such as software for aircraft or medical devices. The problem for 
static typing advocates is that most software is not of this type.

There is a very large class of software where user inputs are 
unpredictable and/or where input data comes from an untrusted source. 
In these cases run-time checks are going to be needed anyway so the 
advantages of static type checking are greatly reduced - you end up 
doing run-time checks anyway, precisely the thing you were trying to 
avoid by doing static analysis. In software like this it isn't worth 
satisfying a static type checker because you don't get much of the 
benefit anyway
From: Raffael Cavallaro
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <2006061611374575249-raffaelcavallaro@pasdespamsilvousplaitmaccom>
On 2006-06-16 11:29:12 -0400, Raffael Cavallaro 
<················@pas-d'espam-s'il-vous-plait-mac.com> said:

> In software like this it isn't worth satisfying a static type checker 
> because you don't get much of the benefit 
> anyway
From: Raffael Cavallaro
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <2006061611492350073-raffaelcavallaro@pasdespamsilvousplaitmaccom>
On 2006-06-16 05:22:08 -0400, Joachim Durchholz <··@durchholz.org> said:

> And this is a typical dynamic type advocate's response when told that 
> static typing has different needs:
> 
> "*I* don't see the usefulness of static typing so *you* shouldn't want 
> it, either."

But I haven't made this sort of argument. I never said you shouldn't 
use static typing if you want to. There are indeed types of software 
where one wants the guarantees provided by static type checks. For 
example, software that controls irreplaceable or very expensive 
equipment such as space craft, or software that can kill people if it 
fails such as software for aircraft or medical devices. The problem for 
static typing advocates is that most software is not of this type.

There is a very large class of software where user inputs are 
unpredictable and/or where input data comes from an untrusted source. 
In these cases run-time checks are going to be needed anyway so the 
advantages of static type checking are greatly reduced - you end up 
doing run-time checks anyway, precisely the thing you were trying to 
avoid by doing static analysis. In software like this it isn't worth 
satisfying a static type checker because you don't get much of the 
benefit anyway and it means forgoing such advantages of dynamic typing 
as being able to run and test portions of a program before other parts 
are written (forward references to as yet nonexistent functions).

Ideally one wants a language with switchable typing - static where 
possible and necessary, dynamic elsewhere. To a certain extent this is 
what common lisp does but it requires programmer declarations. Some 
implementations try to move beyond this by doing type inference and 
alerting the programmer to potential static guarantees that the 
programmer could make that would allow the compiler to do a better job.

In effect the argument comes down to which kind of typing one thinks 
should be the default. Dynamic typing advocates think that static 
typing is the wrong default. The notion that static typing can prove 
program correctness is flawed - it can only prove that type constraints 
are not violated but not necessarily that program logic is correct. It 
seems to me that if we set aside that class of software where safety is 
paramount - mostly embedded software such as aircraft and medical 
devices - we are left mostly with efficiency concerns. The 80-20 rule 
suggests that most code doesn't really need the efficiency provided by 
static guarantees. So static typing should be invoked for that small 
portion of a program where efficiency is really needed and that dynamic 
typing should be the default elswhere. This is how common lisp works - 
dynamic typing by default with static guarantees available where one 
needs them.
From: Joachim Durchholz
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e6v9iv$pgc$2@online.de>
Raffael Cavallaro schrieb:
> There is a very large class of software where user inputs are 
> unpredictable and/or where input data comes from an untrusted source. In 
> these cases run-time checks are going to be needed anyway so the 
> advantages of static type checking are greatly reduced - you end up 
> doing run-time checks anyway, precisely the thing you were trying to 
> avoid by doing static analysis.

There's still a large class of errors that *can* be excluded via type 
checking.

> Ideally one wants a language with switchable typing - static where 
> possible and necessary, dynamic elsewhere.

That has been my position for a long time now.

 > To a certain extent this is
> what common lisp does but it requires programmer declarations. Some 
> implementations try to move beyond this by doing type inference and 
> alerting the programmer to potential static guarantees that the 
> programmer could make that would allow the compiler to do a better job.

I think it's easier to start with a good (!) statically-typed language 
and relax the checking, than to start with a dynamically-typed one and 
add static checks.
With the right restrictions, a language can make all kinds of strong 
guarantees, and it can make it easy to construct software where static 
guarantees abound. If the mechanisms are cleverly chosen, they interfere 
just minimally with the programming process. (A classical example it 
Hindley-Milner type inference systems. Typical reports from languages 
with HM systems say that you can have it verify thousand-line programs 
without a single type annotation in the code. That's actually far better 
than you'd need - you'd *want* to document the types at least on the 
major internal interfaces after all *grin*.)
With a dynamically-typed language, programming style tends to evolve in 
directions that make it harder to give static guarantees.

> It seems to 
> me that if we set aside that class of software where safety is paramount 
> - mostly embedded software such as aircraft and medical devices - we are 
> left mostly with efficiency concerns.

Nope. Efficiency has taken a back seat. Software is getting slower 
(barely offset by increasing machine speed), and newer languages even 
don't statically typecheck everything (C++, Java). (Note that the 
impossibility to statically typecheck everything in OO languages doesn't 
mean that it's impossible to do rigorous static checking in general. 
FPLs have been quite rigorous about static checks; the only cases when 
an FPL needs to dynamically typecheck its data structures is after 
unmarshalling one from an untyped data source such as a network stream, 
a file, or an IPC interface.)

The prime factor nowadays seems to be maintainability.

And the difference here is this:
With dynamic typing, I have to rely on the discipline of the programmers 
to document interfaces.
With static typing, the compiler will infer (and possibly document) at 
least part of their semantics (namely the types).

> So static typing should be invoked for that small portion of a program 
> where efficiency is really needed and that dynamic typing should be the 
> default elswhere. This is how common lisp works - dynamic typing by 
> default with static guarantees available where one needs them.

Actually static typing seems to become more powerful at finding errors 
as the program size increases.
(Yes, that's a maintainability argument. Efficiency isn't *that* 
important; since maintenance is usually the most important single 
factor, squelching bugs even before testing is definitely helpful.)

Regards,
Jo
From: Raffael Cavallaro
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <2006061618250975249-raffaelcavallaro@pasdespamsilvousplaitmaccom>
On 2006-06-16 17:59:07 -0400, Joachim Durchholz <··@durchholz.org> said:

> I think it's easier to start with a good (!) statically-typed language 
> and relax the checking, than to start with a dynamically-typed one and 
> add static checks.
> With the right restrictions, a language can make all kinds of strong 
> guarantees, and it can make it easy to construct software where static 
> guarantees abound. If the mechanisms are cleverly chosen, they 
> interfere just minimally with the programming process. (A classical 
> example it Hindley-Milner type inference systems. Typical reports from 
> languages with HM systems say that you can have it verify thousand-line 
> programs without a single type annotation in the code. That's actually 
> far better than you'd need - you'd *want* to document the types at 
> least on the major internal interfaces after all *grin*.)
> With a dynamically-typed language, programming style tends to evolve in 
> directions that make it harder to give static guarantees.

This is purely a matter of programming style. For explorative 
programming it is easier to start with dynamic typing and add static 
guarantees later rather than having to make decisions about 
representation and have stubs for everything right from the start. The 
lisp programming style is arguably all about using heterogenous lists 
and forward references in the repl for everything until you know what 
it is that you are doing, then choosing a more appropriate 
representation and filling in forward references once the program gels. 
Having to choose representation right from the start and needing 
working versions (even if only stubs) of *every* function you call may 
ensure type correctness, but many programmers find that it also ensures 
that you never find the right program to code in the first place. This 
is because you don't have the freedom to explore possible solutions 
without having to break your concentration to satisfy the nagging of a 
static type checker.
From: Joachim Durchholz
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e70nha$q1o$1@online.de>
Raffael Cavallaro schrieb:
> On 2006-06-16 17:59:07 -0400, Joachim Durchholz <··@durchholz.org> said:
> 
>> I think it's easier to start with a good (!) statically-typed language 
>> and relax the checking, than to start with a dynamically-typed one and 
>> add static checks.
>
> This is purely a matter of programming style. For explorative
> programming it is easier to start with dynamic typing and add static
> guarantees later rather than having to make decisions about
> representation and have stubs for everything right from the start.

Sorry for being ambiguous - I meant to talk about language evolution.

I agree that static checking could (and probably should) be slightly 
relaxed: compilers should still do all the diagnostics that current-day 
technology allows, but any problems shouldn't abort the compilation. 
It's always possible to generate code that will throw an exception as 
soon as a problematic piece of code becomes actually relevant; depending 
on the kind of run-time support, this might abort the program, abort 
just the computation, or open an interactive facility to correct and/or 
modify the program on the spot (the latter is the norm in highly dynamic 
systems like those for Lisp and Smalltalk, and I consider this actually 
useful).

I don't see static checking and explorative programming as opposites.
Of course, in practice, environments that combine these don't seem to 
exist (except maybe in experimental or little-known state).

Regards,
Jo
From: Raffael Cavallaro
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <2006061709085950073-raffaelcavallaro@pasdespamsilvousplaitmaccom>
On 2006-06-17 07:03:19 -0400, Joachim Durchholz <··@durchholz.org> said:

> I don't see static checking and explorative programming as opposites.
> Of course, in practice, environments that combine these don't seem to 
> exist (except maybe in experimental or little-known state).

Right. Unfortunately the philosophical leanings of those who design 
these two types of languages tend to show themselves as different 
tastes in development style - for example, static type advocates don't 
often want a very dynamic development environment that would allow a 
program to run for testing even when parts of it arent defined yet, and 
dynamic type advocates don't want a compiler preventing them from doing 
so because the program can't yet be proven statically correct. Dynamic 
typing advocates don't generally want a compiler error for ambiguous 
typing - for example, adding a float and an int - but static typing 
advocates generally do. Of course there's little reason one couldn't 
have a language that allowed the full range to be switchable so that 
programmers could tighten up compiler warnings and errors as the 
program becomes more fully formed. Unfortunately we're not quite there 
yet. For my tastes something like sbcl*, with its type inference and 
very detailed warnings and notes is as good as it gets for now. I can 
basically ignore warnings and notes early on, but use them to allow the 
compiler to improve the code it generates once the program is doing 
what I want correctly.

[*] I don't mean to exclude other common lisp implementations that do 
type inference here - I just happen to use sbcl.
From: Torben Ægidius Mogensen
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <7zhd2himme.fsf@app-3.diku.dk>
Raffael Cavallaro <················@pas-d'espam-s'il-vous-plait-mac.com> writes:

> This is purely a matter of programming style. For explorative
> programming it is easier to start with dynamic typing and add static
> guarantees later rather than having to make decisions about
> representation and have stubs for everything right from the start.

I think you are confusing static typing with having to write types
everywhere.  With type inference, you only have to write a minimum of
type information (such as datatype declarations), so you can easily do
explorative progrmming in such languages -- I don't see any advantage
of dynamic typing in this respect.

> The
> lisp programming style is arguably all about using heterogenous lists
> and forward references in the repl for everything until you know what
> it is that you are doing, then choosing a more appropriate
> representation and filling in forward references once the program
> gels. Having to choose representation right from the start and needing
> working versions (even if only stubs) of *every* function you call may
> ensure type correctness, but many programmers find that it also
> ensures that you never find the right program to code in the first
> place.

If you don't have definitions (stubs or complete) of the functions you
use in your code, you can only run it up to the point where you call
an undefined function.  So you can't really do much exploration until
you have some definitions.

I expect a lot of the exploration you do with incomplete programs
amount to the feedback you get from type inference.

> This is because you don't have the freedom to explore possible
> solutions without having to break your concentration to satisfy the
> nagging of a static type checker.

I tend to disagree.  I have programmed a great deal in Lisp, Scheme,
Prolog (all dynamically typed) and SML and Haskell (both statically
typed).  And I don't find that I need more stubs etc. in the latter.
In fact, I do a lot of explorative programming when writing programs
in ML and Haskell.  And I find type inference very helpful in this, as
it guides the direction of the exploration, so it is more like a
safari with a local guide than a blindfolded walk in the jungle.

        Torben
From: George Neuner
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <ljsc92dss5t5k43oampeoc6efabu0alsmg@4ax.com>
On 19 Jun 2006 10:19:05 +0200, ·······@app-3.diku.dk (Torben �gidius
Mogensen) wrote:


>If you don't have definitions (stubs or complete) of the functions you
>use in your code, you can only run it up to the point where you call
>an undefined function.  So you can't really do much exploration until
>you have some definitions.

Well in Lisp that just drops you into the debugger where you can
supply the needed return data and continue.  I agree that it is not
something often needed.


>I expect a lot of the exploration you do with incomplete programs
>amount to the feedback you get from type inference.

The ability to write functions and test them immediately without
writing a lot of supporting code is _far_ more useful to me than type
inference.  

I'm not going to weigh in on the static v dynamic argument ... I think
both approaches have their place.  I am, however, going to ask what
information you think type inference can provide that substitutes for
algorithm or data structure exploration.

George
--
for email reply remove "/" from address
From: Matthias Blume
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <m1zmg9ggpz.fsf@hana.uchicago.edu>
George Neuner <·········@comcast.net> writes:

>  I am, however, going to ask what
> information you think type inference can provide that substitutes for
> algorithm or data structure exploration.

Nobody wants to do such a substitution, of course.  In /my/
experience, however, I find that doing algorithm and data structure
exploration is greatly aided by a language with static types and type
inference.  YMMV.
From: Torben Ægidius Mogensen
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <7z8xnt8iqq.fsf@app-1.diku.dk>
George Neuner <·········@comcast.net> writes:

> On 19 Jun 2006 10:19:05 +0200, ·······@app-3.diku.dk (Torben �gidius
> Mogensen) wrote:

> >I expect a lot of the exploration you do with incomplete programs
> >amount to the feedback you get from type inference.
> 
> The ability to write functions and test them immediately without
> writing a lot of supporting code is _far_ more useful to me than type
> inference.  

I can't see what this has to do with static/dynamic typing.  You can
test individula functions in isolation is statically typed languages
too.
 
> I'm not going to weigh in on the static v dynamic argument ... I think
> both approaches have their place.  I am, however, going to ask what
> information you think type inference can provide that substitutes for
> algorithm or data structure exploration.

None.  But it can provide a framework for both and catch some types of
mistakes early.

        Torben
From: George Neuner
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <jfnd92dim5qg3eq4fh10rvdncpdida29c5@4ax.com>
On 19 Jun 2006 13:53:01 +0200, ·······@app-1.diku.dk (Torben �gidius
Mogensen) wrote:

>George Neuner <·········@comcast.net> writes:
>
>> On 19 Jun 2006 10:19:05 +0200, ·······@app-3.diku.dk (Torben �gidius
>> Mogensen) wrote:
>
>> >I expect a lot of the exploration you do with incomplete programs
>> >amount to the feedback you get from type inference.
>> 
>> The ability to write functions and test them immediately without
>> writing a lot of supporting code is _far_ more useful to me than type
>> inference.  
>
>I can't see what this has to do with static/dynamic typing.  You can
>test individul functions in isolation is statically typed languages
>too.

It has nothing to do with static/dynamic typing and that was the point
... that support for exploratory programming is orthogonal to the
language's typing scheme.

George
--
for email reply remove "/" from address
From: Raffael Cavallaro
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <2006061612095416807-raffaelcavallaro@pasdespamsilvousplaitmaccom>
On 2006-06-16 05:22:08 -0400, Joachim Durchholz <··@durchholz.org> said:

> And this is a typical dynamic type advocate's response when told that 
static typing has different needs:
> "*I* don't see the usefulness of static typing so *you* shouldn't 
want it, either."

But I haven't made this sort of argument. I never said you shouldn't 
use static typing if you want to. There are indeed types of software 
where one wants the guarantees provided by static type checks. For 
example, software that controls irreplaceable or very expensive 
equipment such as space craft, or software that can kill people if it 
fails such as software for aircraft or medical devices. The problem for 
static typing advocates is that most software is not of this type.

There is a very large class of software where user inputs are 
unpredictable and/or where input data comes from an untrusted source. 
In these cases run-time checks are going to be needed anyway so the 
advantages of static type checking are greatly reduced - you end up 
doing run-time checks anyway, precisely the thing you were trying to 
avoid by doing static analysis. In software like this it isn't worth 
satisfying a static type checker because you don't get much of the 
benefit anyway and it means forgoing such advantages of dynamic typing 
as being able to run and test portions of a program before other parts 
are written (forward references to as yet nonexistent functions).

Ideally one wants a language with switchable typing - static where 
possible and necessary, dynamic elsewhere. To a certain extent this is 
what common lisp does but it requires programmer declarations. Some 
implementations try to move beyond this by doing type inference and 
alerting the programmer to potential static guarantees that the 
programmer could make that would allow the compiler to do a better job.

In effect the argument comes down to which kind of typing one thinks 
should be the default. Dynamic typing advocates think that static 
typing is the wrong default. The notion that static typing can prove 
program correctness is flawed - it can only prove that type constraints 
are not violated but not necessarily that program logic is correct. It 
seems to me that if we set aside that class of software where safety is 
paramount - mostly embedded software such as aircraft and medical 
devices - we are left mostly with efficiency concerns. The 80-20 rule 
suggests that most code doesn't really need the efficiency provided by 
static guarantees. So static typing should be invoked for that small 
portion of a program where efficiency is really needed and that dynamic 
typing should be the default elswhere. This is how common lisp works - 
dynamic typing by default with static guarantees available where one 
needs them.
From: Darren New
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <_eBkg.14895$uy3.4988@tornado.socal.rr.com>
Joachim Durchholz wrote:
> Give a heterogenous list that would to too awkward to live in a 
> statically-typed language.

Write a function that takes an arbitrary set of arguments and stores 
them into a structure allocated on the heap.

> Give a case of calling nonexistent functions that's useful.

See the Tcl "unknown" proc, used for interactive command expansion, 
dynamic loading of code on demand, etc.

-- 
   Darren New / San Diego, CA, USA (PST)
     My Bath Fu is strong, as I have
     studied under the Showerin' Monks.
From: Darren New
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1sBkg.14897$uy3.4884@tornado.socal.rr.com>
Joachim Durchholz wrote:
> Give a heterogenous list that would to too awkward to live in a 
> statically-typed language.

Printf()?

-- 
   Darren New / San Diego, CA, USA (PST)
     My Bath Fu is strong, as I have
     studied under the Showerin' Monks.
From: Matthias Blume
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <m1bqsthsyt.fsf@hana.uchicago.edu>
Darren New <····@san.rr.com> writes:

> Joachim Durchholz wrote:
>> Give a heterogenous list that would to too awkward to live in a
>> statically-typed language.
>
> Printf()?

Very good statically typed versions of printf exist.  See, e.g.,
Danvy's unparsing combinators.
From: Darren New
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <ajDkg.14294$Z67.2650@tornado.socal.rr.com>
Matthias Blume wrote:
> Very good statically typed versions of printf exist.  See, e.g.,
> Danvy's unparsing combinators.

That seems to ignore the fact that the pattern is a string, which means 
that printf's first argument in Danvy's mechanism has to be a literal. 
You can't read the printf format from a configuration file (for example) 
to support separate languages. It doesn't look like the version of 
printf that can print its arguments in an order different from the order 
provided in the argument list is supported either; something like "%3$d" 
or some such.

Second, what's the type of the argument that printf, sprintf, fprintf, 
kprintf, etc all pass to the subroutine that actually does the 
formatting? (Called vprintf, I think?)

-- 
   Darren New / San Diego, CA, USA (PST)
     My Bath Fu is strong, as I have
     studied under the Showerin' Monks.
From: Matthias Blume
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <m17j3gj0d5.fsf@hana.uchicago.edu>
Darren New <····@san.rr.com> writes:

> Matthias Blume wrote:
>> Very good statically typed versions of printf exist.  See, e.g.,
>> Danvy's unparsing combinators.
>
> That seems to ignore the fact that the pattern is a string, which
> means that printf's first argument in Danvy's mechanism has to be a
> literal.

In Danvy's solution, the format argument is not a string.

> You can't read the printf format from a configuration file
> (for example) to support separate languages.

You don't need to do that if you want to support separate languages.
Moreover, reading the format string from external input is a good way
of opening your program to security attacks, since ill-formed data on
external media are then able to crash you program.

> It doesn't look like the
> version of printf that can print its arguments in an order different
> from the order provided in the argument list is supported either;
> something like "%3$d" or some such.

I am not familiar with the version of printf you are refering to, but
I am sure one could adapt Danvy's solution to support such a thing.

> Second, what's the type of the argument that printf, sprintf, fprintf,
> kprintf, etc all pass to the subroutine that actually does the
> formatting? (Called vprintf, I think?)

Obviously, a Danvy-style solution (see, e.g., the one in SML/NJ's
library) is not necessarily structured that way.  I don't see the
problem with typing, though.

Matthias
From: Darren New
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <u7Fkg.15096$uy3.2589@tornado.socal.rr.com>
Matthias Blume wrote:
> In Danvy's solution, the format argument is not a string.

That's what I said, yes.

>>You can't read the printf format from a configuration file
>>(for example) to support separate languages.

> You don't need to do that if you want to support separate languages.

That's kind of irrelevant to the discussion. We're talking about 
collections of dynamically-typed objects, not the best mechanisms for 
supporting I18N.

> Moreover, reading the format string from external input is a good way
> of opening your program to security attacks, since ill-formed data on
> external media are then able to crash you program.

Still irrelevant to the point.

> I am sure one could adapt Danvy's solution to support such a thing.

I'm not. It's consuming arguments as it goes, from what I understood of 
the paper. It's translating, essentially, into a series of function 
calls in argument order.

> Obviously, a Danvy-style solution (see, e.g., the one in SML/NJ's
> library) is not necessarily structured that way.  I don't see the
> problem with typing, though.

You asked for an example of a heterogenous list that would be awkward in 
a statically strongly-typed language. The arguments to printf() count, 
methinks. What would the second argument to apply be if the first 
argument is printf (since I'm reading this in the LISP group)?

-- 
   Darren New / San Diego, CA, USA (PST)
     My Bath Fu is strong, as I have
     studied under the Showerin' Monks.
From: Joachim Durchholz
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e6va6a$ql0$1@online.de>
Darren New schrieb:
> Joachim Durchholz wrote:
>> Give a heterogenous list that would to too awkward to live in a 
>> statically-typed language.
> 
> Write a function that takes an arbitrary set of arguments and stores 
> them into a structure allocated on the heap.

If the set of arguments is really arbitrary, then the software can't do 
anything with it. In that case, the type is simply "opaque data block", 
and storing it in the heap requires nothing more specific than that of 
"opaque data block".
There's more in this. If we see a function with a parameter type of 
"opaque data block", and there's no function available except copying 
that data and comparing it for equality, then from simply looking at the 
function's signature, we'll know that it won't inspect the data. More 
interestingly, we'll know that funny stuff in the data might trigger 
bugs in the code - in the context of a security audit, that's actually a 
pretty strong guarantee, since the analysis can stop at the function't 
interface and doesn't have to dig into the function's implementation.

>> Give a case of calling nonexistent functions that's useful.
> 
> See the Tcl "unknown" proc, used for interactive command expansion, 
> dynamic loading of code on demand, etc.

Not related to dynamic typing, I fear - I can easily envision 
alternatives to that in a statically-typed context.

Of course, you can't eliminate *all* run-time type checking. I already 
mentioned unmarshalling data from an untyped source; another possibility 
is run-time code compilation (highly dubious in a production system but 
of value in a development system).

However, that's some very specialized applications, easily catered for 
by doing a dynamic type check plus a thrown exception in case the types 
don't match. I still don't see a convincing argument for making dynamic 
typing the standard policy.

Regards,
Jo
From: Joachim Durchholz
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e6pm2u$2pb$1@online.de>
Torben �gidius Mogensen schrieb:
> For example,
> if you have to code everything as natural numbers, untyped pure lambda
> calculus or S-expressions, there is a good chance that you can get
> nonsense past the compiler.

Also past peer review and/or debugging runs. And, most importantly, past 
your own mental review processes.

Regards,
Jo
From: Pascal Costanza
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <4fb97sF1if8l6U1@individual.net>
Torben �gidius Mogensen wrote:

> On a similar note, is a statically typed langauge more or less
> expressive than a dynamically typed language?  Some would say less, as
> you can write programs in a dynamically typed language that you can't
> compile in a statically typed language (without a lot of encoding),
> whereas the converse isn't true.

It's important to get the levels right here: A programming language with 
a rich static type system is more expressive at the type level, but less 
expressive at the base level (for some useful notion of expressiveness ;).

> However, I think this is misleading,
> as it ignores the feedback issue: It takes longer for the average
> programmer to get the program working in the dynamically typed
> language.

This doesn't seem to capture what I hear from Haskell programmers who 
say that it typically takes quite a while to convince the Haskell 
compiler to accept their programs. (They perceive this to be worthwhile 
because of some benefits wrt correctness they claim to get in return.)


Pascal

-- 
3rd European Lisp Workshop
July 3 - Nantes, France - co-located with ECOOP 2006
http://lisp-ecoop06.bknr.net/
From: Neelakantan Krishnaswami
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <slrne9153m.fgj.neelk@gs3106.sp.cs.cmu.edu>
In article <···············@individual.net>, Pascal Costanza wrote:
> Torben �gidius Mogensen wrote:
> 
>> On a similar note, is a statically typed langauge more or less
>> expressive than a dynamically typed language?  Some would say less, as
>> you can write programs in a dynamically typed language that you can't
>> compile in a statically typed language (without a lot of encoding),
>> whereas the converse isn't true.
> 
> It's important to get the levels right here: A programming language
> with a rich static type system is more expressive at the type level,
> but less expressive at the base level (for some useful notion of
> expressiveness ;).

This doesn't seem obviously the case to me. If you have static
information about your program, the compiler can use this information
to automate a lot of grunt work away.

Haskell's system of typeclasses work this way. If you tell the
compiler how to print integers, and how to print lists, then when you
call a print function on a list of list of integers, then the compiler
will automatically figure out the right print function using your base
definitions. This yields an increase in Felleisen-expressiveness over
a dynamically typed language, because you would need to globally
restructure your program to achieve a similar effect.

More dramatic are the "polytypic" programming languages, which let you
automate even more by letting you write generic map, fold, and print
functions which work at every type.

> This doesn't seem to capture what I hear from Haskell programmers
> who say that it typically takes quite a while to convince the
> Haskell compiler to accept their programs. (They perceive this to be
> worthwhile because of some benefits wrt correctness they claim to
> get in return.)

This is true, if you are a novice learning the language, or if you are
an expert programming with good style.

If you encode your invariants in the types, then type errors will
signal broken invariants. But: learning how to use the type system to
encode significantly complex invariants (eg, that an abstract syntax
tree representing an HTML document actually satisfies all of the
constraints on valid HTML) takes experience to do well.

-- 
Neel Krishnaswami
·····@cs.cmu.edu
From: Pascal Costanza
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <4fcgd5F1hv9ilU1@individual.net>
Neelakantan Krishnaswami wrote:
> In article <···············@individual.net>, Pascal Costanza wrote:
>> Torben �gidius Mogensen wrote:
>>
>>> On a similar note, is a statically typed langauge more or less
>>> expressive than a dynamically typed language?  Some would say less, as
>>> you can write programs in a dynamically typed language that you can't
>>> compile in a statically typed language (without a lot of encoding),
>>> whereas the converse isn't true.
>> It's important to get the levels right here: A programming language
>> with a rich static type system is more expressive at the type level,
>> but less expressive at the base level (for some useful notion of
>> expressiveness ;).
> 
> This doesn't seem obviously the case to me. If you have static
> information about your program, the compiler can use this information
> to automate a lot of grunt work away.
> 
> Haskell's system of typeclasses work this way. If you tell the
> compiler how to print integers, and how to print lists, then when you
> call a print function on a list of list of integers, then the compiler
> will automatically figure out the right print function using your base
> definitions. This yields an increase in Felleisen-expressiveness over
> a dynamically typed language, because you would need to globally
> restructure your program to achieve a similar effect.
> 
> More dramatic are the "polytypic" programming languages, which let you
> automate even more by letting you write generic map, fold, and print
> functions which work at every type.

Yes, but these decisions are taken at compile time, without running the 
program.


Pascal

-- 
3rd European Lisp Workshop
July 3 - Nantes, France - co-located with ECOOP 2006
http://lisp-ecoop06.bknr.net/
From: Torben Ægidius Mogensen
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <7zu06lqxhu.fsf@app-1.diku.dk>
Pascal Costanza <··@p-cos.net> writes:

> Torben �gidius Mogensen wrote:
> 
> > On a similar note, is a statically typed langauge more or less
> > expressive than a dynamically typed language?  Some would say less, as
> > you can write programs in a dynamically typed language that you can't
> > compile in a statically typed language (without a lot of encoding),
> > whereas the converse isn't true.
> 
> It's important to get the levels right here: A programming language
> with a rich static type system is more expressive at the type level,
> but less expressive at the base level (for some useful notion of
> expressiveness ;).
> 
> > However, I think this is misleading,
> > as it ignores the feedback issue: It takes longer for the average
> > programmer to get the program working in the dynamically typed
> > language.
> 
> This doesn't seem to capture what I hear from Haskell programmers who
> say that it typically takes quite a while to convince the Haskell
> compiler to accept their programs. (They perceive this to be
> worthwhile because of some benefits wrt correctness they claim to get
> in return.)

That's the point: Bugs that in dynamically typed languages would
require testing to find are found by the compiler in a statically
typed language.  So whil eit may take onger to get a program thatgets
past the compiler, it takes less time to get a program that works.

        Torben
From: Rob Thorpe
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1150454445.151167.283520@r2g2000cwb.googlegroups.com>
Torben Ægidius Mogensen wrote:
> Pascal Costanza <··@p-cos.net> writes:
>
> > Torben Ægidius Mogensen wrote:
> >
> > > On a similar note, is a statically typed langauge more or less
> > > expressive than a dynamically typed language?  Some would say less, as
> > > you can write programs in a dynamically typed language that you can't
> > > compile in a statically typed language (without a lot of encoding),
> > > whereas the converse isn't true.
> >
> > It's important to get the levels right here: A programming language
> > with a rich static type system is more expressive at the type level,
> > but less expressive at the base level (for some useful notion of
> > expressiveness ;).
> >
> > > However, I think this is misleading,
> > > as it ignores the feedback issue: It takes longer for the average
> > > programmer to get the program working in the dynamically typed
> > > language.
> >
> > This doesn't seem to capture what I hear from Haskell programmers who
> > say that it typically takes quite a while to convince the Haskell
> > compiler to accept their programs. (They perceive this to be
> > worthwhile because of some benefits wrt correctness they claim to get
> > in return.)
>
> That's the point: Bugs that in dynamically typed languages would
> require testing to find are found by the compiler in a statically
> typed language.  So whil eit may take onger to get a program thatgets
> past the compiler, it takes less time to get a program that works.

In my experience the opposite is true for many programs.
Having to actually know the precise type of every variable while
writing the program is not necessary, it's a detail often not relevant
to the core problem. The language should be able to take care of
itself.

In complex routines it can be useful for the programmer to give types
and for the compiler to issue errors when they are contradicted.  But
for most routines it's just an unnecessary chore that the compiler
forces on the programmer.
From: Torben Ægidius Mogensen
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <7zslm58ggp.fsf@app-1.diku.dk>
"Rob Thorpe" <·············@antenova.com> writes:

> Torben �gidius Mogensen wrote:

> > That's the point: Bugs that in dynamically typed languages would
> > require testing to find are found by the compiler in a statically
> > typed language.  So whil eit may take onger to get a program thatgets
> > past the compiler, it takes less time to get a program that works.
> 
> In my experience the opposite is true for many programs.
> Having to actually know the precise type of every variable while
> writing the program is not necessary, it's a detail often not relevant
> to the core problem. The language should be able to take care of
> itself.
> 
> In complex routines it can be useful for the programmer to give types
> and for the compiler to issue errors when they are contradicted.  But
> for most routines it's just an unnecessary chore that the compiler
> forces on the programmer.

Indeed.  So use a language with type inference.

        Torben
From: Rob Thorpe
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1150706890.543243.284430@h76g2000cwa.googlegroups.com>
Torben Ægidius Mogensen wrote:
> "Rob Thorpe" <·············@antenova.com> writes:
>
> > Torben Ægidius Mogensen wrote:
>
> > > That's the point: Bugs that in dynamically typed languages would
> > > require testing to find are found by the compiler in a statically
> > > typed language.  So whil eit may take onger to get a program thatgets
> > > past the compiler, it takes less time to get a program that works.
> >
> > In my experience the opposite is true for many programs.
> > Having to actually know the precise type of every variable while
> > writing the program is not necessary, it's a detail often not relevant
> > to the core problem. The language should be able to take care of
> > itself.
> >
> > In complex routines it can be useful for the programmer to give types
> > and for the compiler to issue errors when they are contradicted.  But
> > for most routines it's just an unnecessary chore that the compiler
> > forces on the programmer.
>
> Indeed.  So use a language with type inference.

Well, for most purposes that's the same as dynamic typing since the
compiler doesn't require you to label the type of your variables.  I
occasionally use CMUCL and SBCL which do type inference, which is
useful at improving generated code quality.  It also can warn the
programmer if they if they reuse a variable in a context implying that
it's a different type which is useful.

I see type inference as an optimization of dynamic typing rather than a
generalization of static typing.  But I suppose you can see it that way
around.
From: Torben Ægidius Mogensen
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <7zy7vt1mz2.fsf@app-3.diku.dk>
"Rob Thorpe" <·············@antenova.com> writes:

> Torben �gidius Mogensen wrote:
> > "Rob Thorpe" <·············@antenova.com> writes:
> >
> > > Torben �gidius Mogensen wrote:
> >
> > Indeed.  So use a language with type inference.
> 
> Well, for most purposes that's the same as dynamic typing since the
> compiler doesn't require you to label the type of your variables.

That's not really the difference between static and dynamic typing.
Static typing means that there exist a typing at compile-time that
guarantess against run-time type violations.  Dynamic typing means
that such violations are detected at run-time.  This is orthogonal to
strong versus weak typing, which is about whether such violations are
detected at all.  The archetypal weakly typed language is machine code
-- you can happily load a floating point value from memory, add it to
a string pointer and jump to the resulting value.  ML and Scheme are
both strongly typed, but one is statically typed and the other
dynamically typed.

Anyway, type inference for statically typed langauges don't make them
any more dynamically typed.  It just moves the burden of assigning the
types from the programmer to the compiler.  And (for HM type systems)
the compiler doesn't "guess" at a type -- it finds the unique most
general type from which all other legal types (within the type system)
can be found by instantiation.

>  I
> occasionally use CMUCL and SBCL which do type inference, which is
> useful at improving generated code quality.  It also can warn the
> programmer if they if they reuse a variable in a context implying that
> it's a different type which is useful.
> 
> I see type inference as an optimization of dynamic typing rather than a
> generalization of static typing.  But I suppose you can see it that way
> around.

Some compilers for dynamically typed languages will do a type analysis
similar to type inference, but they will happily compile a program
even if they can't guarantee static type safety.

Such "type inference" can be seen as an optimisation of dynamic
typing, as it allows the compiler to omit _some_ of the runtime type
checks.  I prefer the term "soft typing" for this, though, so as not
to confuse with static type inference.

Soft typing can give feedback similar to that of type inference in
terms of identifying potential problem spots, so in that respect it is
similar to static type inference, and you might get similar fast code
development.  You miss some of the other benefits of static typing,
though, such as a richer type system -- soft typing often lacks
features like polymorphism (it will find a set of monomorphic
instances rather than the most general type) and type classes.

        Torben
From: Chris Smith
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <MPG.1f006ea363b2e2ba9896c4@news.altopia.net>
Torben Ægidius Mogensen <·······@app-3.diku.dk> wrote:
> That's not really the difference between static and dynamic typing.
> Static typing means that there exist a typing at compile-time that
> guarantess against run-time type violations.  Dynamic typing means
> that such violations are detected at run-time.  This is orthogonal to
> strong versus weak typing, which is about whether such violations are
> detected at all.  The archetypal weakly typed language is machine code
> -- you can happily load a floating point value from memory, add it to
> a string pointer and jump to the resulting value.  ML and Scheme are
> both strongly typed, but one is statically typed and the other
> dynamically typed.

Knowing that it'll cause a lot of strenuous objection, I'll nevertheless 
interject my plea not to abuse the word "type" with a phrase like 
"dynamically typed".  If anyone considers "untyped" to be perjorative, 
as some people apparently do, then I'll note that another common term is 
"type-free," which is marketing-approved but doesn't carry the 
misleading connotations of "dynamically typed."  We are quickly losing 
any rational meaning whatsoever to the word "type," and that's quite a 
shame.

By way of extending the point, let me mention that there is no such 
thing as a universal class of things that are called "run-time type 
violations".  At runtime, there is merely correct code and incorrect 
code.  To the extent that anything is called a "type" at runtime, this 
is a different usage of the word from the usage by which we may define 
languages as being statically typed (which means just "typed").  In 
typed OO languages, this runtime usage is often called the "class", for 
example, to distinguish it from type.

This cleaner terminology eliminates a lot of confusion.  For example, it 
clarifies that there is no binary division between strongly typed 
languages and weakly typed languages, since the division between a "type 
error" and any other kind of error is arbitrary, depending only on 
whether the type system in a particular language happens to catch that 
error.  For example, type systems have been defined to try to catch unit 
errors in scientific programming, or to catch out-of-bounds array 
indices... yet these are not commonly called "type errors" only because 
such systems are not in common use.

This also leads us to define something like "language safety" to 
encapsulate what we previously would have meant by the phrase "strongly 
dynamically typed language".  This also is a more general concept than 
we had before.  Language safety refers to a language having well-defined 
behavior for as many operations as feasible, so that it's less likely 
that someone will do something spectacularly bad.  Language safety may 
be achieved either by a type system or by runtime checks.  Again, it's 
not absolute... I'm aware of no language that is completely determinate, 
at least if it supports any kind of concurrency.

This isn't just a matter of preference in terminology.  The definitions 
above (which are, in my experience, used widely by most non-academic 
language design discussions) actually limit our understanding of 
language design by pretending that certain delicate trade-offs such as 
the extent of the type system, or which language behavior is allowed to 
be non-deterministic or undefined, are etched in stone.  This is simply 
not so.  If types DON'T mean a compile-time method for proving the 
absence of certain program behaviors, then they don't mean anything at 
all.  Pretending that there's a distinction at runtime between "type 
errors" and "other errors" serves only to confuse things and 
artificially limit which problems we are willing to concieve as being 
solvable by types.

> Anyway, type inference for statically typed langauges don't make them
> any more dynamically typed.

Indeed it does not.  Unless it weakens the ability of a compiler to 
prove the absence of certain program behaviors (which type inference 
does not), it doesn't move anything on the scale toward a type-free 
language.

That being said, though, it is considered a feature of some programming 
languages that the programmer is asked to repeat type information in a 
few places.  The compiler may not need the information, but for 
precisely the reason that the information is redundant, the compiler is 
then able to check the consistency of the programmer in applying the 
type.  I won't get into precisely how useful this is, but it is 
nevertheless present as an advantage to outweigh the wordiness.

-- 
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation
From: Rob Thorpe
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1150731889.008745.12180@i40g2000cwc.googlegroups.com>
Chris Smith wrote:
> Torben Ægidius Mogensen <·······@app-3.diku.dk> wrote:
> > That's not really the difference between static and dynamic typing.
> > Static typing means that there exist a typing at compile-time that
> > guarantess against run-time type violations.  Dynamic typing means
> > that such violations are detected at run-time.  This is orthogonal to
> > strong versus weak typing, which is about whether such violations are
> > detected at all.  The archetypal weakly typed language is machine code
> > -- you can happily load a floating point value from memory, add it to
> > a string pointer and jump to the resulting value.  ML and Scheme are
> > both strongly typed, but one is statically typed and the other
> > dynamically typed.
>
> Knowing that it'll cause a lot of strenuous objection, I'll nevertheless
> interject my plea not to abuse the word "type" with a phrase like
> "dynamically typed".  If anyone considers "untyped" to be perjorative,
> as some people apparently do, then I'll note that another common term is
> "type-free," which is marketing-approved but doesn't carry the
> misleading connotations of "dynamically typed."  We are quickly losing
> any rational meaning whatsoever to the word "type," and that's quite a
> shame.

I don't think dynamic typing is that nebulous.  I remember this being
discussed elsewhere some time ago, I'll post the same reply I did then
..


A language is statically typed if a variable has a property - called
it's type - attached to it, and given it's type it can only represent
values defined by a certain class.

A language is latently typed if a value has a property - called it's
type - attached to it, and given it's type it can only represent values
defined by a certain class.

Some people use dynamic typing as a word for latent typing, others use
it to mean something slightly different.  But for most purposes the
definition above works for dynamic typing also.

Untyped and type-free mean something else: they mean no type checking
is done.
From: Chris Smith
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <MPG.1f007f6a483b5e7d9896c6@news.altopia.net>
Rob Thorpe <·············@antenova.com> wrote:
> A language is latently typed if a value has a property - called it's
> type - attached to it, and given it's type it can only represent values
> defined by a certain class.

I'm assuming you mean "class" in the general sense, rather than in the 
sense of a specific construct of some subset of OO programming 
languages.

Now I define a class of values called "correct" values.  I define these 
to be those values for which my program will produce acceptable results.  
Clearly there is a defined class of such values: (1) they are 
immediately defined by the program's specification for those lines of 
code that produce output; (2) if they are defined for the values that 
result from any expression, then they are defined for the values that 
are used by that expression; and (3) for any value for which correctness 
is not defined by (1) or (2), we may define its "correct" values as the 
class of all possible values.  Now, by your definition, any language 
which provides checking of that property of correctness for values is 
latently typed.  Of course, there are no languages that assign this 
specific class of values; but ANY kind of correctness checking on values 
that a language does (if it's useful at all) is a subset of the perfect 
correctness checking system above.  Apparently, we should call all such 
systems "latent type systems".  Can you point out a language that is not 
latently typed?

I'm not trying to poke holes in your definition for fun.  I am proposing 
that there is no fundamental distinction between the kinds of problems 
that are "type problems" and those that are not.  Types are not a class 
of problems; they are a class of solutions.  Languages that solve 
problems in ways that don't assign types to variables are not typed 
languages, even if those same problems may have been originally solved 
by type systems.

> Untyped and type-free mean something else: they mean no type checking
> is done.

Hence, they don't exist, and the definitions being used here are rather 
pointless.

-- 
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation
From: Yet Another Dan
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <Xns97E78835F267BakaHackerX@192.17.3.8>
Chris Smith <·······@twu.net> wrote in
·······························@news.altopia.net: 

> Rob Thorpe <·············@antenova.com> wrote:
>> A language is latently typed if a value has a property - called it's
>> type - attached to it, and given it's type it can only represent
>> values defined by a certain class.
> 
> Now I define a class of values called "correct" values.  I define
> these to be those values for which my program will produce acceptable
> results.  Clearly there is a defined class of such values: (1) they
> are immediately defined by the program's specification for those lines
> of code that produce output; ...

> I'm not trying to poke holes in your definition for fun.  I am
> proposing that there is no fundamental distinction between the kinds
> of problems that are "type problems" and those that are not.

That sounds like a lot to demand of a type system. It almost sounds like 
it's supposed to test and debug the whole program. In general, defining the 
exact set of values for a given variable that generate acceptable output 
from your program will require detailed knowledge of the program and all 
its possible inputs. That goes beyond simple typing. It sounds more like 
contracts. Requiring an array index to be an integer is considered a typing 
problem because it can be checked based on only the variable itself, 
whereas checking whether it's in bounds requires knowledge about the array.

-- 
YAD
From: Chris Smith
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <MPG.1f00a3cb314707cc9896c8@news.altopia.net>
Yet Another Dan <········@vapornet.com> wrote:
> That sounds like a lot to demand of a type system. It almost sounds like 
> it's supposed to test and debug the whole program. In general, defining the 
> exact set of values for a given variable that generate acceptable output 
> from your program will require detailed knowledge of the program and all 
> its possible inputs. That goes beyond simple typing. It sounds more like 
> contracts. Requiring an array index to be an integer is considered a typing 
> problem because it can be checked based on only the variable itself, 
> whereas checking whether it's in bounds requires knowledge about the array.

Thanks for proving my point.  As a matter of fact, as I just mentioned 
in another response, array bounds checking is widely known to be 
solvable via type systems, and doing so in a way that yields a usable 
language is an active area of work in type theory for programming 
languages.  My point has been that confusion over the meaning of a type 
leads to limitations in what problems we think of as solvable by types.  
Apparently, that is accurate at least in this case.

Once again: there is no such thing as a bug is that not a type error.  
There are only bugs that aren't caught by type systems.  Because of 
this, defining a general class of "typed languages" as those that catch 
"type errors", regardless of mechanism, is meaningless.  All practical 
languages (even machine code) catch some errors (for machine code, e.g., 
use of an invalid opcode or access to a virtual address that's not 
mapped by the MMU), and therefore they catch some type errors, and 
therefore there would be no languages that are not in that class of 
typed languages (with trivial exceptions, such as the pure untyped 
lambda calculus in which any normal form is considered to be correct 
output.)

The only reasonable possibility for defining types is based on the 
*mechanism* for catching these errors.  Several different groups of 
people use the word "type" for several different such mechanisms.  While 
acknowledging that this is valid, it remains confusing to talk about 
"statically typed" versus "dynamically typed" languages, because the 
terminology implies that "typed" has the same meaning in both cases, 
which it does not.

-- 
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation
From: Patricia Shanahan
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <sZBlg.8164$lp.881@newsread3.news.pas.earthlink.net>
Yet Another Dan wrote:
> Chris Smith <·······@twu.net> wrote in
> ·······························@news.altopia.net: 
> 
>> Rob Thorpe <·············@antenova.com> wrote:
>>> A language is latently typed if a value has a property - called it's
>>> type - attached to it, and given it's type it can only represent
>>> values defined by a certain class.
>> Now I define a class of values called "correct" values.  I define
>> these to be those values for which my program will produce acceptable
>> results.  Clearly there is a defined class of such values: (1) they
>> are immediately defined by the program's specification for those lines
>> of code that produce output; ...
> 
>> I'm not trying to poke holes in your definition for fun.  I am
>> proposing that there is no fundamental distinction between the kinds
>> of problems that are "type problems" and those that are not.
> 
> That sounds like a lot to demand of a type system. It almost sounds like 
> it's supposed to test and debug the whole program. In general, defining the 
> exact set of values for a given variable that generate acceptable output 
> from your program will require detailed knowledge of the program and all 
> its possible inputs. That goes beyond simple typing. It sounds more like 
> contracts. Requiring an array index to be an integer is considered a typing 
> problem because it can be checked based on only the variable itself, 
> whereas checking whether it's in bounds requires knowledge about the array.
> 

It's worse than that. The general question of whether a program will
terminate on a given input is undecidable for any language that is
Turing-machine equivalent, including Java.

Patricia
From: Dimitri Maziuk
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <slrne9e7of.po9.dima@yellowtail.bmrb.wisc.edu>
Yet Another Dan sez:

... Requiring an array index to be an integer is considered a typing 
> problem because it can be checked based on only the variable itself, 
> whereas checking whether it's in bounds requires knowledge about the array.

You mean like
 subtype MyArrayIndexType is INTEGER 7 .. 11
 type MyArrayType is array (MyArrayIndexType) of MyElementType

Dima
-- 
We're sysadmins. Sanity happens to other people.                  -- Chris King
From: George Neuner
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <oh9g92djdj8hpekdrmont4rg4jtnighoq1@4ax.com>
On Mon, 19 Jun 2006 22:02:55 +0000 (UTC), Dimitri Maziuk
<····@127.0.0.1> wrote:

>Yet Another Dan sez:
>
>... Requiring an array index to be an integer is considered a typing 
>> problem because it can be checked based on only the variable itself, 
>> whereas checking whether it's in bounds requires knowledge about the array.
>
>You mean like
> subtype MyArrayIndexType is INTEGER 7 .. 11
> type MyArrayType is array (MyArrayIndexType) of MyElementType
>

If the index computation involves wider types it can still produce
illegal index values.  The runtime computation of an illegal index
value is not prevented by narrowing subtypes and cannot be statically
checked.

George
--
for email reply remove "/" from address
From: Dimitri Maziuk
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <slrne9is00.fr.dima@yellowtail.bmrb.wisc.edu>
George Neuner sez:
> On Mon, 19 Jun 2006 22:02:55 +0000 (UTC), Dimitri Maziuk
><····@127.0.0.1> wrote:
>
>>Yet Another Dan sez:
>>
>>... Requiring an array index to be an integer is considered a typing 
>>> problem because it can be checked based on only the variable itself, 
>>> whereas checking whether it's in bounds requires knowledge about the array.
>>
>>You mean like
>> subtype MyArrayIndexType is INTEGER 7 .. 11
>> type MyArrayType is array (MyArrayIndexType) of MyElementType
>>
>
> If the index computation involves wider types it can still produce
> illegal index values.  The runtime computation of an illegal index
> value is not prevented by narrowing subtypes and cannot be statically
> checked.

My vague recollection is that no, it won't unless _you_ explicitly code an
unchecked type conversion. But it's been a while.

HTH
Dima
-- 
I have not been able to think of any way of describing Perl to [person]
"Hello, blind man?  This is color."                                      -- DPM
From: Dimitri Maziuk
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <slrne9lhb6.3sv.dima@yellowtail.bmrb.wisc.edu>
George Neuner sez:
> On Wed, 21 Jun 2006 16:12:48 +0000 (UTC), Dimitri Maziuk
><····@127.0.0.1> wrote:
>
>>George Neuner sez:
>>> On Mon, 19 Jun 2006 22:02:55 +0000 (UTC), Dimitri Maziuk
>>><····@127.0.0.1> wrote:
>>>
>>>>Yet Another Dan sez:
>>>>
>>>>... Requiring an array index to be an integer is considered a typing 
>>>>> problem because it can be checked based on only the variable itself, 
>>>>> whereas checking whether it's in bounds requires knowledge about the array.
>>>>
>>>>You mean like
>>>> subtype MyArrayIndexType is INTEGER 7 .. 11
>>>> type MyArrayType is array (MyArrayIndexType) of MyElementType
>>>>
>>>
>>> If the index computation involves wider types it can still produce
>>> illegal index values.  The runtime computation of an illegal index
>>> value is not prevented by narrowing subtypes and cannot be statically
>>> checked.
>>
>>My vague recollection is that no, it won't unless _you_ explicitly code an
>>unchecked type conversion. But it's been a while.
>
>
> You can't totally prevent it ... if the index computation involves
> types having a wider range
...
> The point is really that the checks that prevent these things must be
> performed at runtime and can't be prevented by any practical type
> analysis performed at compile time.  I'm not a type theorist but my
> opinion is that a static type system that could, a priori, prevent the
> problem is impossible.

Right, but if you look carefully at the paragraph I originally fup'ed
to, you'll notice that it doesn't say "runtime" or "compile-time".
I was commenting on "array bounds not a typing problem" bit -- it is
if you define the index as distinct type, and there is at least one
language that supports it. In this context I don't think it matters
when the check is performed.

It does matter in the larger context of this thread, though. IMO the
important part here is that the crash report clearly points to the
caller code that generated illegal index, not to my library code.

That is the basic argument in favour of compile time error checking,
extended to runtime errors. I don't really care if it's the compiler
or runtime that tells the luser "your code is broken", as long as it
makes it clear it's *his* code that's broken, not mine.

Dima
-- 
I like the US government, makes the Aussie one look less dumb and THAT is a
pretty big effort.                                               -- Craig Small
From: Joachim Durchholz
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e7lt55$kff$1@online.de>
Dimitri Maziuk schrieb:
> That is the basic argument in favour of compile time error checking,
> extended to runtime errors. I don't really care if it's the compiler
> or runtime that tells the luser "your code is broken", as long as it
> makes it clear it's *his* code that's broken, not mine.

You can make runtime errors point to the appropriate code. Just apply 
"programming by contract": explicitly state what preconditions a 
function is requiring of the caller, have it check the preconditions on 
entry (and, ideally, throw the execption in a way that the error is 
reported in the caller's code - not a complicated thing but would 
require changes in the exception machinery of most languages).

Any errors past that point are either a too liberal precondition (i.e. a 
bug in the precondition - but documenting wrong preconditions is still a 
massive bug, even if it's "just" a documentation bug), or a bug in the 
function's code.

Regards,
Jo
From: Benjamin Franksen
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e7f35a$drq$1@ulysses.news.tiscali.de>
George Neuner wrote:
> The point is really that the checks that prevent these things

[such as 'array index out of bounds']

> must be 
> performed at runtime and can't be prevented by any practical type
> analysis performed at compile time.  I'm not a type theorist but my
> opinion is that a static type system that could, a priori, prevent the
> problem is impossible.

It is not impossible. Dependent type systems can -- in principle -- do all
such things. Take a look at Epigram (http://www.e-pig.org/).

Systems based on dependent types such as Epigram are not yet mature enough
to be used for serious programming. However, they clearly show that these
things /are/ possible.

Cheers
Ben
From: Joachim Durchholz
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e7lsr6$iqm$1@online.de>
George Neuner schrieb:
> The point is really that the checks that prevent these things must be
> performed at runtime and can't be prevented by any practical type
> analysis performed at compile time.  I'm not a type theorist but my
> opinion is that a static type system that could, a priori, prevent the
> problem is impossible.

No type theory is needed.
Assume that the wide index type goes into a function and the result is 
assigned to a variable fo the narrow type, and it's instantly clear that 
the problem is undecidable.
Undecidability can always be avoided by adding annotations, but of 
course that would be gross overkill in the case of index type widening.

Regards,
Jo
From: George Neuner
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <q7jt92tovq5qci2ei16h437tglltvlssvh@4ax.com>
On Sun, 25 Jun 2006 13:42:45 +0200, Joachim Durchholz
<··@durchholz.org> wrote:

>George Neuner schrieb:
>> The point is really that the checks that prevent these things must be
>> performed at runtime and can't be prevented by any practical type
>> analysis performed at compile time.  I'm not a type theorist but my
>> opinion is that a static type system that could, a priori, prevent the
>> problem is impossible.
>
>No type theory is needed.
>Assume that the wide index type goes into a function and the result is 
>assigned to a variable fo the narrow type, and it's instantly clear that 
>the problem is undecidable.

Yes ... the problem is undecidable and that can be statically checked.
But the result is that your program won't compile even if it can be
proved at runtime that an illegal value would never be possible.


>Undecidability can always be avoided by adding annotations, but of 
>course that would be gross overkill in the case of index type widening.

Just what sort of type annotation will convince a compiler that a
narrowing conversion which could produce an illegal value will, in
fact, never produce an illegal value?

[Other than "don't check this" of course.]

George
--
for email reply remove "/" from address
From: Chris Smith
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <MPG.1f089fd357e69ece989700@news.altopia.net>
George Neuner <·········@comcast.net> wrote:
> >Undecidability can always be avoided by adding annotations, but of 
> >course that would be gross overkill in the case of index type widening.
> 
> Just what sort of type annotation will convince a compiler that a
> narrowing conversion which could produce an illegal value will, in
> fact, never produce an illegal value?

The annotation doesn't make the narrowing conversion safe; it prevents 
the narrowing conversion from happening.  If, for example, I need to 
subtract two numbers and all I know is that they are both between 2 and 
40, then I only know that the result is between -38 and 38, which may 
contain invalid array indices.  However, if the numbers were part of a 
pair, and I knew that the type of the pair was <pair of numbers, 2 
through 40, where the first number is greater than the second>, then I 
would know that the difference is between 0 and 38, and that may be a 
valid index.

Of course, the restrictions on code that would allow me to retain 
knowledge of the form [pair of numbers, 2 through 40, a > b] may be 
prohibitive.  That is an open question in type theory, as a matter of 
fact; whether types of this level of power may be inferred by any 
tractable procedure so that safe code like this may be written without 
making giving the development process undue difficulty by requiring ten 
times as much type annotations as actual code.  There are attempts that 
have been made, and they don't look too awfully bad.

-- 
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation
From: George Neuner
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <bq60a2hjm5dp22qta59m5lfa7ebuft16og@4ax.com>
On Sun, 25 Jun 2006 14:28:22 -0600, Chris Smith <·······@twu.net>
wrote:

>George Neuner <·········@comcast.net> wrote:
>> >Undecidability can always be avoided by adding annotations, but of 
>> >course that would be gross overkill in the case of index type widening.
>> 
>> Just what sort of type annotation will convince a compiler that a
>> narrowing conversion which could produce an illegal value will, in
>> fact, never produce an illegal value?
>
>The annotation doesn't make the narrowing conversion safe; it prevents 
>the narrowing conversion from happening. 

That was my point ... you get a program that won't compile.


>If, for example, I need to 
>subtract two numbers and all I know is that they are both between 2 and 
>40, then I only know that the result is between -38 and 38, which may 
>contain invalid array indices.  However, if the numbers were part of a 
>pair, and I knew that the type of the pair was <pair of numbers, 2 
>through 40, where the first number is greater than the second>, then I 
>would know that the difference is between 0 and 38, and that may be a 
>valid index.
>
>Of course, the restrictions on code that would allow me to retain 
>knowledge of the form [pair of numbers, 2 through 40, a > b] may be 
>prohibitive.  That is an open question in type theory, as a matter of 
>fact; whether types of this level of power may be inferred by any 
>tractable procedure so that safe code like this may be written without 
>making giving the development process undue difficulty by requiring ten 
>times as much type annotations as actual code.  There are attempts that 
>have been made, and they don't look too awfully bad.

I worked in signal and image processing for many years and those are
places where narrowing conversions are used all the time - in the form
of floating point calculations reduced to integer to value samples or
pixels, or to value or index lookup tables.  Often the same
calculation needs to be done for several different purposes.

I can know that my conversion of floating point to integer is going to
produce a value within a certain range ... but, in general, the
compiler can't determine what that range will be.  All it knows is
that a floating point value is being truncated and the stupid
programmer wants to stick the result into some type too narrow to
represent the range of possible values.

Like I said to Ben, I haven't seen any _practical_ static type system
that can deal with things like this.  Writing a generic function is
impossible under the circumstances, and writing a separate function
for each narrow type is ridiculous and a maintenance nightmare even if
they can share the bulk of the code.

George
--
for email reply remove "/" from address
From: Chris Smith
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <MPG.1f09dd34676ab719989711@news.altopia.net>
George Neuner <·········@comcast.net> wrote:
> On Sun, 25 Jun 2006 14:28:22 -0600, Chris Smith <·······@twu.net>
> wrote:
> 
> >George Neuner <·········@comcast.net> wrote:
> >> >Undecidability can always be avoided by adding annotations, but of 
> >> >course that would be gross overkill in the case of index type widening.
> >> 
> >> Just what sort of type annotation will convince a compiler that a
> >> narrowing conversion which could produce an illegal value will, in
> >> fact, never produce an illegal value?
> >
> >The annotation doesn't make the narrowing conversion safe; it prevents 
> >the narrowing conversion from happening. 
> 
> That was my point ... you get a program that won't compile.

That's not actually the case here.  If we're talking only about type 
conversions and not value conversions (see below), then narrowing 
conversions are only necessary because you've forgotten some important 
bit of type information.  By adding annotations, you can preserve that 
piece of information and thus avoid the conversion and get a program 
that runs fine.

> I worked in signal and image processing for many years and those are
> places where narrowing conversions are used all the time - in the form
> of floating point calculations reduced to integer to value samples or
> pixels, or to value or index lookup tables.  Often the same
> calculation needs to be done for several different purposes.

These are value conversions, not type conversions.  Basically, when you 
convert a floating point number to an integer, you are not simply 
promising the compiler something about the type; you are actually asking 
the compiler to convert one value to another -- i.e., see to it that 
whatever this is now, it /becomes/ an integer by the time we're done.  
This also results in a type conversion, but you've just converted the 
value to the appropriate form.  There is a narrowing value conversion, 
but the type conversion is perfectly safe.

> I can know that my conversion of floating point to integer is going to
> produce a value within a certain range ... but, in general, the
> compiler can't determine what that range will be.

If you mean "my compiler can't", then this is probably the case.  If you 
mean "no possible compiler could", then I'm not sure this is really very 
likely at all.

> Like I said to Ben, I haven't seen any _practical_ static type system
> that can deal with things like this.

I agree.  Such a thing doesn't currently exist for general-purpose 
programming languages, although it does exist in limited languages for 
some specific domains.

-- 
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation
From: George Neuner
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <k0i2a2hhibfn7oadjdd753bklcv24darhd@4ax.com>
On Mon, 26 Jun 2006 13:02:33 -0600, Chris Smith <·······@twu.net>
wrote:

>George Neuner <·········@comcast.net> wrote:
>
>> I worked in signal and image processing for many years and those are
>> places where narrowing conversions are used all the time - in the form
>> of floating point calculations reduced to integer to value samples or
>> pixels, or to value or index lookup tables.  Often the same
>> calculation needs to be done for several different purposes.
>
>These are value conversions, not type conversions.  Basically, when you 
>convert a floating point number to an integer, you are not simply 
>promising the compiler something about the type; you are actually asking 
>the compiler to convert one value to another -- i.e., see to it that 
>whatever this is now, it /becomes/ an integer by the time we're done.  
>This also results in a type conversion, but you've just converted the 
>value to the appropriate form.  There is a narrowing value conversion, 
>but the type conversion is perfectly safe.

We're talking at cross purposes.  I'm questioning whether a strong
type system can be completely static as some people here seem to
think.  I maintain that it is simply not possible to make compile time
guarantees about *all* runtime behavior and that, in particular,
narrowing conversions will _always_ require runtime checking.


>> I can know that my conversion of floating point to integer is going to
>> produce a value within a certain range ... but, in general, the
>> compiler can't determine what that range will be.
>
>If you mean "my compiler can't", then this is probably the case.  If you 
>mean "no possible compiler could", then I'm not sure this is really very 
>likely at all.

Again, the discussion is about narrowing the result.  It doesn't
matter how much the compiler knows about the ranges.  When the
computation mixes types, the range of the result can only widen as the
compiler determines the types involved.

George
--
for email reply remove "/" from address
From: Chris Smith
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <MPG.1f0b1d6c361e87ed989720@news.altopia.net>
George Neuner <·········@comcast.net> wrote:
> We're talking at cross purposes.  I'm questioning whether a strong
> type system can be completely static as some people here seem to
> think. I maintain that it is simply not possible to make compile time
> guarantees about *all* runtime behavior and that, in particular,
> narrowing conversions will _always_ require runtime checking.

Nah, we're not at cross-purposes.  I'm discussing the same thing you 
are.  My answer is that while narrowing conversion are not (by 
definition) type-safe, a sufficiently strong type system can remove 
these type-unsafe narrowing conversions from a program.

> Again, the discussion is about narrowing the result.  It doesn't
> matter how much the compiler knows about the ranges.  When the
> computation mixes types, the range of the result can only widen as the
> compiler determines the types involved.

Well, there are certain operations that can reduce the range.  For 
example, dividing a number that's known to be in the range 6-10 by the 
exact constant 2 yields a result that's guaranteed to be in the range 3-
5.  That's a smaller range.

That said, though, there is a more fundamental point here.  When you 
perform a a narrowing type conversion, one of two things must be true: 
(a) you know more about the types than the compiler, and thus know that 
the conversion is safe, or (b) your code is broken.  Exluding the 
possibility that you've written buggy code, you must possess some 
knowledge that convinces you the conversion is safe.  In that case, we 
need only allow the type system to have that same knowledge, and the 
problem is solved.

(One thing worth pointing out here is that it's quite possible that you 
want to do something different depending on the kind of data.  In that 
case, the sufficiently powerful type system would need to have rules so 
that an if statemement creates a modified type environment to take that 
into account.  This is different from a runtime type check, in that you 
are writing explicit code that provides correct program behavior in 
either case.)

-- 
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation
From: Marshall
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1151374801.003139.68680@u72g2000cwu.googlegroups.com>
George Neuner wrote:
>
> I can know that my conversion of floating point to integer is going to
> produce a value within a certain range ... but, in general, the
> compiler can't determine what that range will be.  All it knows is
> that a floating point value is being truncated and the stupid
> programmer wants to stick the result into some type too narrow to
> represent the range of possible values.
>
> Like I said to Ben, I haven't seen any _practical_ static type system
> that can deal with things like this.  Writing a generic function is
> impossible under the circumstances, and writing a separate function
> for each narrow type is ridiculous and a maintenance nightmare even if
> they can share the bulk of the code.

This is the partial function question again, in a different guise.


Marshall
From: Greg Buchholz
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1151445597.574064.319180@j72g2000cwa.googlegroups.com>
George Neuner wrote:
> That was interesting, but the authors' method still involves runtime
> checking of the array bounds.  IMO, all they really succeeded in doing
> was turning the original recursion into CPS and making the code a
> little bit clearer.

    Hmm.  There is a comparison in both the DML and Haskell code, but
that's just there to make the binary search terminate correctly.  The
point of the exercise is the line in the DML code "val m = (hi + lo)
div 2", or the "bmiddle" function in the Haskell code.  If you change
that line to something like "val m = (hi + lo)" you'll get a compiler
error complaining about an unsolved constraint.  The point of the
Haskell code is to use the type system to ensure that array indices
have a know provenance.  They're derived from and statically associated
with each individual array, so you can't use an index from one array
with a different array.  You'll probably also enjoy the paper...

    "Eliminating Array Bound Checking Through Dependent Types"
    http://citeseer.ist.psu.edu/xi98eliminating.html

...and DML itself...

    http://www.cs.bu.edu/~hwxi/DML/DML.html

Greg Buchholz
From: Rob Thorpe
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1150794745.148997.245840@b68g2000cwa.googlegroups.com>
Chris Smith wrote:
> Rob Thorpe <·············@antenova.com> wrote:
> > A language is latently typed if a value has a property - called it's
> > type - attached to it, and given it's type it can only represent values
> > defined by a certain class.
>
> I'm assuming you mean "class" in the general sense, rather than in the
> sense of a specific construct of some subset of OO programming
> languages.

Yes.

> Now I define a class of values called "correct" values.  I define these
> to be those values for which my program will produce acceptable results.
> Clearly there is a defined class of such values: (1) they are
> immediately defined by the program's specification for those lines of
> code that produce output; (2) if they are defined for the values that
> result from any expression, then they are defined for the values that
> are used by that expression; and (3) for any value for which correctness
> is not defined by (1) or (2), we may define its "correct" values as the
> class of all possible values.

I'm not talking about correctness, I'm talking about typing.

>  Now, by your definition, any language
> which provides checking of that property of correctness for values is
> latently typed.

No, that isn't what I said.  What I said was:
"A language is latently typed if a value has a property - called it's
type - attached to it, and given it's type it can only represent values
defined by a certain class."

I said nothing about the values producing correct outputs, or being
correct inputs.  I only said that they have types.

What I mean is that each value in the language is defined by two peice
of information, its contents X and its type T.

>  Of course, there are no languages that assign this
> specific class of values; but ANY kind of correctness checking on values
> that a language does (if it's useful at all) is a subset of the perfect
> correctness checking system above.  Apparently, we should call all such
> systems "latent type systems".

No, I'm speaking about type checking not correctness.

>  Can you point out a language that is not
> latently typed?

Easy, any statically typed language is not latently typed.  Values have
no type associated with them, instead variables have types.

If I tell a C program to print out a string but give it a number it
will give an error telling me that the types mismatch where the
variable the number is held in is used to assign to a variable that
must hold a string.  Similarly if I have a lisp function that prints
out a string it will also fail when given a number, but it will fail at
a different point, it will fail when the type of the value is examined
and found to be incorrect.

> I'm not trying to poke holes in your definition for fun.  I am proposing
> that there is no fundamental distinction between the kinds of problems
> that are "type problems" and those that are not.  Types are not a class
> of problems; they are a class of solutions.

Exactly.  Which is why they are only tangentially associated with
correctness.
Typing is a set of rules put in place to aid correctness, but it is not
a system that attempts to create correctness itself.

>  Languages that solve
> problems in ways that don't assign types to variables are not typed
> languages, even if those same problems may have been originally solved
> by type systems.

Well, you can think of things that way.  But to the rest of the
computing world languages that don't assign types to variables but do
assign them to values are latently typed.

> > Untyped and type-free mean something else: they mean no type checking
> > is done.
>
> Hence, they don't exist, and the definitions being used here are rather
> pointless.

No they aren't, types of data exist even if there is no system in place
to check them.  Ask an assembly programmer whether his programs have
string and integers in them and he will probably tell you that they do.
From: Chris Smith
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <MPG.1f01b1f6ac0e45ec9896cf@news.altopia.net>
Rob Thorpe <·············@antenova.com> wrote:
> I'm not talking about correctness, I'm talking about typing.
> 

Since you wrote that, I've come to understand that you meant something 
specific by "property" which I didn't understand at first.  From my 
earlier perspective, it was obvious that correctness was a property of a 
value.  I now realize that you meant a property that's explicitly 
associated with the value and plays a role in determining the behavior 
of the language.  Sorry for the confusion.

> No, that isn't what I said.  What I said was:
> "A language is latently typed if a value has a property - called it's
> type - attached to it, and given it's type it can only represent values
> defined by a certain class."

No, to answer Andreas' concern, you would only need to say:

    ... if a value has a property - called it's type - attached to it,
    and the language semantics guarantees that only values defined by a
    certain class may have that same property attached.

> Easy, any statically typed language is not latently typed.

I'm actually not sure I agree with this at all.  I believe that 
reference values in Java may be said to be latently typed.  This is the 
case because each reference value (except null, which may be tested 
separately) has an explicit property (called its "class", but surely the 
word doesn't make any difference), such that the language semantics 
guarantees that only a certain class of values may have that same 
property, and the property is used to determine behavior of the language 
in many cases (for example, in the case of type-based polymorphism, or 
use of Java's instanceof operator).  Practically all class-based OO 
languages are subject to similar consideration, as it turns out.

I'm unsure whether to consider explicitly stored array lengths, which 
are present in most statically typed languages, to be part of a "type" 
in this sense or not.

-- 
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation
From: Chris Uppal
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <4498356b$0$659$bed64819@news.gradwell.net>
Chris Smith wrote:

> > Easy, any statically typed language is not latently typed.
>
> I'm actually not sure I agree with this at all.  I believe that
> reference values in Java may be said to be latently typed. Practically
> all class-based OO
> languages are subject to similar consideration, as it turns out.

Quite probably true of GC-ed statically typed languages in general, at least up
to a point (and provided you are not using something like a tagless ML
implementation).  I think Rob is assuming a rather too specific implementation
of statically typed languages.


> I'm unsure whether to consider explicitly stored array lengths, which
> are present in most statically typed languages, to be part of a "type"
> in this sense or not.

If I understand your position correctly, wouldn't you be pretty much forced to
reject the idea of the length of a Java array being part of its type ?  If you
want to keep the word "type" bound to the idea of static analysis, then -- 
since Java doesn't perform any size-related static analysis -- the size of a
Java array cannot be part of its type.

That's assuming that you would want to keep the "type" connected to the actual
type analysis performed by the language in question.  Perhaps you would prefer
to loosen that and consider a different (hypothetical) language (perhaps
producing identical bytecode) which does do compile time size analysis.

But then you get into an area where you cannot talk of the type of a value (or
variable) without relating it to the specific type system under discussion.
Personally, I would be quite happy to go there -- I dislike the idea that a
value has a specific inherent type.

It would be interesting to see what a language designed specifically to support
user-defined, pluggable, and perhaps composable, type systems would look like.
Presumably the syntax and "base" semantics would be very simple, clean, and
unrestricted (like Lisp, Smalltalk, or Forth -- not that I'm convinced that any
of those would be ideal for this), with a defined result for any possible
sequence of operations.  The type-system(s) used for a particular run of the
interpreter (or compiler) would effectively reduce the space of possible
sequences.   For instance, one could have a type system which /only/ forbade
dereferencing null, or another with the job of ensuring that mutability
restrictions were respected, or a third which implemented access control...

But then, I don't see a semantically critically distinction between such space
reduction being done at compile time vs. runtime.  Doing it at compile time
could be seen as an optimisation of sorts (with downsides to do with early
binding etc).  That's particularly clear if the static analysis is /almost/
able to prove that <some sequence> is legal (by its own rules) but has to make
certain assumptions in order to construct the proof.  In such a case the
compiler might insert a few runtime checks to ensure that it's assumptions were
valid, but do most of its checking statically.

There would /be/ a distinction between static and dynamic checks in such a
system, and it would be an important distinction, but not nearly as important
as the distinctions between the different type systems.  Indeed I can imagine
categorising type systems by /whether/ (or to what extent) a tractable static
implementation exists.

    -- chris

P.S  Apologies Chris, btw, for dropping out of a conversation we were having on
this subject a little while ago -- I've now said everything that I /would/ have
said in reply to your last post if I'd got around to it in time...
From: Chris Smith
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <MPG.1f020af6425675b19896d0@news.altopia.net>
Chris Uppal <···········@metagnostic.REMOVE-THIS.org> wrote:
> > I'm unsure whether to consider explicitly stored array lengths, which
> > are present in most statically typed languages, to be part of a "type"
> > in this sense or not.
> 
> If I understand your position correctly, wouldn't you be pretty much forced to
> reject the idea of the length of a Java array being part of its type ?

I've since abandoned any attempt to be picky about use of the word 
"type".  That was a mistake on my part.  I still think it's legitimate 
to object to statements of the form "statically typed languages X, but 
dynamically typed languages Y", in which it is implied that Y is 
distinct from X.  When I used the word "type" above, I was adopting the 
working definition of a type from the dynamic sense.  That is, I'm 
considering whether statically typed languages may be considered to also 
have dynamic types, and it's pretty clear to me that they do.

> If you
> want to keep the word "type" bound to the idea of static analysis, then -- 
> since Java doesn't perform any size-related static analysis -- the size of a
> Java array cannot be part of its type.

Yes, I agree.  My terminology has been shifting constantly throughout 
this thread as I learn more about how others are using terms.  I realize 
that probably confuses people, but my hope is that this is still 
superior to stubbornly insisting that I'm right. :)

> That's assuming that you would want to keep the "type" connected to the actual
> type analysis performed by the language in question.  Perhaps you would prefer
> to loosen that and consider a different (hypothetical) language (perhaps
> producing identical bytecode) which does do compile time size analysis.

In the static sense, I think it's absolutely critical that "type" is 
defined in terms of the analysis done by the type system.  Otherwise, 
you miss the definition entirely.  In the dynamic sense, I'm unsure; I 
don't have any kind of deep understanding of what's meant by "type" in 
this sense.  Certainly it could be said that there are somewhat common 
cross-language definitions of "type" that get used.

> But then you get into an area where you cannot talk of the type of a value (or
> variable) without relating it to the specific type system under discussion.

Which is entirely what I'd expect in a static type system.

> Personally, I would be quite happy to go there -- I dislike the idea that a
> value has a specific inherent type.

Good! :)

> It would be interesting to see what a language designed specifically to support
> user-defined, pluggable, and perhaps composable, type systems would look like.
> Presumably the syntax and "base" semantics would be very simple, clean, and
> unrestricted (like Lisp, Smalltalk, or Forth -- not that I'm convinced that any
> of those would be ideal for this), with a defined result for any possible
> sequence of operations.  The type-system(s) used for a particular run of the
> interpreter (or compiler) would effectively reduce the space of possible
> sequences.   For instance, one could have a type system which /only/ forbade
> dereferencing null, or another with the job of ensuring that mutability
> restrictions were respected, or a third which implemented access control...

You mean in terms of a practical programming language?  If not, then 
lambda calculus is used in precisely this way for the static sense of 
types.

> But then, I don't see a semantically critically distinction between such space
> reduction being done at compile time vs. runtime.  Doing it at compile time
> could be seen as an optimisation of sorts (with downsides to do with early
> binding etc).  That's particularly clear if the static analysis is /almost/
> able to prove that <some sequence> is legal (by its own rules) but has to make
> certain assumptions in order to construct the proof.  In such a case the
> compiler might insert a few runtime checks to ensure that it's assumptions were
> valid, but do most of its checking statically.

I think Marshall got this one right.  The two are accomplishing 
different things.  In one case (the dynamic case) I am safeguarding 
against negative consequences of the program behaving in certain non-
sensical ways.  In the other (the static case) I am proving theorems 
about the impossibility of this non-sensical behavior ever happening.  
You mention static typing above as an optimization to dynamic typing; 
that is certainly one possible applications of these theorems.

In some sense, though, it is interesting in its own right to know that 
these theorems have been proven.  Of course, there are important doubts 
to be had whether these are the theorems we wanted to prove in the first 
place, and whether the effort of proving them was worth the additional 
confidence I have in my software systems.

I acknowledge those questions.  I believe they are valid.  I don't know 
the answers.  As an intuitive judgement call, I tend to think that 
knowing the correctness of these things is of considerable benefit to 
software development, because it means that I don't have as much to 
think about at any one point in time.  I can validly make more 
assumptions about my code and KNOW that they are correct.  I don't have 
to trace as many things back to their original source in a different 
module of code, or hunt down as much documentation.  I also, as a 
practical matter, get development tools that are more powerful.  
(Whether it's possible to create the same for a dynamically typed 
language is a potentially interesting discussion; but as a practical 
matter, no matter what's possible, I still have better development tools 
for Java than for JavaScript when I do my job.)

In the end, though, I'm just not very interested in them at the moment.  
For me, as a practical matter, choices of programming language are 
generally dictated by more practical considerations.  I do basically all 
my professional "work" development in Java, because we have a gigantic 
existing software system written in Java and no time to rewrite it.  On 
the other hand, I do like proving theorems, which means I am interested 
in type theory; if that type theory relates to programming, then that's 
great!  That's probably not the thing to say to ensure that my thoughts 
are relevant to the software development "industry", but it's 
nevertheless the truth.

-- 
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation
From: David Hopwood
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <7i3mg.204167$8W1.155367@fe1.news.blueyonder.co.uk>
Chris Smith wrote:
> Chris Uppal <···········@metagnostic.REMOVE-THIS.org> wrote:
> 
>>>I'm unsure whether to consider explicitly stored array lengths, which
>>>are present in most statically typed languages, to be part of a "type"
>>>in this sense or not.
>>
>>If I understand your position correctly, wouldn't you be pretty much forced to
>>reject the idea of the length of a Java array being part of its type ?
> 
> I've since abandoned any attempt to be picky about use of the word "type".

I think you should stick to your guns on that point. When people talk about
"types" being associated with values in a "latently typed" or "dynamically typed"
language, they really mean *tag*, not type.

It is remarkable how much of the fuzzy thinking that often occurs in the
discussion of type systems can be dispelled by insistence on this point (although
much of the benefit can be obtained just by using this terminology in your own
mind and translating what other people are saying to it). It's a good example of
the weak Sapir-Whorf hypothesis, I think.

-- 
David Hopwood <····················@blueyonder.co.uk>
From: Chris Uppal
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <44992e6c$2$664$bed64819@news.gradwell.net>
David Hopwood wrote:

> When people talk
> about "types" being associated with values in a "latently typed" or
> "dynamically typed" language, they really mean *tag*, not type.

I don't think that's true.  Maybe /some/ people do confuse the two, but I am
certainly a counter-example ;-)

The tag (if any) is part of the runtime machinery (or, if not, then I don't
understand what you mean by the word), and while that is certainly a reasonably
approximation to the type of the object/value, it is only an approximation,
and -- what's more -- is only an approximation to the type as yielded by one
specific (albeit abstract, maybe even hypothetical) type system.

If I send #someMessage to a proxy object which has not had its referent set
(and assuming the default value, presumably some variant of nil, does not
understand #someMessage), then that's just as much a type error as sending
#someMessage to a variable holding a nil value.  If I then assign the referent
of the proxy to some object which does understand #someMessage, then it is not
a type error to send #someMessage to the proxy.  So the type has changed, but
nothing in the tag system of the language implementation has changed.

    -- chris
From: Chris Smith
Subject: Re: What is a type error?
Date: 
Message-ID: <MPG.1f045282b0c5d5469896de@news.altopia.net>
Pascal Costanza <··@p-cos.net> wrote:
> What about this: You get a type error when the program attempts to 
> invoke an operation on values that are not appropriate for this operation.
> 
> Examples: adding numbers to strings; determining the string-length of a 
> number; applying a function on the wrong number of parameters; applying 
> a non-function; accessing an array with out-of-bound indexes; etc.

Hmm.  I'm afraid I'm going to be picky here.  I think you need to 
clarify what is meant by "appropriate".  If you mean "the operation will 
not complete successfully" as I suspect you do, then we're closer... but 
this little snippet of Java (HORRIBLE, DO NOT USE!) confuses the matter 
for me:

    int i = 0;

    try
    {
        while (true) process(myArray[i++]);
    }
    catch (IndexOutOfBoundsException e) { }

That's an array index from out of bounds that not only fails to be a 
type error, but also fails to be an error at all!  (Don't get confused 
by Java's having a static type system for other purposes... we are 
looking at array indexing here, which Java checks dynamically.  I would 
have used a dynamically typed language, if I could have written this as 
quickly.)

I'm also unsure how your definition above would apply to languages that 
do normal order evaluation, in which (at least in my limited brain) it's 
nearly impossible to break down a program into sequences of operations 
on actual values.  I suppose, though, that they do eventually happen 
with primitives at the leaves of the derivation tree, so the definition 
would still apply.

-- 
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation
From: Pascal Costanza
Subject: Re: What is a type error?
Date: 
Message-ID: <4g2289F1kv8htU1@individual.net>
Chris Smith wrote:
> Pascal Costanza <··@p-cos.net> wrote:
>> What about this: You get a type error when the program attempts to 
>> invoke an operation on values that are not appropriate for this operation.
>>
>> Examples: adding numbers to strings; determining the string-length of a 
>> number; applying a function on the wrong number of parameters; applying 
>> a non-function; accessing an array with out-of-bound indexes; etc.
> 
> Hmm.  I'm afraid I'm going to be picky here.  I think you need to 
> clarify what is meant by "appropriate".

No, I cannot be a lot clearer here. What operations are appropriate for 
what values largely depends on the intentions of a programmer. Adding a 
number to a string is inappropriate, no matter how a program behaves 
when this actually occurs (whether it continues to execute the operation 
blindly, throws a continuable exception, or just gets stuck).

> If you mean "the operation will 
> not complete successfully" as I suspect you do, then we're closer...

No, we're not. You're giving a purely technical definition here, that 
may or may not relate to the programmer's (or "designer's") 
understanding of the domain.


Pascal

-- 
3rd European Lisp Workshop
July 3 - Nantes, France - co-located with ECOOP 2006
http://lisp-ecoop06.bknr.net/
From: Chris Uppal
Subject: Re: What is a type error?
Date: 
Message-ID: <449aaea0$2$656$bed64819@news.gradwell.net>
Chris Smith wrote:

>  Some people here seem to be
> saying that there is a universal concept of "type error" in dynamic
> typing, but I've still yet to see a good precise definition (nor a good
> precise definition of dynamic typing at all).

How about this, at least as a strawman:

I think we're agreed (you and I anyway, if not everyone in this thread) that we
don't want to talk of "the" type system for a given language.  We want to allow
a variety of verification logics.  So a static type system is a logic which can
be implemented based purely on the program text without making assumptions
about
runtime events (or making maximally pessimistic assumptions -- which comes to
the same thing really).  I suggest that a "dynamic type system" is a
verification logic which (in principle) has available as input not only the
program text, but also the entire history of the program execution up to the
moment when the to-be-checked operation is invoked.

I don't mean to imply that an operation /must/ not be checked until it is
invoked (although a particular logic/implementation might not do so).  For
instance an out-of-bound array access might be rejected:
    When the attempt was made to read that slot.
    When, in the surrounding code, it first became
      unavoidable that the about read /would/ be reached.
    When the array was first passed to a function which
      /might/ read that slot.
    ...and so on...

Note that not all errors that I would want to call type errors are necessarily
caught by the runtime -- it might go happily ahead never realising that it had
just allowed one of the constraints of one of the logics I use to reason about
the program.  What's known as an undetected bug -- but just because the runtime
doesn't see it, doesn't mean that I wouldn't say I'd made a type error.  (The
same applies to any specific static type system too, of course.)

But the checks the runtime does perform (whatever they are, and whenever they
happen), do between them constitute /a/ logic of correctness.  In many highly
dynamic languages that logic is very close to being maximally optimistic, but
it doesn't have to be (e.g. the runtime type checking in the JMV is pretty
pessimistic in many cases).

Anyway, that's more or less what I mean when I talk of dynamically typed
language and their dynamic type systems.


> I suspect you'll see the Smalltalk version of the objections raised in
> response to my post earlier.  In other words, whatever terminology you
> think is consistent, you'll probably have a tough time convincing
> Smalltalkers to stop saying "type" if they did before.  If you exclude
> "message not understood" as a type error, then I think you're excluding
> type errors from Smalltalk entirely, which contradicts the psychological
> understanding again.

Taking Smalltalk /specifically/, there is a definite sense in which it is
typeless -- or trivially typed -- in that in that language there are no[*]
operations which are forbidden[**], and none which might not be invoked
deliberately (e.g. I have code which deliberately reads off the end of a
container object -- just to make sure I raise the "right" error for that
container, rather than raising my own error).  But, on the other hand, I do
still want to talk of type, and type system, and type errors even when I
program Smalltalk, and when I do I'm thinking about "type" in something like
the above sense.

    -- chris

[*] I can't think of any offhand -- there may be a few.

[**] Although there are operations which are not possible, reading another
object's instvars directly for instance, which I suppose could be taken to
induce a non-trivial (and static) type logic.
From: Chris Smith
Subject: Re: What is a type error?
Date: 
Message-ID: <MPG.1f0490946f3e8a789896e0@news.altopia.net>
Chris Uppal <···········@metagnostic.REMOVE-THIS.org> wrote:
> I think we're agreed (you and I anyway, if not everyone in this thread) that we
> don't want to talk of "the" type system for a given language.  We want to allow
> a variety of verification logics.  So a static type system is a logic which can
> be implemented based purely on the program text without making assumptions
> about runtime events (or making maximally pessimistic assumptions -- which comes
> to the same thing really).  I suggest that a "dynamic type system" is a
> verification logic which (in principle) has available as input not only the
> program text, but also the entire history of the program execution up to the
> moment when the to-be-checked operation is invoked.

I am trying to understand how the above statement about dynamic types 
actually says anything at all.  So a dynamic type system is a system of 
logic by which, given a program and a path of program execution up to 
this point, verifies something.  We still haven't defined "something", 
though.  We also haven't defined what happens if that verification 
fails.  One or the other or (more likely) some combination of the two 
must be critical to the definition in order to exclude silly 
applications of it.  Presumably you want to exclude from your definition 
of a dynamic "type system" which verifies that a value is non-negative, 
and if so executes the block of code following "then"; and otherwise, 
executes the block of code following "else".  Yet I imagine you don't 
want to exclude ALL systems that allow the programmer to execute 
different code when the verification fails (think exception handlers) 
versus succeeds, nor exclude ALL systems where the condition is that a 
value is non-negative.

In other words, I think that everything so far is essentially just 
defining a dynamic type system as equivalent to a formal semantics for a 
programming language, in different words that connote some bias toward 
certain ways of looking at possibilities that are likely to lead to 
incorrect program behavior.  I doubt that will be an attractive 
definition to very many people.

> Note that not all errors that I would want to call type errors are necessarily
> caught by the runtime -- it might go happily ahead never realising that it had
> just allowed one of the constraints of one of the logics I use to reason about
> the program.  What's known as an undetected bug -- but just because the runtime
> doesn't see it, doesn't mean that I wouldn't say I'd made a type error.  (The
> same applies to any specific static type system too, of course.)

In static type system terminology, this quite emphatically does NOT 
apply.  There may, of course, be undetected bugs, but they are not type 
errors.  If they were type errors, then they would have been detected, 
unless the compiler is broken.

If you are trying to identify a set of dynamic type errors, in a way 
that also applies to statically typed languages, then I will read on.

> But the checks the runtime does perform (whatever they are, and whenever they
> happen), do between them constitute /a/ logic of correctness.  In many highly
> dynamic languages that logic is very close to being maximally optimistic, but
> it doesn't have to be (e.g. the runtime type checking in the JMV is pretty
> pessimistic in many cases).
> 
> Anyway, that's more or less what I mean when I talk of dynamically typed
> language and their dynamic type systems.

So my objections, then, are in the first paragraph.

> [**] Although there are operations which are not possible, reading another
> object's instvars directly for instance, which I suppose could be taken to
> induce a non-trivial (and static) type logic.

In general, I wouldn't consider a syntactically incorrect program to 
have a static type error.  Type systems are, in fact, essentially a tool 
so separate concerns; specifically, to remove type-correctness concerns 
from the grammars of programming languages.  By doing so, we are able at 
least to considerably simplify the grammar of the language, and perhaps 
also to increase the "tightness" of the verification without risking 
making the language grammar context-sensitive.  (I'm unsure about the 
second part of that statement, but I can think of no obvious theoretical 
reason to assume that combining a type system with a regular context-
free grammar would yield another context-free grammar.  Then again, 
formal languages are not my strong point.)

-- 
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation
From: Eliot Miranda
Subject: Re: What is a type error?
Date: 
Message-ID: <zADmg.27129$VE1.10493@newssvr14.news.prodigy.com>
> Chris Smith wrote:
>>I suspect you'll see the Smalltalk version of the objections raised in
>>response to my post earlier.  In other words, whatever terminology you
>>think is consistent, you'll probably have a tough time convincing
>>Smalltalkers to stop saying "type" if they did before.  If you exclude
>>"message not understood" as a type error, then I think you're excluding
>>type errors from Smalltalk entirely, which contradicts the psychological
>>understanding again.
> 
Chris Uppal wrote:

> 
> Taking Smalltalk /specifically/, there is a definite sense in which it is
> typeless -- or trivially typed -- in that in that language there are no[*]
> operations which are forbidden[**],

Come one Chris U.   One has to distinguish an attempt to invoke an 
operation with it being carried out.  There is nothing in Smalltalk to 
stop one attempting to invoke any "operation" on any object.  But one 
can only actually carry-out operations on objects that implement them. 
So, apart from the argument about inadvertent operation name overloading 
(which is important, but avoidable), Smalltalk is in fact 
strongly-typed, but not statically strongly-typed.

-- 
_______________,,,^..^,,,____________________________
Eliot Miranda              Smalltalk - Scene not herd
From: Darren New
Subject: Re: What is a type error?
Date: 
Message-ID: <INDmg.3801$MF6.813@tornado.socal.rr.com>
Eliot Miranda wrote:
> can only actually carry-out operations on objects that implement them. 

Execpt that every operation is implemented by every object in Smalltalk. 
Unless you specify otherwise, the implementation of every method is to 
call the receiver with doesNotUnderstand.  (I don't recall whether the 
class of nil has a special rule for this or whether it implements 
doesNotUnderstand and invokes the appropriate "don't send messages to 
nil" method.)

There are a number of Smalltalk extensions, such as 
multiple-inheritance, that rely on implementing doesNotUnderstand.

-- 
   Darren New / San Diego, CA, USA (PST)
     Native Americans used every part
     of the buffalo, including the wings.
From: Eliot Miranda
Subject: Re: What is a type error?
Date: 
Message-ID: <0tXmg.28329$VE1.19407@newssvr14.news.prodigy.com>
Darren New wrote:

> Eliot Miranda wrote:
> 
>> can only actually carry-out operations on objects that implement them. 
> 
> 
> Execpt that every operation is implemented by every object in Smalltalk. 

No they are not.  Objects implement the methods defined by their class 
and (inherit) those implemented by the class's superclasses.  Note that 
classes do _not_ have to inherit from Object, and so one can create 
classes that implement no methods at all.  It is a common idiom to 
create a class that implements two methods, an initialization method and 
a doesNotUnderstand: method, so create a transparent proxy that 
intercepts any and all messages sent to it other than the initialization 
method (*).

[(*) With suitable hackery (since Smalltalk gives access to the 
execution stack through thisContext) one can also invoke 
doesNotUnderstand: if the initialization message is sent from any object 
other than the class.]

> Unless you specify otherwise, the implementation of every method is to 
> call the receiver with doesNotUnderstand.  (I don't recall whether the 
> class of nil has a special rule for this or whether it implements 
> doesNotUnderstand and invokes the appropriate "don't send messages to 
> nil" method.)

No.  The run-time error of trying to invoke an  operation that isn't 
implemented by an object is to send the doesNotUnderstand: message. 
This is another attempt to invoke an operation, i.e. whatever the 
object's doesNotUnderstand: method is, if any.  If the object doesn't 
implement doesNotUnderstand:, which is quite possible, then the system, 
will likely crash (either with an out of memory error as it goes into 
infinite recursion) or hang (looping attempting to find an 
implementation of doesNotUnderstand:).

> There are a number of Smalltalk extensions, such as 
> multiple-inheritance, that rely on implementing doesNotUnderstand.

Which has nothing to do with the separation between message send and 
method invocation, or the fact that doesNotUnderstand: is a message 
send, not a hard call of a given doesNotUnderstand: method.
-- 
_______________,,,^..^,,,____________________________
Eliot Miranda              Smalltalk - Scene not herd
From: Darren New
Subject: Re: What is a type error?
Date: 
Message-ID: <hLXmg.3901$MF6.609@tornado.socal.rr.com>
Eliot Miranda wrote:
> classes do _not_ have to inherit from Object, 

I was unaware of this subtlety. Thanks!

-- 
   Darren New / San Diego, CA, USA (PST)
     Native Americans used every part
     of the buffalo, including the wings.
From: Chris Uppal
Subject: Re: What is a type error?
Date: 
Message-ID: <449bde5f$0$663$bed64819@news.gradwell.net>
Eliot Miranda wrote:

[me:]
> > Taking Smalltalk /specifically/, there is a definite sense in which it
> > is typeless -- or trivially typed -- in that in that language there are
> > no[*] operations which are forbidden[**],
>
> Come one Chris U.   One has to distinguish an attempt to invoke an
> operation with it being carried out.  There is nothing in Smalltalk to
> stop one attempting to invoke any "operation" on any object.  But one
> can only actually carry-out operations on objects that implement them.
> So, apart from the argument about inadvertent operation name overloading
> (which is important, but avoidable), Smalltalk is in fact
> strongly-typed, but not statically strongly-typed.

What are you doing /here/, Eliot, this is Javaland ?  Smalltalk is thatta
way ->

 ;-)


But this discussion has been all about /whether/ it is ok to apply the notion
of (strong) typing to what runtime-checked languages do.   We all agree that
the checks happen, but the question is whether it is
reasonable/helpful/legitimate to extend the language of static checking to the
dynamic case.  (I'm on the side which says yes, but good points have been made
against it).

The paragraph you quoted doesn't represent most of what I have been saying -- 
it was just a side-note looking at one small aspect of the issue from a
different angle.

    -- chris
From: Chris Uppal
Subject: Re: What is a type error?
Date: 
Message-ID: <449bde60$0$663$bed64819@news.gradwell.net>
Chris Smith wrote:

[me:]
> > I think we're agreed (you and I anyway, if not everyone in this thread)
> > that we don't want to talk of "the" type system for a given language.
> > We want to allow a variety of verification logics.  So a static type
> > system is a logic which can be implemented based purely on the program
> > text without making assumptions about runtime events (or making
> > maximally pessimistic assumptions -- which comes to the same thing
> > really).  I suggest that a "dynamic type system" is a verification
> > logic which (in principle) has available as input not only the program
> > text, but also the entire history of the program execution up to the
> > moment when the to-be-checked operation is invoked.
>
> I am trying to understand how the above statement about dynamic types
> actually says anything at all.  So a dynamic type system is a system of
> logic by which, given a program and a path of program execution up to
> this point, verifies something.  We still haven't defined "something",
> though.

That was the point of my first sentence (quoted above).  I take it, and I
assumed that you shared my view, that there is no single "the" type system -- 
that /a/ type system will yield judgements on matters which it has been
designed to judge.  So unless we either nominate a specific type system or
choose what judgements we want to make (today) any discussion of types is
necessarily parameterised on the class(es) of <Judgements>.  So, I don't -- 
can't -- say /which/ judgements my "dynamic type systems" will make.  They may
be about nullablity, they may be about traditional "type", they may be about
mutability...

When we look at a specific language (and its implementation), then we can
induce the logic(s) that whatever dynamic checks it applies define.
Alternatively we can consider other "dynamic type systems" which we would like
to formalise and mechanise, but which are not built into our language of
choice.


> We also haven't defined what happens if that verification
> fails.

True, and a good point.  But note that it only applies to systems which are
actually implemented in code (or which are intended to be so).

As a first thought, I suggest that a "dynamic type system" should specify a
replacement action (which includes, but is not limited to, terminating
execution).  That action is taken /instead/ of the rejected one.  E.g. we don't
actually read off the end of the array, but instead a signal is raised. (An
implementation might, in some cases, find it easier to implement the checks by
allowing the operation to fail, and then backtracking to "pretend" that it had
never happened, but that's irrelevant here).  The replacement action must -- of
course -- be valid according to the rules of the type system.

Incidentally, using that idea, we get a fairly close analogy to the difference
between strong and weak static type systems.  If the "dynamic type system"
doesn't specify a valid replacement action, or if that action is not guaranteed
to be taken, then it seems that the type system or the language implementation
is "weak" in very much the same sort of way as -- say -- the 'C' type system is
weak and/or weakly enforced.

I wonder whether that way of looking at it -- the "error" never happens since
it is replaced by a valid operation -- puts what I want to call dynamic type
systems onto a closer footing with static type systems.  Neither allows the
error to occur.

(Of course, /I/ -- the programmer -- have made a type error, but that's
different thing.)


> In other words, I think that everything so far is essentially just
> defining a dynamic type system as equivalent to a formal semantics for a
> programming language, in different words that connote some bias toward
> certain ways of looking at possibilities that are likely to lead to
> incorrect program behavior.  I doubt that will be an attractive
> definition to very many people.

My only objections to this are:

a) I /want/ to use the word "type" to describe the kind of reasoning I do (and
some of the mistakes I make)

b) I want to separate the systems of reasoning (whether formal or informal,
static or dynamic, implemented or merely conceptual, and /whatever/ we call 'em
;-) from the language semantics.  I have no objection to <some type system>
being used as part of a language specification, but I don't want to restrict
types to that.


> > Note that not all errors that I would want to call type errors are
> > necessarily caught by the runtime -- it might go happily ahead never
> > realising that it had just allowed one of the constraints of one of the
> > logics I use to reason about the program.  What's known as an
> > undetected bug -- but just because the runtime doesn't see it, doesn't
> > mean that I wouldn't say I'd made a type error.  (The same applies to
> > any specific static type system too, of course.)
>
> In static type system terminology, this quite emphatically does NOT
> apply.  There may, of course, be undetected bugs, but they are not type
> errors.  If they were type errors, then they would have been detected,
> unless the compiler is broken.

Sorry, I wasn't clear.  I was thinking here of my internal analysis (which I
want to call a type system too).  Most of what I was trying to say is that I
don't expect a "dynamic type system" to be complete, any more than a static
one.   I also wanted to emphasise that I am happy to talk about type systems
(dynamic or not) which have not been implemented as code (so they yield
judgements, and provide a framework for understanding the program, but nothing
in the computer actually checks them for me).


> If you are trying to identify a set of dynamic type errors, in a way
> that also applies to statically typed languages, then I will read on.

Have I answered that now ?  /Given/ a set of operations which I want to forbid,
I would like to say that a "dynamic type system" which prevents them executing
is doing very much the same job as a static type system which stops the code
compiling.

Of course, we can talk about what kinds of operations we want to forbid, but
that seems (to me) to be orthogonal to this discussion.   Indeed, the question
of dynamic/static is largely irrelevant to a discussion of what operations we
want to police, except insofar as certain checks might require information
which isn't available to a (feasible) static theorem prover.

    -- chris
From: Chris Smith
Subject: Re: What is a type error?
Date: 
Message-ID: <MPG.1f05caefbb39e8189896ea@news.altopia.net>
Chris Uppal <···········@metagnostic.REMOVE-THIS.org> wrote:
> That was the point of my first sentence (quoted above).  I take it, and I
> assumed that you shared my view, that there is no single "the" type system -- 
> that /a/ type system will yield judgements on matters which it has been
> designed to judge.

Yes, I definitely agree.

My point was that if you leave BOTH the "something" and the response to 
verification failure undefined, then your definition of a dynamic type 
system is a generalization of the definition of a conditional 
expression.  That is (using Java),

    if (x != 0) y = 1 / x;
    else y = 999999999;

is not all that much different from (now restricting to Java):

    try { y = 1 / x; }
    catch (ArithmeticException e) { y = 999999999; }

So is one of them a use of a dynamic type system, where the other is 
not?

> Incidentally, using that idea, we get a fairly close analogy to the difference
> between strong and weak static type systems.

I think, actually, that the analogy of the strong/weak distinction 
merely has to do with how much verification is done.  But then, I 
dislike discussion of strong/weak type systems in the first place.  It 
doesn't make any sense to me to say that we verify something and then 
don't do anything if the verification fails.  In those cases, I'd just 
say that verification doesn't really exist or is incorrectly 
implemented.  Of course, this would make the type system weaker.

> I wonder whether that way of looking at it -- the "error" never happens since
> it is replaced by a valid operation -- puts what I want to call dynamic type
> systems onto a closer footing with static type systems.

Perhaps.  I'm thinking some things over before I respond to Anton.  I'll 
do that first, and some of my thoughts there may end up being relevant 
to this question.

> b) I want to separate the systems of reasoning (whether formal or informal,
> static or dynamic, implemented or merely conceptual, and /whatever/ we call 'em
> ;-) from the language semantics.  I have no objection to <some type system>
> being used as part of a language specification, but I don't want to restrict
> types to that.

In the pragmatic sense of this desire, I'd suggest that a way to 
characterize this is that your type system is a set of changes to the 
language semantics.  By applying them to a language L, you obtain a 
different language L' which you can then use.  If your type system has 
any impact on the set of programs accepted by the language (static 
types) or on any observable behavior which they exhibit (dynamic types), 
then I can't see a justification for claiming that the result is the 
same language; and if it does not, then the whole exercise is silly.

> Of course, we can talk about what kinds of operations we want to forbid, but
> that seems (to me) to be orthogonal to this discussion.   Indeed, the question
> of dynamic/static is largely irrelevant to a discussion of what operations we
> want to police, except insofar as certain checks might require information
> which isn't available to a (feasible) static theorem prover.

Indeed, that is orthogonal to the discussion from the perspective of 
static types.  If you are proposing that it is also orthogonal with 
respect to dynamic types, that will be a welcome step toward our goal of 
a grand unified type theory, I suppose.  I have heard from others in 
this thread that they don't believe so.  I am also interested in your 
response to the specific example involving an if statement at the 
beginning of this reply.

-- 
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation
From: Chris Uppal
Subject: Re: What is a type error?
Date: 
Message-ID: <449c4826$1$657$bed64819@news.gradwell.net>
Chris Smith wrote:

> But then, I
> dislike discussion of strong/weak type systems in the first place.  It
> doesn't make any sense to me to say that we verify something and then
> don't do anything if the verification fails.  In those cases, I'd just
> say that verification doesn't really exist or is incorrectly
> implemented.

Hmm, that seems to make what we are doing when we run some tests dependent on
what we do when we get the results.  If I run a type inference algorithm on
some of my Smalltalk code, and it tells me that something is type-incorrect,
then I think I'd like to be able to use the same words for the checking
regardless of whether, after due consideration, I decide to change my code, or
that it is OK as-is.  Minor point, though...

Anyway:

>     if (x != 0) y = 1 / x;
>     else y = 999999999;
>
> is not all that much different from (now restricting to Java):
>
>     try { y = 1 / x; }
>     catch (ArithmeticException e) { y = 999999999; }
>
> So is one of them a use of a dynamic type system, where the other is
> not?

My immediate thought is that the same question is equally applicable (and
equally interesting -- i.e. no simple answer ;-) for static typing.  Assuming
Java's current runtime semantics, but positing different type checkers,  we can
get several answers.

A] Trivial: neither involves the type system at all.  What we are seeing
is a runtime check (not type check) correctly applied, or made unnecessary by
conditional code.  And, obviously we can take the same line for the "dynamic
type checking" -- i.e. that no type checking is involved.

B] More interesting, we could extend Java's type system with a NotZero
numeric type (or rather, about half a dozen such), and decree that 1 / x was
only legal if
    x : NotZero
Under that type system, neither of the above would be type legal, assuming x is
declared int.  Or both would be legal if it was declared NotZero -- but in both
cases there'd be a provably unnecessary runtime check.  The nearest "dynamic
type system" equivalent would be quite happy with x : int.  For the first
example, the runtime checks would never fire (though they would execute).  For
the second the runtime "type" might come into play, and if it did, then the
assignment of 999999999 would take place /because/ the illegal operation (which
can be thought of as casting 0 to NotZero) had been replaced by a specific
legal one (throwing an ArithmeticException in this case).

C] A different approach would be to type the execution, instead of the value,
so we extend Java's type system so that:
    1 / x
had type (in part) <computation which can throw ArithmeticException>.  The
first example as a whole then would also have (in part) that type, and the
static type system would deem that illegal.  The second example, however, would
be type-legal.  I hope I don't seem to be fudging the issue, but I don't really
think this version has a natural "dynamic type system" equivalent -- what could
the checker check ?  It would see an attempt to throw an exception from a place
where that's not legal, but it wouldn't make a lot of sense to replace it with
an IllegalExceptionException ;-)

D] Another possibility would be a static type checker after the style of [B]
but extended so that it recognised the x != 0 guard as implicitly giving x :
NonZero.  In that case, again, the first example is static type-correct but the
second is not.  In this case the corresponding "dynamic type system" could, in
principle, avoid testing x for zero the second time.  I don't know whether that
would be worth the effort to implement.  In the second example, just as in
[B], the reason the program doesn't crash when x == 0 is that the "dynamic type
system" has required that the division be replaced with a throw.

One could go further (merge [B] and [C], or [C] and [D] perhaps), but I don't
think it adds much, and anyway I'm getting a bit bored....   (And also getting
sick of putting scare quotes around "dynamic type system" every time ;-)

Personally, of the above, I find [B] most appealing.

    -- chris
From: Chris Smith
Subject: Re: What is a type error?
Date: 
Message-ID: <MPG.1f060fc6e91b10f89896ef@news.altopia.net>
Chris Uppal <···········@metagnostic.REMOVE-THIS.org> wrote:
> >     if (x != 0) y = 1 / x;
> >     else y = 999999999;
> >
> > is not all that much different from:
> >
> >     try { y = 1 / x; }
> >     catch (ArithmeticException e) { y = 999999999; }

> My immediate thought is that the same question is equally applicable (and
> equally interesting -- i.e. no simple answer ;-) for static typing.

I don't see that.  Perhaps I'm missing something.  If I were to define a 
type system for such a language that is vaguely Java-based but has 
NotZero as a type, it would basically implement your option [D], and 
contain the following rules:

1. NotZero is a subtype of Number
2. Zero is a subtype of Number

3. If x : Number and k : Zero, then "if (x != k) a else b" is well-typed 
iff "a" is well-typed when the type-environment is augmented with the 
proposition "x : NotZero" and "b" is well-typed when the type-
environment is augmented with the new proposition "x : Zero".

Yes, I know that Java doesn't do this right now in similar places (e.g., 
instanceof), and I don't know why not.  This kind of thing isn't even 
theoretically interesting at all, except that it makes the proof of 
type-soundness look different.  As far as I can see, it ought to be 
second-nature in developing a type system for a language.

In that case, the first example is well-typed, and also runs as 
expected.  The second bit of code is not well-typed, and thus 
constitutes a type error.  The way we recognize it as a type error, 
though, is in the fact that the compiler aborts while checking the 
typing relation, rather than because of the kind of problem prevented.  
Of course, it would be possible to build a static type system in which 
different things are true.  For example, Java itself is a system in 
which both terms are well-typed.

From a dynamic type perspective, the litany of possibilities doesn't 
really do much for defining a dynamic type system, which I thought was 
our goal in this bit of the thread.  I may not have been clear enough 
that I was look for an a priori grounds to classify these two cases by 
the definition.  I was assuming the existing behavior of Java, which 
says that both accomplish the same thing and react in the same way, yet 
intuitively we seem to think of one as a type error being "solved", 
whereas the other is not.

Anyway, you just wrote another message that is far closer to what I feel 
we ought to be thinking about than my question here... so I will get to 
reading that one, and perhaps we can take up this point later when and 
if it is appropriate.

-- 
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation
From: Marshall
Subject: Re: What is a type error?
Date: 
Message-ID: <1151094574.908557.86040@p79g2000cwp.googlegroups.com>
Chris Uppal wrote:
>  /Given/ a set of operations which I want to forbid,
> I would like to say that a "dynamic type system" which prevents them executing
> is doing very much the same job as a static type system which stops the code
> compiling.

There are similarities, (classifying values into categories, and
checking
whether functions are sensibly invoked according these categories)
and there are differences (in the difference between existential
quantification of success vs. universal quantification of success.)
They are certainly not identical. (I'm not sure if by "very much the
same" you meant identical or not.)


> Of course, we can talk about what kinds of operations we want to forbid, but
> that seems (to me) to be orthogonal to this discussion.   Indeed, the question
> of dynamic/static is largely irrelevant to a discussion of what operations we
> want to police, except insofar as certain checks might require information
> which isn't available to a (feasible) static theorem prover.

Sometimes the things you want to check are emergent properties
of a progam rather than local properties. I don't see how runtime
checks can do anything with these properties, at least not without
a formal framework. An example is deadlock freedom. There
exist type systems that can prove code free of deadlock or
race conditions. How woud you write a runtime check for that?


Marshall
From: Pascal Costanza
Subject: Re: What is a type error?
Date: 
Message-ID: <4g21q7F1lf5lsU1@individual.net>
Matthias Blume wrote:
> Pascal Costanza <··@p-cos.net> writes:
> 
>> Chris Smith wrote:
>>
>>> While this effort to salvage the term "type error" in dynamic
>>> languages is interesting, I fear it will fail.  Either we'll all
>>> have to admit that "type" in the dynamic sense is a psychological
>>> concept with no precise technical definition (as was at least hinted
>>> by Anton's post earlier, whether intentionally or not) or someone is
>>> going to have to propose a technical meaning that makes sense,
>>> independently of what is meant by "type" in a static system.
>> What about this: You get a type error when the program attempts to
>> invoke an operation on values that are not appropriate for this
>> operation.
>>
>> Examples: adding numbers to strings; determining the string-length of
>> a number; applying a function on the wrong number of parameters;
>> applying a non-function; accessing an array with out-of-bound indexes;
>> etc.
> 
> Yes, the phrase "runtime type error" is actually a misnomer.  What one
> usually means by that is a situation where the operational semantics
> is "stuck", i.e., where the program, while not yet arrived at what's
> considered a "result", cannot make any progress because the current
> configuration does not match any of the rules of the dynamic
> semantics.
> 
> The reason why we call this a "type error" is that such situations are
> precisely the ones we want to statically rule out using sound static
> type systems.  So it is a "type error" in the sense that the static
> semantics was not strong enough to rule it out.
> 
>> Sending a message to an object that does not understand that message
>> is a type error. The "message not understood" machinery can be seen
>> either as a way to escape from this type error in case it occurs and
>> allow the program to still do something useful, or to actually remove
>> (some) potential type errors.
> 
> I disagree with this.  If the program keeps running in a defined way,
> then it is not what I would call a type error.  It definitely is not
> an error in the sense I described above.

If your view of a running program is that it is a "closed" system, then 
you're right. However, maybe there are several layers involved, so what 
appears to be a well-defined behavior from the outside may still be 
regarded as a type error internally.

A very obvious example of this is when you run a program in a debugger. 
There are two levels involved here: the program signals a type error, 
but that doesn't mean that the system as a whole is stuck. Instead, the 
debugger takes over and offers ways to deal with the type error. The 
very same program run without debugging support would indeed simply be 
stuck in the same situation.

So to rephrase: It depends on whether you use the "message not 
understood" machinery as a way to provide well-defined behavior for the 
base level, or rather as a means to deal with an otherwise unanticipated 
situation. In the former case it extends the language to remove certain 
type errors, in the latter case it provides a kind of debugging facility 
(and it indeed may be a good idea to deploy programs with debugging 
facilities, and not only use debugging tools at development time).

This is actually related to the notion of reflection, as coined by Brian 
C. Smith. In a reflective architecture, you distinguish between various 
interpreters, each of which interprets the program at the next level. A 
debugger is a program that runs at a different level than a base program 
that it debugs. However, the reflective system as a whole is "just" a 
single program seen from the outside (with one interpreter that runs the 
whole reflective tower). This distinction between the internal and the 
external view of a reflective system was already made by Brian Smith.


Pascal

-- 
3rd European Lisp Workshop
July 3 - Nantes, France - co-located with ECOOP 2006
http://lisp-ecoop06.bknr.net/
From: David Hopwood
Subject: Re: What is a type error?
Date: 
Message-ID: <M40ng.475401$xt.231558@fe3.news.blueyonder.co.uk>
Pascal Costanza wrote:
> Chris Smith wrote:
> 
>> While this effort to salvage the term "type error" in dynamic
>> languages is interesting, I fear it will fail.  Either we'll all have
>> to admit that "type" in the dynamic sense is a psychological concept
>> with no precise technical definition (as was at least hinted by
>> Anton's post earlier, whether intentionally or not) or someone is
>> going to have to propose a technical meaning that makes sense,
>> independently of what is meant by "type" in a static system.
> 
> What about this: You get a type error when the program attempts to
> invoke an operation on values that are not appropriate for this operation.
> 
> Examples: adding numbers to strings; determining the string-length of a
> number; applying a function on the wrong number of parameters; applying
> a non-function; accessing an array with out-of-bound indexes; etc.

This makes essentially all run-time errors (including assertion failures,
etc.) "type errors". It is neither consistent with, nor any improvement
on, the existing vaguely defined usage.

-- 
David Hopwood <····················@blueyonder.co.uk>
From: Pascal Costanza
Subject: Re: What is a type error?
Date: 
Message-ID: <4gakhiF1m698cU1@individual.net>
David Hopwood wrote:
> Pascal Costanza wrote:
>> Chris Smith wrote:
>>
>>> While this effort to salvage the term "type error" in dynamic
>>> languages is interesting, I fear it will fail.  Either we'll all have
>>> to admit that "type" in the dynamic sense is a psychological concept
>>> with no precise technical definition (as was at least hinted by
>>> Anton's post earlier, whether intentionally or not) or someone is
>>> going to have to propose a technical meaning that makes sense,
>>> independently of what is meant by "type" in a static system.
>> What about this: You get a type error when the program attempts to
>> invoke an operation on values that are not appropriate for this operation.
>>
>> Examples: adding numbers to strings; determining the string-length of a
>> number; applying a function on the wrong number of parameters; applying
>> a non-function; accessing an array with out-of-bound indexes; etc.
> 
> This makes essentially all run-time errors (including assertion failures,
> etc.) "type errors". It is neither consistent with, nor any improvement
> on, the existing vaguely defined usage.

Nope. This is again a matter of getting the levels right.

Consider division by zero: appropriate arguments for division are 
numbers, including the zero. The dynamic type check will typically not 
check whether the second argument is zero, but will count on the fact 
that the processor will raise an exception one level deeper.

This is maybe better understandable in user-level code. Consider the 
following class definition:

class Person {
   String name;
   int age;

   void buyPorn() {
    if (< this.age 18) throw new AgeRestrictionException();
    ...
   }
}

The message p.buyPorn() is a perfectly valid message send and will pass 
both static and dynamic type tests (given p is of type Person in the 
static case). However, you will still get a runtime error.


Pascal

-- 
3rd European Lisp Workshop
July 3 - Nantes, France - co-located with ECOOP 2006
http://lisp-ecoop06.bknr.net/
From: Chris Smith
Subject: Re: What is a type error?
Date: 
Message-ID: <MPG.1f09cc2facf5591b98970f@news.altopia.net>
Pascal Costanza <··@p-cos.net> wrote:
> Consider division by zero: appropriate arguments for division are 
> numbers, including the zero. The dynamic type check will typically not 
> check whether the second argument is zero, but will count on the fact 
> that the processor will raise an exception one level deeper.

Of course zero is not appropriate as a second argument to the division 
operator!  I can't possibly see how you could claim that it is.  The 
only reasonable statement worth making is that there doesn't exist a 
type system in widespread use that is capable of checking this.

> This is maybe better understandable in user-level code. Consider the 
> following class definition:
> 
> class Person {
>    String name;
>    int age;
> 
>    void buyPorn() {
>     if (< this.age 18) throw new AgeRestrictionException();
>     ...
>    }
> }
> 
> The message p.buyPorn() is a perfectly valid message send and will pass 
> both static and dynamic type tests (given p is of type Person in the 
> static case).

It appears you've written the code above to assume that the type system 
can't cerify that age >= 18... otherwise, the if statement would not 
make sense.  It also looks like Java, in which the type system is indeed 
not powerfule enough to do that check statically.  However, it sounds as 
if you're claiming that it wouldn't be possible for the type system to 
do this?  If so, that's not correct.  If such a thing were checked at 
compile-time by a static type check, then failing to actually provide 
that guarantee would be a type error, and the compiler would tell you 
so.

Whether you'd choose to call an equivalent runtime check a "dynamic type 
check" is a different matter, and highlights the fact that again there's 
something fundamentally different about the definition of a "type" in 
the static and dynamic sense.

-- 
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation
From: Pascal Costanza
Subject: Re: What is a type error?
Date: 
Message-ID: <4galqlF1mae4qU1@individual.net>
Chris Smith wrote:
> Pascal Costanza <··@p-cos.net> wrote:
>> Consider division by zero: appropriate arguments for division are 
>> numbers, including the zero. The dynamic type check will typically not 
>> check whether the second argument is zero, but will count on the fact 
>> that the processor will raise an exception one level deeper.
> 
> Of course zero is not appropriate as a second argument to the division 
> operator!  I can't possibly see how you could claim that it is.  The 
> only reasonable statement worth making is that there doesn't exist a 
> type system in widespread use that is capable of checking this.

...and this is typically not even checked at the stage where type tags 
are checked in dynamically-typed languages. Hence, it is not a type 
error. (A type error is always what you define to be a type error within 
  a given type system, right?)

Note, this example was in response to David's reply that my definition 
turns every runtime error into a type error. That's not the case, and 
that's all I want to say.

>> This is maybe better understandable in user-level code. Consider the 
>> following class definition:
>>
>> class Person {
>>    String name;
>>    int age;
>>
>>    void buyPorn() {
>>     if (< this.age 18) throw new AgeRestrictionException();
>>     ...
>>    }
>> }
>>
>> The message p.buyPorn() is a perfectly valid message send and will pass 
>> both static and dynamic type tests (given p is of type Person in the 
>> static case).
> 
> It appears you've written the code above to assume that the type system 
> can't cerify that age >= 18... otherwise, the if statement would not 
> make sense.

Right.

> It also looks like Java, in which the type system is indeed 
> not powerfule enough to do that check statically.

Right.

> However, it sounds as 
> if you're claiming that it wouldn't be possible for the type system to 
> do this?

No. I only need an example where a certain error is not a type error in 
_some_ language. I don't need to make a universal claim here.

> If so, that's not correct.  If such a thing were checked at 
> compile-time by a static type check, then failing to actually provide 
> that guarantee would be a type error, and the compiler would tell you 
> so.

That's fine, but not relevant in this specific subplot.


Pascal

-- 
3rd European Lisp Workshop
July 3 - Nantes, France - co-located with ECOOP 2006
http://lisp-ecoop06.bknr.net/
From: Chris Smith
Subject: Re: What is a type error?
Date: 
Message-ID: <MPG.1f09db1c1b611507989710@news.altopia.net>
Pascal Costanza <··@p-cos.net> wrote:
> Chris Smith wrote:
> > Of course zero is not appropriate as a second argument to the division 
> > operator!  I can't possibly see how you could claim that it is.  The 
> > only reasonable statement worth making is that there doesn't exist a 
> > type system in widespread use that is capable of checking this.
> 
> ...and this is typically not even checked at the stage where type tags 
> are checked in dynamically-typed languages. Hence, it is not a type 
> error. (A type error is always what you define to be a type error within 
>   a given type system, right?)

Sure, it's not a type error for that language.

> Note, this example was in response to David's reply that my definition 
> turns every runtime error into a type error. That's not the case, and 
> that's all I want to say.

But your definition does do that.  Your definition of a type error was 
when a program attempts to invoke an operation on values that are not 
appropriate for this operation.  Clearly, in this example, the program 
is invoking an operation (division) on values that are not appropriate 
(zero for the second argument).  Hence, if your definition really is a 
definition, then this must qualify.

> > However, it sounds as 
> > if you're claiming that it wouldn't be possible for the type system to 
> > do this?
> 
> No. I only need an example where a certain error is not a type error in 
> _some_ language. I don't need to make a universal claim here.

Definitions are understood to be statements of the form "if and only 
if".  They assert that /everything/ that fits the definition is 
describable by the word, and /everything/ that doesn't is not 
describable by the word.  If that's not what you meant, then we are 
still in search of a definition.

If you want to make a statement instead of the sort you've implied 
above... namely that a type error is *any* error that's raised by a type 
system, then that's fine.  It certainly brings us closer to static 
types.  Now, though, the task is to define a type system without making 
a circular reference to types.  You've already rejected the statement 
that all runtime errors are type errors, because you specifically reject 
the division by zero case as a type error for most languages.  What is 
that distinction?

-- 
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation
From: Pascal Costanza
Subject: Re: What is a type error?
Date: 
Message-ID: <4gb8tiF1e75njU1@individual.net>
Chris Smith wrote:
> Pascal Costanza <··@p-cos.net> wrote:
>> Chris Smith wrote:
>>> Of course zero is not appropriate as a second argument to the division 
>>> operator!  I can't possibly see how you could claim that it is.  The 
>>> only reasonable statement worth making is that there doesn't exist a 
>>> type system in widespread use that is capable of checking this.
>> ...and this is typically not even checked at the stage where type tags 
>> are checked in dynamically-typed languages. Hence, it is not a type 
>> error. (A type error is always what you define to be a type error within 
>>   a given type system, right?)
> 
> Sure, it's not a type error for that language.
> 
>> Note, this example was in response to David's reply that my definition 
>> turns every runtime error into a type error. That's not the case, and 
>> that's all I want to say.
> 
> But your definition does do that.  Your definition of a type error was 
> when a program attempts to invoke an operation on values that are not 
> appropriate for this operation. 

In my definition, I didn't say who or what decides what values are 
appropriate for operations.

> Clearly, in this example, the program 
> is invoking an operation (division) on values that are not appropriate 
> (zero for the second argument).  Hence, if your definition really is a 
> definition, then this must qualify.

Here, you are asking more from dynamic type systems than any existing 
type system provides. The definition of what is considered to be a type 
error and what not is always part of the specification of a type system.

>>> However, it sounds as 
>>> if you're claiming that it wouldn't be possible for the type system to 
>>> do this?
>> No. I only need an example where a certain error is not a type error in 
>> _some_ language. I don't need to make a universal claim here.
> 
> Definitions are understood to be statements of the form "if and only 
> if".  They assert that /everything/ that fits the definition is 
> describable by the word, and /everything/ that doesn't is not 
> describable by the word.  If that's not what you meant, then we are 
> still in search of a definition.
> 
> If you want to make a statement instead of the sort you've implied 
> above... namely that a type error is *any* error that's raised by a type 
> system, then that's fine.  It certainly brings us closer to static 
> types.  Now, though, the task is to define a type system without making 
> a circular reference to types.  You've already rejected the statement 
> that all runtime errors are type errors, because you specifically reject 
> the division by zero case as a type error for most languages.  What is 
> that distinction?

I don't need such a distinction. I can just define what is covered by a 
type system and what is not. All type systems work that way.


Pascal

-- 
3rd European Lisp Workshop
July 3 - Nantes, France - co-located with ECOOP 2006
http://lisp-ecoop06.bknr.net/
From: Chris Smith
Subject: Re: What is a type error?
Date: 
Message-ID: <MPG.1f0a200a59a26bd9989718@news.altopia.net>
Pascal Costanza <··@p-cos.net> wrote:
> > Clearly, in this example, the program 
> > is invoking an operation (division) on values that are not appropriate 
> > (zero for the second argument).  Hence, if your definition really is a 
> > definition, then this must qualify.
> 
> Here, you are asking more from dynamic type systems than any existing 
> type system provides. The definition of what is considered to be a type 
> error and what not is always part of the specification of a type system.

No, I'm not.  Were we to follow this path of definition, it would not 
require that dynamic type systems catch ALL type errors; only that they 
catch some.  Not that it matters, though.  I am analyzing the logical 
consequences of your own definition.  If those logical consequences were 
impossible to fulfill, that would be an even more convincing reason to 
find a new definition.  Concepts of fairness don't enter into the 
equation.

> > If you want to make a statement instead of the sort you've implied 
> > above... namely that a type error is *any* error that's raised by a type 
> > system, then that's fine.  It certainly brings us closer to static 
> > types.  Now, though, the task is to define a type system without making 
> > a circular reference to types.  You've already rejected the statement 
> > that all runtime errors are type errors, because you specifically reject 
> > the division by zero case as a type error for most languages.  What is 
> > that distinction?
> 
> I don't need such a distinction. I can just define what is covered by a 
> type system and what is not. All type systems work that way.

You've got to define something... or, I suppose, just go on using words 
without definitions.  You claimed to give a definition, though.

I have been led by others to believe that the right way to go in the 
dynamic type sense is to first define type errors, and then just call 
anything that finds type errors a type system.  That would work, but it 
would require a type error.  You seem to want to push that work off of 
the word "type error" and back onto "type system", but that requires 
that you define what a type system is.  It doesn't work for you to say 
BOTH that a type system is anything that catches type errors, AND that a 
type error is anything that's caught by a type system.  That's not a 
definition; it's just circular reasoning.  If I can plug in: type error 
= fly, type system = frog, and get a valid instance of your definitions, 
then you aren't giving much of a definition.  (Unless you think that 
frogs are perfectly good examples of type systems.)

Incidentally, in the case of static type systems, we define the system 
(for example, using the definition given from Pierce many times this 
thread), and then infer the definition of types and type errors from 
there.  Because a solid definition has been given first for a type 
system without invoking the concepts of types or type errors, this 
avoids being circular.

-- 
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation
From: Pascal Costanza
Subject: Re: What is a type error?
Date: 
Message-ID: <4gdkduF1ldalhU1@individual.net>
Chris Smith wrote:
> Pascal Costanza <··@p-cos.net> wrote:
>>> Clearly, in this example, the program 
>>> is invoking an operation (division) on values that are not appropriate 
>>> (zero for the second argument).  Hence, if your definition really is a 
>>> definition, then this must qualify.
>> Here, you are asking more from dynamic type systems than any existing 
>> type system provides. The definition of what is considered to be a type 
>> error and what not is always part of the specification of a type system.
> 
> No, I'm not.  Were we to follow this path of definition, it would not 
> require that dynamic type systems catch ALL type errors; only that they 
> catch some.  Not that it matters, though.  I am analyzing the logical 
> consequences of your own definition.  If those logical consequences were 
> impossible to fulfill, that would be an even more convincing reason to 
> find a new definition.  Concepts of fairness don't enter into the 
> equation.

Yes it does. All static type systems define only a strict subset of all 
possible errors to be covered, and leave others as runtime errors. The 
same holds for dynamic type systems.

>>> If you want to make a statement instead of the sort you've implied 
>>> above... namely that a type error is *any* error that's raised by a type 
>>> system, then that's fine.  It certainly brings us closer to static 
>>> types.  Now, though, the task is to define a type system without making 
>>> a circular reference to types.  You've already rejected the statement 
>>> that all runtime errors are type errors, because you specifically reject 
>>> the division by zero case as a type error for most languages.  What is 
>>> that distinction?
>> I don't need such a distinction. I can just define what is covered by a 
>> type system and what is not. All type systems work that way.
> 
> You've got to define something... or, I suppose, just go on using words 
> without definitions.  You claimed to give a definition, though.
> 
> I have been led by others to believe that the right way to go in the 
> dynamic type sense is to first define type errors, and then just call 
> anything that finds type errors a type system.  That would work, but it 
> would require a type error.  You seem to want to push that work off of 
> the word "type error" and back onto "type system", but that requires 
> that you define what a type system is.

I didn't know I was supposed to give a definition.

> Incidentally, in the case of static type systems, we define the system 
> (for example, using the definition given from Pierce many times this 
> thread), and then infer the definition of types and type errors from 
> there.  Because a solid definition has been given first for a type 
> system without invoking the concepts of types or type errors, this 
> avoids being circular.

Do you mean this one? "A type system is a tractable syntactic method for 
proving the absence of certain program behaviors by classifying phrases 
according to the kinds of values they compute." This isn't really such a 
strong definition because it shifts the problem of what exactly a type 
system is to finding a definition for the word "kind".

But if this makes you happy, here is an attempt:

"A dynamic type system is a tractable syntactic method for proving the 
absence of certain program behaviors by classifying values according to 
their kinds."

Basically, this correlates with the typical approach of using tags to 
indicate the type/kind/class of values. And indeed, this is what dynamic 
type systems typically do: they check tags associated with values. Other 
kinds of runtime errors are not raised because of such checks, but 
because of other reasons. Therefore, there is naturally a distinction 
between runtime errors that are type errors and those that are not.

I am not convinced that this definition of mine is a lot clearer than 
what I have said before, but I am also not convinced that Pierce's 
definition is any clearer either. It is merely a characterization of 
what static type systems do.

The preciseness comes into play when actually defining a type system: 
then you have to give definitions what the errors at runtime are that 
you want to prove absent by way of the static type system, give the 
typing rules for the type system, and then prove soundness by showing 
that the typing rules correlate precisely to the runtime errors in the 
first stage. Since you have to map two different views of the same thing 
to each other you have to be precise in giving definitions that you can 
then successfully use in your proofs.

In dynamic type system, this level of precision is not necessary because 
proving that dynamic type errors is what the dynamic type system catches 
is trivially true, and most programmers don't care that there is a 
distinction between runtime type errors and runtime "other" errors. But 
this doesn't mean that this distinction doesn't exist.


Pascal

-- 
3rd European Lisp Workshop
July 3 - Nantes, France - co-located with ECOOP 2006
http://lisp-ecoop06.bknr.net/
From: George Neuner
Subject: Re: What is a type error?
Date: 
Message-ID: <8vl3b2tu42if639eaak4i79cbqrqig7tch@4ax.com>
On Mon, 26 Jun 2006 11:49:51 -0600, Chris Smith <·······@twu.net>
wrote:

>Pascal Costanza <··@p-cos.net> wrote:
>
>> This is maybe better understandable in user-level code. Consider the 
>> following class definition:
>> 
>> class Person {
>>    String name;
>>    int age;
>> 
>>    void buyPorn() {
>>     if (< this.age 18) throw new AgeRestrictionException();
>>     ...
>>    }
>> }
>> 
>> The message p.buyPorn() is a perfectly valid message send and will pass 
>> both static and dynamic type tests (given p is of type Person in the 
>> static case).
>
>It appears you've written the code above to assume that the type system 
>can't cerify that age >= 18... otherwise, the if statement would not 
>make sense.  It also looks like Java, in which the type system is indeed 
>not powerfule enough to do that check statically.  However, it sounds as 
>if you're claiming that it wouldn't be possible for the type system to 
>do this?  If so, that's not correct.  If such a thing were checked at 
>compile-time by a static type check, then failing to actually provide 
>that guarantee would be a type error, and the compiler would tell you 
>so.

Now this is getting ridiculous.  Regardless of implementation
language, Pascal's example is of a data dependent, runtime constraint.
Is the compiler to forbid a person object from aging?  If not, then
how does this person object suddenly become a different type when it
becomes 18?  Is this conversion to be automatic or manual?  

The programmer could forget to make a manual conversion, in which case
the program's behavior is wrong.  If your marvelous static type
checker does the conversion automatically, then obviously types are
not static and can be changed at runtime.  

Either way you've failed to prevent a runtime problem using a purely
static analysis.


George
--
for email reply remove "/" from address
From: Chris Smith
Subject: Re: What is a type error?
Date: 
Message-ID: <MPG.1f1b973841b328b998976b@news.altopia.net>
George Neuner <·········@comcast.net> wrote:
> Now this is getting ridiculous.

It's not at all ridiculous.  The fact that it seems ridiculous is a good 
reason to educate people about what static typing does (and doesn't) 
mean, and specifically that it doesn't imply any constraints on the kind 
of behavioral property that's being checked; but only on the way that 
the check occurs.

(As a subtle point here, we're assuming that this really is an error; 
i.e., it shouldn't happen in a correct program.  If this is validation 
against a user mistake, then that's a different matter; obviously, that 
would typically be done with a conditional statement like if, and 
wouldn't count as a type error in any context.)

> Regardless of implementation
> language, Pascal's example is of a data dependent, runtime constraint.

99% of program errors (excluding only syntax errors and the like) are 
data-dependent runtime errors.  Types are a way of classifying that data 
until such errors become obvious at compile time.

> Is the compiler to forbid a person object from aging?  If not, then
> how does this person object suddenly become a different type when it
> becomes 18?  Is this conversion to be automatic or manual?  

The object doesn't have a type (this is static typing, remember?), but 
expressions do.  The expressions have types, and those types are 
probably inferred in such a circumstance (at least, this one would even 
more of a pain without type inference).

> If your marvelous static type
> checker does the conversion automatically, then obviously types are
> not static and can be changed at runtime.  

There really is no conversion, but it does infer the correct types 
automatically.  It does so by having more complex rules regarding how to 
determine the type of an expression.  Java's current type system, of 
course, is considerably less powerful.  In particular, Java mostly 
assures that a variable name has a certain type when used as an 
expression, regardless of the context of that expression.  Only with 
some of the Java 5 features (capture conversion, generic method type 
inference, etc) did this cease to be the case.  It would be necessary to 
drop this entirely to make the feature above work.  For example, 
consider the following in a hypothetical Java language augmented with 
integer type ranges:

    int{17..26} a;
    int{14..22} b;

    ...

    if (a < b)
    {
        // Inside this block, a has type int{17..21} and b has type
        // int{18..22}

        signContract(a); // error, because a might be 17
        signContract(b); // no error, because even though the declared
                         // type of b is int{14..22}, it has a local
                         // type of int{18..22}.
    }

(By the way, I hate this example... hence I've changed buyPorn to 
signContract, which at least in the U.S. also requires a minimum age of 
18.  Hopefully, you'll agree this doesn't change the sunstantive 
content. :)

Note that when a programmer is validating input from a user, they will 
probably write precisely the kind of conditional expression that allows 
the compiler to deduce a sufficiently restrictive type and thus allows 
the type checker to succeed.  Or if they don't do so -- say, for 
example, the programmer produces a fencepost error by accidentally 
entering "if (age >= 17)" instead of "if (age > 17)" -- then the 
compiler will produce a type error as a result.  If the programmer 
stores the value into a variable of general type int (assuming such a 
thing still exists in our augmented Java-esque language), then the 
additional type information is forgotten, just like type information is 
forgotten when a reference of type String is assigned to a reference of 
type Object.  In both cases, a runtime-checked cast will be required to 
restore the lost type information.

Incidentally, I'm not saying that such a feature would be a good idea.  
It generally isn't provided in languages specifically because it gets to 
be a big pain to maintain all of the type specifications for this kind 
of stuff.  However, it is possible, and it is a static type system, 
because expressions are assigned typed syntactically, rather than values 
being checked for correctness at runtime.

-- 
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation
From: Marshall
Subject: Re: What is a type error?
Date: 
Message-ID: <1152512416.981345.232330@h48g2000cwc.googlegroups.com>
Chris Smith wrote:
>
> [...] static typing does ... doesn't imply any constraints on the kind
> of behavioral property that's being checked; but only on the way that
> the check occurs.

Nice post.


Marshall
From: Chris Smith
Subject: Re: What is a type error?
Date: 
Message-ID: <MPG.1f1bab64c1cb09d498976f@news.altopia.net>
I <·······@twu.net> wrote:
> Incidentally, I'm not saying that such a feature would be a good idea.  
> It generally isn't provided in languages specifically because it gets to 
> be a big pain to maintain all of the type specifications for this kind 
> of stuff.

There are other good reasons, too, as it turns out.  I don't want to 
overstate the "possible" until it starts to sound like "easy, even if 
it's a pain".  This kind of stuff is rarely done in mainstream 
programming languages because it has serious negative consequences.

For example, I wrote that example using variables of type int.  If we 
were to suppose that we were actually working with variables of type 
Person, then things get a little more complicated.  We would need a few 
(infinite classes of) derived subtypes of Person that further constrain 
the possible values for state.  For example, we'd need types like:

    Person{age:{18..29}}

But this starts to look bad, because we used to have this nice property 
called encapsulation.  To work around that, we'd need to make one of a 
few choices: (a) give up encapsulation, which isn't too happy; (b) rely 
on type inference for this kind of stuff, and consider it okay if the 
type inference system breaks encapsulation; or (c) invent some kind of 
generic constraint language so that constraints like this could be 
expressed without exposing field names.  Choice (b) is incomplete, as 
there will often be times when I need to ascribe a type to a parameter 
or some such thing, and the lack of ability to express the complete type 
system will be rather limiting.  Choice (c), though, looks a little 
daunting.

So I'll stop there.  The point is that while it is emphatically true 
that this kind of stuff is possible, it is also very hard in Java.  
Partly, that's because Java is an imperative language, but it's also 
because there are fundamental design trade-offs involved between 
verbosity, complexity, expressive power, locality of knowledge, etc. 
that are bound to be there in all programming languages, and which make 
it harder to take one design principle to its extreme and produce a 
usable language as a result.  I don't know that it's impossible for this 
sort of thing to be done in a usable Java-like language, but in any 
case, the way to accomplish it is not obvious.

-- 
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation
From: Marshall
Subject: Re: What is a type error?
Date: 
Message-ID: <1152516083.019202.215160@p79g2000cwp.googlegroups.com>
Chris Smith wrote:
>
> But this starts to look bad, because we used to have this nice property
> called encapsulation.  To work around that, we'd need to make one of a
> few choices: (a) give up encapsulation, which isn't too happy; (b) rely
> on type inference for this kind of stuff, and consider it okay if the
> type inference system breaks encapsulation; or (c) invent some kind of
> generic constraint language so that constraints like this could be
> expressed without exposing field names.  Choice (b) is incomplete, as
> there will often be times when I need to ascribe a type to a parameter
> or some such thing, and the lack of ability to express the complete type
> system will be rather limiting.  Choice (c), though, looks a little
> daunting.

Damn the torpedoes, give me choice c!

I've been saying for a few years now that encapsulation is only
a hack to get around the lack of a decent declarative constraint
language.


Marshall
From: Chris Smith
Subject: Re: What is a type error?
Date: 
Message-ID: <MPG.1f1bb6b890c74cd2989770@news.altopia.net>
Marshall <···············@gmail.com> wrote:
> Chris Smith wrote:
> >
> > But this starts to look bad, because we used to have this nice property
> > called encapsulation.  To work around that, we'd need to make one of a
> > few choices: (a) give up encapsulation, which isn't too happy; (b) rely
> > on type inference for this kind of stuff, and consider it okay if the
> > type inference system breaks encapsulation; or (c) invent some kind of
> > generic constraint language so that constraints like this could be
> > expressed without exposing field names.  Choice (b) is incomplete, as
> > there will often be times when I need to ascribe a type to a parameter
> > or some such thing, and the lack of ability to express the complete type
> > system will be rather limiting.  Choice (c), though, looks a little
> > daunting.
> 
> Damn the torpedoes, give me choice c!

You and I both need to figure out when to go to sleep.  :)  Work's gonna 
suck tomorrow.

> I've been saying for a few years now that encapsulation is only
> a hack to get around the lack of a decent declarative constraint
> language.

Choice (c) was meant to preserve encapsulation, actually.  I think 
there's something fundamentally important about information hiding that 
can't be given up.  Hypothetically, say I'm writing an accounting 
package and I've decided to encapsulate details of the tax code into one 
module of the application.  Now, it may be that the compiler can perform 
sufficient type inference on my program to know that it's impossible for 
my taxes to be greater than 75% of my annual income.  So if my income is 
stored in a variable of type decimal{0..100000}, then the return type of 
getTotalTax may be of type decimal{0..75000}.  Type inference could do 
that.

But the point here is that I don't WANT the compiler to be able to infer 
that, because it's a transient consequence of this year's tax code.  I 
want the compiler to make sure my code works no matter what the tax code 
is.  The last thing I need to to go fixing a bunch of bugs during the 
time between the release of next year's tax code and the released 
deadline for my tax software.  At the same time, though, maybe I do want 
the compiler to infer that tax cannot be negative (or maybe it can; I'm 
not an accountant; I know my tax has never been negative), and that it 
can't be a complex number (I'm pretty sure about that one).  I call that 
encapsulation, and I don't think that it's necessary for lack of 
anything; but rather because that's how the problem breaks down.

----

Note that even without encapsulation, the kind of typing information 
we're looking at can be very non-trivial in an imperative language.  For 
example, I may need to express a method signature that is kind of like 
this:

1. The first parameter is an int, which is either between 4 and 8, or 
between 11 and 17.

2. The second parameter is a pointer to an object, whose 'foo' field is 
an int between 0 and 5, and whose 'bar' field is a pointer to another 
abject with three fields 'a', 'b', and 'c', each of which has the full 
range of an unconstrained IEEE double precision floating point number.

3. After the method returns, it will be known that if this object 
previously had its 'baz' field in the range m .. n, it is now in the 
range (m - 5) .. (n + 1).

4. After the method returns, it will be known that the object reached by 
following the 'bar' field of the second parameter will be modified so 
that the first two of its floating point numbers are guaranteed to be of 
the opposite sign as they were before, and that if they were infinity, 
they are now finite.

5. After the method returns, the object referred to by the global 
variable 'zab' has 0 as the value of its 'c' field.

Just expressing all of that in a method signature looks interesting 
enough.  If we start adding abstraction to the type constraints on 
objects to support encapsulation (as I think you'd have to do), then 
things get even more interesting.

-- 
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation
From: Marshall
Subject: Re: What is a type error?
Date: 
Message-ID: <1152543751.186771.113230@h48g2000cwc.googlegroups.com>
Chris Smith wrote:
> Marshall <···············@gmail.com> wrote:
> > Chris Smith wrote:
> > >
> > > But this starts to look bad, because we used to have this nice property
> > > called encapsulation.  To work around that, we'd need to make one of a
> > > few choices: (a) give up encapsulation, which isn't too happy; (b) rely
> > > on type inference for this kind of stuff, and consider it okay if the
> > > type inference system breaks encapsulation; or (c) invent some kind of
> > > generic constraint language so that constraints like this could be
> > > expressed without exposing field names.  Choice (b) is incomplete, as
> > > there will often be times when I need to ascribe a type to a parameter
> > > or some such thing, and the lack of ability to express the complete type
> > > system will be rather limiting.  Choice (c), though, looks a little
> > > daunting.
> >
> > Damn the torpedoes, give me choice c!
>
> You and I both need to figure out when to go to sleep.  :)  Work's gonna
> suck tomorrow.

It's never been a strong point. Made worse now that my daughter
is one of those up-at-the-crack-of-dawn types, and not old enough
to understand why it's not nice to jump on mommy and daddy's
bed while they're still asleep. But aren't you actually a time zone
or two east of me?


> > I've been saying for a few years now that encapsulation is only
> > a hack to get around the lack of a decent declarative constraint
> > language.
>
> Choice (c) was meant to preserve encapsulation, actually.  I think
> there's something fundamentally important about information hiding that
> can't be given up.  Hypothetically, say I'm writing an accounting
> package and I've decided to encapsulate details of the tax code into one
> module of the application.  Now, it may be that the compiler can perform
> sufficient type inference on my program to know that it's impossible for
> my taxes to be greater than 75% of my annual income.  So if my income is
> stored in a variable of type decimal{0..100000}, then the return type of
> getTotalTax may be of type decimal{0..75000}.  Type inference could do
> that.

The fields of an object/struct/what have you are often hidden behind
a method-based interface (sometimes callled "encapsulated") only
because we can't control their values otherwise. (The "exposing
the interface" issue is a non-issue, because we're exposing some
interface or another no matter what.) The issue is controlling the
values, and that is better handled with a declarative constraint
language. The specific value in the fields aren't known until
runtime.

However for a function, the "fields" are the in and out parameters.
The specific values in the relation that the function is aren't known
until runtime either, (and then only the subset for which we actually
perform computation.)

Did that make sense?


> But the point here is that I don't WANT the compiler to be able to infer
> that, because it's a transient consequence of this year's tax code.  I
> want the compiler to make sure my code works no matter what the tax code
> is.  The last thing I need to to go fixing a bunch of bugs during the
> time between the release of next year's tax code and the released
> deadline for my tax software.  At the same time, though, maybe I do want
> the compiler to infer that tax cannot be negative (or maybe it can; I'm
> not an accountant; I know my tax has never been negative), and that it
> can't be a complex number (I'm pretty sure about that one).  I call that
> encapsulation, and I don't think that it's necessary for lack of
> anything; but rather because that's how the problem breaks down.

There's some significant questions in my mind about how much of
a constraint language would be static and how much would be
runtime checks. Over time, I'm starting to feel like it should be
mostly runtime, and only occasionally moving into compile time
at specific programmer request. The decidability issue comes up.

Anyone else?


> Just expressing all of that in a method signature looks interesting
> enough.  If we start adding abstraction to the type constraints on
> objects to support encapsulation (as I think you'd have to do), then
> things get even more interesting.

There are certainly syntactic issues, but I believe these are amenable
to the usual approaches. The runtime/compile time question, and
decidability seem bigger issues to me.


Marshall
From: Chris Smith
Subject: Re: What is a type error?
Date: 
Message-ID: <MPG.1f1c2fbdc841bd58989774@news.altopia.net>
Marshall <···············@gmail.com> wrote:
> It's never been a strong point. Made worse now that my daughter
> is one of those up-at-the-crack-of-dawn types, and not old enough
> to understand why it's not nice to jump on mommy and daddy's
> bed while they're still asleep. But aren't you actually a time zone
> or two east of me?

Yes, I confess I'm one time zone to your east, and I was posting later 
than you.  So perhaps it wasn't really past your bedtime.

> The fields of an object/struct/what have you are often hidden behind
> a method-based interface (sometimes callled "encapsulated") only
> because we can't control their values otherwise.

I believe there are actually two kinds of encapsulation.  The kind of 
encapsulation that best fits your statement there is the getter/setter 
sort, which says: "logically, I want an object with some set of fields, 
but I can't make them fields because I lose control over their values".  
That part can definitely be replaced, in a suitably powerful language, 
with static constraints.

The other half of encapsulation, though, is of the sort that 	I mentioned 
in my post.  I am intentionally choosing to encapsulate something 
because I don't actually know how it should end up being implemented 
yet, or because it changes often, or something like that.  I may 
encapsulate the implementation of current tax code specifically because 
I know that tax code changes on a year-to-year basis, and I want to 
ensure that the rest of my program works no matter how the tax code is 
modified.  There may be huge structural changes in the tax code, and I 
only want to commit to leaving a minimal interface.

In practice, the two purposes are not cleanly separated from each other.  
Most people, if asked why they write getters and setters, would respond 
not only that they want to validate against assignments to the field, 
but also that it helps isolate changes should they change the internal 
representation of the class.  A publicly visible static constraint 
language that allows the programmer to change the internal 
representation of a class obviously can't make reference to any field, 
since it may cease to exist with a change in representation.

> However for a function, the "fields" are the in and out parameters.
> The specific values in the relation that the function is aren't known
> until runtime either, (and then only the subset for which we actually
> perform computation.)
> 
> Did that make sense?

I didn't understand that last bit.

> There are certainly syntactic issues, but I believe these are amenable
> to the usual approaches. The runtime/compile time question, and
> decidability seem bigger issues to me.

Well, the point of static typing is to do what's possible without 
reaching the point of undecidability.  Runtime support for checking the 
correctness of type ascriptions certainly comes in handy when you run 
into those limits.

-- 
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation
From: Joachim Durchholz
Subject: Re: What is a type error?
Date: 
Message-ID: <e8t3hi$eh2$1@online.de>
Chris Smith schrieb:
> I think 
> there's something fundamentally important about information hiding that 
> can't be given up.

Indeed.
Without information hiding, with N entities, you have O(N^2) possible 
interactions between them. This quickly outgrows the human capacity for 
managing the interactions.
With information hiding, you can set up a layered approach, and the 
interactions are usually down to something between O(log N) and O(N log 
N). Now that's far more manageable.

Regards,
Jo
From: Pascal Bourguignon
Subject: Re: What is a type error?
Date: 
Message-ID: <87ejwt4zn7.fsf@thalassa.informatimago.com>
Chris Smith <·······@twu.net> writes:

> But the point here is that I don't WANT the compiler to be able to infer 
> that, because it's a transient consequence of this year's tax code.  I 
> want the compiler to make sure my code works no matter what the tax code 
> is.  The last thing I need to to go fixing a bunch of bugs during the 
> time between the release of next year's tax code and the released 
> deadline for my tax software.  At the same time, though, maybe I do want 
> the compiler to infer that tax cannot be negative (or maybe it can; I'm 
> not an accountant; I know my tax has never been negative), 

Yes, it can.  For example in Spain.  Theorically, in France IVA can
also come out negative, and you have the right to ask for
reimbursement, but I've never seen a check from French Tax
Administration...

> and that it 
> can't be a complex number (I'm pretty sure about that one).  

I wouldn't bet on it.

For example, French taxes consider "advantages in nature", so your
income has at least two dimensions, Euros and and "advantages in
nature".  Thanksfully, these advantages are converted into Euros, but
you could consider it a product by (complex 0 (- some-factor))...

> I call that 
> encapsulation, and I don't think that it's necessary for lack of 
> anything; but rather because that's how the problem breaks down.

-- 
__Pascal Bourguignon__                     http://www.informatimago.com/
You never feed me.
Perhaps I'll sleep on your face.
That will sure show you.
From: David Hopwood
Subject: Re: What is a type error?
Date: 
Message-ID: <%5Dsg.74232$7Z6.25253@fe2.news.blueyonder.co.uk>
Chris Smith wrote:
> At the same time, though, maybe I do want 
> the compiler to infer that tax cannot be negative (or maybe it can; I'm 
> not an accountant; I know my tax has never been negative), [...]

Tax can be negative, e.g. when a business is claiming back VAT (sales tax)
on its purchases, but the things it is selling have lower-rated or zero
VAT (or it is making a loss).

> Note that even without encapsulation, the kind of typing information 
> we're looking at can be very non-trivial in an imperative language.  For 
> example, I may need to express a method signature that is kind of like 
> this:
> 
> 1. The first parameter is an int, which is either between 4 and 8, or 
> between 11 and 17.
> 
> 2. The second parameter is a pointer to an object, whose 'foo' field is 
> an int between 0 and 5, and whose 'bar' field is a pointer to another 
> object with three fields 'a', 'b', and 'c', each of which has the full 
> range of an unconstrained IEEE double precision floating point number.
> 
> 3. After the method returns, it will be known that if this object 
> previously had its 'baz' field in the range m .. n, it is now in the 
> range (m - 5) .. (n + 1).
> 
> 4. After the method returns, it will be known that the object reached by 
> following the 'bar' field of the second parameter will be modified so 
> that the first two of its floating point numbers are guaranteed to be of 
> the opposite sign as they were before, and that if they were infinity, 
> they are now finite.
> 
> 5. After the method returns, the object referred to by the global 
> variable 'zab' has 0 as the value of its 'c' field.
> 
> Just expressing all of that in a method signature looks interesting 
> enough.  If we start adding abstraction to the type constraints on 
> objects to support encapsulation (as I think you'd have to do), then 
> things get even more interesting.

1 and 2 are easy enough. 3 to 5 are best expressed as assertions rather
than types.

-- 
David Hopwood <····················@blueyonder.co.uk>
From: Chris Smith
Subject: Re: What is a type error?
Date: 
Message-ID: <MPG.1f1cb1655d051e6f98977f@news.altopia.net>
David Hopwood <····················@blueyonder.co.uk> wrote:
> 1 and 2 are easy enough. 3 to 5 are best expressed as assertions rather
> than types.

One of us is missing the other's meaning here.  If 3 to 5 were expressed 
as assertions rather than types, then the type system would become 
incomplete, requiring frequent runtime-checked type ascriptions to 
prevent it from becoming impossible to write software.  That's not my 
idea of a usable language.

-- 
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation
From: David Hopwood
Subject: Re: What is a type error?
Date: 
Message-ID: <AMDsg.307637$8W1.193192@fe1.news.blueyonder.co.uk>
Chris Smith wrote:
> David Hopwood <····················@blueyonder.co.uk> wrote:
> 
>>1 and 2 are easy enough. 3 to 5 are best expressed as assertions rather
>>than types.
> 
> One of us is missing the other's meaning here.  If 3 to 5 were expressed 
> as assertions rather than types, then the type system would become 
> incomplete, requiring frequent runtime-checked type ascriptions to 
> prevent it from becoming impossible to write software.  That's not my 
> idea of a usable language.

Maybe I'm not understanding what you mean by "complete". Of course any
type system of this expressive power will be incomplete (whether or not
it can express conditions 3 to 5), in the sense that it cannot prove all
true assertions about the types of expressions.

So yes, some assertions would need to be checked at runtime (others could be
proven to always hold by a theorem prover).

Why is this a problem? The goal is not to check everything at compile time;
the goal is just to support writing correct programs.

-- 
David Hopwood <····················@blueyonder.co.uk>
From: Chris Smith
Subject: Re: What is a type error?
Date: 
Message-ID: <MPG.1f1cd16c26f92e6d989781@news.altopia.net>
David Hopwood <····················@blueyonder.co.uk> wrote:
> Maybe I'm not understanding what you mean by "complete". Of course any
> type system of this expressive power will be incomplete (whether or not
> it can express conditions 3 to 5), in the sense that it cannot prove all
> true assertions about the types of expressions.

Ah.  I meant complete enough to accomplish the goal in this subthread, 
which was to ensure that the compiler knows when a particular int 
variable is guaranteed to be greater than 18, and when it is not.

-- 
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation
From: David Hopwood
Subject: Re: What is a type error?
Date: 
Message-ID: <C1Psg.56632$181.30396@fe3.news.blueyonder.co.uk>
Chris Smith wrote:
> David Hopwood <····················@blueyonder.co.uk> wrote:
> 
>>Maybe I'm not understanding what you mean by "complete". Of course any
>>type system of this expressive power will be incomplete (whether or not
>>it can express conditions 3 to 5), in the sense that it cannot prove all
>>true assertions about the types of expressions.
> 
> Ah.  I meant complete enough to accomplish the goal in this subthread, 
> which was to ensure that the compiler knows when a particular int 
> variable is guaranteed to be greater than 18, and when it is not.

I don't think that placing too much emphasis on any individual example is
the right way of thinking about this. What matters is that, over the range
of typical programs written in the language, the value of the increased
confidence in program correctness outweighs the effort involved in both
adding annotations, and understanding whether any remaining run-time checks
are guaranteed to succeed.

Look at it this way: suppose that I *need* to verify that a program has
no range errors. Doing that entirely manually would be extremely tedious.
If the compiler can do, say, 90% of the work, and point out the places that
need to be verified manually, then that would be both less tedious, and
less error-prone.

-- 
David Hopwood <····················@blueyonder.co.uk>
From: Chris Smith
Subject: Re: What is a type error?
Date: 
Message-ID: <MPG.1f1d751dd472b86d989783@news.altopia.net>
David Hopwood <····················@blueyonder.co.uk> wrote:
> I don't think that placing too much emphasis on any individual example is
> the right way of thinking about this. What matters is that, over the range
> of typical programs written in the language, the value of the increased
> confidence in program correctness outweighs the effort involved in both
> adding annotations, and understanding whether any remaining run-time checks
> are guaranteed to succeed.

Are you really that short on people to disagree with?

In this particular branch of this thread, we were discussing George's 
objection that it would be ridiculous for a type system to check that a 
variable should be greater than 18.  There is no such thing as placing 
too much emphasis on what we were actually discussing.  If you want to 
discuss something else, feel free to post about it.  Why does it bother 
you that I'm considering George's point?

-- 
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation
From: George Neuner
Subject: Re: What is a type error?
Date: 
Message-ID: <tog7b2te7ck12cv35l4b3dm2visfrtevgp@4ax.com>
On Tue, 11 Jul 2006 14:59:46 GMT, David Hopwood
<····················@blueyonder.co.uk> wrote:

>What matters is that, over the range
>of typical programs written in the language, the value of the increased
>confidence in program correctness outweighs the effort involved in both
>adding annotations, and understanding whether any remaining run-time checks
>are guaranteed to succeed.

Agreed, but ...


>Look at it this way: suppose that I *need* to verify that a program has
>no range errors. Doing that entirely manually would be extremely tedious.
>If the compiler can do, say, 90% of the work, and point out the places that
>need to be verified manually, then that would be both less tedious, and
>less error-prone.

All of this presupposes that you have a high level of confidence in
the compiler.  I've been in software development for going in 20 years
now and worked 10 years on high performance, high availability
systems.  In all that time I have yet to meet a compiler ... or
significant program of any kind ... that is without bugs, noticeable
or not.

I'm a fan of static typing but the major problem I have with complex
inferencing (in general) is the black box aspect of it.  That is, when
the compiler rejects my code, is it really because a) I was stupid, b)
the types are too complex, or c) the compiler itself has a bug.  It's
certainly true that the vast majority of my problems are because I'm
stupid, but I've run into actual compiler bugs far too often for my
liking (high performance coding has a tendency to uncover them).

I think I understand how to implement HM inferencing ... I haven't
actually done it yet, but I've studied it and I'm working on a toy
language that will eventually use it.  But HM itself is a toy compared
to an inferencing system that could realistically handle some of the
problems that were discussed in this and Xah's "expressiveness" thread
(my own beef is with *static* checking of range narrowing assignments
which I still don't believe can be done regardless of Chris Smith's
assertions to the contrary).

It seems to me that the code complexity of such a super-duper
inferencing system would make its bug free implementation quite
difficult and I personally would be less inclined to trust a compiler
that used it than one having a less capable (but easier to implement)
system.

George
--
for email reply remove "/" from address
From: David Hopwood
Subject: Re: What is a type error?
Date: 
Message-ID: <KSSsg.307978$8W1.277237@fe1.news.blueyonder.co.uk>
George Neuner wrote:
> On Tue, 11 Jul 2006 14:59:46 GMT, David Hopwood
> <····················@blueyonder.co.uk> wrote:
> 
>>What matters is that, over the range
>>of typical programs written in the language, the value of the increased
>>confidence in program correctness outweighs the effort involved in both
>>adding annotations, and understanding whether any remaining run-time checks
>>are guaranteed to succeed.
> 
> Agreed, but ...
> 
>>Look at it this way: suppose that I *need* to verify that a program has
>>no range errors. Doing that entirely manually would be extremely tedious.
>>If the compiler can do, say, 90% of the work, and point out the places that
>>need to be verified manually, then that would be both less tedious, and
>>less error-prone.
> 
> All of this presupposes that you have a high level of confidence in
> the compiler.  I've been in software development for going in 20 years
> now and worked 10 years on high performance, high availability
> systems.  In all that time I have yet to meet a compiler ... or
> significant program of any kind ... that is without bugs, noticeable
> or not.

That's a good point. I don't think it would be an exaggeration to say
that the some of the most commonly used compilers are riddled with bugs.
gcc is a particularly bad offender, and at the moment seems to be introducing
bugs with each new version at a faster rate than they can be fixed. Sun's
javac also used to be poor in this respect, although I haven't used recent
versions of it.

One of the main reasons for this, IMHO, is that many compilers place too
much emphasis on low-level optimizations of dubious merit. For C and
Java, I've taken to compiling all non-performance-critical code without
optimizations, and that has significantly reduced the number of compiler
bugs that I see. It has very little effect on overall execution performance
(and compile times are quicker).

I don't think that over-complex type systems are the cause of more than a
small part of the compiler bug problem. In my estimation, the frequency of
bugs in different compilers *for the same language* can vary by an order of
magnitude.

> I'm a fan of static typing but the major problem I have with complex
> inferencing (in general) is the black box aspect of it.  That is, when
> the compiler rejects my code, is it really because a) I was stupid, b)
> the types are too complex, or c) the compiler itself has a bug.

Rejecting code incorrectly is much less of a problem than accepting it
incorrectly. *If* all compiler bugs were false rejections, I would say that
this would not be too much of an issue.

Unfortunately there are plenty of bugs that result in silently generating
bad code. But I don't think you can avoid that by using a language with a
simpler type system. The quality of a compiler depends much more on the
competence and attitudes of the language implementation developers, than
it does on the language itself.

[Although, choosing a language with a more quality-conscious compiler
development team doesn't necessarily help if it is compiling to C, since
then you still have to deal with C compiler bugs, but on autogenerated
code :-(.]

> It's certainly true that the vast majority of my problems are because
> I'm stupid, but I've run into actual compiler bugs far too often for my
> liking (high performance coding has a tendency to uncover them).

Yes. Cryptographic algorithms, and autogenerated code also tend to do that.

> I think I understand how to implement HM inferencing ... I haven't
> actually done it yet, but I've studied it and I'm working on a toy
> language that will eventually use it.  But HM itself is a toy compared
> to an inferencing system that could realistically handle some of the
> problems that were discussed in this and Xah's "expressiveness" thread
> (my own beef is with *static* checking of range narrowing assignments
> which I still don't believe can be done regardless of Chris Smith's
> assertions to the contrary).
> 
> It seems to me that the code complexity of such a super-duper
> inferencing system would make its bug free implementation quite
> difficult and I personally would be less inclined to trust a compiler
> that used it than one having a less capable (but easier to implement)
> system.

I don't think that the complexity of such a system would need to be
any greater than that of the optimization frameworks used by many
compilers. IIUC, what Chris Smith is suggesting is essentially equivalent
to doing type inference on the SSA form of the program.

-- 
David Hopwood <····················@blueyonder.co.uk>
From: Andreas Rossberg
Subject: Re: What is a type error?
Date: 
Message-ID: <e92evn$a11al$1@hades.rz.uni-saarland.de>
David Hopwood wrote:
> George Neuner wrote:
>>
>>All of this presupposes that you have a high level of confidence in
>>the compiler.  I've been in software development for going in 20 years
>>now and worked 10 years on high performance, high availability
>>systems.  In all that time I have yet to meet a compiler ... or
>>significant program of any kind ... that is without bugs, noticeable
>>or not.
> 
> [...]
> 
> One of the main reasons for this, IMHO, is that many compilers place too
> much emphasis on low-level optimizations of dubious merit. For C and
> Java, I've taken to compiling all non-performance-critical code without
> optimizations, and that has significantly reduced the number of compiler
> bugs that I see. It has very little effect on overall execution performance
> (and compile times are quicker).
> 
> I don't think that over-complex type systems are the cause of more than a
> small part of the compiler bug problem. In my estimation, the frequency of
> bugs in different compilers *for the same language* can vary by an order of
> magnitude.

Also, the use of typed intermediate languages within the compiler might 
actually help drastically cutting down on the more severe problem of 
code transformation bugs, notwithstanding the relative complexity of 
suitable internal type systems.

   - Andreas
From: Joachim Durchholz
Subject: Re: What is a type error?
Date: 
Message-ID: <e8t0ai$9ad$1@online.de>
Chris Smith schrieb:
> For example, I wrote that example using variables of type int.  If we 
> were to suppose that we were actually working with variables of type 
> Person, then things get a little more complicated.  We would need a few 
> (infinite classes of) derived subtypes of Person that further constrain 
> the possible values for state.  For example, we'd need types like:
> 
>     Person{age:{18..29}}
> 
> But this starts to look bad, because we used to have this nice
> property called encapsulation.  To work around that, we'd need to
> make one of a few choices: [...] (c) invent some kind of generic
> constraint language so that constraints like this could be expressed
> without exposing field names. [...] Choice (c), though, looks a
> little daunting.

That's not too difficult.
Start with boolean expressions.
If you need to check everything statically, add enough constraints that 
they become decidable.
For the type language, you also need to add primitives for type 
checking, and if the language is stateful, you'll also want primitives 
for accessing earlier states (most notably at function entry).

> So I'll stop there.  The point is that while it is emphatically true 
> that this kind of stuff is possible, it is also very hard in Java.

No surprise: It's always very hard to retrofit an inference system to a 
language that wasn't designed for it.

This doesn't mean it can't be done. Adding genericity to Java was a 
pretty amazing feat.
(But I won't hold my breath for a constraint-style type system in Java 
anyway... *gg*)

Regards,
Jo
From: Marshall
Subject: Re: What is a type error?
Date: 
Message-ID: <1152542831.623833.25880@m79g2000cwm.googlegroups.com>
Joachim Durchholz wrote:
> Chris Smith schrieb:
> > For example, I wrote that example using variables of type int.  If we
> > were to suppose that we were actually working with variables of type
> > Person, then things get a little more complicated.  We would need a few
> > (infinite classes of) derived subtypes of Person that further constrain
> > the possible values for state.  For example, we'd need types like:
> >
> >     Person{age:{18..29}}
> >
> > But this starts to look bad, because we used to have this nice
> > property called encapsulation.  To work around that, we'd need to
> > make one of a few choices: [...] (c) invent some kind of generic
> > constraint language so that constraints like this could be expressed
> > without exposing field names. [...] Choice (c), though, looks a
> > little daunting.
>
> That's not too difficult.
> Start with boolean expressions.
> If you need to check everything statically, add enough constraints that
> they become decidable.

I'm not sure I understand. Could you elaborate?


> For the type language, you also need to add primitives for type
> checking, and if the language is stateful, you'll also want primitives
> for accessing earlier states (most notably at function entry).

Again I'm not entirely clear what this means. Are you talking
about pre/post conditions, or are you talking about having the
constraint language itself be something other than functional?


Marshall
From: Joachim Durchholz
Subject: Re: What is a type error?
Date: 
Message-ID: <e8traj$p57$1@online.de>
Marshall schrieb:
> Joachim Durchholz wrote:
>> Chris Smith schrieb:
>>> For example, I wrote that example using variables of type int.  If we
>>> were to suppose that we were actually working with variables of type
>>> Person, then things get a little more complicated.  We would need a few
>>> (infinite classes of) derived subtypes of Person that further constrain
>>> the possible values for state.  For example, we'd need types like:
>>>
>>>     Person{age:{18..29}}
>>>
>>> But this starts to look bad, because we used to have this nice
>>> property called encapsulation.  To work around that, we'd need to
>>> make one of a few choices: [...] (c) invent some kind of generic
>>> constraint language so that constraints like this could be expressed
>>> without exposing field names. [...] Choice (c), though, looks a
>>> little daunting.
>> That's not too difficult.
>> Start with boolean expressions.
>> If you need to check everything statically, add enough constraints that
>> they become decidable.
> 
> I'm not sure I understand. Could you elaborate?

Preconditions/postconditions can express anything you want, and they are 
an absolutely natural extensions of what's commonly called a type 
(actually the more powerful type systems have quite a broad overlap with 
assertions).
I'd essentially want to have an assertion language, with primitives for 
type expressions.

>> For the type language, you also need to add primitives for type
>> checking, and if the language is stateful, you'll also want primitives
>> for accessing earlier states (most notably at function entry).
> 
> Again I'm not entirely clear what this means. Are you talking
> about pre/post conditions,

Yes.

Regards,
Jo
From: Marshall
Subject: Re: What is a type error?
Date: 
Message-ID: <1152597340.855103.153510@b28g2000cwb.googlegroups.com>
Joachim Durchholz wrote:
> Marshall schrieb:
> > Joachim Durchholz wrote:
> >> Chris Smith schrieb:
> >>> For example, I wrote that example using variables of type int.  If we
> >>> were to suppose that we were actually working with variables of type
> >>> Person, then things get a little more complicated.  We would need a few
> >>> (infinite classes of) derived subtypes of Person that further constrain
> >>> the possible values for state.  For example, we'd need types like:
> >>>
> >>>     Person{age:{18..29}}
> >>>
> >>> But this starts to look bad, because we used to have this nice
> >>> property called encapsulation.  To work around that, we'd need to
> >>> make one of a few choices: [...] (c) invent some kind of generic
> >>> constraint language so that constraints like this could be expressed
> >>> without exposing field names. [...] Choice (c), though, looks a
> >>> little daunting.
> >> That's not too difficult.
> >> Start with boolean expressions.
> >> If you need to check everything statically, add enough constraints that
> >> they become decidable.
> >
> > I'm not sure I understand. Could you elaborate?
>
> Preconditions/postconditions can express anything you want, and they are
> an absolutely natural extensions of what's commonly called a type
> (actually the more powerful type systems have quite a broad overlap with
> assertions).
> I'd essentially want to have an assertion language, with primitives for
> type expressions.

Now, I'm not fully up to speed on DBC. The contract specifications,
these are specified statically, but checked dynamically, is that
right? In other words, we can consider contracts in light of
inheritance, but the actual verification and checking happens
at runtime, yes?

Wouldn't it be possible to do them at compile time? (Although
this raises decidability issues.) Mightn't it also be possible to
leave it up to the programmer whether a given contract
was compile-time or runtime?

I've been wondering about this for a while.


Marshall
From: Darren New
Subject: Re: What is a type error?
Date: 
Message-ID: <d8Qsg.14127$MF6.7513@tornado.socal.rr.com>
Marshall wrote:
> Now, I'm not fully up to speed on DBC. The contract specifications,
> these are specified statically, but checked dynamically, is that
> right? 

Yes, but there's a bunch more to it than that. The handling of 
exceptions, an in particular exceptions caused by failed pre/post 
conditions, is an integral part of the process.

Plus, of course, pre/post condition checking is turned off while 
checking pre/post conditions. This does make sense if you think about it 
long enough, but it took me several months before I realized why it's 
necessary theoretically rather than just practically.

> Wouldn't it be possible to do them at compile time? 

For some particularly simple ones, yes. For others, like "the chess 
pieces are in a position it is legal to get to from a standard opening 
set-up", it would be difficult. Anything to do with I/O is going to be 
almost impossible to write a checkable postcondition for, even at 
runtime. "After this call, the reader at the other end of the socket 
will receive the bytes in argument 2 without change."

As far as I understand it, Eiffel compilers don't even make use of 
postconditions to optimize code or eliminate run-time checks (like null 
pointer testing).

-- 
   Darren New / San Diego, CA, USA (PST)
     This octopus isn't tasty. Too many
     tentacles, not enough chops.
From: Joachim Durchholz
Subject: Re: What is a type error?
Date: 
Message-ID: <e92vqc$etv$1@online.de>
Darren New schrieb:
> As far as I understand it, Eiffel compilers don't even make use of 
> postconditions to optimize code or eliminate run-time checks (like null 
> pointer testing).

That's correct.

I think a large part of the reasons why this isn't done is that Eiffel's 
semantics is (a) too complicated (it's an imperative language after 
all), and (b) not formalized, which makes it difficult to assess what 
optimizations are safe and what aren't.

(Reason (c) is that Eiffel compiler vendors don't have the manpower for 
this kind of work, mostly in quantity but also, to some degree, in 
quality: people with a solid language semantics background tend to be 
repelled by the language inventor's know-it-all deny-the-problems 
don't-bother-me-with-formalisms attitude. He still has moved the state 
of the art ahead - mostly by pointing out a certain class of problems in 
OO designs and explaining them lucidly, and proposing solutions that 
hold up better than average even if still fundamentally flawed.)

Regards,
Jo
From: Marshall
Subject: Re: What is a type error?
Date: 
Message-ID: <1152721450.433219.269270@s13g2000cwa.googlegroups.com>
Joachim Durchholz wrote:
> Darren New schrieb:
> > As far as I understand it, Eiffel compilers don't even make use of
> > postconditions to optimize code or eliminate run-time checks (like null
> > pointer testing).
>
> That's correct.
>
> I think a large part of the reasons why this isn't done is that Eiffel's
> semantics is (a) too complicated (it's an imperative language after
> all), and (b) not formalized, which makes it difficult to assess what
> optimizations are safe and what aren't.

I can see the lack of a formal model being an issue, but is the
imperative bit really all that much of an obstacle? How hard
is it really to deal with assignment? Or does the issue have
more to do with pointers, aliasing, etc.?


Marshall
From: Joachim Durchholz
Subject: Re: What is a type error?
Date: 
Message-ID: <44B536EB.6020308@durchholz.org>
Marshall schrieb:
> I can see the lack of a formal model being an issue, but is the
> imperative bit really all that much of an obstacle? How hard
> is it really to deal with assignment? Or does the issue have
> more to do with pointers, aliasing, etc.?

Actually aliasing is *the* hard issue.
Just one data point: C compiler designers spend substantial effort to 
determine which data structures cannot have aliasing to be able to apply 
optimizations.

This can bite even with runtime assertion checking.
For example, ordinarily one would think that if there's only a fixed set 
of routines that may modify some data structure A, checking the 
invariants of that structure need only be done at the exit of these 
routines.
Now assume that A has a reference to structure B, and that the 
invariants involve B in some way. (E.g. A.count = A->B.count.)
There's still no problem with that - until you have a routine that 
exports a reference to B, which gets accessed from elsewhere, say via C.
Now if I do
   C->B.count = 27
I will most likely break A's invariant. If that assignment is in some 
code that's far away from the code for A, debugging can become 
exceedingly hard: the inconsistency (i.e. A violating its invariant) can 
live for the entire runtime of the program.

So that innocent optimization "don't check all the program's invariants 
after every assignment" immediately breaks with assignment (unless you 
do some aliasing analysis).

In an OO language, it's even more difficult to establish that some data 
structure cannot be aliased: even if the code for A doesn't hand out a 
reference to B, there's no guarantee that some subclass of A doesn't. 
I.e. the aliasing analysis has to be repeated whenever there's a new 
subclass, which may happen at link time when you'd ordinarily don't want 
to do any serious semantic analysis anymore.

There's another way around getting destroyed invariants reported at the 
time the breakage occurs: lock any data field that goes into an 
invariant (or a postcondition, too). The rationale behind this is that 
from the perspective of assertions, an alias walks, smells and quacks 
just like concurrent access, so locking would be the appropriate answer. 
The problem here is that this means that updates can induce *very* 
expensive machinery - imagine an invariant that says "A->balance is the 
sum of all the transaction->amount fields in the transaction list of A", 
it would cause all these ->amount fields to be locked as soon as a 
transaction list is hooked up with the amount. OTOH one could argue that 
such a locking simply makes cost in terms of debugging time visible as 
runtime overhead, which isn't entirely a Bad Thing.


There are all other kinds of things that break in the presence of 
aliasing; this is just the one that I happen to know best :-)

Without mutation, such breakage cannot occur. Invariants are just the 
common postconditions of all the routines that may construct a structure 
of a given type.

Regards,
Jo
From: Marshall
Subject: Re: What is a type error?
Date: 
Message-ID: <1152728150.911515.263640@s13g2000cwa.googlegroups.com>
Joachim Durchholz wrote:
> Marshall schrieb:
> > I can see the lack of a formal model being an issue, but is the
> > imperative bit really all that much of an obstacle? How hard
> > is it really to deal with assignment? Or does the issue have
> > more to do with pointers, aliasing, etc.?
>
> Actually aliasing is *the* hard issue.

Okay, sure. Nice explanation.

But one minor point: you describe this as an issue with "imperative"
languages. But aliasing is a problem associated with pointers,
not with assignment. One can have assignment, or other forms
of destructive update, without pointers; they are not part of the
definition of "imperative." (Likewise, one can have pointers without
assignment, although I'm less clear if the aliasing issue is as
severe.)


Marshall
From: Joachim Durchholz
Subject: Re: What is a type error?
Date: 
Message-ID: <e93jb6$ll7$1@online.de>
Marshall schrieb:
> Joachim Durchholz wrote:
>> Marshall schrieb:
>>> I can see the lack of a formal model being an issue, but is the
>>> imperative bit really all that much of an obstacle? How hard
>>> is it really to deal with assignment? Or does the issue have
>>> more to do with pointers, aliasing, etc.?
>> Actually aliasing is *the* hard issue.
> 
> Okay, sure. Nice explanation.
> 
> But one minor point: you describe this as an issue with "imperative"
> languages. But aliasing is a problem associated with pointers,
> not with assignment.

Aliasing is not a problem if the aliased data is immutable.

 > One can have assignment, or other forms
> of destructive update, without pointers; they are not part of the
> definition of "imperative."

Sure.
You can have either of destructive updates and pointers without 
incurring aliasing problems. As soon as they are combined, there's trouble.

Functional programming languages often drop assignment entirely. (This 
is less inefficient than one would think. If everything is immutable, 
you can freely share data structures and avoid some copying, and you can 
share across abstraction barriers. In programs with mutable values, 
programmers are forced to choose the lesser evil of either copying 
entire data structures or doing a cross-abstraction analysis of who 
updates what elements of what data structure. A concrete example: the 
first thing that Windows does when accepting userland data structures 
is... to copy them; this were unnecessary if the structures were immutable.)

Some functional languages restrict assignment so that there can exist at 
most a single reference to any mutable data structure. That way, there's 
still no aliasing problems, but you can still update in place where it's 
really, really necessary.

I know of no professional language that doesn't have references of some 
kind.

Regards,
Jo
From: Marshall
Subject: Re: What is a type error?
Date: 
Message-ID: <1152757188.426777.216560@h48g2000cwc.googlegroups.com>
Joachim Durchholz wrote:
> Marshall schrieb:
> > Joachim Durchholz wrote:
> >> Marshall schrieb:
> >>> I can see the lack of a formal model being an issue, but is the
> >>> imperative bit really all that much of an obstacle? How hard
> >>> is it really to deal with assignment? Or does the issue have
> >>> more to do with pointers, aliasing, etc.?
> >> Actually aliasing is *the* hard issue.
> >
> > Okay, sure. Nice explanation.
> >
> > But one minor point: you describe this as an issue with "imperative"
> > languages. But aliasing is a problem associated with pointers,
> > not with assignment.
>
> Aliasing is not a problem if the aliased data is immutable.

Okay, sure. But for the problem you describe, both imperativeness
and the presence of pointers is each necessary but not sufficient;
it is the two together that causes the problem. So it strikes
me (again, a very minor point) as inaccurate to describe this as
a problem with imperative languages per se.


>  > One can have assignment, or other forms
> > of destructive update, without pointers; they are not part of the
> > definition of "imperative."
>
> Sure.
> You can have either of destructive updates and pointers without
> incurring aliasing problems. As soon as they are combined, there's trouble.

Right. To me the response to this clear: give up pointers. Imperative
operations are too useful to give up; indeed they are a requirement
for certain problems. Pointers on the other hand add nothing except
efficiency and a lot of confusion. They should be considered an
implementation technique only, hidden behind some pointerless
computational model.

I recognize that this is likely to be a controversial opinion.



> Functional programming languages often drop assignment entirely. (This
> is less inefficient than one would think. If everything is immutable,
> you can freely share data structures and avoid some copying, and you can
> share across abstraction barriers. In programs with mutable values,
> programmers are forced to choose the lesser evil of either copying
> entire data structures or doing a cross-abstraction analysis of who
> updates what elements of what data structure. A concrete example: the
> first thing that Windows does when accepting userland data structures
> is... to copy them; this were unnecessary if the structures were immutable.)

I heartily support immutability as the default, for these and other
reasons.


> Some functional languages restrict assignment so that there can exist at
> most a single reference to any mutable data structure. That way, there's
> still no aliasing problems, but you can still update in place where it's
> really, really necessary.

Are we speaking of uniqueness types now? I haven't read much about
them, but it certainly seems like an intriguing concept.


> I know of no professional language that doesn't have references of some
> kind.

Ah, well. I suppose I could mention prolog or mercury, but then
you used that troublesome word "professional." So I will instead
mention a language which, if one goes by number of search results
on hotjobs.com for "xxx progammer" for different value of xxx, is
more popular than Java and twice as popular as C++. It lacks
pointers (although I understand they are beginning to creep in
in the latest version of the standard.) It also posesses a quite
sophisticated set of facilities for declarative integrity constraints.
Yet for some reason it is typically ignored by language designers.

http://hotjobs.yahoo.com/jobseeker/jobsearch/search_results.html?keywords_all=sql+programmer


Marshall
From: Andreas Rossberg
Subject: Re: What is a type error?
Date: 
Message-ID: <e950oa$a11al$2@hades.rz.uni-saarland.de>
Marshall wrote:
> 
> Okay, sure. But for the problem you describe, both imperativeness
> and the presence of pointers is each necessary but not sufficient;
> it is the two together that causes the problem. So it strikes
> me (again, a very minor point) as inaccurate to describe this as
> a problem with imperative languages per se.
> 
> [...]
> 
> Right. To me the response to this clear: give up pointers. Imperative
> operations are too useful to give up; indeed they are a requirement
> for certain problems. Pointers on the other hand add nothing except
> efficiency and a lot of confusion. They should be considered an
> implementation technique only, hidden behind some pointerless
> computational model.

Don't get yourself distracted by the low-level notion of "pointer". The 
problem *really* is mutability and the associated notion of identity, 
which explicit pointers just exhibit on a very low level.

When you have a language with mutable types (e.g. mutable arrays) then 
objects of these types have identity, which is observable through 
assignment. This is regardless of whether identity is an explicit 
concept (like it becomes with pointers and comparison of pointer values, 
i.e. addresses).

Consequently, you cannot possibly get rid of aliasing issues without 
getting rid of (unrestricted) mutability. Mutability implies object 
identity implies aliasing problems.

On the other hand, pointers are totally a futile concept without 
mutability: if everything is immutable, it is useless to distinguish 
between an object and a pointer to it.

In other words, pointers are essentially just an *aspect* of mutability 
in lower-level languages. On a sufficiently high level of abstraction, 
it does not make much sense to differentiate between both concepts - 
pointers are just mutable objects holding other mutable objects 
(immutable pointer types exist, but are only interesting if you also 
have pointer arithmetics - which, however, is largely equivalent to 
arrays, i.e. not particularly relevant either).

- Andreas
From: Marshall
Subject: Re: What is a type error?
Date: 
Message-ID: <1152803102.013470.164540@m73g2000cwd.googlegroups.com>
Andreas Rossberg wrote:
> Marshall wrote:
> >
> > Okay, sure. But for the problem you describe, both imperativeness
> > and the presence of pointers is each necessary but not sufficient;
> > it is the two together that causes the problem. So it strikes
> > me (again, a very minor point) as inaccurate to describe this as
> > a problem with imperative languages per se.
> >
> > [...]
> >
> > Right. To me the response to this clear: give up pointers. Imperative
> > operations are too useful to give up; indeed they are a requirement
> > for certain problems. Pointers on the other hand add nothing except
> > efficiency and a lot of confusion. They should be considered an
> > implementation technique only, hidden behind some pointerless
> > computational model.
>
> Don't get yourself distracted by the low-level notion of "pointer". The
> problem *really* is mutability and the associated notion of identity,
> which explicit pointers just exhibit on a very low level.
>
> When you have a language with mutable types (e.g. mutable arrays) then
> objects of these types have identity, which is observable through
> assignment. This is regardless of whether identity is an explicit
> concept (like it becomes with pointers and comparison of pointer values,
> i.e. addresses).

Hmmm, well, I cannot agree. You've defined away the pointers
but then slipped them back in again by assumption ("objects
of these types have identity".)

First let me say that the terminology is somewhat problematic.
For the specific issue being discussed here, pointers, identity,
and objects are all the same concept. (I agree that "pointer"
connotes a low-level construct, however.) Sometimes I think
of this issue as being one with first class variables. An object
with mutable fields is a variable, and if we have pointers or
references or any way to have two different pathways to
that object/those variables, then we run in to the aliasing problem.

However if the mutable types are not first class, then there
is no way to have the aliasing. Thus, if we do not have pointers
or objects or identity but retain mutability, there is no aliasing
problem.


> Consequently, you cannot possibly get rid of aliasing issues without
> getting rid of (unrestricted) mutability. Mutability implies object
> identity implies aliasing problems.

Mutability by itself does not imply identity. I agree that mutability
plus
identity implies aliasing problems, however.


> On the other hand, pointers are totally a futile concept without
> mutability: if everything is immutable, it is useless to distinguish
> between an object and a pointer to it.

Agreed.


> In other words, pointers are essentially just an *aspect* of mutability
> in lower-level languages.

Again, I disagree: it is posible to have mutability without
pointers/identity/objects.


Marshall
From: Joe Marshall
Subject: Re: What is a type error?
Date: 
Message-ID: <1152814522.080290.267010@m73g2000cwd.googlegroups.com>
Marshall wrote:
>
> Again, I disagree: it is posible to have mutability without
> pointers/identity/objects.

I think you are wrong, but before I make a complete ass out of myself,
I have to ask what you mean by `mutability'.  (And
pointers/identity/objects, for that matter.)

Alan Bawden discusses the phenomenon of `state' in his Ph.D.
dissertation "Implementing Distributed Systems Using Linear Naming".
MIT AI Lab Technical Report AITR-1627. March 1993  He makes a
persuasive argument that `state' is associated with cycles in naming.
From: Marshall
Subject: Re: What is a type error?
Date: 
Message-ID: <1152820770.519295.323890@i42g2000cwa.googlegroups.com>
Joe Marshall wrote:
> Marshall wrote:
> >
> > Again, I disagree: it is posible to have mutability without
> > pointers/identity/objects.
>
> I think you are wrong, but before I make a complete ass out of myself,
> I have to ask what you mean by `mutability'.  (And
> pointers/identity/objects, for that matter.)

Responding to requests for examples from Joachim, Joe, and Chris....

The very simple example is the one Darren New already mentioned.

Consider the following Java fragment:

void foo() {
  int i = 0;
  int j = 0;

  // put any code here you want

  j = 1;
  i = 2;
  // check value of j here. It is still 1, no matter what you filled in
above.
  // The assignment to i cannot be made to affect the value of j.

}


Those two local primitive variables cannot be made to have the same
identity. But you can update them, so this is an example of mutability
without the possibility of identity.

Earlier I also mentioned SQL tables as an example, although SQL
supports *explicit* aliasing via views.


> Alan Bawden discusses the phenomenon of `state' in his Ph.D.
> dissertation "Implementing Distributed Systems Using Linear Naming".
> MIT AI Lab Technical Report AITR-1627. March 1993  He makes a
> persuasive argument that `state' is associated with cycles in naming.

I would like to read that, but my brain generally runs out of gas at
about 21
pages, so it's about an order of magnitude bigger than I can currently
handle. :-( As to "cycles in naming" that's certainly an issue. But it
it
a requirement for state? Back to Java locals, it seems to me they meet
the standard definition of state, despite the lack of cycles.

As to pointers/references, I earlier mentioned the existence of the
reference/dereference operations as being definitional. Note that
one can go to some lengths to obscure them, but they're still there.
For example, Java has the reference and dereference operators;
Java's "." operator is actually C's "->" operator.

I am not so bold/foolish as to attempt a defintion of "object" however.
:-)


Marshall
From: Joe Marshall
Subject: Re: What is a type error?
Date: 
Message-ID: <1152828428.281715.299050@p79g2000cwp.googlegroups.com>
Marshall wrote:
>
> Consider the following Java fragment:
>
> void foo() {
>   int i = 0;
>   int j = 0;
>
>   // put any code here you want
>
>   j = 1;
>   i = 2;
>   // check value of j here. It is still 1, no matter what you filled in
> above.
>   // The assignment to i cannot be made to affect the value of j.
>
> }

True, but you have hidden the pointers.  Semantically, the identifiers
i and j refer not to integers but to locations that hold integers.  The
assignment modifies the location.

> Those two local primitive variables cannot be made to have the same
> identity. But you can update them, so this is an example of mutability
> without the possibility of identity.

The identity is temporal:  You use the same variable name at two
different times.  Do you intend for the second `i' to mean the same
variable as the first `i'?
From: Marshall
Subject: Re: What is a type error?
Date: 
Message-ID: <1152833316.272422.257250@h48g2000cwc.googlegroups.com>
Joe Marshall wrote:
> Marshall wrote:
> >
> > Consider the following Java fragment:
> >
> > void foo() {
> >   int i = 0;
> >   int j = 0;
> >
> >   // put any code here you want
> >
> >   j = 1;
> >   i = 2;
> >   // check value of j here. It is still 1, no matter what you filled in
> > above.
> >   // The assignment to i cannot be made to affect the value of j.
> >
> > }
>
> True, but you have hidden the pointers.  Semantically, the identifiers
> i and j refer not to integers but to locations that hold integers.  The
> assignment modifies the location.

What the implementation looks like shouldn't affect how we speak
of the logical model. In the above code, there are no pointers.

By your definition, "pointer" and "variable" are synonyms. That doesn't
seem like a good idea to me. (What if i and j end up in registers?
I have not heard it said that registers have addresses.)


> > Those two local primitive variables cannot be made to have the same
> > identity. But you can update them, so this is an example of mutability
> > without the possibility of identity.
>
> The identity is temporal:  You use the same variable name at two
> different times.  Do you intend for the second `i' to mean the same
> variable as the first `i'?

Okay, so "i" and "i" have the same identity. I suppose you could argue
that the language's namespace is an address-space, and variable names
are addresses. At this point, though, you've made the concept of
identity
so broad that it is now necessarily a property of all languages that
use
named values, whether updatable or not. I think you would also have
to call "3" a void pointer by this scheme.

Clearly there is *some* difference between a language which allows
explicit pointers and thus aliasing and one that doesn't. What term
would you use? First-class variables?


Marshall
From: Joachim Durchholz
Subject: Re: What is a type error?
Date: 
Message-ID: <e97qrh$cou$1@online.de>
Marshall schrieb:
> By your definition, "pointer" and "variable" are synonyms. That doesn't
> seem like a good idea to me. (What if i and j end up in registers?
> I have not heard it said that registers have addresses.)

There are no registers in the JVM ;-P

More specifically, where Joe said "pointer" he really meant "reference". 
i refers to a variable, and you really want to make sure that the first 
i refers to the same variable as the second i, so whatever is referred 
to by i is really an identity.

>>> Those two local primitive variables cannot be made to have the same
>>> identity. But you can update them, so this is an example of mutability
>>> without the possibility of identity.
>> The identity is temporal:  You use the same variable name at two
>> different times.  Do you intend for the second `i' to mean the same
>> variable as the first `i'?
> 
> Okay, so "i" and "i" have the same identity.

Vacuous but true.

> I suppose you could argue that the language's namespace is an
> address-space, and variable names are addresses.

Correct.

> At this point, though, you've made the concept of identity so broad
> that it is now necessarily a property of all languages that use named
> values, whether updatable or not. I think you would also have to call
> "3" a void pointer by this scheme.

It's not a void pointer, it's a pointer to an integer, but you're 
correct: "3", when written in a program text, is a reference to the 
successor of the successor of the successor of zero.

If you find that silly and would like to stick with equality, then 
you're entirely correct. Natural numbers are entirely immutable by 
definition, and I'm definitely not the first one to observe that 
identity and equality are indistinguishable for immutable objects.

> Clearly there is *some* difference between a language which allows
> explicit pointers and thus aliasing and one that doesn't.

You can have aliasing without pointers; e.g. arrays are fully sufficient.
If i = j, then a [i] and a [j] are aliases of the same object.

(After I observed that, I found it no longer a surprise that array 
optimizations are what Fortran compiler teams sink most time into. The 
aliasing problems are *exactly* the same as those with pointers in C - 
though in practice, the Fortranistas have the advantage that the 
compiler will usually know that a [i] and b [j] cannot be aliases of 
each other, so while they have the same problems, the solutions give the 
Fortranistas more leverage.)

 > What term would you use? First-class variables?

I think it's more a quasi-continuous spectrum.

On one side, we have alias-free toy languages (no arrays, no pointers, 
no call-by-reference, just to name the most common sources of aliasing).

SQL is slightly more dangerous: it does have aliases, but they are 
rarely a problem (mostly because views aren't a very common thing, and 
even if they are used, they aren't usually so abstract that they really 
hide the underlying dependencies).

Next are languages with arrays and call-by-reference. "Pascal without 
pointers" would be a candidate.
Here, aliasing occurs, and it can and does hide behind abstraction 
barriers, but it's not a serious problem unless the software becomes 
*really* large.

The end of the line are languages that use references to mutable data 
everywhere. All OO languages are a typical example of that.

Regards,
Jo
From: Marshall
Subject: Re: What is a type error?
Date: 
Message-ID: <1152891920.228515.188180@m73g2000cwd.googlegroups.com>
Joachim Durchholz wrote:
>
> You can have aliasing without pointers; e.g. arrays are fully sufficient.
> If i = j, then a [i] and a [j] are aliases of the same object.

I am having a hard time with this very broad definition of aliasing.
Would we also say that a[1+1] and a[2] are aliases? It seems
to me, above, that we have only a, and with only one variable
there can be no aliasing.

A further question:

given a 32 bit integer variable x, and offsets i and j (equal as in
the above example) would you say that

   x &= (1 << i)
and
   x &= (1 << j)

are aliased expressions for setting a particular bit in x?

I am not being facetious; I am trying to understand the limits
of your definition for aliasing.


> (After I observed that, I found it no longer a surprise that array
> optimizations are what Fortran compiler teams sink most time into. The
> aliasing problems are *exactly* the same as those with pointers in C -
> though in practice, the Fortranistas have the advantage that the
> compiler will usually know that a [i] and b [j] cannot be aliases of
> each other, so while they have the same problems, the solutions give the
> Fortranistas more leverage.)

I don't understand this paragraph. On the one hand, it seems you
are saying that C and Fortran are identically burdened with the
troubles caused by aliasing, and then a moment later it seems
you are saying the situation is distinctly better with Fortran.


>  > What term would you use? First-class variables?
>
> I think it's more a quasi-continuous spectrum.
>
> On one side, we have alias-free toy languages (no arrays, no pointers,
> no call-by-reference, just to name the most common sources of aliasing).
>
> SQL is slightly more dangerous: it does have aliases, but they are
> rarely a problem (mostly because views aren't a very common thing, and
> even if they are used, they aren't usually so abstract that they really
> hide the underlying dependencies).
>
> Next are languages with arrays and call-by-reference. "Pascal without
> pointers" would be a candidate.
> Here, aliasing occurs, and it can and does hide behind abstraction
> barriers, but it's not a serious problem unless the software becomes
> *really* large.
>
> The end of the line are languages that use references to mutable data
> everywhere. All OO languages are a typical example of that.

Now with this, it appears you are agreeing that SQL has an advantage
vis-a-vis aliasing compared to OO languages. Yes? If so, we are
agreeing on the part I care about, and the specifics of just what
we call aliasing are not so important to me.


Marshall
From: Chris Smith
Subject: Re: What is a type error?
Date: 
Message-ID: <MPG.1f21ea9e2a3f19ee989687@news.altopia.net>
Marshall wrote...
> I am having a hard time with this very broad definition of aliasing.
> Would we also say that a[1+1] and a[2] are aliases? It seems
> to me, above, that we have only a, and with only one variable
> there can be no aliasing.

The problem with this (and with the relational one as well) is that the 
practical problems don't go away by redefining "variable" to mean larger 
collections of values.

> given a 32 bit integer variable x, and offsets i and j (equal as in
> the above example) would you say that
> 
>    x &= (1 << i)
> and
>    x &= (1 << j)
> 
> are aliased expressions for setting a particular bit in x?

I don't know about Joachim, but yes, I would.

-- 
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation
From: Joachim Durchholz
Subject: Re: What is a type error?
Date: 
Message-ID: <e994ev$lj6$1@online.de>
Marshall schrieb:
> Joachim Durchholz wrote:
>> You can have aliasing without pointers; e.g. arrays are fully sufficient.
>> If i = j, then a [i] and a [j] are aliases of the same object.
> 
> I am having a hard time with this very broad definition of aliasing.
> Would we also say that a[1+1] and a[2] are aliases? It seems
> to me, above, that we have only a, and with only one variable
> there can be no aliasing.

a[1] is a variable, too. You can assign values to it.

Inside function foo when called as foo(a[1]), it may even be referenced 
with yet another name and updated in isolation; foo may not even know 
it's an array element. (This is sort-of true in C, and definitely true 
in languages that don't have pointer arithmetic. If you pass a[1] to a 
call-by-reference parameter in Pascal, the callee has no way to access 
the parameter as if it were an array element.)

> A further question:
> 
> given a 32 bit integer variable x, and offsets i and j (equal as in
> the above example) would you say that
> 
>    x &= (1 << i)
> and
>    x &= (1 << j)
> 
> are aliased expressions for setting a particular bit in x?

Yes.

> I am not being facetious; I am trying to understand the limits
> of your definition for aliasing.

Understood and appreciated.

>> (After I observed that, I found it no longer a surprise that array
>> optimizations are what Fortran compiler teams sink most time into. The
>> aliasing problems are *exactly* the same as those with pointers in C -
>> though in practice, the Fortranistas have the advantage that the
>> compiler will usually know that a [i] and b [j] cannot be aliases of
>> each other, so while they have the same problems, the solutions give the
>> Fortranistas more leverage.)
> 
> I don't understand this paragraph. On the one hand, it seems you
> are saying that C and Fortran are identically burdened with the
> troubles caused by aliasing, and then a moment later it seems
> you are saying the situation is distinctly better with Fortran.

That's exactly the situation.
Aliasing is present in both language, but in Fortran, the guarantee that 
two names aren't aliased are easier to come by.
Even more non-aliasing guarantees exist if the language has a more 
expressive type system. In most cases, if you know that two expressions 
have different types, you can infer that they cannot be aliases.

Of course, the life of compiler writers if of less importance to us; I 
added them only because the reports of Fortran compilers having aliasing 
problems due to array indexes made me aware that pointers aren't the 
only source of aliasing. That was quite a surprise to me then.

> Now with this, it appears you are agreeing that SQL has an advantage
> vis-a-vis aliasing compared to OO languages. Yes?

Sure.
I think that making an aliasing bug in SQL is just as easy as in any 
pointer-ridden language, but transactions and constraints (plus the 
considerably smaller typical size of SQL code) make it far easier to 
diagnose any such bugs.

> If so, we are agreeing on the part I care about, and the specifics of
> just what we call aliasing are not so important to me.

I'm not sure whether I'm getting the same mileage out of this, but the 
relevance of a problem is obviously a function on what one is doing on a 
day-to-day basis, so I can readily agree with that :-)

Regards,
Jo
From: Chris F Clark
Subject: Re: What is a type error?
Date: 
Message-ID: <sdd8xmuzlvl.fsf@shell01.TheWorld.com>
Joachim Durchholz wrote:
 >
 > You can have aliasing without pointers; e.g. arrays are fully sufficient.
 > If i = j, then a [i] and a [j] are aliases of the same object.

"Marshall" <···············@gmail.com> writes:

> I am having a hard time with this very broad definition of aliasing.
> Would we also say that a[1+1] and a[2] are aliases? It seems
> to me, above, that we have only a, and with only one variable
> there can be no aliasing.

As you can see from the replies, all these things which you do not
consider aliasing are considered aliasing (by most everyone else, in
fact these are exactly what is meant by aliasing).  There is good
reason for this.  Let us explain this by looking again at the SQL
example that has been kicked around some.

>>    SELECT * FROM persons WHERE name = "John"
>> and
>>    SELECT * FROM persons WHERE surname = "Doe"

Some set of records are shared by both select clauses.  Those records
are the aliased records.  If the set is empty there is no problem.  If
we do not update the records from one of the select clauses, we also
have no problem.  However, if we update the records for one of the
select clauses and the set of aliased records is not empty, the
records returned by the other select clause have changed.  Therefore,
everything we knew to be true about the "other" select clause may or
may not still be true.  It's the may not case we are worried about.

For example, in the "imperative Mi-5" Q asks Moneypenny to run the
first query (name = John) to get a list of agents for infiltrating
Spectre.  In another department, they're doing sexual reassignments
and updating the database with the second query (surname = Doe) and
making the name become Jane (along with other changes to make the
transformation correct).  So, when Q select "John Doe" from the first
list and sends that agent on a mission Q is really sending "Jane Doe"
and Spectre realizes that the agent is one and kills her.  Not a happy
result.

In the "functional Mi-5", the sexual reassignment group, makes clones
of the agents reassigned, so that there is now both a "John Doe" and a
"Jane Doe".  Of course, the payroll is a little more bloated, so that
when Q says "John Doe" isn't needed for any further assignments, the
"Garbage Collection" department does a reduction in force and gets rid
of "John Doe".  The good news is that they didn't send Jane Doe to her
death.

The problem with aliasing, we want the things we knew true in one part
of our program, i.e. the Johns on Q's list are still Johns and not
Janes, to be true after we've run another part of our program.  If the
second part of our program can change what was true after the first
part of our program, we can't depend on the results from the first
part of our program, i.e. Q can send Jane Doe into certain death.

The problem is compounded by the complexities of our programs.  When Q
selected his list, he didn't know of the department doing the
reassignments (and couldn't know because they were part of a
top-secret project).  So, since bureaucracies grow to meet the
increasing needs of the bureaucracy, often the solution is increased
complexity, more regulations and paperwork to fill out, semaphores and
locks to hold on critical data, read-only data, status requests, etc.
All to keep Q from killing Jane by accident.  Sometimes they work. the
reassignment department has to wait 30-days for the clearance to
perform their operation in which time John Doe completes the
infiltration of Spectre saves the world from destruction and is ready
for his next assignment.

The key point is that each record (and even each field in each record)
if it can be known by two names, is an alias.  It is not sufficient to
talk about "whole" variables as not being aliased if there is some way
to refer to some part of the variable and change that part of the
variable.  Thus, a[1+1] is an alias for a[2] if you can have one part
of the code talking about one and another part of the code talking
about the other.

To put it one still final way, consider the following code;
        assert(sorted(array a));
        a[1] = 2;
        assert(sorted(array a)); // is this still true?

Because a[1] is an alias of array a, you cannot be certain that the
2nd assert will not fire (fail) without additional analysis.

        assert(sorted(array a));
        assert(a[0] <= 2);
        assert(2 <= a[2]);
        a[1] = 2;
        assert(sorted(array a)); // this is still true!

-Chris
From: Joe Marshall
Subject: Re: What is a type error?
Date: 
Message-ID: <1153159224.849151.88170@m79g2000cwm.googlegroups.com>
Marshall wrote:
>
> I am having a hard time with this very broad definition of aliasing.

How about this definition:  Consider three variables, i, j, and k, and
a functional equivalence predicate (EQUIVALENT(i, j) returns true if
for every pure function F, F(i) = F(j)).  Now suppose i and j are
EQUIVALENT at some point, then a side effecting function G is invoked
on k, after which i and j are no longer equivalent.  Then there is
aliasing.

This is still a little awkward, but there are three main points:
  1.  Aliasing occurs between variables (named objects).
  2.  It is tied to the notion of equivalence.
  3.  You can detect it when a procedure that has no access to a value
can nonetheless modify the value.

In a call-by-value language, you cannot alias values directly, but if
the values are aggregate data structures (like in Java), you may be
able to modify a shared subcomponent.
From: Joe Marshall
Subject: Re: What is a type error?
Date: 
Message-ID: <1152906397.860152.312910@m73g2000cwd.googlegroups.com>
Marshall wrote:
> Joe Marshall wrote:
> > Marshall wrote:
> > >
> > > Consider the following Java fragment:
> > >
> > > void foo() {
> > >   int i = 0;
> > >   int j = 0;
> > >
> > >   // put any code here you want
> > >
> > >   j = 1;
> > >   i = 2;
> > >   // check value of j here. It is still 1, no matter what you filled in
> > > above.
> > >   // The assignment to i cannot be made to affect the value of j.
> > >
> > > }
> >
> > True, but you have hidden the pointers.  Semantically, the identifiers
> > i and j refer not to integers but to locations that hold integers.  The
> > assignment modifies the location.
>
> What the implementation looks like shouldn't affect how we speak
> of the logical model. In the above code, there are no pointers.
>
> By your definition, "pointer" and "variable" are synonyms. That doesn't
> seem like a good idea to me. (What if i and j end up in registers?
> I have not heard it said that registers have addresses.)

You probably won't like this, but....

The example you give above is a bit ambiguous as to whether or not it
shows `true' mutation.  So what do I mean by `true' mutation?  The code
above could be transformed by alpha-renaming into an equivalent version
that does no mutation:

void foo() {
   int i = 0;
   int j = 0;

   // put any code here you want

   int jj = 1;
   int ii = 2;
   // rename all uses of i to ii and j to jj after this point.

 }

Once I've done the renaming, we're left with a pure function.  I claim
that this sort of `mutation' is uninteresting because it can be
trivially eliminated through renaming.  Since we can eliminate it
through renaming, we can use a simplified semantics that does not
involve a `store', and use a substitution model of evaluation.

The more interesting kind of mutation cannot be eliminated through
renaming.  The semantic model requires the notion of a stateful
`store', and the model is not referentially transparent: substitution
does not work.  These qualities are what make mutation problematic.  My
claim is that this kind of mutation cannot be had without some model of
pointers and references.

There is a further distinguishing characteristic:  Can I write a
program that detects the mutation (and acts differently depending on
whether there is mutation or not)?  It is unclear from your example
above whether you intended this to be possible, but certainly there is
no way for a caller of foo to tell that foo reassigns an internal
variable.  If there a procedure is extrinsically a pure function, then
there is no real meaning to any `mutation' that goes on inside; there
is always a way to turn the internal mutation into a pure function
through alpha-renaming.

>
> Clearly there is *some* difference between a language which allows
> explicit pointers and thus aliasing and one that doesn't. What term
> would you use? First-class variables?

My argument to you in re this statement:
> Again, I disagree: it is posible to have mutability without
> pointers/identity/objects.

is that mutability without a notion of pointers, identity, or objects
is trivially eliminated through alpha renaming and thus isn't the
`mutability' of interest.
From: Joachim Durchholz
Subject: Re: What is a type error?
Date: 
Message-ID: <e97pp2$9p0$1@online.de>
Marshall schrieb:
> void foo() {
>   int i = 0;
>   int j = 0;
>   j = 1;
>   i = 2;
>   // check value of j here. It is still 1, no matter what you filled
 >   // in above.
>   // The assignment to i cannot be made to affect the value of j.
> }
> 
> Those two local primitive variables cannot be made to have the same
> identity.

Sure. To state it more clearly, they cannot be aliases.

 > But you can update them, so this is an example of mutability
> without the possibility of identity.

You're being a bit sloppy with terminology here. "Identity" in the 
phrase above refers to two entities, in the sense of "i and j cannot be 
identical".
Identity is actually a property of a single entity, namely that what 
remains constant regardless of what you do with the entity.

Regards,
Jo
From: Marshall
Subject: Re: What is a type error?
Date: 
Message-ID: <1152891092.888466.79510@35g2000cwc.googlegroups.com>
Joachim Durchholz wrote:
> Marshall schrieb:
> > void foo() {
> >   int i = 0;
> >   int j = 0;
> >   j = 1;
> >   i = 2;
> >   // check value of j here. It is still 1, no matter what you filled
>  >   // in above.
> >   // The assignment to i cannot be made to affect the value of j.
> > }
> >
> > Those two local primitive variables cannot be made to have the same
> > identity.
>
> Sure. To state it more clearly, they cannot be aliases.

Yes.


>  > But you can update them, so this is an example of mutability
> > without the possibility of identity.
>
> You're being a bit sloppy with terminology here. "Identity" in the
> phrase above refers to two entities, in the sense of "i and j cannot be
> identical".
> Identity is actually a property of a single entity, namely that what
> remains constant regardless of what you do with the entity.

It is a fair critque. I'll try to be more precise.


Marshall
From: Chris Smith
Subject: Re: What is a type error?
Date: 
Message-ID: <MPG.1f20306a2cb46ebb989792@news.altopia.net>
Marshall <···············@gmail.com> wrote:
> Hmmm, well, I cannot agree. You've defined away the pointers
> but then slipped them back in again by assumption ("objects
> of these types have identity".)
> 
> First let me say that the terminology is somewhat problematic.
> For the specific issue being discussed here, pointers, identity,
> and objects are all the same concept. (I agree that "pointer"
> connotes a low-level construct, however.)

Unless I'm missing your point, I disagree with your disagreement.  
Mutability only makes sense because of object identity (in the generic 
sense; no OO going on here).  Without object identities, mutability is 
useless.  What's the use of changing something if you're not sure you'll 
ever be able to find it again?

You may limit the scope of object identity arbitrarily, even to the 
point that aliasing is impossible (though with lexical closure, that 
gets even more limiting than it may first appear)... but you're just 
trading off power for simplicity, and the really interesting uses of 
mutations are those that allow access to specific objects from any 
number different bits of code, on a program-wide or at least module-wide 
scope.  Most mediocre programmers could replace assignment with 
recursion if that assignment is limited to local variables of a single 
subroutine.  I don't necessarily agree that the result will be a better 
program despite others' conviction on the matter; however, the 
difference certainly isn't worth complicating the language with mutation 
unless you're willing to allow the interesting uses of mutation as well.

> Mutability by itself does not imply identity. I agree that mutability
> plus identity implies aliasing problems, however.

We might have a terminological issue, then.  I'd tend to say that 
mutability definitely does imply identity, but identity doesn't imply 
aliasing.  Same difference.

-- 
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation
From: Darren New
Subject: Re: What is a type error?
Date: 
Message-ID: <4_vtg.14828$MF6.8524@tornado.socal.rr.com>
Chris Smith wrote:
> Unless I'm missing your point, I disagree with your disagreement.  
> Mutability only makes sense because of object identity (in the generic 
> sense; no OO going on here). 

Depends what you mean by "object".

int x = 6; int y = 5; x = y;

I'd say x was mutable, with no "identity" problems involved?

Why is it problematic that variables have identity and are mutable? 
Certainly I can later "find" whatever value I put into x.

-- 
   Darren New / San Diego, CA, USA (PST)
     This octopus isn't tasty. Too many
     tentacles, not enough chops.
From: Chris Smith
Subject: Re: What is a type error?
Date: 
Message-ID: <MPG.1f207586ed65f730989796@news.altopia.net>
Darren New <····@san.rr.com> wrote:
> Chris Smith wrote:
> > Unless I'm missing your point, I disagree with your disagreement.  
> > Mutability only makes sense because of object identity (in the generic 
> > sense; no OO going on here). 
> 
> Depends what you mean by "object".
> 
> int x = 6; int y = 5; x = y;
> 
> I'd say x was mutable, with no "identity" problems involved?

The variable x definitely has identity that's independent of its value.  
Some might call that a problem in and of itself, as it complicates the 
formal model of the language and makes it difficult to predict what 
result will be produced by normal order evaluation.

On the other hand, this thread seems to be using "identity" to mean 
"identity with potential for aliasing", in which case it is vacuously 
true that eliminating identity also prevents the problems that arise 
from aliasing.  It is true, and I agree on this with Marshall, that 
eliminating the potential for aliasing solves a lot of problems with 
checking invariants.  I also see, though, that the majority (so far, I'd 
say all) of the potential uses for which it's worth introducing mutation 
into an otherwise mutation-free language allow the possibility of 
aliasing, which sorta makes me wonder whether this problem is worth 
solving.  I'd like to see an example of code that would be harder to 
write without mutation, but which can obey any restriction that's 
sufficient to prevent aliasing.

> Why is it problematic that variables have identity and are mutable? 
> Certainly I can later "find" whatever value I put into x.

I simply found the language confusing.  I said it would be nonsensical 
for a language to have mutation without identity.

-- 
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation
From: Marshall
Subject: Re: What is a type error?
Date: 
Message-ID: <1152833948.111493.133950@i42g2000cwa.googlegroups.com>
Chris Smith wrote:
> Darren New <····@san.rr.com> wrote:
> > Chris Smith wrote:
> > > Unless I'm missing your point, I disagree with your disagreement.
> > > Mutability only makes sense because of object identity (in the generic
> > > sense; no OO going on here).
> >
> > Depends what you mean by "object".
> >
> > int x = 6; int y = 5; x = y;
> >
> > I'd say x was mutable, with no "identity" problems involved?
>
> The variable x definitely has identity that's independent of its value.

I'm not sure what you mean by that.

>  I also see, though, that the majority (so far, I'd
> say all) of the potential uses for which it's worth introducing mutation
> into an otherwise mutation-free language allow the possibility of
> aliasing, which sorta makes me wonder whether this problem is worth
> solving.

What about my example of SQL? Mutation, no pointers, no aliasing.
Yet: useful.


Marshall
From: Chris Smith
Subject: Re: What is a type error?
Date: 
Message-ID: <MPG.1f20945225d45ea5989799@news.altopia.net>
Marshall <···············@gmail.com> wrote:
> Chris Smith wrote:
> > Darren New <····@san.rr.com> wrote:
> > > Chris Smith wrote:
> > > > Unless I'm missing your point, I disagree with your disagreement.
> > > > Mutability only makes sense because of object identity (in the generic
> > > > sense; no OO going on here).
> > >
> > > Depends what you mean by "object".
> > >
> > > int x = 6; int y = 5; x = y;
> > >
> > > I'd say x was mutable, with no "identity" problems involved?
> >
> > The variable x definitely has identity that's independent of its value.
> 
> I'm not sure what you mean by that.

I mean, simply, that when you can assign a value to a variable, then you 
care that it is that variable and not a different one.  That's identity 
in the normal sense of the word.  The code elsewhere in the procedure is 
able to access the value of 'x', and that has meaning even though you 
don't know what value 'x' has.  This has definite implications, and is a 
useful concept; for example, it means that the pure lambda calculus no 
longer sufficient to express the semantics of the programming language, 
but instead something else is required.

What you are asking for is some subset of identity, and I've not yet 
succeeded in understanding exactly what it is or what its limits are... 
except that so far, it seems to have everything to do with pointers or 
aliasing.

> >  I also see, though, that the majority (so far, I'd
> > say all) of the potential uses for which it's worth introducing mutation
> > into an otherwise mutation-free language allow the possibility of
> > aliasing, which sorta makes me wonder whether this problem is worth
> > solving.
> 
> What about my example of SQL? Mutation, no pointers, no aliasing.
> Yet: useful.

I'm not yet convinced that this is any different from a language with 
standard pointer aliasing.  If I have two tables A and B, and a foreign 
key from A into B, then I run into the same problems with enforcing 
constraints that I would see with a pointer model... when I update a 
relation, I need to potentially check every other relation that contains 
a foreign key into it, in order to ensure that its constraints are not 
violated by that constraint.  That's the same thing that is being 
pointed out as a negative consequence of aliasing in other languages.  
For example, executing:

    UPDATE P SET x = 5 WHERE y = 43;

may result in the database having to re-evaluate the constraint that 
says that for all P(x, y, z), x must be less than 4 when z = 17.  One 
difference is that while in general purpose programming languages this 
appears to be a daunting task, databases are set up to do these kinds of 
things all the time and contain optimizations for it... but the problem 
is still the same, and it would still present the same difficulties in 
doing formal proofs that running the above UPDATE statement doesn't 
violate any invariants.

(If I'm wrong about that, please let me know; I'd very interested if 
that's so.)

-- 
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation
From: Marshall
Subject: Re: What is a type error?
Date: 
Message-ID: <1152842087.797288.132490@75g2000cwc.googlegroups.com>
Chris Smith wrote:
> Marshall <···············@gmail.com> wrote:
> > Chris Smith wrote:
> > > Darren New <····@san.rr.com> wrote:
> > > > Chris Smith wrote:
> > > > > Unless I'm missing your point, I disagree with your disagreement.
> > > > > Mutability only makes sense because of object identity (in the generic
> > > > > sense; no OO going on here).
> > > >
> > > > Depends what you mean by "object".
> > > >
> > > > int x = 6; int y = 5; x = y;
> > > >
> > > > I'd say x was mutable, with no "identity" problems involved?
> > >
> > > The variable x definitely has identity that's independent of its value.
> >
> > I'm not sure what you mean by that.
>
> I mean, simply, that when you can assign a value to a variable, then you
> care that it is that variable and not a different one.  That's identity
> in the normal sense of the word.

I guess it is, now that you mention it.


> The code elsewhere in the procedure is
> able to access the value of 'x', and that has meaning even though you
> don't know what value 'x' has.  This has definite implications, and is a
> useful concept; for example, it means that the pure lambda calculus no
> longer sufficient to express the semantics of the programming language,
> but instead something else is required.
>
> What you are asking for is some subset of identity, and I've not yet
> succeeded in understanding exactly what it is or what its limits are...
> except that so far, it seems to have everything to do with pointers or
> aliasing.

Perhaps it is specifically first-class identity, rather than identity
per se.


> > >  I also see, though, that the majority (so far, I'd
> > > say all) of the potential uses for which it's worth introducing mutation
> > > into an otherwise mutation-free language allow the possibility of
> > > aliasing, which sorta makes me wonder whether this problem is worth
> > > solving.
> >
> > What about my example of SQL? Mutation, no pointers, no aliasing.
> > Yet: useful.
>
> I'm not yet convinced that this is any different from a language with
> standard pointer aliasing.  If I have two tables A and B, and a foreign
> key from A into B, then I run into the same problems with enforcing
> constraints that I would see with a pointer model... when I update a
> relation, I need to potentially check every other relation that contains
> a foreign key into it, in order to ensure that its constraints are not
> violated by that constraint.  That's the same thing that is being
> pointed out as a negative consequence of aliasing in other languages.

No, that's not the same thing. What you are describing here is
not an aliasing issue, but simply the consequences of allowing
constraints to mention more than one variable.

In our i-and-j example above, suppose there was a constraint
such that i < j. We have to re-check this constraint if we
update either i or j. That's not the same thing as saying
that i and j are aliased.

A foreign key constraint is a multi-variable constraint.
Specifically, a foreign key from table A, attribute a
to table B, attribute b is the constraint:

forall a in A, exists b in B such that a = b.

Note that two variables, A and B, are referenced in
the constraint. In general, any constraint on two
variables will have to be rechecked upon update
to either.


> For example, executing:
>
>     UPDATE P SET x = 5 WHERE y = 43;
>
> may result in the database having to re-evaluate the constraint that
> says that for all P(x, y, z), x must be less than 4 when z = 17.

I don't see any aliasing in this example either.

But consider how much worse this problem is if real aliasing
is possible. We have some pointer variable, and initialize
it with the return value from some function. We don't know
anything about what variable is involved. We update through
the pointer. What constraints must we recheck? Apparently
all of them; unless we have perfect alias analysis, we can't
tell what variables are affected by our update.


> One
> difference is that while in general purpose programming languages this
> appears to be a daunting task, databases are set up to do these kinds of
> things all the time and contain optimizations for it... but the problem
> is still the same, and it would still present the same difficulties in
> doing formal proofs that running the above UPDATE statement doesn't
> violate any invariants.
>
> (If I'm wrong about that, please let me know; I'd very interested if
> that's so.)

I'm interested to hear your reply.


Marshall
From: Chris Smith
Subject: Re: What is a type error?
Date: 
Message-ID: <MPG.1f20b12c658f511098979c@news.altopia.net>
Marshall <···············@gmail.com> wrote:
> > What you are asking for is some subset of identity, and I've not yet
> > succeeded in understanding exactly what it is or what its limits are...
> > except that so far, it seems to have everything to do with pointers or
> > aliasing.
> 
> Perhaps it is specifically first-class identity, rather than identity
> per se.

As in: "the value of one variable can (be/refer to/depend on) the 
identity of another variable"?  I can certainly see this as as 
reasonable concept to consider.

> > I'm not yet convinced that this is any different from a language with
> > standard pointer aliasing.  If I have two tables A and B, and a foreign
> > key from A into B, then I run into the same problems with enforcing
> > constraints that I would see with a pointer model... when I update a
> > relation, I need to potentially check every other relation that contains
> > a foreign key into it, in order to ensure that its constraints are not
> > violated by that constraint.  That's the same thing that is being
> > pointed out as a negative consequence of aliasing in other languages.
> 
> No, that's not the same thing. What you are describing here is
> not an aliasing issue, but simply the consequences of allowing
> constraints to mention more than one variable.

> A foreign key constraint is a multi-variable constraint.
> Specifically, a foreign key from table A, attribute a
> to table B, attribute b is the constraint:
> 
> forall a in A, exists b in B such that a = b.
> 
> Note that two variables, A and B, are referenced in
> the constraint.

There's confusion here coming from different usages of the word 
variable.  Let us talk instead of values, and of the abstract structures 
that gives them meaning.  In both cases (invariants in a hypothetical 
imperative language, and in a relational database), the constraints make 
reference to these structures of values (relations, for example, or 
various kinds of data structures), and not to the individual values or 
objects that they contain.  In both cases, the problem is not that we 
don't know what structures to check to verify the invariant; rather, 
it's that we have to check ALL of the values in that structure.

As someone pointed out, this is to be expected in a world of mutable 
things with identity that are globally locatable.  It is simple fact 
that if I tell you "I spoke to Barbara's husband", you may need to trace 
down who Barbara's husband is before you could discover that, for 
example, maybe I actually spoke to your boss, or to your nephew's best-
friend's father.  If databases are capable of modeling these kinds of 
relationships (and of course they are), then they are as susceptible to 
"aliasing" -- in a logical sense that avoids mention of pointer -- as 
anyone else.

> I don't see any aliasing in this example either.
> 

Actually, this was probably a bad example.  Let's stick to the others 
involving relationships between tuples.

-- 
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation
From: Marshall
Subject: Re: What is a type error?
Date: 
Message-ID: <1152866078.567124.135170@i42g2000cwa.googlegroups.com>
Chris Smith wrote:
> Marshall <···············@gmail.com> wrote:
> > > What you are asking for is some subset of identity, and I've not yet
> > > succeeded in understanding exactly what it is or what its limits are...
> > > except that so far, it seems to have everything to do with pointers or
> > > aliasing.
> >
> > Perhaps it is specifically first-class identity, rather than identity
> > per se.
>
> As in: "the value of one variable can (be/refer to/depend on) the
> identity of another variable"?  I can certainly see this as as
> reasonable concept to consider.

Yes, I think it's important. We've become so accustomed, in
modern languages, to considering variables as first-class,
that we cannot see any more that many of the problems
attributed to variables (such as aliasing) are actually
problems only with *first-class* variables. The historical
reality that non-first-class variables are associated with
more antique languages (Fortran/pre-OO) makes us shy
away from even considering removing first-class status
from variables. But I think it's an important point in the
design space (although I certainly respect Andreas'
reluctance; to paraphrase, "first class or bust." :-) )

After all, what are the alternatives? Purely-functional
languages remove themselves from a large class of
problems that I consider important: data management.
I have explored the OO path to its bitter end and am
convinced it is not the way. So what is left? Uniqueness
types and logic programming, I suppose. I enjoy logic
programming but it doesn't seem quite right. But notice:
no pointers there! And it doesn't seem to suffer from the
lack. I really don't know enough about uniqueness types
to evaluate them. But again, I already have my proposed
solution: variables but no first-class variables. (And yes,
I'm acutely aware of how problematic that makes
closures. :-()


> > > I'm not yet convinced that this is any different from a language with
> > > standard pointer aliasing.  If I have two tables A and B, and a foreign
> > > key from A into B, then I run into the same problems with enforcing
> > > constraints that I would see with a pointer model... when I update a
> > > relation, I need to potentially check every other relation that contains
> > > a foreign key into it, in order to ensure that its constraints are not
> > > violated by that constraint.  That's the same thing that is being
> > > pointed out as a negative consequence of aliasing in other languages.
> >
> > No, that's not the same thing. What you are describing here is
> > not an aliasing issue, but simply the consequences of allowing
> > constraints to mention more than one variable.
>
> > A foreign key constraint is a multi-variable constraint.
> > Specifically, a foreign key from table A, attribute a
> > to table B, attribute b is the constraint:
> >
> > forall a in A, exists b in B such that a = b.
> >
> > Note that two variables, A and B, are referenced in
> > the constraint.
>
> There's confusion here coming from different usages of the word
> variable.  Let us talk instead of values, and of the abstract structures
> that gives them meaning.

I don't think that will work. We cannot discuss constraints, nor
aliasing, without bringing variables in to the picture. (Aliasing
may exist without variables but it is a non-problem then.)


> In both cases (invariants in a hypothetical
> imperative language, and in a relational database), the constraints make
> reference to these structures of values (relations, for example, or
> various kinds of data structures), and not to the individual values or
> objects that they contain.

I am not certain what this means.


> In both cases, the problem is not that we
> don't know what structures to check to verify the invariant; rather,
> it's that we have to check ALL of the values in that structure.

But not all values in all structures, as you do in the presence
of aliasing.


> As someone pointed out, this is to be expected in a world of mutable
> things with identity that are globally locatable.  It is simple fact
> that if I tell you "I spoke to Barbara's husband", you may need to trace
> down who Barbara's husband is before you could discover that, for
> example, maybe I actually spoke to your boss, or to your nephew's best-
> friend's father.  If databases are capable of modeling these kinds of
> relationships (and of course they are), then they are as susceptible to
> "aliasing" -- in a logical sense that avoids mention of pointer -- as
> anyone else.

I don't see that they're the same thing. Similar is some ways, yes,
but not the same.

As I said much earlier, the terminology is problematic, and it may
be that we (or just I) simply lack the precision of terms necessary
for the conversation.

 
Marshall
From: Andreas Rossberg
Subject: Re: What is a type error?
Date: 
Message-ID: <e97nst$a45id$4@hades.rz.uni-saarland.de>
Marshall wrote:
> 
> After all, what are the alternatives? Purely-functional
> languages remove themselves from a large class of
> problems that I consider important: data management.

Maybe, but I have yet to see how second-class variables are really more 
adequate in dealing with it.

And note that even with second-class state you can still have aliasing 
issues - you just need mutable arrays and pass around indices. Keys in 
databases are a more general form of the same problem.

> I have explored the OO path to its bitter end and am
> convinced it is not the way. So what is left? Uniqueness
> types and logic programming, I suppose. I enjoy logic
> programming but it doesn't seem quite right. But notice:
> no pointers there!  And it doesn't seem to suffer from the
> lack.

Uh, aliasing all over the place! Actually, I think that logic 
programming, usually based on deep unification, brings by far the worst 
incarnation of aliasing issues to the table.

- Andreas

-- 
Andreas Rossberg, ········@ps.uni-sb.de
From: Marshall
Subject: Re: What is a type error?
Date: 
Message-ID: <1152890972.511862.231590@m79g2000cwm.googlegroups.com>
Andreas Rossberg wrote:
> Marshall wrote:
> >
> > After all, what are the alternatives? Purely-functional
> > languages remove themselves from a large class of
> > problems that I consider important: data management.
>
> Maybe, but I have yet to see how second-class variables are really more
> adequate in dealing with it.
>
> And note that even with second-class state you can still have aliasing
> issues - you just need mutable arrays and pass around indices. Keys in
> databases are a more general form of the same problem.

So for array a, you would claim that "a[5]" is an alias for
(a part of) "a"? That seems to stretch the idea of aliasing
to me. With these two expressions, it is obvious enough
what is going on; with two arbitrary pointers, it is not.

It seems to me the source of the problem is the opacity
of pointers, but perhaps I am missing something.


> > I have explored the OO path to its bitter end and am
> > convinced it is not the way. So what is left? Uniqueness
> > types and logic programming, I suppose. I enjoy logic
> > programming but it doesn't seem quite right. But notice:
> > no pointers there!  And it doesn't seem to suffer from the
> > lack.
>
> Uh, aliasing all over the place! Actually, I think that logic
> programming, usually based on deep unification, brings by far the worst
> incarnation of aliasing issues to the table.

Hmmm. Can you elaborate just a bit?


Marshall
From: ········@ps.uni-sb.de
Subject: Re: What is a type error?
Date: 
Message-ID: <1152897443.675121.58750@m73g2000cwd.googlegroups.com>
Marshall wrote:
> Andreas Rossberg wrote:
> >
> > And note that even with second-class state you can still have aliasing
> > issues - you just need mutable arrays and pass around indices. Keys in
> > databases are a more general form of the same problem.
>
> So for array a, you would claim that "a[5]" is an alias for
> (a part of) "a"? That seems to stretch the idea of aliasing
> to me.

Not at all, I'd say. In my book, aliasing occurs whenever you have two
different "lvalues" that denote the same mutable cell. I think it would
be rather artificial to restrict the definition to lvalues that happen
to be variables or dereferenced pointers.

So of course, a[i] aliases a[j] when i=j, which in fact is a well-known
problem in some array or matrix routines (e.g. copying array slices
must always consider overlapping slices as special cases).

> > Uh, aliasing all over the place! Actually, I think that logic
> > programming, usually based on deep unification, brings by far the worst
> > incarnation of aliasing issues to the table.
>
> Hmmm. Can you elaborate just a bit?

Logic variables immediately become aliases when you unify them.
Afterwards, after you bind one, you also bind the other - or fail,
because both got already bound the other way round. Unfortunately,
unification also is a deep operation, that is, aliasing can be induced
into variables deeply nested into some terms that happen to get
unified, possibly without you being aware of any of this (the
unification, nor the variable containment). To avoid accidental
aliasing you essentially must keep track of all potential unifications
performed by any given predicate (including transitively, by its
subgoals), which totally runs square to all concerns of modularity and
abstraction.

I've never been much of a logic programmer myself, but our language
group has a dedicated LP and CP background, and people here have
developed very strong opinions about the adequateness of unrestricted
logic variables and particularly unification in practice. I remember
somebody calling Prolog "the worst imperative language ever invented".

- Andreas
From: Zoltan Somogyi
Subject: Re: What is a type error?
Date: 
Message-ID: <44bb15f7$1@news.unimelb.edu.au>
Andreas Rossberg <········@ps.uni-sb.de> writes:
>Uh, aliasing all over the place! Actually, I think that logic 
>programming, usually based on deep unification, brings by far the worst 
>incarnation of aliasing issues to the table.

I agree that deep unification, as implemented in Prolog, brings with it
lots of aliasing problems. However, these problems have been recognized
from the seventies, and people have tried to solve them. There have
been a huge variety of mode systems added to various dialects of Prolog
over the years, which each attempt to tackle (various parts of) the aliasing
problem, none of them fully successfully in my view, since their designers
usually actually *wanted* to retain at least *some* of the expressive power
of the logic variable.

Some non-Prolog logic languages have departed from this approach, the earliest
being the Relational Language. To tie this message to another thread, the
Relational Language had a strict mode system that is as identical as possible
to the notion of typestate in NIL and Hermes given the limitations of
comparing declarative apples with imperative oranges, but significantly
earlier, 1979 vs 1986 IIRC.

Marshall wrote:
>I have explored the OO path to its bitter end and am
>convinced it is not the way. So what is left? Uniqueness
>types and logic programming, I suppose. I enjoy logic
>programming but it doesn't seem quite right. But notice:
>no pointers there!  And it doesn't seem to suffer from the
>lack.

Of course, the main logic programming language today that disallows the use
of unification for arbitrary aliasing is Mercury. It enforces this through
a strong mode system. Our motivation for the design of this mode system
was precisely to eliminate the problems Andreas identified above. However
it has also turned out to be an excellent foundation for Mercury's notion of
unique modes, its equivalent of uniqueness types, which Mercury uses to
express I/O.

Zoltan Somogyi <··@cs.mu.OZ.AU> http://www.cs.mu.oz.au/~zs/
Department of Computer Science and Software Engineering, Univ. of Melbourne
From: Joachim Durchholz
Subject: Re: What is a type error?
Date: 
Message-ID: <e97nn9$6g0$1@online.de>
Marshall schrieb:
> What about my example of SQL? Mutation, no pointers, no aliasing.
> Yet: useful.

Sorry, but SQL does have aliasing.

E.g. if you have records that have name="John", surname="Doe", the 
statements
   SELECT * FROM persons WHERE name = "John"
and
   SELECT * FROM persons WHERE name = "Doe"
are aliases of each other.

The alias is actually in the WHERE clause. And this *can* get you into 
trouble if you have something that does
   UPDATE ... WHERE name = "John"
and
   UPDATE ... WHERE surname = "Doe"
e.g. doing something with the Johns, then updating the names of all 
Does, and finally restoring the Johns (but not recognizing that changing 
the names of all Does might have changed your set of Johns).

Conceptually, this is just the same as having two different access path 
to the same memory cell. Or accessing the same global variable through a 
call-by-reference parameter and via its global name.

BTW with views, you get not just aliasing but what makes aliasing really 
dangerous. Without views, you can simply survey all the queries that you 
are working with and lexically compare table and field names to see 
whether there's aliasing. With views, the names that you see in a 
lexical scope are not enough to determine aliasing.
E.g. if you use a view that builds upon the set of Johns but aren't 
aware of that (possibly due to abstraction barriers), and you change the 
name field of all Does, then you're altering the view without a chance 
to locally catch the bug. That's just as bad as with any pointer 
aliasing problem.

Regards,
Jo
From: Marshall
Subject: Re: What is a type error?
Date: 
Message-ID: <1152890632.892109.61930@i42g2000cwa.googlegroups.com>
Joachim Durchholz wrote:
> Marshall schrieb:
> > What about my example of SQL? Mutation, no pointers, no aliasing.
> > Yet: useful.
>
> Sorry, but SQL does have aliasing.

Well. I suppose we do not have an agreed upon definition
of aliasing, so it is hard to evaluate either way. I would
propose using the same one you used for identity:

if there are two variables and modifying one also modifies
the other, then there is aliasing between them.

I avoided mentioning equality to include, for example,
having an array i that is an alias to a subset of array j.
Please feel free to critque this definition, and/or supply
an alternative.


> E.g. if you have records that have name="John", surname="Doe", the
> statements
>    SELECT * FROM persons WHERE name = "John"
> and
>    SELECT * FROM persons WHERE name = "Doe"
> are aliases of each other.
>
> The alias is actually in the WHERE clause.

Not by my definition, because there is only one variable here.


> And this *can* get you into
> trouble if you have something that does
>    UPDATE ... WHERE name = "John"
> and
>    UPDATE ... WHERE surname = "Doe"
> e.g. doing something with the Johns, then updating the names of all
> Does, and finally restoring the Johns (but not recognizing that changing
> the names of all Does might have changed your set of Johns).

The fact that some person might get confused about the
semantics of what they are doing does not indicate aliasing.
It is easy enough to do an analysis of your updates and
understand what will happen; this same analysis is impossible
with two arbitrary pointers, unless one has a whole-program
trace. That strikes me as a significant difference.


> Conceptually, this is just the same as having two different access path
> to the same memory cell. Or accessing the same global variable through a
> call-by-reference parameter and via its global name.

There are similarities, but they are not the same.


> BTW with views, you get not just aliasing but what makes aliasing really
> dangerous. Without views, you can simply survey all the queries that you
> are working with and lexically compare table and field names to see
> whether there's aliasing. With views, the names that you see in a
> lexical scope are not enough to determine aliasing.
> E.g. if you use a view that builds upon the set of Johns but aren't
> aware of that (possibly due to abstraction barriers), and you change the
> name field of all Does, then you're altering the view without a chance
> to locally catch the bug. That's just as bad as with any pointer
> aliasing problem.

It is certainly aliasing, but I would not call it "just as bad." There
are
elaborate declarative constraint mechanisms in place, for example.
And the definition of the view itsef is a declarative, semantic
entity. Whereas a pointer is an opaque, semantics-free address.


Marshall
From: Joachim Durchholz
Subject: Re: What is a type error?
Date: 
Message-ID: <e993et$jvr$1@online.de>
Marshall schrieb:
> Joachim Durchholz wrote:
>> Marshall schrieb:
>>> What about my example of SQL? Mutation, no pointers, no aliasing.
>>> Yet: useful.
>> Sorry, but SQL does have aliasing.
> 
> Well. I suppose we do not have an agreed upon definition
> of aliasing, so it is hard to evaluate either way. I would
> propose using the same one you used for identity:
> 
> if there are two variables and modifying one also modifies
> the other, then there is aliasing between them.

I think that's an adequate example.
For a definition, I'd say it's a bit too narrow unless we use a fairly 
broad definition for "variable". I.e. in a C context, I'd say that a->b 
is a variable, too, as would be foo(blah)->x.

> I avoided mentioning equality to include, for example,
> having an array i that is an alias to a subset of array j.

This would mean that there's aliasing between, say, a list of 
transaction records and the balance of the account (since modifying the 
list of transactions will change the balance, unless the object isn't 
properly encapsulated).
For purposes of this discussion, it's probably fair to say "that's a 
form of aliasing, too", even though it's quite indirect.

>> E.g. if you have records that have name="John", surname="Doe", the
>> statements
>>    SELECT * FROM persons WHERE name = "John"
>> and
>>    SELECT * FROM persons WHERE name = "Doe"
>> are aliases of each other.

Arrrrrgh... I made a most braindamaged, stupid mistake here. The second 
SELECT should have used the *surname* field, so it would be

 >>    SELECT * FROM persons WHERE surname = "Doe"

Then, if there's a record that has name = "John", surname = "Doe", the 
two WHERE clauses have aliasing in the form of overlapping result sets.

>> The alias is actually in the WHERE clause.
> 
> Not by my definition, because there is only one variable here.

Sorry, my wording was sloppy.

I meant to say that 'in SQL, you identify records via clauses like WHERE 
name = "John", i.e. WHERE clauses are a kind of identity'.

This is still not precise - the identity of an SQL record isn't 
explicitly accessible (except the nonstandard OID facility that some SQL 
engines offer). Nevertheless, they do have an identity, and there's a 
possibility of aliasing - if you change all Johns, you may also change a 
Doe.

>> And this *can* get you into
>> trouble if you have something that does
>>    UPDATE ... WHERE name = "John"
>> and
>>    UPDATE ... WHERE surname = "Doe"
>> e.g. doing something with the Johns, then updating the names of all
>> Does, and finally restoring the Johns (but not recognizing that changing
>> the names of all Does might have changed your set of Johns).
> 
> The fact that some person might get confused about the
> semantics of what they are doing does not indicate aliasing.
> It is easy enough to do an analysis of your updates and
> understand what will happen; this same analysis is impossible
> with two arbitrary pointers, unless one has a whole-program
> trace. That strikes me as a significant difference.

Sure. I said that aliases in SQL aren't as bad as in other programs.

Once you get abstraction mixed in, the analysis becomes less 
straightforward. In the case of SQL, views are such an abstraction 
facility, and they indeed can obscure what you're doing and make the 
analysis more difficult. If it's just SQL we're talking about, you 
indeed have to look at the whole SQL to check whether there's a view 
that may be involved in the queries you're analysing, so the situation 
isn't *that* different from pointers - it's just not a problem because 
the amount of code is so tiny!

>> Conceptually, this is just the same as having two different access path
>> to the same memory cell. Or accessing the same global variable through a
>> call-by-reference parameter and via its global name.
> 
> There are similarities, but they are not the same.

What are the relevant differences? How does the semantics of a WHERE 
clause differ from that of a pointer, in terms of potential aliasing?

My line of argument would be this:
Pointers can be simulated using arrays, so it's no surprise that arrays 
can emulate all the aliasing of pointers.
Arrays can be simulated using WHERE (with SELECT and UPDATE), so I'd say 
that SQL can emulate all the aliasing of arrays, and hence that of pointers.

>> BTW with views, you get not just aliasing but what makes aliasing really
>> dangerous. Without views, you can simply survey all the queries that you
>> are working with and lexically compare table and field names to see
>> whether there's aliasing. With views, the names that you see in a
>> lexical scope are not enough to determine aliasing.
>> E.g. if you use a view that builds upon the set of Johns but aren't
>> aware of that (possibly due to abstraction barriers), and you change the
>> name field of all Does, then you're altering the view without a chance
>> to locally catch the bug. That's just as bad as with any pointer
>> aliasing problem.
> 
> It is certainly aliasing, but I would not call it "just as bad." There
> are
> elaborate declarative constraint mechanisms in place, for example.

Yes, but they can give you unexpected results.
It smells a bit like closing the gates after the horses are out.

On the other hand, *if* there is identity and updates, no matter how we 
handle them, there must be *some* way to deal with the problems that 
ensue. Having declarative constraints doesn't sound like the worst way 
to do that.

> And the definition of the view itsef is a declarative, semantic
> entity. Whereas a pointer is an opaque, semantics-free address.

Oh, that's a very BCPL view. It was still somewhat true for K&R C, but 
it's not really applicable to ANSI C, not at all to C++, and even less 
to languages that associate well-encapsulated types with pointers (such 
as Ada, Eiffel, or Java).
Unless, of course, you mean "address" if you say "pointer" ;-)

Regards,
Jo
From: Andreas Rossberg
Subject: Re: What is a type error?
Date: 
Message-ID: <e95p87$a45id$2@hades.rz.uni-saarland.de>
Marshall wrote:
> 
> However if the mutable types are not first class, then there
> is no way to have the aliasing. Thus, if we do not have pointers
> or objects or identity but retain mutability, there is no aliasing
> problem.

Yes, technically you are right. But this makes a pretty weak notion of 
mutability. All stateful data structures had to stay within their 
lexical scope, and could never be passed to a function. For example, 
this essentially precludes object-oriented programming, because you 
could not have objects with state (the alternative, second class 
objects, would be even less "objective").

Generally, second-classness is an ad-hoc restriction that can work 
around all kinds of problems, but rarely with satisfactory results. So I 
would tend to say that this is not an overly interesting point in the 
design space. But YMMV.

>>In other words, pointers are essentially just an *aspect* of mutability
>>in lower-level languages.
> 
> Again, I disagree: it is posible to have mutability without
> pointers/identity/objects.

OK, if you prefer: it is an aspect of first-class mutability - which is 
present in almost all imperative languages I know. :-)

- Andreas

-- 
Andreas Rossberg, ········@ps.uni-sb.de
From: Darren New
Subject: Re: What is a type error?
Date: 
Message-ID: <g%utg.27635$uy3.27536@tornado.socal.rr.com>
Andreas Rossberg wrote:
> Yes, technically you are right. But this makes a pretty weak notion of 
> mutability. All stateful data structures had to stay within their 
> lexical scope, and could never be passed to a function. 

Not really. The way Hermes handles this is with destructive assignment. 
Each variable holds a value, and you (almost) cannot have multiple 
variables referring to the same value.

If you want to assign Y to X, you use
    X := Y
after which Y is considered to be uninitialized. If you want X and Y to 
have the same value, you use
    X := copy of Y
after which X and Y have the same value but are otherwise unrelated, and 
changes to one don't affect the other.

(As almost an aside, the non-scalar data structures were very similar to 
SQL data tables.)

If you declare (the equivalent to) a function, you can indicate whether 
the paramters matching the arguments are passed destructively, or are 
read-only, or are copy-in-copy-out. So you could declare a function, for 
example, that you pass a table into, and if it's marked as a read-only 
parameter, the compiler ensures the callee does not modify the table and 
the compiler generates code to pass a pointer. One could also mark a 
variable (for example) as uninitialized on entry, intialized on return, 
and uninitalized on the throw of an exception, and this could be used 
(for example) for the read-a-line-from-a-socket routine.

The only value that came close to being shared is an outgoing connection 
to a procedure; the equivalent of the client side of a socket. For 
these, you could make copies, and each copy would point to the same 
receiver. The receiving process could change over time, by passing its 
end of the socket in a message to some other process (live code 
upgrading, for example).

Since everything could be passed as part of a message, including code, 
procedures, tables, and "inports" and "outports" (the ends of sockets), 
I don't see that it had any problems with first classness.

> OK, if you prefer: it is an aspect of first-class mutability - which is 
> present in almost all imperative languages I know. :-)

I disagree. It's entirely possible to make sophisticated imperitive 
languages with assignment and without aliasing.

-- 
   Darren New / San Diego, CA, USA (PST)
     This octopus isn't tasty. Too many
     tentacles, not enough chops.
From: Andreas Rossberg
Subject: Re: What is a type error?
Date: 
Message-ID: <e97n2t$a45id$3@hades.rz.uni-saarland.de>
Darren New wrote:
> Andreas Rossberg wrote:
> 
>> Yes, technically you are right. But this makes a pretty weak notion of 
>> mutability. All stateful data structures had to stay within their 
>> lexical scope, and could never be passed to a function. 
> 
> Not really. The way Hermes handles this is with destructive assignment. 
> Each variable holds a value, and you (almost) cannot have multiple 
> variables referring to the same value.

OK, this is interesting. I don't know Hermes, is this sort of like a 
dynamically checked equivalent of linear or uniqueness typing?

> If you want to assign Y to X, you use
>    X := Y
> after which Y is considered to be uninitialized. If you want X and Y to 
> have the same value, you use
>    X := copy of Y
> after which X and Y have the same value but are otherwise unrelated, and 
> changes to one don't affect the other.

Mh, but if I understand correctly, this seems to require performing a 
deep copy - which is well-known to be problematic, and particularly 
breaks all kinds of abstractions. Not to mention the issue with 
uninitialized variables that I would expect occuring all over the place. 
So unless I'm misunderstanding something, this feels like trading one 
evil for an even greater one.

- Andreas
From: Darren New
Subject: Re: What is a type error?
Date: 
Message-ID: <HoQtg.28241$uy3.27425@tornado.socal.rr.com>
Andreas Rossberg wrote:
> OK, this is interesting. I don't know Hermes, is this sort of like a 
> dynamically checked equivalent of linear or uniqueness typing?

I'm not sure what linear or uniqueness typing is. It's typestate, and if 
I remember correctly the papers I read 10 years ago, the folks at 
TJWatson that invented Hermes also invented the concept of typestate. 
They at least claim to have coined the term.

It's essentially a dataflow analysis that allows you to do the same 
sorts of things that "don't read variables that may not yet have been 
assigned to", except that you could annotate that variables change to 
the state of "uninitialized" after they've already been initialized.

> Mh, but if I understand correctly, this seems to require performing a 
> deep copy - which is well-known to be problematic, and particularly 
> breaks all kinds of abstractions.

Um, no, because there are no aliases. There's only one name for any 
given value, so there's no "deep copy" problems. A deep copy and a 
shallow copy are the same thing. And there are types of values you can 
assign but not copy, such as callmessages (which are problematic to copy 
for the same reason a stack frame would be problematic to copy).

I believe, internally, that there were cases where the copy was "deep" 
and cases where it was "shallow", depending on the surrounding code. 
Making a copy of a table and passing it to another process had to be a 
"deep" copy (given that a column could contain another table, for 
example). Making a copy of a table and using it for read-only purposes 
in the same process would likely make a shallow copy of the table. 
Iterating over a table and making changes during the iteration made a 
copy-on-write subtable, then merged it back into the original table when 
it was done the loop, since the high-level semantic definition of 
looping over a table is that you iterate over a copy of the table.

The only thing close to aliases are references to some other process's 
input ports (i.e., multiple client-side sockets connected to a 
server-side socket). If you want to share data (such as a file system or 
program library), you put the data in a table in a process, and you hand 
out client-side connections to the process. Mostly, you'd define such 
connections as accepting a data value (for the file contents) with the 
parameter being undefined upon return from the call, and the file name 
as being read-only, for example. If you wanted to store the file, you 
could just pass a pointer to its data (in the implementation). If you 
wanted a copy of it, you would either copy it and pass the pointer, or 
you'd pass the pointer with a flag indicating it's copy-on-write, or you 
could pass the pointer and have the caller copy it at some point before 
returning, depending on what the caller did with it. The semantics were 
high-level with the intent to allow the compiler lots of leeway in 
implementation, not unlike SQL.

> Not to mention the issue with 
> uninitialized variables that I would expect occuring all over the place. 

The typestate tracks this, and prevents you from using uninitialized 
variables. If you do a read (say, from a socket) and it throws an "end 
of data" exception, the compiler prevents you from using the buffer you 
just tried but failed to read.

Indeed, that's the very point of it all. By doing this, you can run 
"untrusted" code in the same address space as trusted code, and be 
assured that the compiler will prevent the untrusted code from messing 
up the trusted code. The predecessor of Hermes (NIL) was designed to let 
IBM's customers write efficient networking code and emulations and such 
that ran in IBM's routers, without the need for expensive (in 
performance or money) hardware yet with the safety that they couldn't 
screw up IBM's code and hence cause customer service problems.

> So unless I'm misunderstanding something, this feels like trading one 
> evil for an even greater one.

In truth, it was pretty annoying. But more because you wound up having 
to write extensive declarations and compile the declarations before 
compiling the code that implements them and such. That you didn't get to 
use uninitialized variables was a relatively minor thing, especially 
given that many languages nowadays complain about uninitialized 
variables, dead code, etc. But for lots of types of programs, it let you 
do all kinds of things with a good assurance that they'd work safely and 
efficiently. It was really a language for writing operating systems in, 
when you get right down to it.

-- 
   Darren New / San Diego, CA, USA (PST)
     This octopus isn't tasty. Too many
     tentacles, not enough chops.
From: Chris Smith
Subject: Re: What is a type error?
Date: 
Message-ID: <MPG.1f24867143d04542989694@news.altopia.net>
Darren New <····@san.rr.com> wrote:
> I'm not sure what linear or uniqueness typing is. It's typestate, and if 
> I remember correctly the papers I read 10 years ago, the folks at 
> TJWatson that invented Hermes also invented the concept of typestate. 
> They at least claim to have coined the term.

Coining the term is one thing, but I feel pretty confident that the idea 
was not invented in 1986 with the Hermes language, but rather far 
earlier.  Perhaps they may have invented the concept of considering it 
any different from other applications of types, though.  I still have 
trouble delineating how to consider "typestate" different from ordinary 
types in formal terms, unless I make the formal model complex enough to 
include some kind of programmer-defined identifiers such as variables.  
The distinction is not at all relevant to common type system models 
based on the lambda calculus.

While acknowledging, on the one hand, that the word "typestate" is used 
to describe this, I also object that types have *always* been assigned 
to expressions in differing type environments.  Otherwise, it would be 
impossible to assign types to lambda abstractions in the simply typed 
lambda calculus, the simplest of all generally studied type systems.  
What is being named here is the overcoming of a limitation that 
programming language designers imposed upon themselves, whether from not 
understanding the theoretical research or not believing it important, I 
don't know.

-- 
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation
From: Darren New
Subject: Re: What is a type error?
Date: 
Message-ID: <vcPug.24229$Z67.18188@tornado.socal.rr.com>
Chris Smith wrote:

> Darren New <····@san.rr.com> wrote:
> 
>>I'm not sure what linear or uniqueness typing is. It's typestate, and if 
>>I remember correctly the papers I read 10 years ago, the folks at 
>>TJWatson that invented Hermes also invented the concept of typestate. 
>>They at least claim to have coined the term.
> 
> Coining the term is one thing, but I feel pretty confident that the idea 
> was not invented in 1986 with the Hermes language, but rather far 
> earlier.

Yes. However, the guys who invented Hermes didn't come up with it out of 
the blue. It was around (in NIL - Network Implementation Language) for 
years beforehand. I read papers about these things in graduate school, 
but I don't know where my photocopies are.

NIL was apparently quite successful, but a niche language, designed by 
IBM for programming IBM routers. Hermes was an attempt years later to 
take the same successful formula and turn it into a general-purpose 
programming system, which failed (I believe) for the same reason that a 
general purpose operating system that can't run C programs will fail.

> Perhaps they may have invented the concept of considering it 
> any different from other applications of types, though.  

 From what I can determine, the authors seem to imply that typestate is 
dataflow analysis modified in (at least) two ways:

1) When control flow joins, the new typestate is the intersection of 
typestates coming into the join, where as dataflow analysis doesn't 
guarantee that. (They imply they think dataflow analysis is allowed to 
say "the variable might or might not be initialized here", while 
typestate would ensure the variable is uninitialized.)

2) The user has control over the typestate, so the user can (for 
exmaple) assert a variable is uninitialized at some point, and by doing 
so, make it so.

How this differs from theoretical lambda types and all I couldn't say.

> What is being named here is the overcoming of a limitation that 
> programming language designers imposed upon themselves, whether from not 
> understanding the theoretical research or not believing it important, I 
> don't know.

I believe there's also a certain level of common-senseness needed to 
make a language even marginally popular. :-)  While it's possible that 
there's really no difference between type and typestate at the 
theoretical level, I think most practical programmers would have trouble 
wrapping their head around that, just as programming in an entirely 
recursive pattern when one is used to looping can be disorienting.

-- 
   Darren New / San Diego, CA, USA (PST)
     This octopus isn't tasty. Too many
     tentacles, not enough chops.
From: David Hopwood
Subject: Re: What is a type error?
Date: 
Message-ID: <EnRug.7549$5B3.3215@fe2.news.blueyonder.co.uk>
Darren New wrote:
> From what I can determine, the authors seem to imply that typestate is
> dataflow analysis modified in (at least) two ways:
> 
> 1) When control flow joins, the new typestate is the intersection of
> typestates coming into the join, where as dataflow analysis doesn't
> guarantee that. (They imply they think dataflow analysis is allowed to
> say "the variable might or might not be initialized here", while
> typestate would ensure the variable is uninitialized.)

Right, but this is a disadvantage of their typestate algorithm. It is why
the example in Figure 2 of
<http://www.cs.ubc.ca/local/reading/proceedings/spe91-95/spe/vol25/issue4/spe950wk.pdf>
fails to check, even though it "obviously" initializes all variables.

Consider the equivalent Java program:

public class LoopInitTest {
    public static void main(String[] args) {
        boolean b;
        int i;

        while (true) {
            b = true;
            if (b) {
                v = 1;
            }
            v = v + 1;
        }
    }
}

As it happens, javac rejects this:

LoopInitTest.java:12: variable v might not have been initialized
            v = v + 1;
                ^

but for a different and more trivial reason than the Hermes algorithm.
Change "if (b) { v = 1; }" to just "v = 1;", and the Java version will be
accepted by its definite assignment analysis (which is a dataflow analysis),
but the equivalent Hermes program still would not.

-- 
David Hopwood <····················@blueyonder.co.uk>
From: David Hopwood
Subject: Re: What is a type error? [correction]
Date: 
Message-ID: <4KRug.7756$5B3.7551@fe2.news.blueyonder.co.uk>
David Hopwood wrote:
> Darren New wrote:
> 
>>From what I can determine, the authors seem to imply that typestate is
>>dataflow analysis modified in (at least) two ways:
>>
>>1) When control flow joins, the new typestate is the intersection of
>>typestates coming into the join, where as dataflow analysis doesn't
>>guarantee that. (They imply they think dataflow analysis is allowed to
>>say "the variable might or might not be initialized here", while
>>typestate would ensure the variable is uninitialized.)
> 
> Right, but this is a disadvantage of their typestate algorithm. It is why
> the example in Figure 2 of
> <http://www.cs.ubc.ca/local/reading/proceedings/spe91-95/spe/vol25/issue4/spe950wk.pdf>
> fails to check, even though it "obviously" initializes all variables.
> 
> Consider the equivalent Java program:

I mixed up Figures 1 and 2. Here is the Java program that Figure 2 should
be compared to:

public class LoopInitTest {
    public static String getString() { return "foo"; }

    public static void main(String[] args) {
        String line = getString();
        boolean is_last = false;

        while (!is_last) {
            if (line.charAt(0) == 'q') {
                is_last = true;
            }

            // insert line into inputs (not important for analysis)

            if (!is_last) {
                line = getString();
            }
        }
    }
}

which compiles without error, because is_last is definitely initialized.

-- 
David Hopwood <····················@blueyonder.co.uk>
From: Darren New
Subject: Re: What is a type error? [correction]
Date: 
Message-ID: <QWRug.29144$uy3.1541@tornado.socal.rr.com>
David Hopwood wrote:
> 
> public class LoopInitTest {
>     public static String getString() { return "foo"; }
> 
>     public static void main(String[] args) {
>         String line = getString();
>         boolean is_last = false;
> 
>         while (!is_last) {
>             if (line.charAt(0) == 'q') {
>                 is_last = true;
>             }
> 
>             // insert line into inputs (not important for analysis)
> 
>             if (!is_last) {
>                 line = getString();
>             }
>         }
>     }
> }
> 
> which compiles without error, because is_last is definitely initialized.

At what point do you think is_last or line would seem to not be 
initialized? They're both set at the start of the function, and (given 
that it's Java) nothing can unset them.

At the start of the while loop, it's initialized. At the end of the 
while loop, it's initialized. So the merge point of the while loop has 
it marked as initialized.

Now, if the "insert line into inputs" actually unset "line", then yes, 
you're right, Hermes would complain about this.

Alternately, if you say
    if (x) v = 1;
    if (x) v += 1;
then Hermes would complain when it wouldn't need to. However, that's 
more a limitation of the typestate checking algorithms than the concept 
itself; that is to say, clearly the typestate checker could be made 
sufficiently intelligent to track most simple versions of this problem 
and not complain, by carrying around conditionals in the typestate 
description.

-- 
   Darren New / San Diego, CA, USA (PST)
     This octopus isn't tasty. Too many
     tentacles, not enough chops.
From: Darren New
Subject: Re: What is a type error? [correction]
Date: 
Message-ID: <ggSug.29146$uy3.2489@tornado.socal.rr.com>
Darren New wrote:
> Now, if the "insert line into inputs" actually unset "line", then yes, 
> you're right, Hermes would complain about this.

Oh, I see. You translated from Hermes into Java, and Java doesn't have 
the "insert into" statement. Indeed, the line you commented out is 
*exactly* what's important for analysis, as it unsets line.

Had it been
   insert copy of line into inputs
then you would not have gotten any complaint from Hermes, as it would 
not have unset line there.

In this case, it's equivalent to
if (!is_line) line = getString();
if (!is_line) use line for something...
except the second test is at the top of the loop instead of the bottom.

-- 
   Darren New / San Diego, CA, USA (PST)
     This octopus isn't tasty. Too many
     tentacles, not enough chops.
From: David Hopwood
Subject: Re: What is a type error? [correction]
Date: 
Message-ID: <aLXug.11420$5B3.4643@fe2.news.blueyonder.co.uk>
Darren New wrote:
> David Hopwood wrote:
> 
>> public class LoopInitTest {
>>     public static String getString() { return "foo"; }
>>
>>     public static void main(String[] args) {
>>         String line = getString();
>>         boolean is_last = false;
>>
>>         while (!is_last) {
>>             if (line.charAt(0) == 'q') {
>>                 is_last = true;
>>             }
>>
>>             // insert line into inputs (not important for analysis)
>>
>>             if (!is_last) {
>>                 line = getString();
>>             }
>>         }
>>     }
>> }
>>
>> which compiles without error, because is_last is definitely initialized.
> 
> At what point do you think is_last or line would seem to not be
> initialized? They're both set at the start of the function, and (given
> that it's Java) nothing can unset them.
> 
> At the start of the while loop, it's initialized. At the end of the
> while loop, it's initialized. So the merge point of the while loop has
> it marked as initialized.

Apparently, Hermes (at least the version of it described in that paper)
essentially forgets that is_last has been initialized at the top of the
loop, and so when it does the merge, it is merging 'not necessarily initialized'
with 'initialized'.

This sounds like a pretty easy thing to fix to me (and maybe it was fixed
later, since there are other papers on Hermes' typestate checking that I
haven't read yet).

-- 
David Hopwood <····················@blueyonder.co.uk>
From: Darren New
Subject: Re: What is a type error? [correction]
Date: 
Message-ID: <OM7vg.29227$uy3.13575@tornado.socal.rr.com>
David Hopwood wrote:
> Darren New wrote:
> 
>>David Hopwood wrote:
>>
>>
>>>public class LoopInitTest {
>>>    public static String getString() { return "foo"; }
>>>
>>>    public static void main(String[] args) {
>>>        String line = getString();
>>>        boolean is_last = false;
>>>
>>>        while (!is_last) {
>>>            if (line.charAt(0) == 'q') {
>>>                is_last = true;
>>>            }
>>>
>>>            // insert line into inputs (not important for analysis)
>>>
>>>            if (!is_last) {
>>>                line = getString();
>>>            }
>>>        }
>>>    }
>>>}
>>>
>>>which compiles without error, because is_last is definitely initialized.
>>
>>At what point do you think is_last or line would seem to not be
>>initialized? They're both set at the start of the function, and (given
>>that it's Java) nothing can unset them.
>>
>>At the start of the while loop, it's initialized. At the end of the
>>while loop, it's initialized. So the merge point of the while loop has
>>it marked as initialized.
> 
> 
> Apparently, Hermes (at least the version of it described in that paper)
> essentially forgets that is_last has been initialized at the top of the
> loop, and so when it does the merge, it is merging 'not necessarily initialized'
> with 'initialized'.


No, it's not that it "forgets". It's that the "insert line into inputs" 
unitializes "line". Hence, "line" is only conditionally set at the 
bottom of the loop, so it's only conditionally set at the top of the loop.

> This sounds like a pretty easy thing to fix to me (and maybe it was fixed
> later, since there are other papers on Hermes' typestate checking that I
> haven't read yet).

You simply misunderstand the "insert line into inputs" semantics. Had 
that line actually been commented out in the Hermes code, the loop would 
have compiled without a problem.

That said, in my experience, finding this sort of typestate error 
invariably led me to writing clearer code.

boolean found_ending = false;
while (!found_ending) {
   string line = getString();
   if (line.charAt(0) == 'q')
     found_ending = true;
   else
     insert line into inputs;
}

It seems that's much clearer to me than the contorted version presented 
as the example. If you want it to work like the Java code, where you can 
still access the "line" variable after the loop, the rearrangement is 
trivial and transparent as well.

-- 
   Darren New / San Diego, CA, USA (PST)
     This octopus isn't tasty. Too many
     tentacles, not enough chops.
From: David Hopwood
Subject: Re: What is a type error? [correction]
Date: 
Message-ID: <kJ8vg.2656$q97.1718@fe3.news.blueyonder.co.uk>
Darren New wrote:
> David Hopwood wrote:
> 
[...]
>> Apparently, Hermes (at least the version of it described in that paper)
>> essentially forgets that is_last has been initialized at the top of the
>> loop, and so when it does the merge, it is merging 'not necessarily
>> initialized' with 'initialized'.
> 
> No, it's not that it "forgets". It's that the "insert line into inputs"
> unitializes "line". Hence, "line" is only conditionally set at the
> bottom of the loop, so it's only conditionally set at the top of the loop.
> 
>> This sounds like a pretty easy thing to fix to me (and maybe it was fixed
>> later, since there are other papers on Hermes' typestate checking that I
>> haven't read yet).
> 
> You simply misunderstand the "insert line into inputs" semantics.

Yes, you're right, I did misunderstand this.

> Had that line actually been commented out in the Hermes code, the loop would
> have compiled without a problem.
> 
> That said, in my experience, finding this sort of typestate error
> invariably led me to writing clearer code.
> 
> boolean found_ending = false;
> while (!found_ending) {
>   string line = getString();
>   if (line.charAt(0) == 'q')
>     found_ending = true;
>   else
>     insert line into inputs;
> }
> 
> It seems that's much clearer to me than the contorted version presented
> as the example. If you want it to work like the Java code, where you can
> still access the "line" variable after the loop, the rearrangement is
> trivial and transparent as well.

I agree.

-- 
David Hopwood <····················@blueyonder.co.uk>
From: Joachim Durchholz
Subject: Re: What is a type error?
Date: 
Message-ID: <e95vp9$hln$1@online.de>
Marshall schrieb:
> Mutability by itself does not imply identity.

Well, the implication certainly holds from identity to mutability.
The only definition of identity that I found to hold up for all kinds of 
references (pointers, shared-memory identifiers, URLs etc.) is this:

Two pieces of data are identical if and only if:
a) they are equal
b) they stay equal after applying an arbitrary operation to one of them.

This means that for immutable objects, there's no observable difference 
between equality and identity (which I think it just fine).


For the implicaton from mutability to identity, I'm not sure whether 
talking about mutation still makes sense without some kind of identity. 
For example, you need to establish that the object after the mutation is 
still "the same" in some sense, and this "the same" concept is exactly 
identity.

 > I agree that mutability
> plus identity implies aliasing problems, however.

Then we're agreeing about the most important point anyway.

>> In other words, pointers are essentially just an *aspect* of mutability
>> in lower-level languages.
> 
> Again, I disagree: it is posible to have mutability without
> pointers/identity/objects.

I'm sceptical.
Any examples?

Regards,
Jo
From: Marshall
Subject: Re: What is a type error?
Date: 
Message-ID: <1152820077.044047.228350@p79g2000cwp.googlegroups.com>
Joachim Durchholz wrote:
> Marshall schrieb:
> > Mutability by itself does not imply identity.
>
> Well, the implication certainly holds from identity to mutability.
> The only definition of identity that I found to hold up for all kinds of
> references (pointers, shared-memory identifiers, URLs etc.) is this:
>
> Two pieces of data are identical if and only if:
> a) they are equal
> b) they stay equal after applying an arbitrary operation to one of them.
>
> This means that for immutable objects, there's no observable difference
> between equality and identity (which I think it just fine).

Agreed on all counts.


> For the implicaton from mutability to identity, I'm not sure whether
> talking about mutation still makes sense without some kind of identity.
> For example, you need to establish that the object after the mutation is
> still "the same" in some sense, and this "the same" concept is exactly
> identity.

Unless we have some specific mechanism to make two named variables
have the same identity, (distinct from having the same value), then
there
is no aliasing. Pointers or references is one such mechanism; lexical
closures over variables is another. (I don't know of any others.)


>  > I agree that mutability
> > plus identity implies aliasing problems, however.
>
> Then we're agreeing about the most important point anyway.

Yes.


> >> In other words, pointers are essentially just an *aspect* of mutability
> >> in lower-level languages.
> >
> > Again, I disagree: it is posible to have mutability without
> > pointers/identity/objects.
> 
> I'm sceptical.
> Any examples?

See next post.


Marshall
From: Joachim Durchholz
Subject: Re: What is a type error?
Date: 
Message-ID: <e9581q$6od$1@online.de>
Marshall schrieb:
> Joachim Durchholz wrote:
>> Marshall schrieb:
>>> Joachim Durchholz wrote:
>>>> Marshall schrieb:
>>>>> I can see the lack of a formal model being an issue, but is the
>>>>> imperative bit really all that much of an obstacle? How hard
>>>>> is it really to deal with assignment? Or does the issue have
>>>>> more to do with pointers, aliasing, etc.?
>>>> Actually aliasing is *the* hard issue.
>>> Okay, sure. Nice explanation.
>>>
>>> But one minor point: you describe this as an issue with "imperative"
>>> languages. But aliasing is a problem associated with pointers,
>>> not with assignment.
>> Aliasing is not a problem if the aliased data is immutable.
> 
> Okay, sure. But for the problem you describe, both imperativeness
> and the presence of pointers is each necessary but not sufficient;
> it is the two together that causes the problem. So it strikes
> me (again, a very minor point) as inaccurate to describe this as
> a problem with imperative languages per se.

Sure.
It's just that I know that it's viable to give up destructive updates. 
Giving up pointers is a far more massive restriction.

> Right. To me the response to this clear: give up pointers. Imperative
> operations are too useful to give up; indeed they are a requirement
> for certain problems.

I don't know any.
In some cases, you need an additional level of conceptual indirection - 
instead of *doing* the updates, you write a function that *describes* them.

 > Pointers on the other hand add nothing except
> efficiency and a lot of confusion. They should be considered an
> implementation technique only, hidden behind some pointerless
> computational model.
> 
> I recognize that this is likely to be a controversial opinion.

Indeed.

In fact "modes" are a way to restrict pointer aliasing.

> I heartily support immutability as the default, for these and other
> reasons.

OK, then we're in agreement here.

>> Some functional languages restrict assignment so that there can exist at
>> most a single reference to any mutable data structure. That way, there's
>> still no aliasing problems, but you can still update in place where it's
>> really, really necessary.
> 
> Are we speaking of uniqueness types now? I haven't read much about
> them, but it certainly seems like an intriguing concept.

Yup.
It's called "modes" in some other languages (Mercury or Clean IIRC).

>> I know of no professional language that doesn't have references of some
>> kind.
> 
> Ah, well. I suppose I could mention prolog or mercury, but then
> you used that troublesome word "professional." So I will instead
> mention a language which, if one goes by number of search results
> on hotjobs.com for "xxx progammer" for different value of xxx, is
> more popular than Java and twice as popular as C++. It lacks
> pointers (although I understand they are beginning to creep in
> in the latest version of the standard.) It also posesses a quite
> sophisticated set of facilities for declarative integrity constraints.
> Yet for some reason it is typically ignored by language designers.
> 
> http://hotjobs.yahoo.com/jobseeker/jobsearch/search_results.html?keywords_all=sql+programmer

Oh, right. SQL is an interesting case of getting all the problems of 
pointers without having them ;-)

Actually SQL has references - they are called "primary keys", but they 
are references nevertheless. (Some SQL dialects also offer synthetic 
"ID" fields that are guaranteed to remain stable over the lifetime of a 
record. Seems like SQL is imperative enough that programmers want this, 
else the SQL vendors wouldn't have added the feature...)
SQL also has updates.
The result: updates with undefined semantics. E.g. if you have a numeric 
key field, UPDATE commands that increment the key by 1 will fail or work 
depending on the (unspecified) order in which UPDATE touches the 
records. You can have even more fun with updatable views.
With a "repeatable read" isolation level, you actually return to a 
declarative view of the database: whatever you do with it, you won't see 
it until you commit the transaction. (As soon as you commit, the 
declarative peace is over and you better watch out that your data 
doesn't become inconsistent due to aliasing.)


Aliasing isn't really related to specific programming practices. If two 
accountants chat, and one talks about the hot blonde secretaire and the 
other about his adorable wife, you can imagine the awkwardness that 
ensues as soon as they find out they're talking about the same person!
The only thing that can really be done about it is not adding it 
artificially into a program. In those cases where aliasing is part of 
the modelled domain, you really have to carefully inspect all 
interactions and never, never, never dream about abstracting it away.


Regards,
Jo
From: Marshall
Subject: Re: What is a type error?
Date: 
Message-ID: <1152805549.184625.117520@s13g2000cwa.googlegroups.com>
Joachim Durchholz wrote:
> Marshall schrieb:
> > Joachim Durchholz wrote:
> >> Marshall schrieb:
> >>> Joachim Durchholz wrote:
> >>>> Marshall schrieb:
> >>>>> I can see the lack of a formal model being an issue, but is the
> >>>>> imperative bit really all that much of an obstacle? How hard
> >>>>> is it really to deal with assignment? Or does the issue have
> >>>>> more to do with pointers, aliasing, etc.?
> >>>> Actually aliasing is *the* hard issue.
> >>> Okay, sure. Nice explanation.
> >>>
> >>> But one minor point: you describe this as an issue with "imperative"
> >>> languages. But aliasing is a problem associated with pointers,
> >>> not with assignment.
> >> Aliasing is not a problem if the aliased data is immutable.
> >
> > Okay, sure. But for the problem you describe, both imperativeness
> > and the presence of pointers is each necessary but not sufficient;
> > it is the two together that causes the problem. So it strikes
> > me (again, a very minor point) as inaccurate to describe this as
> > a problem with imperative languages per se.
>
> Sure.
> It's just that I know that it's viable to give up destructive updates.
> Giving up pointers is a far more massive restriction.

Oddly, I feel the opposite. While it's true there are many domains
for which purely functional programming is a fine choice, there
are some domains for which it is insufficient. Any kind of data
managament, for example, requires that you be able to update
the information.

On the other hand, there is no problem domain for which pointers
are a requirement. I agree they are deucedly convenient, though.


> > Right. To me the response to this clear: give up pointers. Imperative
> > operations are too useful to give up; indeed they are a requirement
> > for certain problems.
>
> I don't know any.
> In some cases, you need an additional level of conceptual indirection -
> instead of *doing* the updates, you write a function that *describes* them.

But then what do you do with that function? Let's say I have an
employee database. John Smith just got hired on 1/1/2006 with
a salary of $10,000. I need to record this fact somewhere. How
do I do that without variables? Current-employees is a variable.
Even if I have the space to keep all historical data, so I'm not
deleting anything, I still have to have a variable for the latest
version of the accumulated data. I can solve this without
pointers, but I can't solve it without variables.


>  > Pointers on the other hand add nothing except
> > efficiency and a lot of confusion. They should be considered an
> > implementation technique only, hidden behind some pointerless
> > computational model.
> >
> > I recognize that this is likely to be a controversial opinion.
>
> Indeed.
>
> In fact "modes" are a way to restrict pointer aliasing.

I should like to learn more about these. I have some vague
perception of the existence of linear logic, but not much
else. However, I also already have an excellent solution
to the pointer aliasing problem, so I'm less motivated.


> > I heartily support immutability as the default, for these and other
> > reasons.
>
> OK, then we're in agreement here.
>
> >> Some functional languages restrict assignment so that there can exist at
> >> most a single reference to any mutable data structure. That way, there's
> >> still no aliasing problems, but you can still update in place where it's
> >> really, really necessary.
> >
> > Are we speaking of uniqueness types now? I haven't read much about
> > them, but it certainly seems like an intriguing concept.
>
> Yup.
> It's called "modes" in some other languages (Mercury or Clean IIRC).

Cool.


> >> I know of no professional language that doesn't have references of some
> >> kind.
> >
> > Ah, well. I suppose I could mention prolog or mercury, but then
> > you used that troublesome word "professional." So I will instead
> > mention a language which, if one goes by number of search results
> > on hotjobs.com for "xxx progammer" for different value of xxx, is
> > more popular than Java and twice as popular as C++. It lacks
> > pointers (although I understand they are beginning to creep in
> > in the latest version of the standard.) It also posesses a quite
> > sophisticated set of facilities for declarative integrity constraints.
> > Yet for some reason it is typically ignored by language designers.
> >
> > http://hotjobs.yahoo.com/jobseeker/jobsearch/search_results.html?keywords_all=sql+programmer
>
> Oh, right. SQL is an interesting case of getting all the problems of
> pointers without having them ;-)

Oh, pooh. SQL has plenty of problems, sure, but the problems
of pointers are not among its faults.


> Actually SQL has references - they are called "primary keys", but they
> are references nevertheless.

I strongly object; this is quite incorrect. I grant you that from the
50,000 foot level they appear identical, but they are not. To
qualify as a reference, there need to be reference and dereference
operations on the reference datatype; there is no such operation
is SQL.

Would you say the relational algebra has references?

Or, consider the classic prolog ancestor query. Let's say we're
setting up as follows

father(bob, joe).
father(joe, john).

Is "joe" a reference, here? After all, when we do the ancestor
query for john, we'll see his father is joe and then use that to
find joe's father. Keys in SQL are isomorphic to joe in the
above prolog.


> (Some SQL dialects also offer synthetic
> "ID" fields that are guaranteed to remain stable over the lifetime of a
> record.

Primary keys are updatable; there is nothing special about them.


> Seems like SQL is imperative enough that programmers want this,
> else the SQL vendors wouldn't have added the feature...)

I believe you are making a statement about the general level of
education among the user base of data base management systems,
and not a statement about the nature of the relational algebra.


> SQL also has updates.

Yes; SQL is imperative. But no pointers and thus no aliasing.
Plenty of *other* problems, though; no argument there!


> The result: updates with undefined semantics. E.g. if you have a numeric
> key field, UPDATE commands that increment the key by 1 will fail or work
> depending on the (unspecified) order in which UPDATE touches the
> records.

This does not sound correct to me, and in any event does not
appear to illustrate anything about aliasing.


> You can have even more fun with updatable views.

I suppose. Views are something for which the practice has
rushed ahead of the theoretical foundation out of need. The
"right" way to do views is not yet known. Again: plenty of
problems with SQL, but no aliasing. (Actually, there probably
is aliasing with SQL99, since IIUC they've gone ahead and
introduced reference types. (Cue Charlton Heston on the beach
saying "You Maniacs! You blew it up! Ah, damn you!"))


> With a "repeatable read" isolation level, you actually return to a
> declarative view of the database: whatever you do with it, you won't see
> it until you commit the transaction. (As soon as you commit, the
> declarative peace is over and you better watch out that your data
> doesn't become inconsistent due to aliasing.)

Alas, transaction isolation levels are a performance hack.
I cannot defend them on any logical basis. (Aside: did you mean
serializable, rather than repeatable read?)


> Aliasing isn't really related to specific programming practices. If two
> accountants chat, and one talks about the hot blonde secretaire and the
> other about his adorable wife, you can imagine the awkwardness that
> ensues as soon as they find out they're talking about the same person!

Heh, that's not aliasing. That's the undecidability of intentional
function equivalence. <joke>


> The only thing that can really be done about it is not adding it
> artificially into a program. In those cases where aliasing is part of
> the modelled domain, you really have to carefully inspect all
> interactions and never, never, never dream about abstracting it away.

Yes, aliasing introduces a lot of problems. This is one reason
why closures make me nervous.


Marshall
From: Rob Warnock
Subject: Re: What is a type error?
Date: 
Message-ID: <BJ-dnZPb69C_SivZnZ2dnUVZ_oydnZ2d@speakeasy.net>
Marshall <···············@gmail.com> wrote:
+---------------
| Joachim Durchholz wrote:
| > Actually SQL has references - they are called "primary keys", but they
| > are references nevertheless.
| 
| I strongly object; this is quite incorrect. I grant you that from the
| 50,000 foot level they appear identical, but they are not.
+---------------

Agreed. The only thing different about "primary" keys from any other
key is uniqueness -- a selection by primary key will return only one
record. Other than that constraint, many databases treat them exactly
the same as non-primary keys [e.g., can form indexes on them, etc.].

+---------------
| To qualify as a reference, there need to be reference and dereference
| operations on the reference datatype; there is no such operation is SQL.
+---------------

Not in "ANSI SQL92", say, but there might be in most SQL databases!
[See below re OIDs. Also, SQL:1999 had a "REF" type that was essentially
and OID.]

+---------------
| Would you say the relational algebra has references?
+---------------

Don't confuse "SQL" & "relational algebra"!! You'll get real
relational algebraists *way* bent out of shape if you do that!

+---------------
| > (Some SQL dialects also offer synthetic "ID" fields that are
| > guaranteed to remain stable over the lifetime of a record.
| 
| Primary keys are updatable; there is nothing special about them.
+---------------

I think he's probably talking about "OIDs" (object IDs). Most
current SQL-based databases provide them, usually as a normally-
invisible "system column" that doesn't show up when you say
"SELECT * FROM", but that *does* appear if you say "SELECT oid, *",
and may be used as a "primary" key even on tables with no actual
primary key:

    rpw3=# select * from toy limit 4;
       c1   |  c2   |               c3               | upd 
    --------+-------+--------------------------------+-----
     fall   | tape  | My Favorite Thanksgiving       |  16
     xmas   | book  | My Favorite Christmas          |   2
     xmas   | video | The Grinch who Stole Christmas |   4
     summer | book  | Unusual 4ths of July           |  17
    (4 rows)

    rpw3=# select oid, * from toy limit 4;
      oid  |   c1   |  c2   |               c3               | upd 
    -------+--------+-------+--------------------------------+-----
     19997 | fall   | tape  | My Favorite Thanksgiving       |  16
     19998 | xmas   | book  | My Favorite Christmas          |   2
     19999 | xmas   | video | The Grinch who Stole Christmas |   4
     20000 | summer | book  | Unusual 4ths of July           |  17
    (4 rows)

    rpw3=# select * from toy where oid = 19998;
      c1  |  c2  |          c3           | upd 
      ------+------+-----------------------+-----
       xmas | book | My Favorite Christmas |   2
       (1 row)

    rpw3=# insert into toy values ('fall','book','Glory Road');
    INSERT 32785 1

    rpw3=# select oid, * from toy where oid = 32785;
      oid  |  c1  |  c2  |     c3     | upd 
    -------+------+------+------------+-----
     32785 | fall | book | Glory Road |  21
     (1 row)

    rpw3=# 

See <http://www.postgresql.org/docs/8.1/static/datatype-oid.html>
for how PostgreSQL treats OIDs [including some critical limitations].


-Rob

-----
Rob Warnock			<····@rpw3.org>
627 26th Avenue			<URL:http://rpw3.org/>
San Mateo, CA 94403		(650)572-2607
From: George Neuner
Subject: Re: What is a type error?
Date: 
Message-ID: <0ideb2pm854tbmaqf30fnnr7e8jtd9hi0k@4ax.com>
On 13 Jul 2006 08:45:49 -0700, "Marshall" <···············@gmail.com>
wrote:
>
>On the other hand, there is no problem domain for which pointers
>are a requirement. I agree they are deucedly convenient, though.
>

I would argue that pointers/references _are_ a requirement for I/O.  I
know of no workable method for interpreting raw bits as meaningful
data other than to overlay a typed template upon them.

Categorically disallowing address manipulation functionally cripples
the language because an important class of programs (system programs)
cannot be written.

Of course, languages can go overboard the other way too.  IMO, C did
not need to provide address arithmetic at the language level,
reinterpretable references and array indexing would have sufficed for
any use.  Modula 3's type safe view is an example of getting it right.

It is quite reasonable to say "I don't write _____ so I don't need
[whatever language feature enables writing it]".  It is important,
however, to be aware of the limitation and make your choice
deliberately.


George
--
for email reply remove "/" from address
From: Marshall
Subject: Re: What is a type error?
Date: 
Message-ID: <1152864196.285749.113580@m73g2000cwd.googlegroups.com>
George Neuner wrote:
> On 13 Jul 2006 08:45:49 -0700, "Marshall" <···············@gmail.com>
> wrote:
> >
> >On the other hand, there is no problem domain for which pointers
> >are a requirement. I agree they are deucedly convenient, though.
> >
>
> I would argue that pointers/references _are_ a requirement for I/O.  I
> know of no workable method for interpreting raw bits as meaningful
> data other than to overlay a typed template upon them.

I think I know what you mean. I agree that pointers are necessary
for, e.g., device drivers. So I have to weaken my earlier statement.


> Categorically disallowing address manipulation functionally cripples
> the language because an important class of programs (system programs)
> cannot be written.

That's fair, although I could argue how important systems programming
is these days. (And C/C++ are cock-of-the-walk there anyway.)


> Of course, languages can go overboard the other way too.  IMO, C did
> not need to provide address arithmetic at the language level,
> reinterpretable references and array indexing would have sufficed for
> any use.  Modula 3's type safe view is an example of getting it right.
>
> It is quite reasonable to say "I don't write _____ so I don't need
> [whatever language feature enables writing it]".  It is important,
> however, to be aware of the limitation and make your choice
> deliberately.

Agreed.


Marshall
From: Joachim Durchholz
Subject: Re: What is a type error?
Date: 
Message-ID: <e97usd$ipd$1@online.de>
Marshall schrieb:
> Joachim Durchholz wrote:
>> It's just that I know that it's viable to give up destructive updates.
>> Giving up pointers is a far more massive restriction.
> 
> Oddly, I feel the opposite. While it's true there are many domains
> for which purely functional programming is a fine choice, there
> are some domains for which it is insufficient. Any kind of data
> managament, for example, requires that you be able to update
> the information.

Sure, the data itself cannot be managed non-destructively.

However, the programs that operate on the data need not do destructive 
updates internally. That way, I can have pointers inside the programs 
without giving up aliasing.
(Aliasing is really, really, really important. It's the reason why 
mathematicians can handle infinite objects - they copy just the alias, 
i.e. the name or description, instead of the object itself. It's one of 
the things that make abstraction useful.)

>>> Right. To me the response to this clear: give up pointers. Imperative
>>> operations are too useful to give up; indeed they are a requirement
>>> for certain problems.
>> I don't know any.
>> In some cases, you need an additional level of conceptual indirection -
>> instead of *doing* the updates, you write a function that *describes* them.
> 
> But then what do you do with that function?

I pass it to an engine that's imperative.

However, that engine doesn't need to do aliasing anymore.

In other words, I'm separating mutability and aliasing, so that they 
can't combine and explode.

 > Let's say I have an
> employee database. John Smith just got hired on 1/1/2006 with
> a salary of $10,000. I need to record this fact somewhere. How
> do I do that without variables? Current-employees is a variable.
> Even if I have the space to keep all historical data, so I'm not
> deleting anything, I still have to have a variable for the latest
> version of the accumulated data. I can solve this without
> pointers, but I can't solve it without variables.

As I said elsewhere, the record has an identity even though it isn't 
explicit in SQL.
(SQL's data manipulation language is generally not considered half as 
"clean" as the query language; I think the core of the problemsis that 
DML combines mutability and identity.)

> I should like to learn more about these. I have some vague
> perception of the existence of linear logic, but not much
> else. However, I also already have an excellent solution
> to the pointer aliasing problem, so I'm less motivated.

Pointers are just the most obvious form of aliasing. As I said 
elsewhere, there are others: array indexing, by-reference parameters, or 
even WHERE clauses in SQL statements. In fact, anything that selects an 
updatable piece of data from a collection is an alias, and incurs the 
same problems as pointers, though they may come in different disguises.

>> Actually SQL has references - they are called "primary keys", but they
>> are references nevertheless.
> 
> I strongly object; this is quite incorrect. I grant you that from the
> 50,000 foot level they appear identical, but they are not. To
> qualify as a reference, there need to be reference and dereference
> operations on the reference datatype; there is no such operation
> is SQL.
> 
> Would you say the relational algebra has references?

Yes.

> Or, consider the classic prolog ancestor query. Let's say we're
> setting up as follows
> 
> father(bob, joe).
> father(joe, john).
> 
> Is "joe" a reference, here? After all, when we do the ancestor
> query for john, we'll see his father is joe and then use that to
> find joe's father. Keys in SQL are isomorphic to joe in the
> above prolog.

Agreed. They are both references ;-P

>> (Some SQL dialects also offer synthetic
>> "ID" fields that are guaranteed to remain stable over the lifetime of a
>> record.
> 
> Primary keys are updatable; there is nothing special about them.

Right - I was wrong with identifying keys and identities. In fact the 
identity of an SQL record cannot be directly obtained (unless via those 
ID fields).
The records still have identities. It's possible to have two WHERE 
clauses that refer to the same record, and if you update the record 
using one WHERE clause, the record returned by the other WHERE clause 
will have changed, too.

>> With a "repeatable read" isolation level, you actually return to a
>> declarative view of the database: whatever you do with it, you won't see
>> it until you commit the transaction. (As soon as you commit, the
>> declarative peace is over and you better watch out that your data
>> doesn't become inconsistent due to aliasing.)
> 
> Alas, transaction isolation levels are a performance hack.
> I cannot defend them on any logical basis. (Aside: did you mean
> serializable, rather than repeatable read?)

Possibly. There are so many isolation levels that I have to look them up 
whenever I want to get the terminology 100% correct.

>> The only thing that can really be done about it is not adding it
>> artificially into a program. In those cases where aliasing is part of
>> the modelled domain, you really have to carefully inspect all
>> interactions and never, never, never dream about abstracting it away.
> 
> Yes, aliasing introduces a lot of problems. This is one reason
> why closures make me nervous.

Me too.
On the other hand, they are so immensely useful.
Simply don't create closures with mutable data in them. Or, to the very 
least, make sure that no aliases outside the closure exist.

Regards,
Jo
From: Marshall
Subject: Re: What is a type error?
Date: 
Message-ID: <1152996978.533372.223000@b28g2000cwb.googlegroups.com>
Joachim Durchholz wrote:
> Marshall schrieb:
>
> >> In some cases, you need an additional level of conceptual indirection -
> >> instead of *doing* the updates, you write a function that *describes* them.
> >
> > But then what do you do with that function?
>
> I pass it to an engine that's imperative.
>
> However, that engine doesn't need to do aliasing anymore.
>
> In other words, I'm separating mutability and aliasing, so that they
> can't combine and explode.

This would seem to be identical to what I am proposing.


>  > Let's say I have an
> > employee database. John Smith just got hired on 1/1/2006 with
> > a salary of $10,000. I need to record this fact somewhere. How
> > do I do that without variables? Current-employees is a variable.
> > Even if I have the space to keep all historical data, so I'm not
> > deleting anything, I still have to have a variable for the latest
> > version of the accumulated data. I can solve this without
> > pointers, but I can't solve it without variables.
>
> As I said elsewhere, the record has an identity even though it isn't
> explicit in SQL.

Hmmmm. What can this mean?

In general, I feel that "records" are not the right conceptual
level to think about.

In any event, I am not sure what you mean by non-explicit
identity. I would say, records in SQL have value, and their
identity is exactly their value. I do not see that they have
any identity outside of their value. We can uniquely identify
any particular record via a key, but a table may have more
than one key, and an update may change the values of one
key but not another. So it is not possible in general to
definitely and uniquely assign a mapping from each record
of a table after an update to each record of the table before
the update, and if you can't do that, then where
is the record identity?

(The situation is even more dire when one considers that
SQL actually uses bag semantics and not set semantics.)


[...]

> > I should like to learn more about these. I have some vague
> > perception of the existence of linear logic, but not much
> > else. However, I also already have an excellent solution
> > to the pointer aliasing problem, so I'm less motivated.
>
> Pointers are just the most obvious form of aliasing. As I said
> elsewhere, there are others: array indexing, by-reference parameters, or
> even WHERE clauses in SQL statements. In fact, anything that selects an
> updatable piece of data from a collection is an alias, and incurs the
> same problems as pointers, though they may come in different disguises.

Okay. At this point, though, the term aliasing has become extremely
general. I believe "i+1+1" is an alias for "i+2" under this definition.
That is so general that I am concerned it has lost its ability to
identify problems specific to pointers.


> >> Actually SQL has references - they are called "primary keys", but they
> >> are references nevertheless.
> >
> > I strongly object; this is quite incorrect. I grant you that from the
> > 50,000 foot level they appear identical, but they are not. To
> > qualify as a reference, there need to be reference and dereference
> > operations on the reference datatype; there is no such operation
> > is SQL.
> >
> > Would you say the relational algebra has references?
>
> Yes.
>
> > Or, consider the classic prolog ancestor query. Let's say we're
> > setting up as follows
> >
> > father(bob, joe).
> > father(joe, john).
> >
> > Is "joe" a reference, here? After all, when we do the ancestor
> > query for john, we'll see his father is joe and then use that to
> > find joe's father. Keys in SQL are isomorphic to joe in the
> > above prolog.
>
> Agreed. They are both references ;-P

Again, by generalizing the term this far, I am concerned with a
loss of precision. If "joe" in the prolog is a references, then
"reference" is just a term for "data" that is being used in a
certain way. The conection with a specfic address space
has been lost in favor of the full domain of the datatype.


> >> (Some SQL dialects also offer synthetic
> >> "ID" fields that are guaranteed to remain stable over the lifetime of a
> >> record.
> >
> > Primary keys are updatable; there is nothing special about them.
>
> Right - I was wrong with identifying keys and identities. In fact the
> identity of an SQL record cannot be directly obtained (unless via those
> ID fields).
> The records still have identities. It's possible to have two WHERE
> clauses that refer to the same record, and if you update the record
> using one WHERE clause, the record returned by the other WHERE clause
> will have changed, too.

Is this any different from saying that an expression that includes
a variable will produce a different value if the variable changes?

In other words, "i+1" will return a different value if we update i.

It seems odd to me to suggest that "i+1" has identity. I can see
that i has identity, but I would say that "i+1" has only value.

But perhaps the ultimate upshoot of this thread is that my use
of terminology is nonstandard.


> >> With a "repeatable read" isolation level, you actually return to a
> >> declarative view of the database: whatever you do with it, you won't see
> >> it until you commit the transaction. (As soon as you commit, the
> >> declarative peace is over and you better watch out that your data
> >> doesn't become inconsistent due to aliasing.)
> >
> > Alas, transaction isolation levels are a performance hack.
> > I cannot defend them on any logical basis. (Aside: did you mean
> > serializable, rather than repeatable read?)
>
> Possibly. There are so many isolation levels that I have to look them up
> whenever I want to get the terminology 100% correct.

Hmmm. Is it that there are so many, or that they are simply not
part of our daily concern? It seems to me there are more different
styles of parameter passing than there are isolation levels, but
I don't usually see (competent) people (such as yourself) getting
call-by-value confused with call-by-reference.


> >> The only thing that can really be done about it is not adding it
> >> artificially into a program. In those cases where aliasing is part of
> >> the modelled domain, you really have to carefully inspect all
> >> interactions and never, never, never dream about abstracting it away.
> >
> > Yes, aliasing introduces a lot of problems. This is one reason
> > why closures make me nervous.
>
> Me too.
> On the other hand, they are so immensely useful.
> Simply don't create closures with mutable data in them. Or, to the very
> least, make sure that no aliases outside the closure exist.

Thank you for the interesting exchange.


Marshall
From: Joachim Durchholz
Subject: Re: What is a type error?
Date: 
Message-ID: <e9bqvr$jt$1@online.de>
Marshall schrieb:
> Joachim Durchholz wrote:
>> As I said elsewhere, the record has an identity even though it isn't
>> explicit in SQL.
> 
> Hmmmm. What can this mean?
> 
> In general, I feel that "records" are not the right conceptual
> level to think about.

They are, when it comes to aliasing of mutable data. I think it's 
justified by the fact that aliased mutable data has a galling tendency 
to break abstraction barriers. (More on this on request.)

> In any event, I am not sure what you mean by non-explicit
> identity.

The identity isn't visible from inside SQL. (Unless there's an OID 
facility available, which *is* an explicit identity.)

 > I would say, records in SQL have value, and their
> identity is exactly their value.

Definitely not. You can have two equal records and update just one of 
them, yielding non-equal records; by my definition (and by intuition), 
this means that the records were equal but not identical.

 > I do not see that they have
> any identity outside of their value. We can uniquely identify
> any particular record via a key, but a table may have more
> than one key, and an update may change the values of one
> key but not another. So it is not possible in general to
> definitely and uniquely assign a mapping from each record
> of a table after an update to each record of the table before
> the update, and if you can't do that, then where
> is the record identity?

Such a mapping is indeed possible. Simply extend the table with a new 
column, number the columns consecutively, and identify the records via 
that column.

But even if you don't do that, there's still identity. It is irrelevant 
whether the programs can directly read the value of the identity field; 
the adverse effects happen because updates are in-place. (If every 
update generated a new record, then we'd indeed have no identity.)

> Okay. At this point, though, the term aliasing has become extremely
> general. I believe "i+1+1" is an alias for "i+2" under this definition.

No, "i+1+1" isn't an alias in itself. It's an expression - to be an 
alias, it would have to be a reference to something.

However, a[i+1+1] is an alias to a[i+2]. Not that this is particularly 
important - 1+1 is replacable by 2 in every context, so this is 
essentially the same as saying "a[i+2] is an alias of a[i+2]", which is 
vacuously true.

There's another aspect here. If two expressions are always aliases to 
the same mutable, that's usually easy to determine; this kind of 
aliasing is usually not much of a problem.
What's more of a problem are those cases where there's occasional 
aliasing. I.e. a[i] and a[j] may or may not be aliases of each other, 
depending on the current value of i and j, and *that* is a problem 
because the number of code paths to be tested doubles. It's even more of 
a problem because testing with random data will usually not uncover the 
case where the aliasing actually happens; you have to go around and 
construct test cases specifically for the code paths that have aliasing. 
Given that references may cross abstraction barriers (actually that's 
often the purpose of constructing a reference in the first place), this 
means you have to look for your test cases at multiple levels of 
software abstraction, and *that* is really, really bad.

> That is so general that I am concerned it has lost its ability to
> identify problems specific to pointers.

If the reference to "pointers" above means "references", then I don't 
know about any pointer problems that cannot be reproduced, in one form 
or the other, in any of the other aliasing mechanisms.

> Again, by generalizing the term this far, I am concerned with a
> loss of precision. If "joe" in the prolog is a references, then
> "reference" is just a term for "data" that is being used in a
> certain way. The conection with a specfic address space
> has been lost in favor of the full domain of the datatype.

Aliasing is indeed a more general idea that goes beyond address spaces.

However, identity and aliasing can be defined in fully abstract terms, 
so I welcome this opportunity to get rid of a too-concrete model.

>> The records still have identities. It's possible to have two WHERE
>> clauses that refer to the same record, and if you update the record
>> using one WHERE clause, the record returned by the other WHERE clause
>> will have changed, too.
> 
> Is this any different from saying that an expression that includes
> a variable will produce a different value if the variable changes?

Yes.
Note that the WHERE clause properly includes array indexing (just set up 
a table that has continuous numeric primary key, and a single other column).

I.e. I'm not talking about how a[i] is an alias of a[i+1] after updating 
i, I'm talking about how a[i] may be an alias of a[j].

> It seems odd to me to suggest that "i+1" has identity.

It doesn't (unless it's passed around as a closure, but that's 
irrelevant to this discussion).
"i" does have identity. "a[i]" does have identity. "a[i+1]" does have 
identity.
Let me say that for purposes of this discussion, if it can be assigned 
to (or otherwise mutated), it has identity. (We *can* assign identity to 
immutable things, but it's equivalent to equality and not interesting 
for this discussion.)

 > I can see
> that i has identity, but I would say that "i+1" has only value.

Agreed.

> But perhaps the ultimate upshoot of this thread is that my use
> of terminology is nonstandard.

It's somewhat restricted, but not really nonstandard.

>> Possibly. There are so many isolation levels that I have to look them up
>> whenever I want to get the terminology 100% correct.
> 
> Hmmm. Is it that there are so many, or that they are simply not
> part of our daily concern?

I guess it's the latter. IIRC there are four or five isolation levels.

 > It seems to me there are more different
> styles of parameter passing than there are isolation levels, but
> I don't usually see (competent) people (such as yourself) getting
> call-by-value confused with call-by-reference.

Indeed.
Though the number of parameter passing mechanisms isn't that large 
anyway. Off the top of my head, I could recite just three (by value, by 
reference, by name aka lazy), the rest are just combinations with other 
concepts (in/out/through, most notably) or a mapping to implementation 
details (by reference vs. "pointer by value" in C++, for example).

Regards,
Jo
From: Marshall
Subject: Re: What is a type error?
Date: 
Message-ID: <1153017376.379831.306420@35g2000cwc.googlegroups.com>
Joachim Durchholz wrote:
> Marshall schrieb:
> > Joachim Durchholz wrote:
> >> As I said elsewhere, the record has an identity even though it isn't
> >> explicit in SQL.
> >
> > Hmmmm. What can this mean?
> >
> > In general, I feel that "records" are not the right conceptual
> > level to think about.
>
> They are, when it comes to aliasing of mutable data. I think it's
> justified by the fact that aliased mutable data has a galling tendency
> to break abstraction barriers. (More on this on request.)

I can see how pointers, or other kinds of aliasing
(by my more restricted definition) break abstraction barries;
it is hard to abstract something that can change out from
under you uncontrollably.


> > In any event, I am not sure what you mean by non-explicit
> > identity.
>
> The identity isn't visible from inside SQL. (Unless there's an OID
> facility available, which *is* an explicit identity.)

Agreed about OIDs.


>  > I would say, records in SQL have value, and their
> > identity is exactly their value.
>
> Definitely not. You can have two equal records and update just one of
> them, yielding non-equal records; by my definition (and by intuition),
> this means that the records were equal but not identical.

This sort of thing comes up when one has made the mistake of
not defining any keys on one's table and needs to rectify the
situation. It is in fact considered quite a challenge to do.
My preferred technique for such a situation is to create
a new table with the same columns and SELECT DISTINCT ...
INTO ... then recreate the original table. So that way doesn't
fit your model.

How else might you do it? I suppose there are nonstandard
extensions, such as UPDATE ... LIMIT. And in fact there
might be some yucky thing that made it in to the standard
that will work.

But what you descrbe is certainly *not* possible in the
relational algebra; alas that SQL doesn't hew closer
to it. Would you agree? Would you also agree that
if a language *did* adhere to relation semantics,
then relation elements would *not* have identity?
(Under such a language, one could not have two
equal records, for example.)

[The above paragraph is the part of this post that I
really care about; feel free to ignore the rest if it's naive
or annoying or whatever, but please do respond to this.]


>  > I do not see that they have
> > any identity outside of their value. We can uniquely identify
> > any particular record via a key, but a table may have more
> > than one key, and an update may change the values of one
> > key but not another. So it is not possible in general to
> > definitely and uniquely assign a mapping from each record
> > of a table after an update to each record of the table before
> > the update, and if you can't do that, then where
> > is the record identity?
>
> Such a mapping is indeed possible. Simply extend the table with a new
> column, number the columns consecutively, and identify the records via
> that column.

That doesn't work for me. It is one thing to say that for all
tables T, for all elements e in T, e has identity. It is a different
thing to say that for all tables T, there exists a table T' such
that for all elements e in T', e has identity.


> But even if you don't do that, there's still identity. It is irrelevant
> whether the programs can directly read the value of the identity field;
> the adverse effects happen because updates are in-place. (If every
> update generated a new record, then we'd indeed have no identity.)
>
> > Okay. At this point, though, the term aliasing has become extremely
> > general. I believe "i+1+1" is an alias for "i+2" under this definition.
>
> No, "i+1+1" isn't an alias in itself. It's an expression - to be an
> alias, it would have to be a reference to something.
>
> However, a[i+1+1] is an alias to a[i+2]. Not that this is particularly
> important - 1+1 is replacable by 2 in every context, so this is
> essentially the same as saying "a[i+2] is an alias of a[i+2]", which is
> vacuously true.

To me, the SQL with the where clause is like i+2, not like a[2].
A query is simply an expression that makes mention of a variable.
However I would have to admit that if SQL table elements have
identity, it would be more like the array example.

(The model of SQL I use in my head is one with set semantics,
and I am careful to avoid running in to bag semantics by, for
example, always defining at least one key, and using SELECT
DISTINCT where necessary. But it is true that the strict set
semantics in my head are not the wobbly bag semantics in
the standard.)


> There's another aspect here. If two expressions are always aliases to
> the same mutable, that's usually easy to determine; this kind of
> aliasing is usually not much of a problem.
> What's more of a problem are those cases where there's occasional
> aliasing. I.e. a[i] and a[j] may or may not be aliases of each other,
> depending on the current value of i and j, and *that* is a problem
> because the number of code paths to be tested doubles. It's even more of
> a problem because testing with random data will usually not uncover the
> case where the aliasing actually happens; you have to go around and
> construct test cases specifically for the code paths that have aliasing.
> Given that references may cross abstraction barriers (actually that's
> often the purpose of constructing a reference in the first place), this
> means you have to look for your test cases at multiple levels of
> software abstraction, and *that* is really, really bad.
>
> > That is so general that I am concerned it has lost its ability to
> > identify problems specific to pointers.
>
> If the reference to "pointers" above means "references", then I don't
> know about any pointer problems that cannot be reproduced, in one form
> or the other, in any of the other aliasing mechanisms.

It seems we agree that, for example, raw C pointers can easily
cause aliasing problems. To me, Java references, which are much
more tightly controlled, are still subject to aliasing problems,
although
in practice the situation is significantly less severe. (It is possible
to have multiple references to a single Java mutable object and
not know it.) I would assume the same is posible with, say, SML
refs.

However the situation in SQL strikes me as being different.
If one is speaking of base tables, and one has two queries,
one has everything one needs to know simply looking at
the queries. If views are involved, one must further examine
the definition of the view.

For example, we might ask in C, if we update a[i], will
the value of b[j] change? And we can't say, because
we don't know if there is aliasing. But in Fortran, we
can say that it won't, because they are definitely
different variables. But even so in Fortran, if we ask
if we update a[i], will a[j] change, and we know we
can't tell unless we know the values of i and j.

In SQL, if we update table T, will a SELECT on table
S change? We know it won't; they are different variables.
But if we update T, will a select on T change? It
certainly might, but I wouldn't call it aliasing, because
the two variables involved are the same variable, T.


> > Again, by generalizing the term this far, I am concerned with a
> > loss of precision. If "joe" in the prolog is a references, then
> > "reference" is just a term for "data" that is being used in a
> > certain way. The conection with a specfic address space
> > has been lost in favor of the full domain of the datatype.
>
> Aliasing is indeed a more general idea that goes beyond address spaces.
>
> However, identity and aliasing can be defined in fully abstract terms,
> so I welcome this opportunity to get rid of a too-concrete model.
>
> >> The records still have identities. It's possible to have two WHERE
> >> clauses that refer to the same record, and if you update the record
> >> using one WHERE clause, the record returned by the other WHERE clause
> >> will have changed, too.
> >
> > Is this any different from saying that an expression that includes
> > a variable will produce a different value if the variable changes?
>
> Yes.
> Note that the WHERE clause properly includes array indexing (just set up
> a table that has continuous numeric primary key, and a single other column).

I am still not convinced. It is not sufficient to show how sometimes
you can establish identity, or how you could establish identity based
on some related table; it must be shown how the identity can be
established in general, and I don't think it can. If it can't, then
I claim we are back to expressions that include a variable, and
not aliasing, even by the wider definition.



> I.e. I'm not talking about how a[i] is an alias of a[i+1] after updating
> i, I'm talking about how a[i] may be an alias of a[j].
>
> > It seems odd to me to suggest that "i+1" has identity.
>
> It doesn't (unless it's passed around as a closure, but that's
> irrelevant to this discussion).
> "i" does have identity. "a[i]" does have identity. "a[i+1]" does have
> identity.
> Let me say that for purposes of this discussion, if it can be assigned
> to (or otherwise mutated), it has identity. (We *can* assign identity to
> immutable things, but it's equivalent to equality and not interesting
> for this discussion.)
>
>  > I can see
> > that i has identity, but I would say that "i+1" has only value.
>
> Agreed.

Cool.


> > But perhaps the ultimate upshoot of this thread is that my use
> > of terminology is nonstandard.
>
> It's somewhat restricted, but not really nonstandard.

Whew!


> >> Possibly. There are so many isolation levels that I have to look them up
> >> whenever I want to get the terminology 100% correct.
> >
> > Hmmm. Is it that there are so many, or that they are simply not
> > part of our daily concern?
>
> I guess it's the latter. IIRC there are four or five isolation levels.

Yes, four.


>  > It seems to me there are more different
> > styles of parameter passing than there are isolation levels, but
> > I don't usually see (competent) people (such as yourself) getting
> > call-by-value confused with call-by-reference.
>
> Indeed.
> Though the number of parameter passing mechanisms isn't that large
> anyway. Off the top of my head, I could recite just three (by value, by
> reference, by name aka lazy), the rest are just combinations with other
> concepts (in/out/through, most notably) or a mapping to implementation
> details (by reference vs. "pointer by value" in C++, for example).

Fair enough. I could only remember three, but I was sure someone
else could name more. :-)


Marshall
From: Joachim Durchholz
Subject: Re: What is a type error?
Date: 
Message-ID: <e9cvhg$lrv$1@online.de>
Marshall schrieb:
> Joachim Durchholz wrote:
>> Marshall schrieb:
>>> I would say, records in SQL have value, and their
>>> identity is exactly their value.
 >>
>> Definitely not. You can have two equal records and update just one of
>> them, yielding non-equal records; by my definition (and by intuition),
>> this means that the records were equal but not identical.
> 
> This sort of thing comes up when one has made the mistake of
> not defining any keys on one's table and needs to rectify the
> situation. It is in fact considered quite a challenge to do.
> My preferred technique for such a situation is to create
> a new table with the same columns and SELECT DISTINCT ...
> INTO ... then recreate the original table. So that way doesn't
> fit your model.

I was mentioning multiple equal records just as an example where the 
records have an identity that's independent of the record values.

> How else might you do it? I suppose there are nonstandard
> extensions, such as UPDATE ... LIMIT. And in fact there
> might be some yucky thing that made it in to the standard
> that will work.

Right, it might be difficult to update multiple records that are exactly 
the same. It's not what SQL was intended for, and difficult to do anyway 
- I was thinking of LIMIT (I wasn't aware that it's nonstandard), and I 
agree that there may be other ways to do it.

However, I wouldn't overvalue that case. The Jane/John Doe example 
posted elsewhere in this thread is strictly within the confines of what 
SQL was built for, yet there is aliasing.

> But what you descrbe is certainly *not* possible in the
> relational algebra; alas that SQL doesn't hew closer
> to it. Would you agree?

Yup, SQL (particularly its update semantics) aren't relational semantics.
Still, it's SQL we've been talking about.

 > Would you also agree that
> if a language *did* adhere to relation semantics,
> then relation elements would *not* have identity?
> (Under such a language, one could not have two
> equal records, for example.)

Any identity that one could define within relational semantics would be 
just equality, so no, it wouldn't be very helpful (unless as part of a 
greater framework).
However, that's not really a surprise. Aliasing becomes relevant (even 
observable) only when there's updates, and relational algebra doesn't 
have updates.

>>  > I do not see that they have
>>> any identity outside of their value. We can uniquely identify
>>> any particular record via a key, but a table may have more
>>> than one key, and an update may change the values of one
>>> key but not another. So it is not possible in general to
>>> definitely and uniquely assign a mapping from each record
>>> of a table after an update to each record of the table before
>>> the update, and if you can't do that, then where
>>> is the record identity?
>> Such a mapping is indeed possible. Simply extend the table with a new
>> column, number the columns consecutively, and identify the records via
>> that column.
> 
> That doesn't work for me. It is one thing to say that for all
> tables T, for all elements e in T, e has identity. It is a different
> thing to say that for all tables T, there exists a table T' such
> that for all elements e in T', e has identity.

Let me continue that argument:
Records in T' have identity.
 From an SQL point-of-view, there's no difference between the records in 
T' and records in other tables, so they must share all properties, in 
particular that of having an identity.

(I agree that's not as convincing as seeing aliasing in action. See below.)

>> But even if you don't do that, there's still identity. It is irrelevant
>> whether the programs can directly read the value of the identity field;
>> the adverse effects happen because updates are in-place. (If every
>> update generated a new record, then we'd indeed have no identity.)
>>
>>> Okay. At this point, though, the term aliasing has become extremely
>>> general. I believe "i+1+1" is an alias for "i+2" under this definition.
>> No, "i+1+1" isn't an alias in itself. It's an expression - to be an
>> alias, it would have to be a reference to something.
>>
>> However, a[i+1+1] is an alias to a[i+2]. Not that this is particularly
>> important - 1+1 is replacable by 2 in every context, so this is
>> essentially the same as saying "a[i+2] is an alias of a[i+2]", which is
>> vacuously true.
> 
> To me, the SQL with the where clause is like i+2, not like a[2].

I'd say WHERE is like [i+2]: neither is valid on its own, it's the 
"selector" part of a reference into a table.

> A query is simply an expression that makes mention of a variable.
> However I would have to admit that if SQL table elements have
> identity, it would be more like the array example.

OK.

> (The model of SQL I use in my head is one with set semantics,
> and I am careful to avoid running in to bag semantics by, for
> example, always defining at least one key, and using SELECT
> DISTINCT where necessary. But it is true that the strict set
> semantics in my head are not the wobbly bag semantics in
> the standard.)

Set-vs.-bag doesn't affect SQL's aliasing in general, it was a specific 
property of the example that I chose. So you don't have to worry wether 
that might be the cause of the misunderstanding.

>>> That is so general that I am concerned it has lost its ability to
>>> identify problems specific to pointers.
 >>
>> If the reference to "pointers" above means "references", then I don't
>> know about any pointer problems that cannot be reproduced, in one form
>> or the other, in any of the other aliasing mechanisms.
> 
> It seems we agree that, for example, raw C pointers can easily
> cause aliasing problems. To me, Java references, which are much
> more tightly controlled, are still subject to aliasing problems,
> although
> in practice the situation is significantly less severe. (It is possible
> to have multiple references to a single Java mutable object and
> not know it.)

Exactly.

 > I would assume the same is posible with, say, SML refs.

So do I.

> However the situation in SQL strikes me as being different.
> If one is speaking of base tables, and one has two queries,
> one has everything one needs to know simply looking at
> the queries. If views are involved, one must further examine
> the definition of the view.

Sure. Usually, SQL itself is enough abstract that no additional 
abstraction happens; so even if you have aliasing, you usually know 
about it.
I.e. when compared to what you were saying about Java: "It is possible
to have multiple references to a single Java mutable object and
not know it.", the "and not know it" part is usually missing.

The picture changes as soon as the SQL statements get hidden behind 
layers of abstraction, say result sets and prepared query handles.

... Um... let me also restate the preconditions for aliasing become a 
problem. You need three elements for that:
1) Identities that can be stored in multiple places.
2) Updates.
3) Abstraction (of updatable entities).

Without identities, there cannot be aliasing.
Without updates, the operation that makes aliasing problematic is excluded.
Without abstraction, you may have aliasing but can clearly identify it, 
and don't have a problem.

I think the roadblock you're hitting is that you keep looking for 
nonlocal trouble to identify spots where SQL might have aliasing; since 
SQL isn't used with abstraction often, you come up empty and conclude 
there's no aliasing.
My position is that not all aliasing is evil, and that there's a 
technical definition: two program entities (l-expressions in C parlance) 
are aliases if changing the referenced data through one also changes the 
other.

> For example, we might ask in C, if we update a[i], will
> the value of b[j] change? And we can't say, because
> we don't know if there is aliasing. But in Fortran, we
> can say that it won't, because they are definitely
> different variables.

Even that may be the case.
There's an EQUIVALENCE statement that can explicitly alias array sections.
Also, I suspect that newer versions of Fortran have reference parameters.

 > But even so in Fortran, if we ask
> if we update a[i], will a[j] change, and we know we
> can't tell unless we know the values of i and j.

In fact, that's the usual form of aliasing in Fortran.

> In SQL, if we update table T, will a SELECT on table
> S change? We know it won't; they are different variables.
> But if we update T, will a select on T change? It
> certainly might, but I wouldn't call it aliasing, because
> the two variables involved are the same variable, T.

I fail to see how this is different from the aliasing in a[i] and a[j].

It's just the same kind of aliasing that I have with the statements
   SELECT * FROM T WHERE key = $i
and
   SELECT * FROM T WHERE key = $j
(where $i and $j are placeholders for variables from program variables).

After running
   UPDATE T SET f = 5 WHERE key = $i
not only the first SELECT may give different results, the second may, 
too (in those cases where $i and $j happen to have the same value).


I can also construct pointer-like aliasing in SQL. If I have
   SELECT * FROM employees WHERE country = "US"
and
   SELECT * FROM employees WHERE name = "Doe"
then somebody with a grudge against all US employees running
   UPDATE employees SET salary = 0 WHERE country = "US"
will (on purpose or not) also modify the results of the second query.

Not a problem while all three queries are in plain view, but imagine 
them wrapped in some opaque object or behind a module interface.

> I am still not convinced. It is not sufficient to show how sometimes
> you can establish identity, or how you could establish identity based
> on some related table; it must be shown how the identity can be
> established in general, and I don't think it can. If it can't, then
> I claim we are back to expressions that include a variable, and
> not aliasing, even by the wider definition.

See above.

I admit that identity cannot be reliably established in SQL. The 
identity of a record isn't available (unless via OID extensions).
However, it is there and it can have effect.

Translated to real life, "my best friend" and "my employer" may be 
referring to the same identity, even if these descriptions may change in 
the future.

>>  > It seems to me there are more different
>>> styles of parameter passing than there are isolation levels, but
>>> I don't usually see (competent) people (such as yourself) getting
>>> call-by-value confused with call-by-reference.
>> Indeed.
>> Though the number of parameter passing mechanisms isn't that large
>> anyway. Off the top of my head, I could recite just three (by value, by
>> reference, by name aka lazy), the rest are just combinations with other
>> concepts (in/out/through, most notably) or a mapping to implementation
>> details (by reference vs. "pointer by value" in C++, for example).
> 
> Fair enough. I could only remember three, but I was sure someone
> else could name more. :-)

I got a private mail naming more parameter passing mechanisms. It's a 
bit difficult define what's a parameter passing mechanism and what's 
just an attribute for such a mechanism. If you count all combinations as 
an extra mechanism, you can easily get dozens.

Regards,
Jo
From: Marshall
Subject: Re: What is a type error?
Date: 
Message-ID: <1153070922.854289.270130@75g2000cwc.googlegroups.com>
Joachim Durchholz wrote:
> Marshall schrieb:
>
> > But what you descrbe is certainly *not* possible in the
> > relational algebra; alas that SQL doesn't hew closer
> > to it. Would you agree?
>
> Yup, SQL (particularly its update semantics) aren't relational semantics.
> Still, it's SQL we've been talking about.
>
>  > Would you also agree that
> > if a language *did* adhere to relation semantics,
> > then relation elements would *not* have identity?
> > (Under such a language, one could not have two
> > equal records, for example.)
>
> Any identity that one could define within relational semantics would be
> just equality, so no, it wouldn't be very helpful (unless as part of a
> greater framework).
> However, that's not really a surprise. Aliasing becomes relevant (even
> observable) only when there's updates, and relational algebra doesn't
> have updates.

Good point. Perhaps I should have said "relational algebra +
variables with assignment." It is interesting to consider
assignment vs. the more restricted update operators: insert,
update, delete. I think I see what you mean about the restricted
update operators working on identity.


> >>> Okay. At this point, though, the term aliasing has become extremely
> >>> general. I believe "i+1+1" is an alias for "i+2" under this definition.
> >> No, "i+1+1" isn't an alias in itself. It's an expression - to be an
> >> alias, it would have to be a reference to something.
> >>
> >> However, a[i+1+1] is an alias to a[i+2]. Not that this is particularly
> >> important - 1+1 is replacable by 2 in every context, so this is
> >> essentially the same as saying "a[i+2] is an alias of a[i+2]", which is
> >> vacuously true.
> >
> > To me, the SQL with the where clause is like i+2, not like a[2].
>
> I'd say WHERE is like [i+2]: neither is valid on its own, it's the
> "selector" part of a reference into a table.

Is it possible you're being distracted by the syntax? WHERE is a
binary operation taking a relation and a filter function. I don't
think of filters as being like array indexing; do they appear
analogous to you? (Always a difficult question because often
times A and B share some qualities and not others, and
what does one call that?)


> > However the situation in SQL strikes me as being different.
> > If one is speaking of base tables, and one has two queries,
> > one has everything one needs to know simply looking at
> > the queries. If views are involved, one must further examine
> > the definition of the view.
>
> Sure. Usually, SQL itself is enough abstract that no additional
> abstraction happens; so even if you have aliasing, you usually know
> about it.
> I.e. when compared to what you were saying about Java: "It is possible
> to have multiple references to a single Java mutable object and
> not know it.", the "and not know it" part is usually missing.

Yes!


> The picture changes as soon as the SQL statements get hidden behind
> layers of abstraction, say result sets and prepared query handles.
>
> ... Um... let me also restate the preconditions for aliasing become a
> problem. You need three elements for that:
> 1) Identities that can be stored in multiple places.
> 2) Updates.
> 3) Abstraction (of updatable entities).
>
> Without identities, there cannot be aliasing.
> Without updates, the operation that makes aliasing problematic is excluded.
> Without abstraction, you may have aliasing but can clearly identify it,
> and don't have a problem.

Aha! That description works very well for me.


> I admit that identity cannot be reliably established in SQL. The
> identity of a record isn't available (unless via OID extensions).
> However, it is there and it can have effect.

Yes, I think I see what you mean now. This is in part a consequence
of the restricted update operators.

Thanks,

Marshall
From: Chris Smith
Subject: Re: What is a type error?
Date: 
Message-ID: <MPG.1f2436ff5a3b6dec98968b@news.altopia.net>
Marshall <···············@gmail.com> wrote:
> Is it possible you're being distracted by the syntax? WHERE is a
> binary operation taking a relation and a filter function. I don't
> think of filters as being like array indexing; do they appear
> analogous to you? (Always a difficult question because often
> times A and B share some qualities and not others, and
> what does one call that?)

This is definitely a good question.  Nevertheless, I think we're fooling 
ourselves if we decide that we've made progress by defining part of the 
problem out of existence.  The original problem was that when there are 
several ways of getting access to something that is identical (rather 
than just equal), this causes problems for invariant checking.  These 
several ways of accessing data are being called 'aliasing' -- I have no 
comment on whether that's standard usage or not, because I don't know.  
I suspect that aliasing is normally used for something closer to the 
physical level.  The point is that whatever "aliasing" really means, 
what we're discussing is having two paths by which one may access 
identical data.

So there are infinitely complex ways in which this can occur.  I can 
have pointers that are aliased.  I can have integers, which by 
themselves are just plain values, but can alias each other as indices 
into an array.  I can have strings that do the same thing for an 
associative array.  I can, of course, have arbitrarily more complex data 
structures with exactly the same issue.  I can also have two different 
WHERE clauses, which when applied to the same relation yield exactly the 
same set of tuples.  The problem arises when code is written to update 
some value (or set of tuples, in the SQL case, since definitions differ 
there) using one pointer/int/string/WHERE clause/etc, and at the same 
time an invariant was written using the other pointer/int/WHERE/etc, the 
result is that either of:

a) The compiler has to be able to prove that the two are not aliased

b) The compiler has to re-evaluate the invariant from scratch when this 
operation is performed.

(Yeah, there's actually a whole continuum between a and b, based on more 
limited abilities of the compiler to prove things.  I'm oversimplifying, 
and I think it's rather benign in this case.)

So in this sense, a WHERE clause is a binary operation in exactly the 
same way that an array indexing expression is a binary operation, and 
both are capable of aliasing in at least this logical sense.

Now it's certainly true that languages with unrestricted pointers are 
less successful at limiting the scope of re-evaluation of invariants.  
In the array or SQL relation cases, there's a limited set of value/tuple 
modifications that might cause the invariant to need rechecking 
(specifically those that point to the same array, or to the relation or 
one of its views).  A language with unrestricted untyped pointers, if it 
doesn't get lucky with the capture analysis, may have to re-evaluate all 
invariants in the application on any given assignment.  Nevertheless, 
the re-evaluation of the invariant still needs to be done.

-- 
Chris Smith - Lead Software Developer /Technical Trainer
MindIQ Corporation
From: Joachim Durchholz
Subject: Re: What is a type error?
Date: 
Message-ID: <e9e64h$vsa$1@online.de>
Marshall schrieb:
> 
> Good point. Perhaps I should have said "relational algebra +
> variables with assignment." It is interesting to consider
> assignment vs. the more restricted update operators: insert,
> update, delete.

Actually I see it the other way round: assignment is strictly less 
powerful than DML since it doesn't allow creating or destroying 
variables, while UPDATE does cover assignment to fields.
(However, it's usually new+assignment+delete vs. INSERT+UPDATE+DELETE, 
at which point there is not much of a difference.)

>>>>> Okay. At this point, though, the term aliasing has become extremely
>>>>> general. I believe "i+1+1" is an alias for "i+2" under this definition.
>>>> No, "i+1+1" isn't an alias in itself. It's an expression - to be an
>>>> alias, it would have to be a reference to something.
>>>>
>>>> However, a[i+1+1] is an alias to a[i+2]. Not that this is particularly
>>>> important - 1+1 is replacable by 2 in every context, so this is
>>>> essentially the same as saying "a[i+2] is an alias of a[i+2]", which is
>>>> vacuously true.
 >>>
>>> To me, the SQL with the where clause is like i+2, not like a[2].
>> I'd say WHERE is like [i+2]: neither is valid on its own, it's the
>> "selector" part of a reference into a table.
> 
> Is it possible you're being distracted by the syntax?

I think that's very, very unlikely ;-)

 > WHERE is a
> binary operation taking a relation and a filter function. I don't
> think of filters as being like array indexing; do they appear
> analogous to you?

Yes.

Filters are just like array indexing: both select a subset of variables 
from a collection. In SQL, you select a subset of a table, in a 
programming language, you select a subset of an array.

(The SQL selection mechanism is far more flexible in the kinds of 
filtering you can apply, while array indexing allows filtering just by 
ordinal position. However, the relevant point is that both select things 
that can be updated.)

>> I admit that identity cannot be reliably established in SQL. The
>> identity of a record isn't available (unless via OID extensions).
>> However, it is there and it can have effect.
> 
> Yes, I think I see what you mean now. This is in part a consequence
> of the restricted update operators.

I don't think so. You can update SQL records any way you want.
The unavailability of a record identity is due to the fact that, well, 
it's unavailable in standard SQL.

Regards,
Jo
From: Marshall
Subject: Re: What is a type error?
Date: 
Message-ID: <1153091295.520674.287230@i42g2000cwa.googlegroups.com>
Joachim Durchholz wrote:
> Marshall schrieb:
> >
> > Good point. Perhaps I should have said "relational algebra +
> > variables with assignment." It is interesting to consider
> > assignment vs. the more restricted update operators: insert,
> > update, delete.
>
> Actually I see it the other way round: assignment is strictly less
> powerful than DML since it doesn't allow creating or destroying
> variables, while UPDATE does cover assignment to fields.

Oh, my.

Well, for all table variables T, there exists some pair of
values v and v', such that we can transition the value of
T from v to v' via assignment, but not by any single
insert, update or delete. The reverse is not true. I
would consider that a solid demonstration of which
one is more powerful. Further, it is my understanding
that your claim of row identity *depends* on the restricted
nature of DML; if the only mutator operation is assignment,
then there is definitely no record identity.


> (However, it's usually new+assignment+delete vs. INSERT+UPDATE+DELETE,
> at which point there is not much of a difference.)

I am not sure what this means. Delete can be expressed in
terms of assignment. (As can insert and update.) (Assignment
can also be expressed in terms of insert and delete.) I
don't know what "new" would be in a value-semantics, relational
world. So I would say it's assignment vs. INSERT+UPDATE+DELETE,
but perhaps I'm not understanding what you mean.

Assignment by itself is complete. Insert and delete together
are complete.


> >>>>> Okay. At this point, though, the term aliasing has become extremely
> >>>>> general. I believe "i+1+1" is an alias for "i+2" under this definition.
> >>>> No, "i+1+1" isn't an alias in itself. It's an expression - to be an
> >>>> alias, it would have to be a reference to something.
> >>>>
> >>>> However, a[i+1+1] is an alias to a[i+2]. Not that this is particularly
> >>>> important - 1+1 is replacable by 2 in every context, so this is
> >>>> essentially the same as saying "a[i+2] is an alias of a[i+2]", which is
> >>>> vacuously true.
>  >>>
> >>> To me, the SQL with the where clause is like i+2, not like a[2].
> >> I'd say WHERE is like [i+2]: neither is valid on its own, it's the
> >> "selector" part of a reference into a table.
> >
> > Is it possible you're being distracted by the syntax?
>
> I think that's very, very unlikely ;-)

Well, I just thought I'd ask. :-)


>  > WHERE is a
> > binary operation taking a relation and a filter function. I don't
> > think of filters as being like array indexing; do they appear
> > analogous to you?
>
> Yes.
>
> Filters are just like array indexing: both select a subset of variables
> from a collection.

I can't agree with this wording. A filter produces a collection
value from a collection value. I don't see how variables
enter in to it. One can filter either a collection constant or
a collection variable; if one speaks of filtering a collection
variable, on is really speaking of filtering the collection value
that the variable currently contains; filtering is not an operation
on the variable as such, the way the "address of" operator is.
Note you can't update the result of a filter.


> In SQL, you select a subset of a table, in a
> programming language, you select a subset of an array.
>
> (The SQL selection mechanism is far more flexible in the kinds of
> filtering you can apply, while array indexing allows filtering just by
> ordinal position. However, the relevant point is that both select things
> that can be updated.)

When you have been saying "select things that can be updated"
I have been assuming you meant that one can derive values
from variables, and that some other operation can update that
variable, causing the expression, if re-evaluated, to produce
a different value. However the phrase also suggests that
you mean that the *result* of the select can *itself* be
updated. Which one do you mean? (Or is there a third
possibility?)


> >> I admit that identity cannot be reliably established in SQL. The
> >> identity of a record isn't available (unless via OID extensions).
> >> However, it is there and it can have effect.
> >
> > Yes, I think I see what you mean now. This is in part a consequence
> > of the restricted update operators.
>
> I don't think so. You can update SQL records any way you want.
> The unavailability of a record identity is due to the fact that, well,
> it's unavailable in standard SQL.

I disagree. The concept of record-identity is quite tenuous, and
depends on specifics of SQL semantics. (In fact I'm not entirely
convinced it's definitely there even in SQL.) It certainly does not
hold in, say, set theory. Set elements do not have identity; they
only have value. If we have a set-semantics language that
supports set variables with assignment, we still do not have
element-identity.


Marshall
From: Joachim Durchholz
Subject: Re: What is a type error?
Date: 
Message-ID: <e9fhbf$vg2$1@online.de>
Marshall schrieb:
> Joachim Durchholz wrote:
>> Marshall schrieb:
>>> Good point. Perhaps I should have said "relational algebra +
>>> variables with assignment." It is interesting to consider
>>> assignment vs. the more restricted update operators: insert,
>>> update, delete.
>> Actually I see it the other way round: assignment is strictly less
>> powerful than DML since it doesn't allow creating or destroying
>> variables, while UPDATE does cover assignment to fields.
> 
> Oh, my.
> 
> Well, for all table variables T, there exists some pair of
> values v and v', such that we can transition the value of
> T from v to v' via assignment, but not by any single
> insert, update or delete.

I fail to see an example that would support such a claim.

On the other hand, UPDATE can assign any value to any field of any 
record, so it's doing exactly what an assignment does. INSERT/DELETE can 
create resp. destroy records, which is what new and delete operators 
would do.

I must really be missing the point.

 > Further, it is my understanding
> that your claim of row identity *depends* on the restricted
> nature of DML; if the only mutator operation is assignment,
> then there is definitely no record identity.

Again, I fail to connect.

I and others have given aliasing examples that use just SELECT and UPDATE.

>> (However, it's usually new+assignment+delete vs. INSERT+UPDATE+DELETE,
>> at which point there is not much of a difference.)
> 
> I am not sure what this means. Delete can be expressed in
> terms of assignment. (As can insert and update.)

INSERT cannot be expressed in terms of assignment. INSERT creates a new 
record; there's no way that assignment in a language like C can create a 
new data structure!
The same goes for DELETE.

 > (Assignment can also be expressed in terms of insert and delete.)

Agreed.

I also realize that this makes it a bit more difficult to nail down the 
nature of identity in a database. It's certainly not storage location: 
if you DELETE a record and then INSERT it with the same values, it may 
be allocated somewhere entirely else, and our intuition would say it's 
not "the same" (i.e. not identical). (In a system with OID, it would 
even be impossible to recreate such a record, since it would have a 
different OID. I'm not sure whether this makes OID systems better or 
worse at preserving identity, but that's just a side track.)

Since intuition gives me ambivalent results here, I'll go back to my 
semiformal definition (and take the opportunity to make it a bit more 
precise):
Two path expressions (lvalues, ...) are aliases if and only if the 
referred-to values compare equal, and if they stay equal after applying 
any operation to the referred-to value through either of the path 
expressions.

In the context of SQL, this means that identity isn't the location where 
the data is stored. It's also not the values stored in the record - 
these may change, including key data. SQL record identity is local, it 
can be defined from one operation to the next, but there is no such 
thing as a global identity that one can memorize and look up years 
later, without looking at the intermediate states of the store.

It's a gross concept, now that I think about it. Well, or at least 
rather alien for us programmers, who are used to taking the address of a 
variable to get a tangible identity that will stay stable over time.

On the other hand, variable addresses as tangible identities don't hold 
much water anyway.
Imagine data that's written out to disk at program end, and read back 
in. Further imagine that while the data is read into main memory, 
there's a mechanism that redirects all further reads and writes to the 
file into the read-in copy in memory, i.e. whenever any program changes 
the data, all other programs see the change, too.
Alternatively, think about software agents that move from machine to 
machine, carrying their data with them. They might be communicating with 
each other, so they need some means of establishing identity 
("addressing") the memory buffers that they use for communication.

 > I don't know what "new" would be in a value-semantics, relational
> world.

It would be INSERT.

Um, my idea of "value semantics" is associated with immutable values. 
SQL with INSERT/DELETE/UPDATE certainly doesn't match that definition.

So by my definition, SQL doesn't have value semantics, by your 
definition, it would have value semantics but updates which are enough 
to create aliasing problems, so I'm not sure what point you're making 
here...

>> Filters are just like array indexing: both select a subset of variables
>> from a collection.
> 
> I can't agree with this wording. A filter produces a collection
> value from a collection value. I don't see how variables
> enter in to it.

A collection can consist of values or variables.

And yes, I do think that WHERE is a selection over a bunch of variables 
- you can update records after all, so they are variables! They don't 
have a name, at least none which is guaranteed to be constant over their 
lifetime, but they can be mutated!

 > One can filter either a collection constant or
> a collection variable; if one speaks of filtering a collection
> variable, on is really speaking of filtering the collection value
> that the variable currently contains; filtering is not an operation
> on the variable as such, the way the "address of" operator is.
> Note you can't update the result of a filter.

If that's your definition of a filter, then WHERE is not a filter, 
simple as that.

>> In SQL, you select a subset of a table, in a
>> programming language, you select a subset of an array.
>>
>> (The SQL selection mechanism is far more flexible in the kinds of
>> filtering you can apply, while array indexing allows filtering just by
>> ordinal position. However, the relevant point is that both select things
>> that can be updated.)
> 
> When you have been saying "select things that can be updated"
> I have been assuming you meant that one can derive values
> from variables, and that some other operation can update that
> variable, causing the expression, if re-evaluated, to produce
> a different value.

That's what I meant.

 > However the phrase also suggests that
> you mean that the *result* of the select can *itself* be
> updated.

The "that" in "things that can be updated" refers to the selected 
things. I'm not sure how this "that" could be interpreted to refer to 
the selection as a whole (is my understanding of English really that bad?)

 > Which one do you mean? (Or is there a third
> possibility?)

I couldn't tell - I wouldn't have thought that there are even two 
possibilities.

Regards,
Jo
From: Rob Warnock
Subject: Re: What is a type error?
Date: 
Message-ID: <EJydnZRij7pN0SbZnZ2dnUVZ_sKdnZ2d@speakeasy.net>
Joachim Durchholz  <··@durchholz.org> wrote:
+---------------
| INSERT cannot be expressed in terms of assignment. INSERT creates a new 
| record; there's no way that assignment in a language like C can create a 
| new data structure!  The same goes for DELETE.
+---------------

Well, what about "malloc()" & "free()"? I mean, if your
"database" is a simple linked list, then INSERT is just:

    ROW *p = malloc(sizeof *p);
    p->field1 = value1;
    p->field2 = value2;
    ...
    p->fieldN = valueN;
    database = cons(p, database);		/* Common Lisp's CONS */

and DELETE is just:

    ROW *p = find_if(predicate_function, database); /* CL's FIND-IF */
    database = delete(p, database);		/* CL's DELETE */
    free(p);

[There are single-pass methods, of course, but...]


-Rob

-----
Rob Warnock			<····@rpw3.org>
627 26th Avenue			<URL:http://rpw3.org/>
San Mateo, CA 94403		(650)572-2607
From: Joachim Durchholz
Subject: Re: What is a type error?
Date: 
Message-ID: <e9fonb$cga$1@online.de>
Rob Warnock schrieb:
> Joachim Durchholz  <··@durchholz.org> wrote:
> +---------------
> | INSERT cannot be expressed in terms of assignment. INSERT creates a new 
> | record; there's no way that assignment in a language like C can create a 
> | new data structure!  The same goes for DELETE.
> +---------------
> 
> Well, what about "malloc()" & "free()"?

These do exactly what assignment cannot do.

The correspondence between SQL and C would be:
   INSERT - malloc() (create a new data structure)
   DELETE - free() (destroy a data structure)
   UPDATE - assignment (change data within a data structure)

Regards,
Jo
From: Marshall
Subject: Re: What is a type error?
Date: 
Message-ID: <1153153626.781332.263490@i42g2000cwa.googlegroups.com>
Joachim Durchholz wrote:
> Marshall schrieb:
> > Joachim Durchholz wrote:
> >> Marshall schrieb:
> >>> Good point. Perhaps I should have said "relational algebra +
> >>> variables with assignment." It is interesting to consider
> >>> assignment vs. the more restricted update operators: insert,
> >>> update, delete.
> >> Actually I see it the other way round: assignment is strictly less
> >> powerful than DML since it doesn't allow creating or destroying
> >> variables, while UPDATE does cover assignment to fields.
> >
> > Oh, my.
> >
> > Well, for all table variables T, there exists some pair of
> > values v and v', such that we can transition the value of
> > T from v to v' via assignment, but not by any single
> > insert, update or delete.
>
> I fail to see an example that would support such a claim.

variable T : unary relation of int
T = { 1, 2, 3 };  // initialization
T := { 4, 5 };   // assignment

The above transition of the value of T cannot be be
done by any one single insert, update or delete.
Two would suffice, however. (In fact, any assignement
can be modeled at a full delete followed by an insert
of the new value.)


> On the other hand, UPDATE can assign any value to any field of any
> record,

Yes.

> so it's doing exactly what an assignment does.

No. The variable is the table, not the records. Relations are not
arrays.
Records are not lvalues.


> INSERT/DELETE can
> create resp. destroy records, which is what new and delete operators
> would do.
>
> I must really be missing the point.
>
>  > Further, it is my understanding
> > that your claim of row identity *depends* on the restricted
> > nature of DML; if the only mutator operation is assignment,
> > then there is definitely no record identity.
>
> Again, I fail to connect.
>
> I and others have given aliasing examples that use just SELECT and UPDATE.

Sure, but update's semantics are defined in a per-record way,
which is consistent with record identity. Assignment's isn't.


> >> (However, it's usually new+assignment+delete vs. INSERT+UPDATE+DELETE,
> >> at which point there is not much of a difference.)
> >
> > I am not sure what this means. Delete can be expressed in
> > terms of assignment. (As can insert and update.)
>
> INSERT cannot be expressed in terms of assignment. INSERT creates a new
> record; there's no way that assignment in a language like C can create a
> new data structure!
> The same goes for DELETE.

I was intendind to be discussing a hypothetical relation-based
language,
so while I generally agree with you statement about C, I don't see
how it applies.


>  > (Assignment can also be expressed in terms of insert and delete.)
>
> Agreed.
>
> I also realize that this makes it a bit more difficult to nail down the
> nature of identity in a database.

I would propose that variables have identity, and values do not.
In part this is via the supplied definition of identity, in which, when
you change one thing, if something else changes as well, they
share identity. Since one cannot change values, they necessarily
lack identity.


> It's certainly not storage location:
> if you DELETE a record and then INSERT it with the same values, it may
> be allocated somewhere entirely else, and our intuition would say it's
> not "the same" (i.e. not identical).

Well, it would depend on how our intuition had been primed. If it
was via implementation techniques in low level languages, we
might reach a different conclusion than if our intuition was primed
via logical models and relation theory.


> (In a system with OID, it would
> even be impossible to recreate such a record, since it would have a
> different OID. I'm not sure whether this makes OID systems better or
> worse at preserving identity, but that's just a side track.)

OIDs are something of a kludge, and they break set semantics.


> Since intuition gives me ambivalent results here, I'll go back to my
> semiformal definition (and take the opportunity to make it a bit more
> precise):
> Two path expressions (lvalues, ...) are aliases if and only if the
> referred-to values compare equal, and if they stay equal after applying
> any operation to the referred-to value through either of the path
> expressions.

Alas, this leaves me confused. I don't see how a path expression
(in this case, SELECT ... WHERE) can be an l-value. You cannot
apply imperative operations to the result. (Also I think the use
of equality here is too narrow; it is only necessary to show that
two things both change, not that they change in the same way.)

I was under the impression you agred that "i+2" was not
a "path expression". If our hypothetical language lacks record
identity, then I would say that any query is simply an expression
that returns a value, as in "i+2."


> In the context of SQL, this means that identity isn't the location where
> the data is stored. It's also not the values stored in the record -
> these may change, including key data. SQL record identity is local, it
> can be defined from one operation to the next, but there is no such
> thing as a global identity that one can memorize and look up years
> later, without looking at the intermediate states of the store.

Yes, however all of this depends on record identity.


> It's a gross concept, now that I think about it. Well, or at least
> rather alien for us programmers, who are used to taking the address of a
> variable to get a tangible identity that will stay stable over time.

It is certaily alien if one is not used to relation semantics, which
is the default case.


> On the other hand, variable addresses as tangible identities don't hold
> much water anyway.
> Imagine data that's written out to disk at program end, and read back
> in. Further imagine that while the data is read into main memory,
> there's a mechanism that redirects all further reads and writes to the
> file into the read-in copy in memory, i.e. whenever any program changes
> the data, all other programs see the change, too.
> Alternatively, think about software agents that move from machine to
> machine, carrying their data with them. They might be communicating with
> each other, so they need some means of establishing identity
> ("addressing") the memory buffers that they use for communication.

These are exactly why content-based addressing is so important.
Location addressing depends on an address space, and this
concept does not distribute well.


>  > I don't know what "new" would be in a value-semantics, relational
> > world.
>
> It would be INSERT.
>
> Um, my idea of "value semantics" is associated with immutable values.
> SQL with INSERT/DELETE/UPDATE certainly doesn't match that definition.

Sorry, I was vague. Compare, in OOP, the difference between a value
object and a "regular" object.


> So by my definition, SQL doesn't have value semantics, by your
> definition, it would have value semantics but updates which are enough
> to create aliasing problems, so I'm not sure what point you're making
> here...
>
> >> Filters are just like array indexing: both select a subset of variables
> >> from a collection.
> >
> > I can't agree with this wording. A filter produces a collection
> > value from a collection value. I don't see how variables
> > enter in to it.
>
> A collection can consist of values or variables.
>
> And yes, I do think that WHERE is a selection over a bunch of variables
> - you can update records after all, so they are variables! They don't
> have a name, at least none which is guaranteed to be constant over their
> lifetime, but they can be mutated!

We seem to have slipped back from the hypothetical relation language
with only assignement back to SQL.


>  > One can filter either a collection constant or
> > a collection variable; if one speaks of filtering a collection
> > variable, on is really speaking of filtering the collection value
> > that the variable currently contains; filtering is not an operation
> > on the variable as such, the way the "address of" operator is.
> > Note you can't update the result of a filter.
>
> If that's your definition of a filter, then WHERE is not a filter,
> simple as that.

Fair enough! Can you correct my definition of filter, though?
I am still unaware of the difference.


> >> In SQL, you select a subset of a table, in a
> >> programming language, you select a subset of an array.
> >>
> >> (The SQL selection mechanism is far more flexible in the kinds of
> >> filtering you can apply, while array indexing allows filtering just by
> >> ordinal position. However, the relevant point is that both select things
> >> that can be updated.)
> >
> > When you have been saying "select things that can be updated"
> > I have been assuming you meant that one can derive values
> > from variables, and that some other operation can update that
> > variable, causing the expression, if re-evaluated, to produce
> > a different value.
>
> That's what I meant.
>
>  > However the phrase also suggests that
> > you mean that the *result* of the select can *itself* be
> > updated.
>
> The "that" in "things that can be updated" refers to the selected
> things. I'm not sure how this "that" could be interpreted to refer to
> the selection as a whole (is my understanding of English really that bad?)

Your English is extraordinary. I could easily conclude that you
were born in Boston and educated at Harvard, and either have
Germanic ancestry or have simply adopted a Germanic name
out of whimsy. If English is not your native tongue, there is no
way to detect it.

Argh, late for dropping off my daughter at school now. Must run.
Sorry if I was a bit unclear due to being rushed.


Marshall
From: Chris Smith
Subject: Re: What is a type error?
Date: 
Message-ID: <MPG.1f2570c1692d9dd698969f@news.altopia.net>
Marshall <···············@gmail.com> wrote:
> We seem to have slipped back from the hypothetical relation language
> with only assignement back to SQL.

I missed the point where we started discussing such a language.  I 
suspect it was while some of us were still operating under the 
misconception that you assignment to attributes of tuples, rather than 
to entire relations.

I don't see how such a language (limited to assignment of entire 
relations) is particularly helpful to consider.  If the relations are to 
be considered opaque, then there's clearly no aliasing going on.  
However, such a language doesn't seem to solve any actual problems.  It 
appears to be nothing other than a toy language, with a fixed finite set 
of variables having only value semantics, no scope, etc.  I assume that 
relational databases will have the equivalent of SQL's update statement; 
and if that's not the case, then I would need someone to explain how to 
accomplish the same goals in the new relational language; i.e. it would 
need some way of expressing transformations of relations, not just 
complete replacement of them with new relations that are assumed to 
appear out of thin air.

-- 
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation
From: Marshall
Subject: Re: What is a type error?
Date: 
Message-ID: <1153161100.650922.110630@h48g2000cwc.googlegroups.com>
Chris Smith wrote:
> Marshall <···············@gmail.com> wrote:
> > We seem to have slipped back from the hypothetical relation language
> > with only assignement back to SQL.
>
> [...]
> I don't see how such a language (limited to assignment of entire
> relations) is particularly helpful to consider.

I find it useful to consider various language features together and
individually, to see how these features interact. If nothing else,
it furthers my understanding of how languages work.


> If the relations are to
> be considered opaque, then there's clearly no aliasing going on.

Not certain I understand, but I think I agree.


> However, such a language doesn't seem to solve any actual problems.  It
> appears to be nothing other than a toy language, with a fixed finite set
> of variables having only value semantics, no scope, etc.

No, such a language is entirely useful, since relational assignment
is *more* expressive than insert/update/delete. (How did you get "no
scope"
by the way? And fixed variables? All I said was to change ins/upd/del
to
assignment.)

Also note one could fairly easily write a macro for ins/upd/del if one
had assignement. (See below.)

Now, there are some significant issues with trying to produce a
performant implementation in a distributed environment with strictness.
In fact I will go so far as to say it's impossible. However on a single
machine I don't see any reason to think it would perform any worse,
(although I haven't thought about it deeply.)


> I assume that
> relational databases will have the equivalent of SQL's update statement;
> and if that's not the case, then I would need someone to explain how to
> accomplish the same goals in the new relational language; i.e. it would
> need some way of expressing transformations of relations, not just
> complete replacement of them with new relations that are assumed to
> appear out of thin air.

Consider:

i := i + 1;

Note that the new value of i didn't just appear out of thin air; it was
in fact based on the previous value of i.

Given:
variable T : relation of r;
update_fn : r -> r   // takes one record and returns new, updated
record
filter_fn : r -> boolean  // indicator function

insert(T, new_rows) => { T := T union new_rows; }

update(T, update_fn, filter_fn) => {
  transaction {
    let Tmp = filter(T, filter_fn);  // select only those rows to
update
    T := T minus Tmp;
    T := T union update_fn(Tmp);
  }
}

delete(T,filter_fn) => { T := T minus filter(T, filter_fn); }

So we can define insert, update and delete in terms of relational
assignment, relational subtraction, and relational union. Type
checking details omitted.


Marshall
From: Chris Smith
Subject: Re: What is a type error?
Date: 
Message-ID: <MPG.1f259200b8db1bc09896a0@news.altopia.net>
Marshall <···············@gmail.com> wrote:
> > If the relations are to
> > be considered opaque, then there's clearly no aliasing going on.
> 
> Not certain I understand, but I think I agree.

My condition, though, was that relations be opaque.  Since you will be 
violating that condition further on down, I just felt that it's useful 
to point that out here.

> No, such a language is entirely useful, since relational assignment
> is *more* expressive than insert/update/delete.

Assignment is more powerful *as* assignment.  However, it is less 
powerful when the task at hand is deriving new relations from old ones.  
Assignment provides absolutely no tools for doing that.  I thought you 
were trying to remove those tools from the language entirely in order to 
remove the corresponding aliasing problems.  I guess I was wrong, since 
you make it clear below that you intend to keep at least basic set 
operations on relations in your hypothetical language.

> Consider:
> 
> i := i + 1;
> 
> Note that the new value of i didn't just appear out of thin air; it was
> in fact based on the previous value of i.

Right.  That's exactly the kind of thing I thought you were trying to 
avoid.

> So we can define insert, update and delete in terms of relational
> assignment, relational subtraction, and relational union. Type
> checking details omitted.

Then the problem is in the step where you assign the new relation to the 
old relational variable.  You need to check that the new relation 
conforms to the invariants that are expressed on that relational 
variable.  If you model this as assignment of relations (or relation 
values... I'm unclear on the terminology at this point) then naively 
this requires scanning through an entire set of relations in the 
constraint, to verify that the invariant holds.  You've may have avoided 
"aliasing" in any conventional sense of the word by stretching the word 
itself beyond breaking... but you've only done it by proactively 
accepting its negative consequences.

It remains non-trivial to scan through a 2 GB database table to verify 
that some attribute of every tuple matches some attribute of another 
table, even if you call the entire thing one relational variable.  The 
implementation, of course, isn't at all going to make a copy of the 
entire (possibly several GB) relation and rewrite it all every time it 
makes a change, and it isn't going to give up and rescan all possible 
invariants every time every change is made.  In other words, you've 
risen to a layer of abstraction where the aliasing problem does not 
exist.  The implementation is still going to deal with the aliasing 
problem, which will resurface once you pass over to the other side of 
the abstraction boundary.

-- 
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation
From: Marshall
Subject: Re: What is a type error?
Date: 
Message-ID: <1153170053.499642.14840@b28g2000cwb.googlegroups.com>
Chris Smith wrote:
> Marshall <···············@gmail.com> wrote:
> > > If the relations are to
> > > be considered opaque, then there's clearly no aliasing going on.
> >
> > Not certain I understand, but I think I agree.
>
> My condition, though, was that relations be opaque.  Since you will be
> violating that condition further on down, I just felt that it's useful
> to point that out here.
>
> > No, such a language is entirely useful, since relational assignment
> > is *more* expressive than insert/update/delete.
>
> Assignment is more powerful *as* assignment.  However, it is less
> powerful when the task at hand is deriving new relations from old ones.

At the implementation level, it makes some things harder, however
as a logical model, it is more powerful. While this is very much a real
world issue, it is worth noting that it is a performance issue *merely*
and not a semantic issue.


> Assignment provides absolutely no tools for doing that.  I thought you
> were trying to remove those tools from the language entirely in order to
> remove the corresponding aliasing problems.  I guess I was wrong, since
> you make it clear below that you intend to keep at least basic set
> operations on relations in your hypothetical language.
>
> > Consider:
> >
> > i := i + 1;
> >
> > Note that the new value of i didn't just appear out of thin air; it was
> > in fact based on the previous value of i.
>
> Right.  That's exactly the kind of thing I thought you were trying to
> avoid.

I was under the impression tat Joachim, for example, did not
consider "i+1" as an alias for i.


> > So we can define insert, update and delete in terms of relational
> > assignment, relational subtraction, and relational union. Type
> > checking details omitted.
>
> Then the problem is in the step where you assign the new relation to the
> old relational variable.  You need to check that the new relation
> conforms to the invariants that are expressed on that relational
> variable.  If you model this as assignment of relations (or relation
> values... I'm unclear on the terminology at this point) then naively
> this requires scanning through an entire set of relations in the
> constraint, to verify that the invariant holds.  You've may have avoided
> "aliasing" in any conventional sense of the word by stretching the word
> itself beyond breaking... but you've only done it by proactively
> accepting its negative consequences.
>
> It remains non-trivial to scan through a 2 GB database table to verify
> that some attribute of every tuple matches some attribute of another
> table, even if you call the entire thing one relational variable.  The
> implementation, of course, isn't at all going to make a copy of the
> entire (possibly several GB) relation and rewrite it all every time it
> makes a change, and it isn't going to give up and rescan all possible
> invariants every time every change is made.  In other words, you've
> risen to a layer of abstraction where the aliasing problem does not
> exist.  The implementation is still going to deal with the aliasing
> problem, which will resurface once you pass over to the other side of
> the abstraction boundary.

Yes, these *performance* issues make assignment prohibitive for
real-world use, at least if we are talking about data management
in the large. This is not the same thing as saying the resulting
language is a toy language, though; its semantics are quite
interesting and possibly a better choice for *defining* the semantics
of the imperative operations than directly modelling the imperative
operations. (Or maybe not.) In any event, it's worth thinking about,
even if performance considerations make it not worth implementing.


Marshall
From: Chris Smith
Subject: Re: What is a type error?
Date: 
Message-ID: <MPG.1f25b37cb34eb3729896a1@news.altopia.net>
Marshall <···············@gmail.com> wrote:
> Yes, these *performance* issues make assignment prohibitive for
> real-world use, at least if we are talking about data management
> in the large. This is not the same thing as saying the resulting
> language is a toy language, though; its semantics are quite
> interesting and possibly a better choice for *defining* the semantics
> of the imperative operations than directly modelling the imperative
> operations. (Or maybe not.) In any event, it's worth thinking about,
> even if performance considerations make it not worth implementing.

My "toy language" comment was directed at a language that I mistakenly 
thought you were proposing, but that you really weren't.  You can ignore 
it, and all the corresponding comments about assignment being less 
powerful, etc.  I was apparently not skilled at communication when I 
tried to say that in the last message.

It is, perhaps, worth thinking about.  My assertion here (which I think 
I've backed up, but there's been enough confusion that I'm not surprised 
if it was missed) is that the underlying reasons that performance might 
be poor for this language are a superset of the performance problems 
caused by aliasing.  Hence, when discussing the problems caused by 
aliasing for the performance of language implementations (which I 
believe was at some point the discussion here), this isn't a 
particularly useful example.

It does, though, have the nice property of hiding the aliasing from the 
semantic model.  That is interesting and worth considering, but is a 
different conversation; and I don't know how to start it.

-- 
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation
From: Darren New
Subject: Re: What is a type error?
Date: 
Message-ID: <QEPug.24235$Z67.2146@tornado.socal.rr.com>
Marshall wrote:
> I would propose that variables have identity, and values do not.
> In part this is via the supplied definition of identity, in which, when
> you change one thing, if something else changes as well, they
> share identity.

Maybe you gave a better definition the first time, but this one doesn't 
really hold up.

> of equality here is too narrow; it is only necessary to show that
> two things both change, not that they change in the same way.)

If I change X, then Y[X] changes also. I don't think X is identical to Y 
or Y[X], nor is it an alias of either.  I think that's where the 
equality comes into it.

-- 
   Darren New / San Diego, CA, USA (PST)
     This octopus isn't tasty. Too many
     tentacles, not enough chops.
From: Joachim Durchholz
Subject: Re: What is a type error?
Date: 
Message-ID: <e9igt2$r0f$1@online.de>
Marshall schrieb:
> No. The variable is the table, not the records.

In your examples, yes.

 > Relations are not arrays.

They are, in all ways that matter for aliasing:
They are a collection of mutable data, accessible via selector values.

> Records are not lvalues.

Agreed, but they have identity. An lvalue is an identity that you can 
track in software; record identity is untrackable in standard SQL, but 
it's real enough to create aliasing problems.

> I would propose that variables have identity, and values do not.

What's a "variable", by that definition?
It can't be "something that has a name in a program", since that would 
exclude variables on the heap.
Another definition would be "something that may have a different value 
if I look at it sometime later", but then that would include SQL records 
as well. Besides, the temporal aspect is problematic, since it's unclear 
whether the "it" some time later is still the same as the "it" before 
(if you have a copying garbage collector, you'd have to somehow 
establish identity over time, and we're full circle back at defining 
identity).

Personally, I say "two things are identical / have the same identity", 
if they are (a) equal and (b) stay equal after changing just one of 
them. That's a nice local definition, and captures exactly those cases 
where aliasing is relevant in the first place (it's actually that 
"changing here also changes there" aspect that makes aliasing so dangerous).
It also covers SQL records. Well, they do have aliasing problems, or 
problems that are similar enough to them, so I'd say this is a 
surprising but actually interesting and useful result of having that 
definition.

> In part this is via the supplied definition of identity, in which, when
> you change one thing, if something else changes as well, they
> share identity. Since one cannot change values, they necessarily
> lack identity.

I agree that immutable values don't have identity (or, rather, that 
identity isn't an interesting concept for them since it's already 
subsumed under equality).

>> It's certainly not storage location:
>> if you DELETE a record and then INSERT it with the same values, it may
>> be allocated somewhere entirely else, and our intuition would say it's
>> not "the same" (i.e. not identical).
> 
> Well, it would depend on how our intuition had been primed. If it
> was via implementation techniques in low level languages, we
> might reach a different conclusion than if our intuition was primed
> via logical models and relation theory.

Sure.

However, since I'm interested in practical consequences for programming, 
I'm looking for definitions that allow me to capture those effects that 
create bugs.
In the case of identity, I'd say that's aliasing.

>> Since intuition gives me ambivalent results here, I'll go back to my
>> semiformal definition (and take the opportunity to make it a bit more
>> precise):
>> Two path expressions (lvalues, ...) are aliases if and only if the
>> referred-to values compare equal, and if they stay equal after applying
>> any operation to the referred-to value through either of the path
>> expressions.
> 
> Alas, this leaves me confused. I don't see how a path expression
> (in this case, SELECT ... WHERE) can be an l-value. You cannot
> apply imperative operations to the result.

It's not SELECT...WHERE..., it's WHERE... that's an lvalue.
And you can indeed use it with UPDATE, so yes it's an lvalue by my book. 
(A given WHERE clause may become invalid or refer to a different record 
after some updates, so it's not the kind of lvalue that you have in a C 
program.)

 > (Also I think the use
> of equality here is too narrow; it is only necessary to show that
> two things both change, not that they change in the same way.)

Sorry for using misleading terminology.
An "alias" is when two names refer to an identical object (both in real 
life and IMHO in programming), and that's what I wrote.
"Aliasing" is the kind of problems that you describe.

> I was under the impression you agred that "i+2" was not
> a "path expression".

Yes, but [i+2] is one.

In the same vein, "name = 'John'" is an expression, "WHERE name = 
'John'" is a path expression.

 > If our hypothetical language lacks record
> identity, then I would say that any query is simply an expression
> that returns a value, as in "i+2."

As long as the data is immutable, yes.
The question is how equality behaves under mutation.

>> On the other hand, variable addresses as tangible identities don't hold
>> much water anyway.
>> Imagine data that's written out to disk at program end, and read back
>> in. Further imagine that while the data is read into main memory,
>> there's a mechanism that redirects all further reads and writes to the
>> file into the read-in copy in memory, i.e. whenever any program changes
>> the data, all other programs see the change, too.
>> Alternatively, think about software agents that move from machine to
>> machine, carrying their data with them. They might be communicating with
>> each other, so they need some means of establishing identity
>> ("addressing") the memory buffers that they use for communication.
> 
> These are exactly why content-based addressing is so important.
> Location addressing depends on an address space, and this
> concept does not distribute well.

I think that's an overgeneralization. URLs have distributed well in the 
past.

>>  > I don't know what "new" would be in a value-semantics, relational
>>> world.
>> It would be INSERT.
>>
>> Um, my idea of "value semantics" is associated with immutable values.
>> SQL with INSERT/DELETE/UPDATE certainly doesn't match that definition.
> 
> Sorry, I was vague. Compare, in OOP, the difference between a value
> object and a "regular" object.

Yes, one if mutable, the other is not.
I'm not sure how that relates to SQL though.

>>>> Filters are just like array indexing: both select a subset of variables
>>>> from a collection.
>>> I can't agree with this wording. A filter produces a collection
>>> value from a collection value. I don't see how variables
>>> enter in to it.
>> A collection can consist of values or variables.
>>
>> And yes, I do think that WHERE is a selection over a bunch of variables
>> - you can update records after all, so they are variables! They don't
>> have a name, at least none which is guaranteed to be constant over their
>> lifetime, but they can be mutated!
> 
> We seem to have slipped back from the hypothetical relation language
> with only assignement back to SQL.

This all started with your claim that SQL does not have identities and 
aliasing problems, which grew out of a discussion of what identities and 
aliases and aliasing really are.
Hypothetical relation languages are interesting to illustrate fine 
points, but they don't help with that.

Anyway, if a hypothetical relational language is what you wish to 
discuss, let me return to an even more general model: I'm pretty sure 
that any language that has
a) collections
b) means of addressing individual items
c) means of changing individiual items
has aliasing problems, and that the indidual items have identity.

(Actually all programming languages with mutation and any kind of 
reference fit that description: the "collection" is the pool of 
variables, the "means of addressing" is the references, and "means of 
changing" is the mutation.)

>>  > One can filter either a collection constant or
>>> a collection variable; if one speaks of filtering a collection
>>> variable, on is really speaking of filtering the collection value
>>> that the variable currently contains; filtering is not an operation
>>> on the variable as such, the way the "address of" operator is.
>>> Note you can't update the result of a filter.
>> If that's your definition of a filter, then WHERE is not a filter,
>> simple as that.
> 
> Fair enough! Can you correct my definition of filter, though?

I can't. You were using the term "filter", and seemed to apply it to 
WHERE but assume something that didn't match WHERE.
I wasn't rejecting your definition but that you applied the term to 
something that didn't match it.

Oh, and personally, I'd say a filter is something that selects items 
from a collection, be collection or items mutable or not. But that's 
just my personal definition, and I'll be happy to stick with anything 
that's remotely reasonable ;-)

> Your English is extraordinary. I could easily conclude that you
> were born in Boston and educated at Harvard, and either have
> Germanic ancestry or have simply adopted a Germanic name
> out of whimsy. If English is not your native tongue, there is no
> way to detect it.

Thanks :-)

But, no, my last visit to an English-speaking country was decades ago. 
Attribute my English to far too much time in technical newsgroups, and 
far too much googling ;-)
I'm pretty sure that my oral English is much, much worse, and that I've 
got a heavy German accent. I'd just try to improve on that if I were to 
relocate to GB or the US (and maybe that's just what made the difference 
for my English: constant striving for improvement).

Regards,
Jo
From: Chris Smith
Subject: Re: What is a type error?
Date: 
Message-ID: <MPG.1f255dd3a7fa689198969c@news.altopia.net>
Joachim Durchholz <··@durchholz.org> wrote:
> I fail to see an example that would support such a claim.
> 
> On the other hand, UPDATE can assign any value to any field of any 
> record, so it's doing exactly what an assignment does. INSERT/DELETE can 
> create resp. destroy records, which is what new and delete operators 
> would do.
> 
> I must really be missing the point.

I *think* I understand Marshall here.  When you are saying "assignment", 
you mean assignment to values of attributes within tuples of the cell.  
When Marshall is saying "assignment", he seems to mean assigning a 
completely new *table* value to a relation; i.e., wiping out the entire 
contents of the relation and replacing it with a whole new set of 
tuples.  Your assignment is indeed less powerful than DML, whereas 
Marshall's assignment is more powerful than DML.

-- 
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation
From: Marshall
Subject: Re: What is a type error?
Date: 
Message-ID: <1153151694.674061.133510@p79g2000cwp.googlegroups.com>
Chris Smith wrote:
> Joachim Durchholz <··@durchholz.org> wrote:
> > I fail to see an example that would support such a claim.
> >
> > On the other hand, UPDATE can assign any value to any field of any
> > record, so it's doing exactly what an assignment does. INSERT/DELETE can
> > create resp. destroy records, which is what new and delete operators
> > would do.
> >
> > I must really be missing the point.
>
> I *think* I understand Marshall here.  When you are saying "assignment",
> you mean assignment to values of attributes within tuples of the cell.
> When Marshall is saying "assignment", he seems to mean assigning a
> completely new *table* value to a relation; i.e., wiping out the entire
> contents of the relation and replacing it with a whole new set of
> tuples.  Your assignment is indeed less powerful than DML, whereas
> Marshall's assignment is more powerful than DML.

Exactly.


Marshall
From: Joachim Durchholz
Subject: Re: What is a type error?
Date: 
Message-ID: <e9ieii$mn6$1@online.de>
Marshall schrieb:
> Chris Smith wrote:
>> Joachim Durchholz <··@durchholz.org> wrote:
>> I *think* I understand Marshall here.  When you are saying "assignment",
>> you mean assignment to values of attributes within tuples of the cell.
>> When Marshall is saying "assignment", he seems to mean assigning a
>> completely new *table* value to a relation; i.e., wiping out the entire
>> contents of the relation and replacing it with a whole new set of
>> tuples.  Your assignment is indeed less powerful than DML, whereas
>> Marshall's assignment is more powerful than DML.
> 
> Exactly.

Ah well. I never meant that kind of assignment.

Besides, all the aliasing possibilities already happen at the field 
level, so it never occurred to me that whole-table assignment might 
enter into the picture.

Regards,
Jo
From: Marshall
Subject: Re: What is a type error?
Date: 
Message-ID: <1153290273.890438.56440@i3g2000cwc.googlegroups.com>
Joachim Durchholz wrote:
> Marshall schrieb:
> > Chris Smith wrote:
> >> Joachim Durchholz <··@durchholz.org> wrote:
> >> I *think* I understand Marshall here.  When you are saying "assignment",
> >> you mean assignment to values of attributes within tuples of the cell.
> >> When Marshall is saying "assignment", he seems to mean assigning a
> >> completely new *table* value to a relation; i.e., wiping out the entire
> >> contents of the relation and replacing it with a whole new set of
> >> tuples.  Your assignment is indeed less powerful than DML, whereas
> >> Marshall's assignment is more powerful than DML.
> >
> > Exactly.
>
> Ah well. I never meant that kind of assignment.

Sorry for the confusion.


Marshall
From: Xah Lee
Subject: On Computing and Its People
Date: 
Message-ID: <1153509405.284210.48400@m79g2000cwm.googlegroups.com>
Hi all,

in the past years, i have written few hundreds of essays and tutorials
on computing. Now, i've but a index page to this collection:
http://xahlee.org/Periodic_dosage_dir/skami_prosa.html

many of these, originated from online forum. The writing style is
enticing and the content astute.

also, as many of you may know, in the past month there's some
controversy regarding me, that resulted in a small web hosting company
kicking me out, giving the legal reason of _no reason_ as by the
contract agreement (with a 30 days notice in advance). I feel there's
some public obligation to give this update. More details is at
http://xahlee.org/Periodic_dosage_dir/t2/harassment.html

i have been busy in the past month so i haven't wrote anything new
(regarding computing). Recently i started to do Java again, and will
probably soon complete the other sections of this exposition:
What are OOP's Jargons and Complexities
http://xahlee.org/Periodic_dosage_dir/t2/oop.html

   Xah
   ···@xahlee.org
 ∑ http://xahlee.org/
From: Chris F Clark
Subject: Re: What is a type error?
Date: 
Message-ID: <sddwtadbkst.fsf@shell01.TheWorld.com>
"Marshall" <···············@gmail.com> wrote:
> In general, I feel that "records" are not the right conceptual
> level to think about.

Unfortunately, they are the right level.  Actually,the right level
might even be lower, the fields within a record, but that's moving
even farther away from the direction you wish to go.  The right level
is the lowest level at which the programmer can see and manipulate
values.  Thus, since a programmer can see and manipulate fields within
records in SQL that is the level we need to concern ourselves with
when talking about aliasing in SQL.  In other words, since a SELECT
(or UPDATE) statement can talk about specific fields of specific
records within the table (not directly, but reliably, which I'll
explain in a moment) then those are the "primitive objects" in SQL
that can be aliased.

What I meant by "not directly, but reliably":

If you have a fixed database, and you do two selects which specify the
same sets of fields to be selected and the same keys to select records
with, one expects the two selects to return the same values. You do
not necessarily expect the records to be returned in the same order,
but you do expect the same set of records to be returned and the
records to have the same values in the same fields.  Further, if you
do an update, you expect certain fields of certain records to change
(and be reflected in subsequent selects).

However, therein lies the rub, if you do a select on some records, and
then an update that changes those records, the records you have from
the select have either changed or show outdated values.  If there is
some way to refer to the records you first selected before the update,
then you have an aliasing problem, but maybe one can't do that.  One
could also have an aliasing problem, if one were allowed to do two
updates simultaneously, so that one update could changed records in
the middle of the other changing the records.  However, I don't know
if that's allowed in the relational algebra either. (I think I finally
see your point, and more on that later.)

> I do not see that they have
> any identity outside of their value. We can uniquely identify
> any particular record via a key, but a table may have more
> than one key, and an update may change the values of one
> key but not another. So it is not possible in general to
> definitely and uniquely assign a mapping from each record
> of a table after an update to each record of the table before
> the update, and if you can't do that, then where
> is the record identity?

I don't know if your point of having two keys within one table amkes a
difference.  If one can uniquely identify a record, then it has
identity.  The fact that there is another mapping where one may not be
able to uniquely identify the record is not necessarily relevant.

> But what you descrbe is certainly *not* possible in the
> relational algebra; alas that SQL doesn't hew closer
> to it. Would you agree? Would you also agree that
> if a language *did* adhere to relation semantics,
> then relation elements would *not* have identity?
> (Under such a language, one could not have two
> equal records, for example.)

Ok, here I think I will explain my understanding of your point.  If we
treat each select statement and each update statements as a separate
program, i.e. we can't save records from one select statement to a
later one (and I think in the relational algebra one cannot), then
each single (individual) select statement or update statement is a
functional program.  That is we can view a single select statement as
doing a fold over some incoming data, but it cannot change the data.
Similarly, we can view an update statement as creating a new copy of
the table with mutations in it, but the update statement can only
refer to the original immutable table that was input.  (Well, that's
atleast how I recall the relational algebra.)  It is the second point,
about the semantics of the update statement which is important.  There
are no aliasing issues because in the relational algebra one cannot
refer to the mutated values in an update statement, every reference in
an update statement is refering to the unmutated input (again, that's
my recollection).  (Note, if my recollection is off, and one can refer
to the mutated records in an update statement, perhaps they can
explain why aliasing is not an issue in it.)

Now for some minor points:

> For example, we might ask in C, if we update a[i], will
> the value of b[j] change? And we can't say, because
> we don't know if there is aliasing. But in Fortran, we
> can say that it won't, because they are definitely
> different variables. 

Unfortunately, your statement about FORTRAN is not true, if a and b
are parameters to a subroutine (or members of a common block, or
equivalenced, or ...), then an update of a[i] might change b[j].

This is in fact, an important point, it is in subroutines (and
subroutine like things), where we have local names for things
(bindings) that are actually names for things which may or may not be
distinct, that aliasing is the primary issue.  I don't recall the
relational algebra having subroutines, and even if it did they would
be unimportant, given that one could not refer within a single update
statement to the records that a subroutine changed (since one cannot
refer in an update statement to any record which has been changed, see
the caveats above).

Joachim Durchholz wrote: > >
> > Indeed.
> > Though the number of parameter passing mechanisms isn't that large
> > anyway. Off the top of my head, I could recite just three (by value, by
> > reference, by name aka lazy), the rest are just combinations with other
> > concepts (in/out/through, most notably) or a mapping to implementation
> > details (by reference vs. "pointer by value" in C++, for example).

Marshall again:
> Fair enough. I could only remember three, but I was sure someone
> else could name more. :-)

There actual are some more, but very rare, for example there was call-
by-text-string, which is sort of like call-by-name, but allows a
parameter to reach into the calling routine and mess with local
variables therein.  Most of the rare ones have really terrible
semantics, and are perhaps best forgotten except to keep future
implementors from choosing them.  For example, call-by-text-string is
really easy to implement in a naive interpreter that reparses each
statement at every invocation, but but it exposes the implementation
properties so blatantly that writing a different interpretor is nearly
impossible.

If I recall correctly, there are also some call-by- methods that take
into account networks and addressing space issues--call-by-reference
doesn't work if one cannot see the thing being referenced.  Of coruse,
one must then ask whether one considers "remote-procedure-call" and/or
message-passing to be the same thing as a local call.

One last nit, Call-by-value is actually call-by-copy-in-copy-out when
generalized.  

There are some principles that one can use to organize the parameter
passing methods.  However, the space is much richer than people
commonly know about (and that is actually a good thing, that most
people aren't aware of the richness, simplicity is good).

-Chris
From: Marshall
Subject: Re: What is a type error?
Date: 
Message-ID: <1153071347.282931.203960@s13g2000cwa.googlegroups.com>
Chris F Clark wrote:
> "Marshall" <···············@gmail.com> wrote:
> > In general, I feel that "records" are not the right conceptual
> > level to think about.
>
> Unfortunately, they are the right level.  Actually,the right level
> might even be lower, the fields within a record, but that's moving
> even farther away from the direction you wish to go.  The right level
> is the lowest level at which the programmer can see and manipulate
> values.

But how is this not always "the bit"?



> > I do not see that they have
> > any identity outside of their value. We can uniquely identify
> > any particular record via a key, but a table may have more
> > than one key, and an update may change the values of one
> > key but not another. So it is not possible in general to
> > definitely and uniquely assign a mapping from each record
> > of a table after an update to each record of the table before
> > the update, and if you can't do that, then where
> > is the record identity?
>
> I don't know if your point of having two keys within one table amkes a
> difference.  If one can uniquely identify a record, then it has
> identity.  The fact that there is another mapping where one may not be
> able to uniquely identify the record is not necessarily relevant.

The issue with two keys is that the keys may not *agree* on the
mapping before vs. after the update.


> > For example, we might ask in C, if we update a[i], will
> > the value of b[j] change? And we can't say, because
> > we don't know if there is aliasing. But in Fortran, we
> > can say that it won't, because they are definitely
> > different variables.
>
> Unfortunately, your statement about FORTRAN is not true, if a and b
> are parameters to a subroutine (or members of a common block, or
> equivalenced, or ...), then an update of a[i] might change b[j].

Ah, common blocks. How that takes me back!

But your point is a good one; I oversimplified.


> Marshall again:
> > Fair enough. I could only remember three, but I was sure someone
> > else could name more. :-)
>
> There actual are some more, but very rare, for example there was call-
> by-text-string, which is sort of like call-by-name, but allows a
> parameter to reach into the calling routine and mess with local
> variables therein.  Most of the rare ones have really terrible
> semantics, and are perhaps best forgotten except to keep future
> implementors from choosing them.  For example, call-by-text-string is
> really easy to implement in a naive interpreter that reparses each
> statement at every invocation, but but it exposes the implementation
> properties so blatantly that writing a different interpretor is nearly
> impossible.
>
> If I recall correctly, there are also some call-by- methods that take
> into account networks and addressing space issues--call-by-reference
> doesn't work if one cannot see the thing being referenced.  Of coruse,
> one must then ask whether one considers "remote-procedure-call" and/or
> message-passing to be the same thing as a local call.
>
> One last nit, Call-by-value is actually call-by-copy-in-copy-out when
> generalized.
>
> There are some principles that one can use to organize the parameter
> passing methods.  However, the space is much richer than people
> commonly know about (and that is actually a good thing, that most
> people aren't aware of the richness, simplicity is good).


Fair enough.


Marshall
From: Chris Smith
Subject: Re: What is a type error?
Date: 
Message-ID: <MPG.1f24333f84d7a29198968a@news.altopia.net>
We Chris's stick together, as always.

Marshall <···············@gmail.com> wrote:
> > Unfortunately, they are the right level.  Actually,the right level
> > might even be lower, the fields within a record, but that's moving
> > even farther away from the direction you wish to go.  The right level
> > is the lowest level at which the programmer can see and manipulate
> > values.
> 
> But how is this not always "the bit"?

First of all, we should be consistent in speaking about the logical 
language semantics, which may not include bits anyway.

That said, if bits were the primitive concept of data in a language, 
then we'd be able to get away with talking about higher-level concepts 
because we agree to always manipulate a larger structure (a byte, for 
example) as a pure value.  If we were using bitwise operators, then we 
wouldn't be able to get away with it, for example.  That said, it's also 
quite possible to consider aliasing on higher levels as well; it's just 
not possible to point out the lack of aliasing for higher levels of 
abstraction, and thus conclude that no aliasing exists.  Aliasing is 
still possible for entities within those layers of abstraction.

-- 
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation
From: Dr.Ruud
Subject: Re: What is a type error?
Date: 
Message-ID: <e9e1t9.1do.1@news.isolution.nl>
Chris F Clark schreef:

> If you have a fixed database, and you do two selects which specify the
> same sets of fields to be selected and the same keys to select records
> with, one expects the two selects to return the same values.

When your "fixed" means read-only, or (fully) locked, then yes.

Modern databases also have modes without locks, so without special
attention (like maybe your "fixed") the two selects do not necessarily
return the same set of records.


> Further, if you
> do an update, you expect certain fields of certain records to change
> (and be reflected in subsequent selects).

Doesn't your "fixed" block updates?


> However, therein lies the rub, if you do a select on some records, and
> then an update that changes those records, the records you have from
> the select have either changed or show outdated values.

Not necessarily outdated: values can be fixed in time for a purpose.
Example: calculating interest is done once per day, but the whole
process takes more than a day.

Some systems implemented a lock plus requery just before the update, to
check for unacceptable changes in the stored data; this to prevent
having to keep locks while waiting.


> If there is
> some way to refer to the records you first selected before the update,
> then you have an aliasing problem, but maybe one can't do that.  One
> could also have an aliasing problem, if one were allowed to do two
> updates simultaneously, so that one update could changed records in
> the middle of the other changing the records.

Some databases allow you to travel back in time: run this query on the
data of 1 year ago. All previous values are kept "behind" the current
value.

-- 
Affijn, Ruud

"Gewoon is een tijger."
From: David Hopwood
Subject: [Way off-topic] Re: What is a type error?
Date: 
Message-ID: <Qdutg.91033$7Z6.15255@fe2.news.blueyonder.co.uk>
Joachim Durchholz wrote:
> A concrete example: the
> first thing that Windows does when accepting userland data structures
> is... to copy them; this were unnecessary if the structures were immutable.

It is necessary also because the userland structures are in a different
privilege domain, and the memory backing them is potentially mapped by
other processes, or accessible by other threads of this process, that can
be running in parallel while the current thread is in the kernel.

Also, it tends to be easier and more secure to treat userland structures
as if they were in a different address space (even if for a particular OS,
the kernel would be able to access user mode addresses directly). The
kernel cannot trust that pointers in userland structures are valid, for
instance.

So, in order to avoid a copy when accepting a userland structure:

 - when servicing a kernel call for a given process, the OS must use
   kernel-mode page table mappings that are a superset of the user-mode
   mappings for that process at the point of the call.

 - the memory backing the structure must be immutable in all processes
   that could map it, and the kernel must trust the means by which it is
   made immutable.

-- 
David Hopwood <····················@blueyonder.co.uk>
From: Joachim Durchholz
Subject: Re: What is a type error?
Date: 
Message-ID: <e9191f$pkb$1@online.de>
Marshall schrieb:
> Now, I'm not fully up to speed on DBC. The contract specifications,
> these are specified statically, but checked dynamically, is that
> right?

That's how it's done in Eiffel, yes.

 > In other words, we can consider contracts in light of
> inheritance, but the actual verification and checking happens
> at runtime, yes?

Sure. Though, while DbC gives rules for inheritance (actually subtypes), 
these are irrelevant to the current discussion; DbC-minus-subtyping can 
still be usefully applied.

> Wouldn't it be possible to do them at compile time? (Although
> this raises decidability issues.)

Exactly, and that's why you'd either uses a restricted assertion 
language (and essentially get something that's somewhere between a type 
system and traditional assertion); or you'd use some inference system 
and try to help it along (not a simple thing either - the components of 
such a system exist, but I'm not aware of any system that was designed 
for the average programmer).

 > Mightn't it also be possible to
> leave it up to the programmer whether a given contract
> was compile-time or runtime?

I'd agree with that, but I'm not sure how well that would hold up in 
practice.

Regards,
Jo
From: Marshall
Subject: Re: What is a type error?
Date: 
Message-ID: <1152677387.981169.11390@35g2000cwc.googlegroups.com>
Joachim Durchholz wrote:
> Marshall schrieb:
> > Now, I'm not fully up to speed on DBC. The contract specifications,
> > these are specified statically, but checked dynamically, is that
> > right?
>
> That's how it's done in Eiffel, yes.
>
>  > In other words, we can consider contracts in light of
> > inheritance, but the actual verification and checking happens
> > at runtime, yes?
>
> Sure. Though, while DbC gives rules for inheritance (actually subtypes),
> these are irrelevant to the current discussion; DbC-minus-subtyping can
> still be usefully applied.

Yes, subtyping. Of course I meant to say subtyping.<blush>

I can certainly see how DbC would be useful without subtyping.
But would there still be a reason to separate preconditions
from postconditions? I've never been clear on the point
of differentiating them (beyond the fact that one's covariant
and the other is contravariant.)


> > Wouldn't it be possible to do them at compile time? (Although
> > this raises decidability issues.)
>
> Exactly, and that's why you'd either uses a restricted assertion
> language (and essentially get something that's somewhere between a type
> system and traditional assertion); or you'd use some inference system
> and try to help it along (not a simple thing either - the components of
> such a system exist, but I'm not aware of any system that was designed
> for the average programmer).

As to the average programmer, I heard this recently on
comp.databases.theory:

"Don't blame me for the fact that competent programming, as I view it
as an intellectual possibility, will be too difficult for "the
average programmer"  -you must not fall into the trap of rejecting
a surgical technique because it is beyond the capabilities of the
barber in his shop around the corner."   -- EWD512


>  > Mightn't it also be possible to
> > leave it up to the programmer whether a given contract
> > was compile-time or runtime?
>
> I'd agree with that, but I'm not sure how well that would hold up in
> practice.

I want to try it and see what it's like.


Marshall
From: Joachim Durchholz
Subject: Re: What is a type error?
Date: 
Message-ID: <e92tso$bar$1@online.de>
Marshall schrieb:
> Joachim Durchholz wrote:
>> Marshall schrieb:
>>> Now, I'm not fully up to speed on DBC. The contract specifications,
>>> these are specified statically, but checked dynamically, is that
>>> right?
>> That's how it's done in Eiffel, yes.
>>
>>  > In other words, we can consider contracts in light of
>>> inheritance, but the actual verification and checking happens
>>> at runtime, yes?
>> Sure. Though, while DbC gives rules for inheritance (actually subtypes),
>> these are irrelevant to the current discussion; DbC-minus-subtyping can
>> still be usefully applied.
> 
> Yes, subtyping. Of course I meant to say subtyping.<blush>

No need to blush for that, it's perfectly OK in the Eiffel context, 
where a subclass ist always assumed to be a subtype. (This assumption 
isn't always true, which is why Eiffel has some serious holes in the 
type system. The DbC ideas are still useful though, saying "subtype" 
instead of "subclass" just makes the concept applicable outside of OO 
languages.)

> I can certainly see how DbC would be useful without subtyping.
> But would there still be a reason to separate preconditions
> from postconditions? I've never been clear on the point
> of differentiating them (beyond the fact that one's covariant
> and the other is contravariant.)

There is indeed.
The rules about covariance and contravariance are just consequences of 
the notion of having a subtype (albeit important ones when it comes to 
designing concrete interfaces).

For example, if a precondition fails, something is wrong about the 
things that the subroutine assumes about its environment, so it 
shouldn't have been called. This means the error is in the caller, not 
in the subroutine that carries the precondition.

The less important consequence is that this should be reflected in the 
error messages.

The more important consequences apply when integrating software.
If you have a well-tested library, it makes sense to switch off 
postcondition checking for the library routines, but not their 
precondition checking.
This applies not just for run-time checking: Ideally, with compile-time 
inference, all postconditions can be inferred from the function's 
preconditions and their code. The results of that inference can easily 
be stored in the precompiled libraries.
Preconditions, on the other hand, can only be verified by checking all 
places where the functions are called.

>>> Wouldn't it be possible to do them at compile time? (Although
>>> this raises decidability issues.)
>> Exactly, and that's why you'd either uses a restricted assertion
>> language (and essentially get something that's somewhere between a type
>> system and traditional assertion); or you'd use some inference system
>> and try to help it along (not a simple thing either - the components of
>> such a system exist, but I'm not aware of any system that was designed
>> for the average programmer).
> 
> As to the average programmer, I heard this recently on
> comp.databases.theory:
> 
> "Don't blame me for the fact that competent programming, as I view it
> as an intellectual possibility, will be too difficult for "the
> average programmer"  -you must not fall into the trap of rejecting
> a surgical technique because it is beyond the capabilities of the
> barber in his shop around the corner."   -- EWD512

Given the fact that we have far more need for competently-written 
programs than for competent surgery, I don't think that we'll be able to 
follow that idea.

>>> Mightn't it also be possible to
>>> leave it up to the programmer whether a given contract
>>> was compile-time or runtime?
>> I'd agree with that, but I'm not sure how well that would hold up in
>> practice.
> 
> I want to try it and see what it's like.

So do I :-)

Regards,
Jo
From: Marshall
Subject: Re: What is a type error?
Date: 
Message-ID: <1152722309.801960.18090@s13g2000cwa.googlegroups.com>
Joachim Durchholz wrote:
> Marshall schrieb:
>
> > I can certainly see how DbC would be useful without subtyping.
> > But would there still be a reason to separate preconditions
> > from postconditions? I've never been clear on the point
> > of differentiating them (beyond the fact that one's covariant
> > and the other is contravariant.)
>
> There is indeed.
> The rules about covariance and contravariance are just consequences of
> the notion of having a subtype (albeit important ones when it comes to
> designing concrete interfaces).
>
> For example, if a precondition fails, something is wrong about the
> things that the subroutine assumes about its environment, so it
> shouldn't have been called. This means the error is in the caller, not
> in the subroutine that carries the precondition.
>
> The less important consequence is that this should be reflected in the
> error messages.
>
> The more important consequences apply when integrating software.
> If you have a well-tested library, it makes sense to switch off
> postcondition checking for the library routines, but not their
> precondition checking.
> This applies not just for run-time checking: Ideally, with compile-time
> inference, all postconditions can be inferred from the function's
> preconditions and their code. The results of that inference can easily
> be stored in the precompiled libraries.
> Preconditions, on the other hand, can only be verified by checking all
> places where the functions are called.

Interesting!

So, what exactly separates a precondition from a postcondition
from an invariant? I have always imagined that one writes
assertions on parameters and return values, and those
assertions that only reference parameters were preconditions
and those which also referenced return values were postconditions.
Wouldn't that be easy enough to determine automatically?

And what's an invariant anyway?


Marshall
From: Darren New
Subject: Re: What is a type error?
Date: 
Message-ID: <esatg.14243$MF6.2444@tornado.socal.rr.com>
Marshall wrote:
> So, what exactly separates a precondition from a postcondition
> from an invariant?

A precondition applies to a routine/method/function and states the 
conditions under which a function might be called. For example, a 
precondition on "stack.pop" might be "not stack.empty", and 
"socket.read" might have a precondition of "socket.is_open", and 
"socket.open_a_new_connection" might have a precondition of 
"socket.is_closed".

A postcondition applies to a routine/method/function and states (at 
least part of) what that routine guarantees to be true when it returns, 
assuming it is called with true preconditions. So "not stack.empty" 
would be a postcondition of "stack.push". If your postconditions are 
complete, they tell you what the routine does.

An invariant is something that applies to an entire object instance, any 
time after the constructor has completed and no routine within that 
instance is running (i.e., you're not in the middle of a call). There 
should probably be something in there about destructors, too. So an 
invariant might be "stack.is_empty == (stack.size == 0)" or perhaps 
"socket.is_open implies (socket.buffer != null)" or some such.

The first problem with many of these sorts of things are that in 
practice, there are lots of postconditions that are difficult to express 
in any machine-readable way. The postcondition of "socket.read" is that 
the buffer passed as the first argument has valid data in the first "n" 
bytes, where "n" is the return value from "socket.read", unless 
"socket.read" returned -1.  What does "valid" mean there? It has to 
match the bytes sent by some other process, possibly on another machine, 
disregarding the bytes already read.

You can see how this can be hard to formalize in any particularly useful 
way, unless you wind up modelling the whole universe, which is what most 
of the really formalized network description techniques do.

Plus, of course, without having preconditions and postconditions for OS 
calls, it's difficult to do much formally with that sort of thing.

You can imagine all sorts of I/O that would be difficult to describe, 
even for things that are all in memory: what would be the postcondition 
for "screen.draw_polygon"? Any set of postconditions on that routine 
that don't mention that a polygon is now visible on the screen would be 
difficult to take seriously.

There are also problems with the complexity of things. Imagine a 
chess-playing game trying to describe the "generate moves" routine. 
Precondition: An input board with a valid configuration of chess pieces. 
Postcondition: An array of boards with possible next moves for the 
selected team.  Heck, if you could write those as assertions, you 
wouldn't need the code.

-- 
   Darren New / San Diego, CA, USA (PST)
     This octopus isn't tasty. Too many
     tentacles, not enough chops.
From: Joachim Durchholz
Subject: Re: What is a type error?
Date: 
Message-ID: <e93joo$meo$1@online.de>
Darren New schrieb:
> There are also problems with the complexity of things. Imagine a 
> chess-playing game trying to describe the "generate moves" routine. 
> Precondition: An input board with a valid configuration of chess pieces. 
> Postcondition: An array of boards with possible next moves for the 
> selected team.  Heck, if you could write those as assertions, you 
> wouldn't need the code.

Actually, in a functional programming language (FPL), you write just the 
postconditions and let the compiler generate the code for you.

At least that's what happens for those FPL functions that you write down 
without much thinking. You can still tweak the function to make it more 
efficient. Or you can define an interface using preconditions and 
postconditions, and write a function that fulfills these assertions 
(i.e. requires no more preconditions than the interface specifies, and 
fulfills at least the postcondition that the interface specifies); here 
we'd have a postcondition that's separate from the code, too.
I.e. in such cases, the postconditions separate the accidental and 
essential properties of a function, so they still have a role to play.

Regards,
Jo
From: Darren New
Subject: Re: What is a type error?
Date: 
Message-ID: <XMctg.27137$uy3.7183@tornado.socal.rr.com>
Joachim Durchholz wrote:
> Actually, in a functional programming language (FPL), you write just the 
> postconditions and let the compiler generate the code for you.

Certainly. And my point is that the postcondition describing "all valid 
chess boards reachable from this one" is pretty much going to be as big 
as an implementation for generating it, yes? The postcondition will 
still have to contain all the rules of chess in it, for example. At best 
you've replaced loops with some sort of universal quanitifier with a 
"such that" phrase.

Anyway, I expect you could prove you can't do this in the general case. 
Otherwise, you could just write a postcondition that asserts the output 
of your function is machine code that when run generates the same 
outputs as the input string would. I.e., you'd have a compiler that can 
write other compilers, generated automatically from a description of the 
semantics of the input stream and the semantics of the machine the code 
is to run on. I'm pretty sure we're not there yet, and I'm pretty sure 
you start running into the limits of computability if you do that.

-- 
   Darren New / San Diego, CA, USA (PST)
     This octopus isn't tasty. Too many
     tentacles, not enough chops.
From: Joachim Durchholz
Subject: Re: What is a type error?
Date: 
Message-ID: <e958mh$7sk$1@online.de>
Darren New schrieb:
> Joachim Durchholz wrote:
>> Actually, in a functional programming language (FPL), you write just 
>> the postconditions and let the compiler generate the code for you.
> 
> Certainly. And my point is that the postcondition describing "all valid 
> chess boards reachable from this one" is pretty much going to be as big 
> as an implementation for generating it, yes?

Yes. It's a classical case where the postcondition and the code that 
fulfils it are essentially the same.

 > The postcondition will
> still have to contain all the rules of chess in it, for example. At best 
> you've replaced loops with some sort of universal quanitifier with a 
> "such that" phrase.

Correct.

OTOH, isn't that the grail that many people have been searching for: 
programming by simply declaring the results that they want to see?

> Anyway, I expect you could prove you can't do this in the general case. 
> Otherwise, you could just write a postcondition that asserts the output 
> of your function is machine code that when run generates the same 
> outputs as the input string would. I.e., you'd have a compiler that can 
> write other compilers, generated automatically from a description of the 
> semantics of the input stream and the semantics of the machine the code 
> is to run on. I'm pretty sure we're not there yet, and I'm pretty sure 
> you start running into the limits of computability if you do that.

No, FPLs are actually just that: compilable postconditions.
Computability issues aren't more or less a factor than with other kinds 
of compilers: they do limit what you can do, but these limits are loose 
enough that you can do really useful stuff within them (in particular, 
express all algorithms).

Regards,
Jo
From: Chris Smith
Subject: Re: What is a type error?
Date: 
Message-ID: <MPG.1f202ae3e730d505989791@news.altopia.net>
Joachim Durchholz <··@durchholz.org> wrote:
> OTOH, isn't that the grail that many people have been searching for: 
> programming by simply declaring the results that they want to see?

Possibly.

> No, FPLs are actually just that: compilable postconditions.

This seems to me a bit misleading.  Perhaps someone will explain why I 
should stop thinking this way; but currently I classify statements like 
this in the "well, sort of" slot of my mind.  If functional programming 
were really just compilable postconditions, then functional programmers 
would be able to skip a good bit of stuff that they really can't.  For 
example, time and space complexity of code is still entirely relevant 
for functional programming.  I can't simply write:

(define fib
  (lambda (x) (if (< x 2) 1 (+ (fib (- x 1)) (fib (- x 2))))))

and expect the compiler to create an efficient algorithm for me.  This 
is true even though the above is (the LISP transcription of) the most 
natural way to describe the fibonacci sequence from a mathematical 
standpoint.  It still runs in exponential time, and it still matters 
that it runs in exponential time; and LISP programmers adapt their so-
called declarative code to improve time bounds all the time.  This makes 
it harder for me to call it declarative (or "compilable postconditions") 
and feel entirely honest.

(Yes, I realize that the above could be optimized in a language that 
does normal order evaluation with common subexpression elimination. and 
become linear-time.  However, that's not true of algorithms in general.  
It is not the case that all that's needed to find an efficient algorithm 
for something is to plug it into a Haskell compiler and observe what 
happens.  Or, if that is the case, there are a few CS professors I know 
who would be quite interested in hearing so.)

> Computability issues aren't more or less a factor than with other kinds 
> of compilers: they do limit what you can do, but these limits are loose 
> enough that you can do really useful stuff within them (in particular, 
> express all algorithms).

Is it really consistent to say that postconditions allow you to express 
algorithms?

-- 
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation
From: David Hopwood
Subject: Re: What is a type error?
Date: 
Message-ID: <Uzxtg.93986$7Z6.48541@fe2.news.blueyonder.co.uk>
Chris Smith wrote:
> Joachim Durchholz <··@durchholz.org> wrote:
> 
>>OTOH, isn't that the grail that many people have been searching for: 
>>programming by simply declaring the results that they want to see?
> 
> Possibly.
> 
>>No, FPLs are actually just that: compilable postconditions.
> 
> This seems to me a bit misleading.  Perhaps someone will explain why I 
> should stop thinking this way; but currently I classify statements like 
> this in the "well, sort of" slot of my mind.  If functional programming 
> were really just compilable postconditions, then functional programmers 
> would be able to skip a good bit of stuff that they really can't.  For 
> example, time and space complexity of code is still entirely relevant 
> for functional programming.  I can't simply write:
> 
> (define fib
>   (lambda (x) (if (< x 2) 1 (+ (fib (- x 1)) (fib (- x 2))))))
> 
> and expect the compiler to create an efficient algorithm for me.

This is true, but note that postconditions also need to be efficient
if we are going to execute them.

That is, the difference you've pointed out is not a difference between
executable postconditions and functional programs. Both the inefficient
functional definition of 'fib' and an efficient one are executable
postconditions. In order to prove that the efficient implementation is
as correct as the inefficient one, we need to prove that, treated as
postconditions, the former implies the latter.

(In this case a single deterministic result is required, so the former
will be equivalent to the latter.)

-- 
David Hopwood <····················@blueyonder.co.uk>
From: Chris Smith
Subject: Re: What is a type error?
Date: 
Message-ID: <MPG.1f2076205513cf2a989797@news.altopia.net>
David Hopwood <····················@blueyonder.co.uk> wrote:
> This is true, but note that postconditions also need to be efficient
> if we are going to execute them.

If checked by execution, yes.  In which case, I am trying to get my head 
around how it's any more true to say that functional languages are 
compilable postconditions than to say the same of imperative languages.  
In both cases, some statement is asserted through a description of a 
means of computing it.  There may be a distinction worth making here, 
but I'm missing it so far.

-- 
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation
From: David Hopwood
Subject: Re: What is a type error?
Date: 
Message-ID: <d1Ctg.313440$8W1.311974@fe1.news.blueyonder.co.uk>
Chris Smith wrote:
> David Hopwood <····················@blueyonder.co.uk> wrote:
> 
>>This is true, but note that postconditions also need to be efficient
>>if we are going to execute them.
> 
> If checked by execution, yes.  In which case, I am trying to get my head 
> around how it's any more true to say that functional languages are 
> compilable postconditions than to say the same of imperative languages.

A postcondition must, by definition [*], be a (pure) assertion about the
values that relevant variables will take after the execution of a subprogram.

If a subprogram is intended to have side effects, then its postcondition
can describe the results of the side effects, but must not reexecute them.


[*] E.g. see
    <http://www.spatial.maine.edu/~worboys/processes/hoare%20axiomatic.pdf>,
    although the term "postcondition" was introduced later than this paper.

-- 
David Hopwood <····················@blueyonder.co.uk>
From: Chris Smith
Subject: Re: What is a type error?
Date: 
Message-ID: <MPG.1f247767eefec77e989690@news.altopia.net>
David Hopwood <····················@blueyonder.co.uk> wrote:
> Chris Smith wrote:
> > If checked by execution, yes.  In which case, I am trying to get my head 
> > around how it's any more true to say that functional languages are 
> > compilable postconditions than to say the same of imperative languages.
> 
> A postcondition must, by definition [*], be a (pure) assertion about the
> values that relevant variables will take after the execution of a subprogram.

Okay, my objection was misplaced, then, as I misunderstood the original 
statement.  I had understood it to mean that programs written in pure 
functional languages carry no additional information except for their 
postconditions.

-- 
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation
From: Joachim Durchholz
Subject: Re: What is a type error?
Date: 
Message-ID: <e9fj2i$2mq$1@online.de>
Chris Smith schrieb:
> David Hopwood <····················@blueyonder.co.uk> wrote:
>> Chris Smith wrote:
>>> If checked by execution, yes.  In which case, I am trying to get my head 
>>> around how it's any more true to say that functional languages are 
>>> compilable postconditions than to say the same of imperative languages.
 >>
>> A postcondition must, by definition [*], be a (pure) assertion about the
>> values that relevant variables will take after the execution of a subprogram.
> 
> Okay, my objection was misplaced, then, as I misunderstood the original 
> statement.  I had understood it to mean that programs written in pure 
> functional languages carry no additional information except for their 
> postconditions.

Oh, but that's indeed correct for pure functional languages. (Well, they 
*might* carry additional information anyway, but it's not a requirement 
to make it a programming language.)

The answer that I have for your original objection may be partial, but 
here goes anyway:

Postconditions cannot easily capture all side effects.
E.g. assume there's a file-open routine but no way to test whether a 
given file handle was ever opened (callers are supposed to test the 
return code from the open() call).
In an imperative language, that's a perfectly valid (though possibly not 
very good) library design.
Now the postcondition would have to say something like "from now on, 
read() and write() are valid on the filehandle that I returned". I know 
of no assertion language that can express such temporal relationships, 
and even if there is (I'm pretty sure there is), I'm rather sceptical 
that programmers would be able to write correct assertions, or correctly 
interpret them - temporal logic offers several traps for the unwary. 
(The informal postcondition above certainly isn't complete, since it 
doesn't cover close(); I shunned the effort to get a wording that would 
correctly cover sequences like open()-close()-open(), or 
open()-close()-close()-open().)

Regards,
Jo
From: Darren New
Subject: Re: What is a type error?
Date: 
Message-ID: <2kPug.24230$Z67.2556@tornado.socal.rr.com>
Joachim Durchholz wrote:
> of no assertion language that can express such temporal relationships, 
> and even if there is (I'm pretty sure there is), I'm rather sceptical 
> that programmers would be able to write correct assertions, or correctly 
> interpret them - temporal logic offers several traps for the unwary. 

FWIW, this is exactly the area to which LOTOS (Language Of Temporal 
Orderering Specifications) is targetted at. It's essentially based on 
CSP, but somewhat extended. It's pretty straightforward to learn and 
understand, too.  Some have even added "realtime" constraints to it.

-- 
   Darren New / San Diego, CA, USA (PST)
     This octopus isn't tasty. Too many
     tentacles, not enough chops.
From: Joachim Durchholz
Subject: Re: What is a type error?
Date: 
Message-ID: <e93djg$9tl$1@online.de>
Marshall schrieb:
> So, what exactly separates a precondition from a postcondition
> from an invariant? I have always imagined that one writes
> assertions on parameters and return values, and those
> assertions that only reference parameters were preconditions
> and those which also referenced return values were postconditions.
> Wouldn't that be easy enough to determine automatically?

Sure.
Actually this can be done in an even simpler fashion; e.g. in Eiffel, it 
is (forgive me if I got the details wrong, it's been several years since 
I last coded in it):

   foo (params): result_type is
   require
     -- list of assertions; these become preconditions
   do
     -- subroutine body
   ensure
     -- list of assertions; these become postconditions
   end

> And what's an invariant anyway?

In general, it's a postcondition to all routines that create or modify a 
data structure of a given type.

Eiffel does an additional check at routine entry, but that's just a 
last-ditch line of defense against invariant destruction via aliases, 
not a conceptual thing. Keep aliases under control via modes (i.e. 
"unaliasable" marks on local data to prevent aliases from leaving the 
scope of the data structure), or via locking, or simply by disallowing 
destructive updates, and you don't need the checks at routine entry anymore.

Regards,
Jo
From: David Hopwood
Subject: Re: What is a type error?
Date: 
Message-ID: <fqftg.311080$8W1.234575@fe1.news.blueyonder.co.uk>
Marshall wrote:
> Joachim Durchholz wrote:
[...]
>>Preconditions/postconditions can express anything you want, and they are
>>an absolutely natural extensions of what's commonly called a type
>>(actually the more powerful type systems have quite a broad overlap with
>>assertions).
>>I'd essentially want to have an assertion language, with primitives for
>>type expressions.
> 
> Now, I'm not fully up to speed on DBC. The contract specifications,
> these are specified statically, but checked dynamically, is that
> right? In other words, we can consider contracts in light of
> inheritance, but the actual verification and checking happens
> at runtime, yes?

For DBC as implemented in Eiffel, yes.

> Wouldn't it be possible to do them at compile time? (Although
> this raises decidability issues.)

It is certainly possible to prove statically that some assertions cannot fail.

The ESC/Java 2 (http://secure.ucd.ie/products/opensource/ESCJava2/docs.html)
tool for JML (http://www.cs.iastate.edu/~leavens/JML/) is one system that does
this -- although some limitations and usability problems are described in
<http://secure.ucd.ie/products/opensource/ESCJava2/ESCTools/papers/CASSIS2004.pdf>.

> Mightn't it also be possible to
> leave it up to the programmer whether a given contract
> was compile-time or runtime?

That would be possible, but IMHO a better option would be for an IDE to give
an indication (by highlighting, for example), which contracts are dynamically
checked and which are static.

This property is, after all, not something that the program should depend on.
It is determined by how good the static checker currently is, and we want to be
able to improve checkers (and perhaps even allow them to regress slightly in
order to simplify them) without changing programs.

-- 
David Hopwood <····················@blueyonder.co.uk>
From: Marshall
Subject: Re: What is a type error?
Date: 
Message-ID: <1152758668.738197.54170@p79g2000cwp.googlegroups.com>
David Hopwood wrote:
> Marshall wrote:
>
> > Wouldn't it be possible to do them at compile time? (Although
> > this raises decidability issues.)
>
> It is certainly possible to prove statically that some assertions cannot fail.
>
> The ESC/Java 2 (http://secure.ucd.ie/products/opensource/ESCJava2/docs.html)
> tool for JML (http://www.cs.iastate.edu/~leavens/JML/) is one system that does
> this -- although some limitations and usability problems are described in
> <http://secure.ucd.ie/products/opensource/ESCJava2/ESCTools/papers/CASSIS2004.pdf>.

I look forward to reading this. I read a paper on JML a while ago and
found it quite interesting.


> > Mightn't it also be possible to
> > leave it up to the programmer whether a given contract
> > was compile-time or runtime?
>
> That would be possible, but IMHO a better option would be for an IDE to give
> an indication (by highlighting, for example), which contracts are dynamically
> checked and which are static.
>
> This property is, after all, not something that the program should depend on.
> It is determined by how good the static checker currently is, and we want to be
> able to improve checkers (and perhaps even allow them to regress slightly in
> order to simplify them) without changing programs.

Hmmm. I have heard that argument before and I'm conflicted.

I can think of more reasons than just runtime safety for which I'd
want proofs. Termination for example, in highly critical code;
not something for which a runtime check will suffice. On the
other hand the points you raise are good ones, and affect
the usability of the language.


Marshall
From: Chris Smith
Subject: Re: What is a type error?
Date: 
Message-ID: <MPG.1f1f68abb8043a7398978d@news.altopia.net>
Marshall <···············@gmail.com> wrote:
> David Hopwood wrote:
> > Marshall wrote:

> > > Mightn't it also be possible to
> > > leave it up to the programmer whether a given contract
> > > was compile-time or runtime?
> >
> > That would be possible, but IMHO a better option would be for an IDE to give
> > an indication (by highlighting, for example), which contracts are dynamically
> > checked and which are static.
> >
> > This property is, after all, not something that the program should depend on.
> > It is determined by how good the static checker currently is, and we want to be
> > able to improve checkers (and perhaps even allow them to regress slightly in
> > order to simplify them) without changing programs.
> 
> Hmmm. I have heard that argument before and I'm conflicted.
> 
> I can think of more reasons than just runtime safety for which I'd
> want proofs. Termination for example, in highly critical code;
> not something for which a runtime check will suffice. On the
> other hand the points you raise are good ones, and affect
> the usability of the language.

There doesn't seem to be a point of disagreement here.  Programmers 
often need to require certain properties to be checked at compile-time.  
Others could go either way.  There is no property that a program would 
rationally desire to *require* be checked at runtime; that would only 
occur because the compiler doesn't know how to check it at compile time.

-- 
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation
From: Joachim Durchholz
Subject: Re: What is a type error?
Date: 
Message-ID: <e9593v$8g3$1@online.de>
Marshall schrieb:
> David Hopwood wrote:
>> This property is, after all, not something that the program should depend on.
>> It is determined by how good the static checker currently is, and we want to be
>> able to improve checkers (and perhaps even allow them to regress slightly in
>> order to simplify them) without changing programs.
> 
> Hmmm. I have heard that argument before and I'm conflicted.

I'd want several things.

A way for me to indicate what assertions must be proven statically.
Highlighting (be it compiler messages or flashing colors in an IDE) that 
marks assertions that *will* break.
And highlighting for assertions that *may* break.
In the language, a (possibly) simplicistic inference engine definition 
that gives me minimum guarantees about the things that it will be able 
to prove; if something is out of the reach of the engine, a 
straightforward way to add intermediate assertions until the inference 
succeeds.

(Plus diagnostics that tell me where the actual error may be, whether 
it's a bug in the code or an omission in the assertions. That's probably 
the hardest part of it all.)

Regards,
Jo
From: David Hopwood
Subject: Re: What is a type error?
Date: 
Message-ID: <0cxtg.61529$181.2346@fe3.news.blueyonder.co.uk>
Marshall wrote:
> David Hopwood wrote:
>>Marshall wrote:
>>
>>>Wouldn't it be possible to do them at compile time? (Although
>>>this raises decidability issues.)
>>
>>It is certainly possible to prove statically that some assertions cannot fail.
>>
>>The ESC/Java 2 (http://secure.ucd.ie/products/opensource/ESCJava2/docs.html)
>>tool for JML (http://www.cs.iastate.edu/~leavens/JML/) is one system that does
>>this -- although some limitations and usability problems are described in
>><http://secure.ucd.ie/products/opensource/ESCJava2/ESCTools/papers/CASSIS2004.pdf>.
> 
> I look forward to reading this. I read a paper on JML a while ago and
> found it quite interesting.
> 
>>>Mightn't it also be possible to
>>>leave it up to the programmer whether a given contract
>>>was compile-time or runtime?
>>
>>That would be possible, but IMHO a better option would be for an IDE to give
>>an indication (by highlighting, for example), which contracts are dynamically
>>checked and which are static.
>>
>>This property is, after all, not something that the program should depend on.
>>It is determined by how good the static checker currently is, and we want to be
>>able to improve checkers (and perhaps even allow them to regress slightly in
>>order to simplify them

.. or improve their performance ..

> ) without changing programs.
> 
> Hmmm. I have heard that argument before and I'm conflicted.
> 
> I can think of more reasons than just runtime safety for which I'd
> want proofs. Termination for example, in highly critical code;
> not something for which a runtime check will suffice.

It is true that some properties cannot be verified directly by a runtime check,
but that does not mean that runtime checks are not indirectly useful in verifying
them.

For example, we can check at runtime that a loop variant is strictly decreasing
with each iteration. Then, given that each iteration of the loop body terminates,
it is guaranteed that the loop terminates, *either* because the runtime check
fails, or because the variant goes to zero.

In general, we can verify significantly more program properties using a
combination of runtime checks and static proof, than we can using static proof
alone. That may seem like quite an obvious statement, but the consequence is
that any particular property is, in general, not verified purely statically or
purely at runtime.

I am not opposed to being able to annotate an assertion to say that it should
be statically provable and that a runtime check should not be used. However,

 - such annotations should be very lightweight and visually undistracting,
   relative to the assertion itself;

 - a programmer should not interpret such an annotation on a particular assertion
   to mean that its static validity is not reliant on runtime checks elsewhere;

 - if the class of assertions that are statically provable changes, then a
   tool should be provided which can *automatically* add or remove these
   annotations (with programmer approval when they are removed).


I'd like to make a couple more comments about when it is sufficient to detect
errors and when it is necessary to prevent them:

 - If a language supports transactions, then this increases the proportion
   of cases in which it is sufficient to detect errors in imperative code.
   When state changes are encapsulated in a transaction, it is much easier
   to recover if an error is detected, because invariants that were true of
   objects used by the transaction when it started will be automatically
   reestablished. (Purely functional code does not need this.)

 - Almost all safety-critical systems have a recovery or safe shutdown
   behaviour which should be triggered when an error is detected in the
   rest of the program. The part of the program that implements this behaviour
   definitely needs to be statically correct, but it is usually only a small
   amount of code.

   Safety-critical systems that must either prevent errors or continue
   functioning in their presence (aircraft control systems, for example) are
   in a separate category that demand *much* greater verification effort. Even
   for these systems, though, it is still useful to detect errors in cases
   where they cannot be prevented. When multiple independent implementations
   of a subsystem are used to check each other, this error detection can be
   used as an input to the decision of which implementation is failing and
   which should take over.

-- 
David Hopwood <····················@blueyonder.co.uk>
From: David Hopwood
Subject: Re: What is a type error?
Date: 
Message-ID: <jYCsg.74231$7Z6.56831@fe2.news.blueyonder.co.uk>
Chris Smith wrote:
> I <·······@twu.net> wrote:
> 
>>Incidentally, I'm not saying that such a feature would be a good idea.

I don't think it would be a bad idea. Silently giving incorrect results
on arithmetic overflow, as C-family languages do, is certainly a bad idea.
A type system that supported range type arithmetic as you've described would
have considerable advantages, especially in areas such as safety-critical
software. It would be a possible improvement to Ada, which UUIC currently
has a more restrictive range typing system that cannot infer different
ranges for a variable at different points in the program.

I find that regardless of programming language, relatively few of my
integer variables are dimensionless -- most are associated with some
specific unit. Currently, I find variable naming conventions helpful in
documenting this, but the result is probably more verbose than it would
be to drop this information from the names, and instead use more
precise types that indicate the unit and the range.

When prototyping, you could alias all of these to bignum types (with
range [0..+infinity) or (-infinity..+infinity)) to avoid needing to deal
with any type errors, and then constrain them where necessary later.

>>It generally isn't provided in languages specifically because it gets to 
>>be a big pain to maintain all of the type specifications for this kind 
>>of stuff.

It would take a little more work to write a program, but it would be no
more difficult to read (easier if you're also trying to verify its correctness).
Ease of reading programs is more important than writing.

> There are other good reasons, too, as it turns out.  I don't want to 
> overstate the "possible" until it starts to sound like "easy, even if 
> it's a pain".  This kind of stuff is rarely done in mainstream 
> programming languages because it has serious negative consequences.
> 
> For example, I wrote that example using variables of type int.  If we 
> were to suppose that we were actually working with variables of type 
> Person, then things get a little more complicated.  We would need a few 
> (infinite classes of) derived subtypes of Person that further constrain 
> the possible values for state.  For example, we'd need types like:
> 
>     Person{age:{18..29}}
> 
> But this starts to look bad, because we used to have this nice property 
> called encapsulation.

I think you're assuming that 'age' would have to refer to a concrete field.
If it refers to a type parameter, something like:

  class Person{age:Age} is
     Age getAge()
  end

then I don't see how this breaks encapsulation.

-- 
David Hopwood <····················@blueyonder.co.uk>
From: Darren New
Subject: Re: What is a type error?
Date: 
Message-ID: <Pevsg.26648$uy3.13388@tornado.socal.rr.com>
Chris Smith wrote:
>         // Inside this block, a has type int{17..21} and b has type
>         // int{18..22}

No what happens if right here you code
   b := 16;

Does that again change the type of "b"? Or is that an illegal 
instruction, because "b" has the "local type" of (18..22)?

>         signContract(a); // error, because a might be 17
>         signContract(b); // no error, because even though the declared
>                          // type of b is int{14..22}, it has a local
>                          // type of int{18..22}.


If the former (i.e., if reassigning to "b" changes the "static type" of 
b, then the term you're looking for is not type, but "typestate".

In other words, this is the same sort of test that disallows using an 
unassigned variable in a value-returning expression.  When
   { int a; int b; b := a; }
returns a compile-time error because "a" is uninitialized at the 
assignment, that's not the "type" of a, but the typestate. Just FYI.

> Incidentally, I'm not saying that such a feature would be a good idea.  

It actually works quite well if the language takes advantage of it 
consistantly and allows you to designate your expected typestates and such.

-- 
   Darren New / San Diego, CA, USA (PST)
     This octopus isn't tasty. Too many
     tentacles, not enough chops.
From: Chris Smith
Subject: Re: What is a type error?
Date: 
Message-ID: <MPG.1f1c334429ce4ae989775@news.altopia.net>
Darren New <····@san.rr.com> wrote:
> Chris Smith wrote:
> >         // Inside this block, a has type int{17..21} and b has type
> >         // int{18..22}
> 
> No what happens if right here you code
>    b := 16;
> 
> Does that again change the type of "b"? Or is that an illegal 
> instruction, because "b" has the "local type" of (18..22)?

It arranges that the expression "b" after that line (barring further 
changes) has type int{16..16}, which would make the later call to 
signContract illegal.

> If the former (i.e., if reassigning to "b" changes the "static type" of 
> b, then the term you're looking for is not type, but "typestate".

We're back into discussion terminology, then.  How fun.  Yes, the word 
"typestate" is used to describe this in a good bit of literature.  
Nevertheless, a good number of authors -- including all of them that I'm 
aware of in programming language type theory -- would agree that "type" 
is a perfectly fine word.

When I said b has a type of int{18..22}, I meant that the type that will 
be inferred for the expression "b" when it occurs inside this block as 
an rvalue will be int{18..22}.  The type of the variable didn't change, 
because variables don't *have* types.  Expressions (or, depending on 
your terminology preference, terms) have types.  An expression "b" that 
occurs after your assignment is a different expression from the one that 
occurs before your assignment, so it's entirely expected that in the 
general case, it may have a different type.

It's also the case (and I didn't really acknowledge this before) that 
the expression "b" when used as an lvalue has a different type, which is 
determined according to different rules.  As such, the assignment to b 
was not at all influenced by the new type that was arranged for the 
expression "b" as an rvalue.

(I'm using lvalue and rvalue intuitively; in practice, these would be 
assigned on a case-by-case basis along the lines of actual operators or 
language syntax.)

> In other words, this is the same sort of test that disallows using an 
> unassigned variable in a value-returning expression.

Yes, it is.

> When
>    { int a; int b; b := a; }
> returns a compile-time error because "a" is uninitialized at the 
> assignment, that's not the "type" of a, but the typestate. Just FYI.

If you wish to say "typestate" to mean this, be my guest.  It is also 
correct to say "type".

> It actually works quite well if the language takes advantage of it 
> consistantly and allows you to designate your expected typestates and such.

I'm not aware of a widely used language that implements stuff like this.  
Are you?

-- 
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation
From: Darren New
Subject: Re: What is a type error?
Date: 
Message-ID: <nJvsg.22632$Z67.16855@tornado.socal.rr.com>
Chris Smith wrote:
> If you wish to say "typestate" to mean this, be my guest.  It is also 
> correct to say "type".

Sure. I just wasn't sure everyone here was aware of the term, is all. It 
makes it easier to google if you have a more specific term.

> I'm not aware of a widely used language that implements stuff like this.  

I've only used Hermes with extensive support for this sort of thing. 
Hermes is process-oriented, rather than object-oriented, so it's a 
little easier to deal with the "encapsulation" part of the equation 
there. Sadly, Hermes went the way of the dodo.

-- 
   Darren New / San Diego, CA, USA (PST)
     This octopus isn't tasty. Too many
     tentacles, not enough chops.
From: Marcin 'Qrczak' Kowalczyk
Subject: Re: What is a type error?
Date: 
Message-ID: <87k66kplgt.fsf@qrnik.zagroda>
Chris Smith <·······@twu.net> writes:

>> No what happens if right here you code
>>    b := 16;
>> 
>> Does that again change the type of "b"? Or is that an illegal 
>> instruction, because "b" has the "local type" of (18..22)?
>
> It arranges that the expression "b" after that line (barring further 
> changes) has type int{16..16}, which would make the later call to 
> signContract illegal.

The assignment might be performed in a function called there, so it's
not visible locally.

Propagating constraints from conditionals is not applicable to mutable
variables, at least not easily.

I think that constant bounds are not very useful at all. Most ranges
are not known statically, e.g. a variable can span the size of an
array.

-- 
   __("<         Marcin Kowalczyk
   \__/       ······@knm.org.pl
    ^^     http://qrnik.knm.org.pl/~qrczak/
From: Chris Smith
Subject: Re: What is a type error?
Date: 
Message-ID: <MPG.1f1d777230d4eb9b989784@news.altopia.net>
Marcin 'Qrczak' Kowalczyk <······@knm.org.pl> wrote:
> Chris Smith <·······@twu.net> writes:
> >> No what happens if right here you code
> >>    b := 16;
> >> 
> >> Does that again change the type of "b"? Or is that an illegal 
> >> instruction, because "b" has the "local type" of (18..22)?
> >
> > It arranges that the expression "b" after that line (barring further 
> > changes) has type int{16..16}, which would make the later call to 
> > signContract illegal.
> 
> The assignment might be performed in a function called there, so it's
> not visible locally.

Indeed, I pointed that out a few messages ago.  That doesn't mean it's 
impossible, but it does mean that it's more difficult.  Eventually, the 
compiler will have to stop checking something, somewhere.  It certainly 
doesn't, though, have to stop at the first functional abstraction it 
comes to.  The ways that a function modifies the global application 
state certainly ought to be considered part of the visible API of that 
function, and if we could reasonably express that in a type system, then 
that's great!  Granted, designing such a type system for an arbitrary 
imperative language seems a little scary.

> Propagating constraints from conditionals is not applicable to mutable
> variables, at least not easily.

Certainly it worked in the code from my original response to George.  
Regardless of whether it might not work in more complex scenarios (and I 
think it could, though it would be more challenging), it still doesn't 
seem reasonable to assert that the technique is not applicable.  If the 
type system fails, then it fails conservatively as always, and some 
programmer annotation and runtime check is needed to enforce the 
condition.

> I think that constant bounds are not very useful at all. Most ranges
> are not known statically, e.g. a variable can span the size of an
> array.

I think you are overestimating the difficulties here.  Specialized 
language already exist that reliably (as in, all the time) move array 
bounds checking to compile time; that means that there exist at least 
some languages that have already solved this problem.  Going back to my 
handy copy of Pierce's book again, he claims that range checking is a 
solved problem in theory, and the only remaining work is in how to 
integrate it into a program without prohibitive amounts of type 
annotation.

-- 
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation
From: Marshall
Subject: Re: What is a type error?
Date: 
Message-ID: <1152635490.527372.51150@h48g2000cwc.googlegroups.com>
Chris Smith wrote:
>
> Going back to my
> handy copy of Pierce's book again, he claims that range checking is a
> solved problem in theory, and the only remaining work is in how to
> integrate it into a program without prohibitive amounts of type
> annotation.

This is in TAPL? Or ATTPL? Can you cite it a bit more specifically?
I want to reread that.


Marshall
From: Chris Smith
Subject: Re: What is a type error?
Date: 
Message-ID: <MPG.1f1d8f46483dc014989786@news.altopia.net>
Marshall <···············@gmail.com> wrote:
> Chris Smith wrote:
> > Going back to my
> > handy copy of Pierce's book again, he claims that range checking is a
> > solved problem in theory, and the only remaining work is in how to
> > integrate it into a program without prohibitive amounts of type
> > annotation.
> 
> This is in TAPL? Or ATTPL? Can you cite it a bit more specifically?
> I want to reread that.

It's TAPL.  On further review, I may have done more interpretation than 
I remembered.  In particular, the discussion is limited to array bounds 
checking, though I don't see a fundamental reason why it wouldn't extend 
to other kinds of range checking against constant ranges as we've been 
discussing here.

Relevant sections are:

Page 7 footnote: Mentions other challenges such as tractability that 
seem more fundamental, but those are treated as trading off against 
complexity of annotations; in other words, if we didn't require so many 
annotations, then it might become computationally intractable.  The next 
reference better regarding tractability.

Section 30.5, pp. 460 - 465: Gives a specific example of a language 
(theoretical) developed to use limited dependent types to eliminate 
array bounds checking.  There's a specific mention there that it can be 
made tractable because the specific case of dependent types needed for 
bounds checking happens to have efficient algorithms, although dependent 
types are intractable in the general case.

I'm afraid that's all I've got.  I also have come across a real-life 
(though not general purpose) languages that does bounds checking 
elimination via these techniques... but I can't find them now for the 
life of me.  I'll post if I remember what it is, soon.

-- 
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation
From: Darren New
Subject: Re: What is a type error?
Date: 
Message-ID: <PmQsg.23044$Z67.11504@tornado.socal.rr.com>
Chris Smith wrote:
> Specialized 
> language already exist that reliably (as in, all the time) move array 
> bounds checking to compile time; 

It sounds like this means the programmer has to code up what it means to 
index off an array, yes? Otherwise, I can't imagine how it would work.

x := read_integer_from_stdin();
write_to_stdout(myarray[x]);

What does the programmer have to do to implement this semantic in the 
sort of language you're talking about? Surely something somewhere along 
the line has to  "fail" (for some meaning of failure) at run-time, yes?

-- 
   Darren New / San Diego, CA, USA (PST)
     This octopus isn't tasty. Too many
     tentacles, not enough chops.
From: Benjamin Franksen
Subject: Re: What is a type error?
Date: 
Message-ID: <e9thg1$1gkg$1@ulysses.news.tiscali.de>
Darren New wrote:
> Chris Smith wrote:
>> Specialized
>> language already exist that reliably (as in, all the time) move array
>> bounds checking to compile time;
> 
> It sounds like this means the programmer has to code up what it means to
> index off an array, yes? Otherwise, I can't imagine how it would work.
> 
> x := read_integer_from_stdin();
> write_to_stdout(myarray[x]);
> 
> What does the programmer have to do to implement this semantic in the
> sort of language you're talking about? Surely something somewhere along
> the line has to  "fail" (for some meaning of failure) at run-time, yes?

You should really take a look at Epigram. One of its main features is that
it makes it possible not only to statically /check/ invariants, but also
to /establish/ them.

In your example, of course the program has to check the integer at runtime.
However, in a dependent type system, the type of the value returned from
the check-function can very well indicate whether it passed the test (i.e.
being a valid index for myarray) or not.

Ben
From: Darren New
Subject: Re: What is a type error?
Date: 
Message-ID: <4dQsg.23042$Z67.18834@tornado.socal.rr.com>
Marcin 'Qrczak' Kowalczyk wrote:
> The assignment might be performed in a function called there, so it's
> not visible locally.

In Hermes, which actually does this sort of constraint propagation, you 
don't have the ability[1] to munge some other routine's[2] local 
variables, so that becomes a non-issue. You wind up with some strange 
constructs, tho, like an assignment statement that copies and a 
different assignment statement that puts the rvalue into the variable on 
the left while simultaneously destroying whatever variable held the rvalue.


[1] You do. Just not in a way visible to the programmer. The compiler 
manages to optimize out most places that different names consistantly 
refer to the same value.

[2] There aren't subroutines. Just processes, with their own address 
space, to which you send and receive messages.

-- 
   Darren New / San Diego, CA, USA (PST)
     This octopus isn't tasty. Too many
     tentacles, not enough chops.
From: Marshall
Subject: Re: What is a type error?
Date: 
Message-ID: <1151374704.304776.60630@u72g2000cwu.googlegroups.com>
Pascal Costanza wrote:
>
> Consider division by zero: appropriate arguments for division are
> numbers, including the zero.

A bold assertion!

The general question is, what do we do about partial functions?


> The dynamic type check will typically not
> check whether the second argument is zero, but will count on the fact
> that the processor will raise an exception one level deeper.

This is an implementation artifact, and hence not relevant to our
understanding of the issue.


Marshall
From: Pascal Costanza
Subject: Re: What is a type error?
Date: 
Message-ID: <4gdki2F1ldalhU2@individual.net>
Marshall wrote:
> Pascal Costanza wrote:
>> Consider division by zero: appropriate arguments for division are
>> numbers, including the zero.
> 
> A bold assertion!
> 
> The general question is, what do we do about partial functions?
> 
> 
>> The dynamic type check will typically not
>> check whether the second argument is zero, but will count on the fact
>> that the processor will raise an exception one level deeper.
> 
> This is an implementation artifact, and hence not relevant to our
> understanding of the issue.

No, it's not. I am pretty sure that you can model this formally.


Pascal

-- 
3rd European Lisp Workshop
July 3 - Nantes, France - co-located with ECOOP 2006
http://lisp-ecoop06.bknr.net/
From: David Hopwood
Subject: Re: What is a type error?
Date: 
Message-ID: <enZng.220091$8W1.160512@fe1.news.blueyonder.co.uk>
Pascal Costanza wrote:
> David Hopwood wrote:
>> Pascal Costanza wrote:
>>> Chris Smith wrote:
>>>
>>>> While this effort to salvage the term "type error" in dynamic
>>>> languages is interesting, I fear it will fail.  Either we'll all have
>>>> to admit that "type" in the dynamic sense is a psychological concept
>>>> with no precise technical definition (as was at least hinted by
>>>> Anton's post earlier, whether intentionally or not) or someone is
>>>> going to have to propose a technical meaning that makes sense,
>>>> independently of what is meant by "type" in a static system.
>>>
>>> What about this: You get a type error when the program attempts to
>>> invoke an operation on values that are not appropriate for this
>>> operation.
>>>
>>> Examples: adding numbers to strings; determining the string-length of a
>>> number; applying a function on the wrong number of parameters; applying
>>> a non-function; accessing an array with out-of-bound indexes; etc.
>>
>> This makes essentially all run-time errors (including assertion failures,
>> etc.) "type errors". It is neither consistent with, nor any improvement
>> on, the existing vaguely defined usage.
> 
> Nope. This is again a matter of getting the levels right.
> 
> Consider division by zero: appropriate arguments for division are
> numbers, including the zero. The dynamic type check will typically not
> check whether the second argument is zero, but will count on the fact
> that the processor will raise an exception one level deeper.

That is an implementation detail. I don't see its relevance to the above
argument at all.

> This is maybe better understandable in user-level code. Consider the
> following class definition:
> 
> class Person {
>   String name;
>   int age;
> 
>   void buyPorn() {
>    if (< this.age 18) throw new AgeRestrictionException();
>    ...
>   }
> }
> 
> The message p.buyPorn() is a perfectly valid message send and will pass
> both static and dynamic type tests (given p is of type Person in the
> static case). However, you will still get a runtime error.

Your definition above was "You get a type error when the program attempts
to invoke an operation on values that are not appropriate for this operation."

this.age, when less than 18, is a value that is inappropriate for the buyPorn()
operation, and so this satisfies your definition of a type error. My point was
that it's difficult to see any kind of program error that would *not* satisfy
this definition.

-- 
David Hopwood <····················@blueyonder.co.uk>
From: Kent M Pitman
Subject: Re: What is a type error?
Date: 
Message-ID: <ufyhlqo1q.fsf@nhplace.com>
[Replying only to comp.lang.lisp
 http://www.nhplace.com/kent/PFAQ/cross-posting.html ]

Matthias Blume <····@my.address.elsewhere> writes:

> Pascal Costanza <··@p-cos.net> writes:
> 
> > Chris Smith wrote:
> >
> >> While this effort to salvage the term "type error" in dynamic
> >> languages is interesting, I fear it will fail.  Either we'll all
> >> have to admit that "type" in the dynamic sense is a psychological
> >> concept with no precise technical definition (as was at least hinted
> >> by Anton's post earlier, whether intentionally or not) or someone is
> >> going to have to propose a technical meaning that makes sense,
> >> independently of what is meant by "type" in a static system.

Chris, generic terms ARE inherently without meaning in the sense that
they are unions of incompatible meanings.  Get out an unabridged
dictionary and look up type.  You seem to justify here the use of only
one meaning of type in a bit the way it sounds to me like missionaries
traveling across an ocean might say upon observing a scene they've
insufficiently studied: "these savages in this land I've come to have
no apparent sense of order and their lives will not be complete until
they learn the Way of the One True God".  There is a difference between
saying "I understand my own sense of order." and saying "That which I
fail to understand must be, by definition, without order or purpose."

> > What about this: You get a type error when the program attempts to
> > invoke an operation on values that are not appropriate for this
> > operation.
> >
> > Examples: adding numbers to strings; determining the string-length of
> > a number; applying a function on the wrong number of parameters;
> > applying a non-function; accessing an array with out-of-bound indexes;
> > etc.
> 
> Yes, the phrase "runtime type error" is actually a misnomer.  What one
> usually means by that is a situation where the operational semantics
> is "stuck", i.e., where the program, while not yet arrived at what's
> considered a "result", cannot make any progress because the current
> configuration does not match any of the rules of the dynamic
> semantics.
> 
> The reason why we call this a "type error" is that such situations are
> precisely the ones we want to statically rule out using sound static
> type systems.  So it is a "type error" in the sense that the static
> semantics was not strong enough to rule it out.
> 
> > Sending a message to an object that does not understand that message
> > is a type error. The "message not understood" machinery can be seen
> > either as a way to escape from this type error in case it occurs and
> > allow the program to still do something useful, or to actually remove
> > (some) potential type errors.
> 
> I disagree with this.  If the program keeps running in a defined way,
> then it is not what I would call a type error.  It definitely is not
> an error in the sense I described above.

Why do the people who design static type systems get to decide what
the word "type" should mean?

It's a common thing for people to arrive in a new world and to assume
that it should behave like the world they left.

For example: While it's common for tourists from any given country to
go abroad and be annoyed that other countries don't work like their
country at home, this is not a proof that any given country is
canonical in its operation.  It turns out, for example, that the US is
a foreign country to people who live outside of it ... just as those
countries are foreign to the US.

I often find the notion of static "type" to be enormously restrictive,
which is, of course, its source of power.  If it were not restrictive,
it might be hard to infer useful things from a static type system's
notion of type.  But in our world, we don't seek to infer the same set
of things, we use types in a different way.  Our desire to do different
things is not an abdication of the use of the word "type", however.

To push my off-topic analogy web a bit further: Being pro-life in the
abortion debate uses life a particular way, but does not give the
speaker a unique claim to control the meaning of "life"; indeed, while
the decision to be pro-life or pro-choice is a personal decision and I
don't want to characterize individuals in the movement as clones (not
that they would EVER be involved in cloning) of their leaders, it
seems still apparent that organizers of the movement have chosen the
generic term "life" as a way of saying "we own this term, and by
implication of an implicit exhaustive partition, those not in our camp
do not own it".  The choice of the term, in and of itself, seems
carefully crafted to say "we who join us politically care about
anything that might be construed about Life in any way and you who who
do not join us politically obviously do not care about anything that
might be reasonably construed about Life".  And yet, that's obviously
not true.  But the words certainly feel like they push you in that
direction.  One must be wary of the way words influence us.

The term "type" is a bit like this.  I understand what "static type" is
and I understand what "dynamic type" is but I don't understand anyone
being puzzled over someone just insisting they have a lock on the meaning
"type" because it would seem that, by implication, the term "static" that
modifies "type" in the phrase "static type" is therefore redundant and
offers no meaning.  Surely that's not so.

We even get into this debate on the question of what is and is not Lisp.
Which is why Lisp has been reserved as the name of a language family and
people are encouraged to speak only of "Common Lisp" or "ISLISP" or 
"Lisp Machine Lisp" or "Emacs Lisp" because those are specific ... and why
it is a horrible travesty this newsgroup is called comp.lang.lisp, mostly
for indexing purposes with the past.  It really should be comp.lang.cl
and if there were a comp.lang.lisp, it properly should not be 
dialect-specific.  Alas.

I personally think that as a meta-theory of naming, one should never
claim a generic term (i.e., a single word that you can look up in the
dictionary and find to be a noun) unless they have a very open-minded
willingness to embrace modifiers (adjectives) and to negotiate with
political opposition.  So names like "variable" or "function" need
very broad definitions that embrace the many kinds of variables
(call-by-value vs call-by reference, ones users can name vs ones that
can be gensymed, ones that are single-valued vs ones that are
multiple-valued, ones that are typed and ones that are not) and
functions (pure mathematical functions, computer procedures, etc.)

Before you say that Lisp's dynamic typing doesn't match some notion of
type and before enumerating a criticism of what it does and doesn't
do, you need to establish a framework in which you've defined type and
to some extent to motivate why you get to have your notion of type be
thought canonical.  You might, for example, say "for the purposes of
this discussion, I define type this way: ..." and then we could have a
debate about your definition, but it would be important to note that
any criticisms of the language would be scoped to that debate.  Unless
you get the community as a whole to buy your definition of type, it's
not our definition, and we need to be judged by our definition.

It occurs to me that perhaps a Lisp notion of type is less like the
way that static people use it--that is, that it extends transitively
outward-- and more like a "point notion" that you get when you think
of the world as round, and yet you reason about your motion around a
room as if you were on a flat plane.  That is, at every point, the
curvature of the earth is so long that for most people, it appears
flat (and the horizon effect mostly just a visual anomaly).  I think
I've heard there's even math that is made to embrace this point of
view with good effect, though I don't happen to use it.  In some
sense, dynamic typing is like that.  Because a variable is not typed,
what is in that variable might change dynamically in flow through a
program such that type doesn't have the non-local flow effect through
a conduit that static type transitivity might imply; and yet locally
at any given point, an object is or is not something you can operate
on, according to its type.  Declared type or not depends on details of
the language--it's been called tagging, but we avoid that term in part
because it suggests that you cannot compile out the tag.  In fact, in
fixnums, it's common to store them in arrays or to move them into
registers where there is no tag, and operations on fixnums are
designed to avoid pinning their EQness exactly so that you can split a
fixnum or a character and have one instance of it in one register and
another in another and not get tripped up over whether they have
suddenly become non-EQ...we just say they are still EQL and leave it
at that.  The point though is that the "type" is an intentionally
etherial thing, as to where it lives--it might be manifest or might
not--but what matters, to us, is that the set of operations on typed
objects work as described.

To me a "type" means approximately "an attribute, manifest or
otherwise, shared by a set of things, possibly inspectable and
possibly not, but which if the thing has a given type, then a set of
operations associated with the type will be valid on the type".
That's pretty generic, and seems to me to span both dynamic and static
typing, by making each of these take choices on the questions of
whether type may be, may not be, must be, etc. inspectable, manifest,
etc.  There are also issues of whether the set of types is
inspectable, whether a type can be obtained from its name, whether all
instances can be obtained from the type or type name, etc.  Each of
these choice points leads to a different "type system", all of which
seem to me valid in an appropriate context.

Saying that a "static type system" is a true "type" system and that
in a "dynamic type system" we really mean "tag", not "type" is not
correct, at least in the real world where no one owns these terms.
(It is certainly reasonable within a more restricted context where
these terms do enjoy different meaning to use one's own terms, but to
go "abroad" to our gruop, where they don't have those meanings and
then assert alternate meanings is going a bit afield.)

We have no notion of tag in the language.  Nor any notion of pointer,
though people often find use to refer to pointers when explaining our
behavior.  Tag and pointer are implementation terms.  Abstractly, CL
has neither.

We choose to use the word "type" as we do, and we have as much right
to use that term with our heads held high as anyone who uses the term
differently.  Not because we are right and others are wrong, but
because in a pluralistic society, there is room for more than one
point of view.

Coming from a dynamic world, I always find it fascinating to hear
discussions of what static type systems do.  I would hope that people
who are from that world find ours equally enlightening and
fascinating.  But ultimately, neither of these is right or wrong.
They are just choices.  And, as you probably guessed above, I'm
pro-choice... (but not just the limited use of the word "choice" that
might be implied by that "other" political battlefield...a more
generic notion of choice...)

This message provided as a public service for those in the Lisp
community who think they have to hang their heads in shame and think
we have some kind of poor man's type system.  We don't.  We just have
a different type system.  It's stronger in some ways than a static
one, and weaker in other ways.  That's what different means.  You
never get anything for free--benefits come at the cost of
non-benefits.  And our type system has certain warts that come from
being poorly or too quickly thought out--but one of them is not being
"dynamic".  We've thought about that a lot over the years and the
decision has been repeatedly made to keep it dynamic.  That's not an
accident or a confusion, it's a deliberate choice.  As is our use of
the word "type".
From: Greg Menke
Subject: Re: What is a type error?
Date: 
Message-ID: <m364ihb4js.fsf@athena.pienet>
Thanks Ken- the late discussion on types was really bugging me and I
couldn't explain why.  Your response helps a lot.

The discussion reminded me a lot of the wrangle thats so easy to get
into when theoreticians and engineers start trying to communicate.  Each
camp wants & uses their own terms & definitions which seem ill-defined
or uselessly abstract to the other.

Greg
From: Kent M Pitman
Subject: Re: What is a type error?
Date: 
Message-ID: <uy7vau00b.fsf@nhplace.com>
Chris Smith <·······@twu.net> writes:

> Pascal Costanza <····@p-cos.net> wrote:
> > What about this: You get a type error when the program attempts to 
> > invoke an operation on values that are not appropriate for this operation.
> > 
> > Examples: adding numbers to strings; determining the string-length of a 
> > number; applying a function on the wrong number of parameters; applying 
> > a non-function; accessing an array with out-of-bound indexes; etc.
> 
> Hmm.  I'm afraid I'm going to be picky here.  I think you need to 
> clarify what is meant by "appropriate".  If you mean "the operation will 
> not complete successfully" as I suspect you do, then we're closer... but 
> this little snippet of Java (HORRIBLE, DO NOT USE!) confuses the matter 
> for me:
> 
>     int i = 0;
> 
>     try
>     {
>         while (true) process(myArray[i++]);
>     }
>     catch (IndexOutOfBoundsException e) { }
> 
> That's an array index from out of bounds that not only fails to be a 
> type error, but also fails to be an error at all!  (Don't get confused 
> by Java's having a static type system for other purposes... we are 
> looking at array indexing here, which Java checks dynamically.  I would 
> have used a dynamically typed language, if I could have written this as 
> quickly.)
> 
> I'm also unsure how your definition above would apply to languages that 
> do normal order evaluation, in which (at least in my limited brain) it's 
> nearly impossible to break down a program into sequences of operations 
> on actual values.  I suppose, though, that they do eventually happen 
> with primitives at the leaves of the derivation tree, so the definition 
> would still apply.

See Section I and particularly Section II of my 
 "Exceptional Situations in Lisp"
 http://www.nhplace.com/kent/Papers/Exceptional-Situations-1990.html
which tries to confront this issue.

The answer I arrived at when I first tried to write about this issue
was that there is an inherently-subjective but objectively-necessary
division that you must draw in any discusison between the normal and
exceptional parts of a program.  You won't win in your attempt to
apply uniform terminology because there is no single overarching view
that is consistently going to capture all angles.  (This is the
conversational equivalent of saying that a 2-D rendering of a 3-D
scene will be necessarily deficient in expressing all angles of the
view.)

The function + is properly described as something which normally adds 
numbers and signals an error when the arguments are not numbers.  It is
simply not a normal use of the function plus to do:
 (+ 3 'A)
just for the purpose of entering the debugger.  However, it is the
normal function of the ERROR function to force entry to the debugger.
That is, when one does:
 (error "Bad argument: ~S" 'A)
[which is what the + function above does], that is a normal view of
what ERROR does.

It's fine to say that + is a universal function that works on all data
and simply adds numbers but transfers control otherwise.  That
describes its behavior, but it leads to a certain way of understanding
your program that is more complicated than it might be. It's also fine
to describe + as something that only adds numbers and deal with the
other cases as exceptional--in this regard, it is harder to explain
how the function continues to run at all when that constraint is not
satisfied.  And yet, Lisp embraces both of those paradigms at once.

Mathemeticians do this, too, even with types.  They have discussions
about types and insist that types be dealt with a certain way, and yet
they have discussions about what happens in the case of violated
constraints.  They deal formally with proofs that involve logical
inconsistencies, and they manage to simply handle the data carefully
and still reach useful conclusions, reasoning about situations that
can't happen.  To understand what Mathematicians are doing when they
reach logical inconsistencies in the propositional calculus and
nevertheless survive either requires something other than the
propositional calculus to describe, or at minimum requires a lot more
symbols in the propositional calculus to represent ... just as to
understand why type errors never happen anywhere in any computer
anywhere involves an understanding that computers mostly never "do
anything", "never represent anything", "never move data from place to
place", they just sit there being finite state machines and changing
ones to zeros back and forth relentlessly... anything else above that
is just an abstraction.  And if you don't understand that abstractions
are both mutable (you can add and remove them, apply them or ignore
them, according to circumstance) and yet at the same time important to
apply AS IF immutable, you get nowhere. 

All objective analysis is within a context, but all context is
subjective.  Or it it vice versa?  But in any case, the subjective and
the objective are not enemies, nor do they compete.  They are
collaborators, but they are different.  It's not so important where
you put one and where you put the other, as that you understand when
you are within the zone of one and when you are within the zone of the
other, because the rules for navigating them are different.
From: J�rgen Exner
Subject: Re: What is a type error?
Date: 
Message-ID: <2oDsg.14592$Wh7.11783@trnddc07>
David Hopwood wrote:
[...]

There is no such error message listed in 'perldoc perldiag'.
Please quote the actual text of your error message and include the program 
that produced this error.
Then people here in CLPM may be able to assist you.

jue 
From: David Hopwood
Subject: Re: What is a type error?
Date: 
Message-ID: <CRDsg.56590$181.46496@fe3.news.blueyonder.co.uk>
Jürgen Exner wrote:
> David Hopwood wrote:
> [...]
> 
> There is no such error message listed in 'perldoc perldiag'.
> Please quote the actual text of your error message and include the program 
> that produced this error.
> Then people here in CLPM may be able to assist you.

Yes, I'm well aware that most of this thread has been off-topic for c.l.p.m,
but it is no less off-topic for the other groups (except possibly c.l.functional),
and I can't trim the Newsgroups line down to nothing.

Don't you have a newsreader that can mark whole threads that you don't want
to read?

-- 
David Hopwood <····················@blueyonder.co.uk>
From: Marshall
Subject: Re: What is a type error?
Date: 
Message-ID: <1152590555.413435.40270@s13g2000cwa.googlegroups.com>
David Hopwood wrote:
> Jürgen Exner wrote:
> > David Hopwood wrote:
> > [...]
> >
> > There is no such error message listed in 'perldoc perldiag'.
> > Please quote the actual text of your error message and include the program
> > that produced this error.
> > Then people here in CLPM may be able to assist you.
>
> Yes, I'm well aware that most of this thread has been off-topic for c.l.p.m,
> but it is no less off-topic for the other groups (except possibly c.l.functional),
> and I can't trim the Newsgroups line down to nothing.
>
> Don't you have a newsreader that can mark whole threads that you don't want
> to read?

Sure, or he could just skip over it. Or he could make a simple
request, such as "please trim comp.lang.whatever because it's
off-topic here." But he hasn't done any of these, choosing instead
to drop in with his occasional pointless snarky comments.


Marshall
From: Marshall
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1150902019.850531.204740@g10g2000cwb.googlegroups.com>
Chris Smith wrote:
>
>  When I used the word "type" above, I was adopting the
> working definition of a type from the dynamic sense.  That is, I'm
> considering whether statically typed languages may be considered to also
> have dynamic types, and it's pretty clear to me that they do.

I suppose this statement has to be evaluated on the basis of a
definition of "dynamic types." I don't have a firm definition for
that term, but my working model is runtime type tags. In which
case, I would say that among statically typed languages,
Java does have dynamic types, but C does not. C++ is
somewhere in the middle.


Marshall
From: Dr.Ruud
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e7bv9v.1f8.1@news.isolution.nl>
Marshall schreef:

> "dynamic types." I don't have a firm definition for
> that term, but my working model is runtime type tags. In which
> case, I would say that among statically typed languages,
> Java does have dynamic types, but C does not. C++ is
> somewhere in the middle.

C has union.

-- 
Affijn, Ruud

"Gewoon is een tijger."
From: Rob Thorpe
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1150910976.455179.55870@r2g2000cwb.googlegroups.com>
Dr.Ruud wrote:
> Marshall schreef:
>
> > "dynamic types." I don't have a firm definition for
> > that term, but my working model is runtime type tags. In which
> > case, I would say that among statically typed languages,
> > Java does have dynamic types, but C does not. C++ is
> > somewhere in the middle.
>
> C has union.

That's not the same thing.  The value of a union in C can be any of a
set of specified types.  But the program cannot find out which, and the
language doesn't know either.

With C++ and Java dynamic types the program can test to find the type.
From: Dr.Ruud
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e7c8sr.1fk.1@news.isolution.nl>
Rob Thorpe schreef:
> Dr.Ruud:
>> Marshall:

>>> "dynamic types." I don't have a firm definition for
>>> that term, but my working model is runtime type tags. In which
>>> case, I would say that among statically typed languages,
>>> Java does have dynamic types, but C does not. C++ is
>>> somewhere in the middle.
>>
>> C has union.
>
> That's not the same thing.

That is your opinion. In the context of this discussion I don't see any
problem to put C's union under "dynamic types".


> The value of a union in C can be any of a
> set of specified types.  But the program cannot find out which, and
> the language doesn't know either.
>
> With C++ and Java dynamic types the program can test to find the type.

When such a test is needed for the program with the union, it has it.

-- 
Affijn, Ruud

"Gewoon is een tijger."
From: Rob Thorpe
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1150990555.368011.48350@b68g2000cwa.googlegroups.com>
Dr.Ruud wrote:
> Rob Thorpe schreef:
> > Dr.Ruud:
> >> Marshall:
>
> >>> "dynamic types." I don't have a firm definition for
> >>> that term, but my working model is runtime type tags. In which
> >>> case, I would say that among statically typed languages,
> >>> Java does have dynamic types, but C does not. C++ is
> >>> somewhere in the middle.
> >>
> >> C has union.
> >
> > That's not the same thing.
>
> That is your opinion. In the context of this discussion I don't see any
> problem to put C's union under "dynamic types".

Lets put it like this:-
1. In C++ and Java it is possible to make a variable that can A)Take on
many different types and B)Where the programmer can test what the type
is.
2. In C it is possible to make a variable that can do 1A but not 1B.

This is a statement of fact, not opinion.

I call languages that do #1 dynamically typed, in line with common
usage.

> > The value of a union in C can be any of a
> > set of specified types.  But the program cannot find out which, and
> > the language doesn't know either.
> >
> > With C++ and Java dynamic types the program can test to find the type.
>
> When such a test is needed for the program with the union, it has it.

What do you mean?
There is no way to test the type of the value inside a union in C.
From: Chris Uppal
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <44992e6c$3$664$bed64819@news.gradwell.net>
Chris Smith wrote:

> > It would be interesting to see what a language designed specifically to
> > support user-defined, pluggable, and perhaps composable, type systems
> > would look like. [...]
>
> You mean in terms of a practical programming language?  If not, then
> lambda calculus is used in precisely this way for the static sense of
> types.

Good point.  I was actually thinking about what a practical language might look
like, but -- hell -- why not start with theory for once ? ;-)


> I think Marshall got this one right.  The two are accomplishing
> different things.  In one case (the dynamic case) I am safeguarding
> against negative consequences of the program behaving in certain non-
> sensical ways.  In the other (the static case) I am proving theorems
> about the impossibility of this non-sensical behavior ever happening.

And so conflating the two notions of type (-checking) as a kind of category
error ?  If so then I see what you mean, and it's a useful distinction, but am
unconvinced that it's /so/ helpful a perspective that I would want to exclude
other perspectives which /do/ see the two as more-or-less trivial variants on
the same underlying idea.

> I acknowledge those questions.  I believe they are valid.  I don't know
> the answers.  As an intuitive judgement call, I tend to think that
> knowing the correctness of these things is of considerable benefit to
> software development, because it means that I don't have as much to
> think about at any one point in time.  I can validly make more
> assumptions about my code and KNOW that they are correct.  I don't have
> to trace as many things back to their original source in a different
> module of code, or hunt down as much documentation.  I also, as a
> practical matter, get development tools that are more powerful.

Agreed that these are all positive benefits of static declarative (more or
less) type systems.

But then (slightly tongue-in-cheek) shouldn't you be agitating for Java's type
system to be stripped out (we hardly /need/ it since the JVM does latent typing
anyway), leaving the field free for more powerful or more specialised static
analysis ?


> (Whether it's possible to create the same for a dynamically typed
> language is a potentially interesting discussion; but as a practical
> matter, no matter what's possible, I still have better development tools
> for Java than for JavaScript when I do my job.)

Acknowledged.  Contrary-wise, I have better development tools in Smalltalk than
I ever expect to have in Java -- in part (only in part) because of the late
binding in Smalltalk and it's lack of insistence on declared types from an
arbitrarily chosen type system.


> On
> the other hand, I do like proving theorems, which means I am interested
> in type theory; if that type theory relates to programming, then that's
> great!  That's probably not the thing to say to ensure that my thoughts
> are relevant to the software development "industry", but it's
> nevertheless the truth.

Saying it will probably win you more friends in comp.lang.functional than it
looses in comp.lang.java.programmer ;-)

    -- chris
From: Joachim Durchholz
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e7bka4$7no$1@online.de>
Chris Uppal schrieb:
> Chris Smith wrote:
>> I think Marshall got this one right.  The two are accomplishing
>> different things.  In one case (the dynamic case) I am safeguarding
>> against negative consequences of the program behaving in certain non-
>> sensical ways.  In the other (the static case) I am proving theorems
>> about the impossibility of this non-sensical behavior ever happening.
> 
> And so conflating the two notions of type (-checking) as a kind of category
> error ?  If so then I see what you mean, and it's a useful distinction, but am
> unconvinced that it's /so/ helpful a perspective that I would want to exclude
> other perspectives which /do/ see the two as more-or-less trivial variants on
> the same underlying idea.

It is indeed helpful.
Just think of all the unit tests that you don't have to write.

Regards,
Jo
From: Ketil Malde
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <eg4py81c3r.fsf@polarvier.ii.uib.no>
Chris Smith <·······@twu.net> writes:

> I've since abandoned any attempt to be picky about use of the word 
> "type".  That was a mistake on my part.  I still think it's legitimate 
> to object to statements of the form "statically typed languages X, but 
> dynamically typed languages Y", in which it is implied that Y is 
> distinct from X.  When I used the word "type" above, I was adopting the 
> working definition of a type from the dynamic sense.  

Umm...what *is* the definition of a "dynamic type"?  In this thread
it's been argued from tags and memory representation to (almost?)
arbitrary predicates.  

> That is, I'm considering whether statically typed languages may be
> considered to also have dynamic types, and it's pretty clear to me
> that they do. 

Well, if you equate tags with dynamic types:

        data Universal = Str String | Int Integer | Real Double 
                         | List [Universal] ...

A pattern match failure can then be seen as a dynamic type error.

Another question is how to treat a system (and I think C++'s RTTI
qualifies, and probably related languages like Java) where static type
information is available at runtime.  Does this fit with the term
"dynamic typing" with "typing" in the same sense as "static typing"?

-k
-- 
If I haven't seen further, it is by standing in the footprints of giants
From: Darren New
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <0%Wlg.3521$MF6.947@tornado.socal.rr.com>
Chris Uppal wrote:
> Personally, I would be quite happy to go there -- I dislike the idea that a
> value has a specific inherent type.

Interestingly, Ada defines a type as a collection of values. It works 
quite well, when one consistantly applies the definition. For example, 
it makes very clear the difference between a value of (type T) and a 
value of (type T or one of its subtypes).

-- 
   Darren New / San Diego, CA, USA (PST)
     My Bath Fu is strong, as I have
     studied under the Showerin' Monks.
From: Chris Uppal
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <44992e6c$0$664$bed64819@news.gradwell.net>
Darren New wrote:

[me:]
> > Personally, I would be quite happy to go there -- I dislike the idea
> > that a value has a specific inherent type.
>
> Interestingly, Ada defines a type as a collection of values. It works
> quite well, when one consistantly applies the definition.

I have never been very happy with relating type to sets of values (objects,
whatever).  I'm not saying that it's formally wrong (but see below), but it
doesn't fit with my intuitions very well -- most noticeably in that the sets
are generally unbounded so you have to ask where the (intentional) definitions
come from.

Two other notions of what "type" means might be interesting, both come from
attempts to create type-inference mechanisms for Smalltalk or related
languages.  Clearly one can't use the set-of-values approach for these purposes
;-)   One approach takes "type" to mean "set of classes" the other takes a
finer-grained approach and takes it to mean "set of selectors" (where
"selector" is Smalltalk for "name of a method" -- or, more accurately, name of
a message).

But I would rather leave the question of what a type "is" open, and consider
that to be merely part of the type system.  For instance the hypothetical
nullability analysis type system I mentioned might have only three types
NULLABLE, ALWAYSNULL, and NEVERNULL.

It's worth noting, too, that (in some sense) the type of an object can change
over time[*].  That can be handled readily (if not perfectly) in the informal
internal type system(s) which programmers run in their heads (pace the very
sensible post by Anton van Straaten today in this thread -- several branches
away), but cannot be handled by a type system based on sets-of-values (and is
also a counter-example to the idea that "the" dynamic type of an object/value
can be identified with its tag).

([*] if the set of operations in which it can legitimately partake changes.
That can happen explicitly in Smalltalk (using DNU proxies for instance if the
proxied object changes, or even using #becomeA:), but can happen anyway in less
"free" languages -- the State Pattern for instance, or even (arguably) in the
difference between an empty list and a non-empty list).

    -- chris
From: Andreas Rossberg
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e7bfbq$8v164$3@hades.rz.uni-saarland.de>
Chris Uppal wrote:
> 
> I have never been very happy with relating type to sets of values (objects,
> whatever).

Indeed, this view is much too narrow. In particular, it cannot explain 
abstract types, which is *the* central aspect of decent type systems. 
There were papers observing this as early as 1970. A type system should 
rather be seen as a logic, stating invariants about a program. This can 
include operations supported by values of certain types, as well as more 
advanced properties, e.g. whether something can cause a side-effect, can 
diverge, can have a deadlock, etc.

(There are also theoretic problems with the types-as-sets view, because 
sufficiently rich type systems can no longer be given direct models in 
standard set theory. For example, first-class polymorphism would run 
afoul the axiom of foundation.)

> It's worth noting, too, that (in some sense) the type of an object can change
> over time[*].

No. Since a type expresses invariants, this is precisely what may *not* 
happen. If certain properties of an object may change then the type of 
the object has to reflect that possibility. Otherwise you cannot 
legitimately call it a type.

Taking your example of an uninitialised reference, its type is neither 
"reference to nil" nor "reference to object that understands message X", 
it is in fact the union of both (at least). And indeed, languages with 
slightly more advanced type systems make things like this very explicit 
(in ML for example you have the option type for that purpose).

- Andreas
From: Joachim Durchholz
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e7botf$g92$1@online.de>
Andreas Rossberg schrieb:
> Chris Uppal wrote:
> 
>> It's worth noting, too, that (in some sense) the type of an object can 
>> change over time[*].
> 
> No. Since a type expresses invariants, this is precisely what may *not* 
> happen.

No. A type is a set of allowable values, allowable operations, and 
constraints on the operations (which are often called "invariants" but 
they are invariant only as long as the type is invariant).

I very much agree that the association of a type with a value or a name 
should be constant over their lifetime - but that doesn't follow from 
the definition of "type", it follows from general maintainability 
considerations (quite strong ones actually).

Regards,
Jo
From: Andreas Rossberg
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e7bq0m$8v164$5@hades.rz.uni-saarland.de>
Joachim Durchholz wrote:
>>
>>> It's worth noting, too, that (in some sense) the type of an object 
>>> can change over time[*].
>>
>> No. Since a type expresses invariants, this is precisely what may 
>> *not* happen.
> 
> No. A type is a set of allowable values, allowable operations, and 
> constraints on the operations (which are often called "invariants" but 
> they are invariant only as long as the type is invariant).

The purpose of a type system is to derive properties that are known to 
hold in advance. A type is the encoding of these properties. A type 
varying over time is an inherent contradiction (or another abuse of the 
term "type").

- Andreas
From: Joachim Durchholz
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e7et3c$2bd$1@online.de>
Andreas Rossberg schrieb:
> Joachim Durchholz wrote:
>>>
>>>> It's worth noting, too, that (in some sense) the type of an object 
>>>> can change over time[*].
>>>
>>> No. Since a type expresses invariants, this is precisely what may 
>>> *not* happen.
>>
>> No. A type is a set of allowable values, allowable operations, and 
>> constraints on the operations (which are often called "invariants" but 
>> they are invariant only as long as the type is invariant).
> 
> The purpose of a type system is to derive properties that are known to 
> hold in advance.

That's just one of many possible purposes (a noble one, and the most 
preeminent one in FPLs I'll agree any day, but it's still *not the 
definition of a type*).

 > A type is the encoding of these properties. A type
> varying over time is an inherent contradiction (or another abuse of the 
> term "type").

No. It's just a matter of definition, essentially.
E.g. in Smalltalk and Lisp, it does make sense to talk of the "type" of 
a name or a value, even if that type may change over time.
I regard it as a highly dubious practice to have things change their 
types over their lifetime, but if there are enough other constraints, 
type constancy may indeed have to take a back seat.

Regards,
Jo
From: ········@ps.uni-sb.de
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1151010139.960608.122470@b68g2000cwa.googlegroups.com>
Joachim Durchholz wrote:
>
>  > A type is the encoding of these properties. A type
> > varying over time is an inherent contradiction (or another abuse of the
> > term "type").
>
> No. It's just a matter of definition, essentially.
> E.g. in Smalltalk and Lisp, it does make sense to talk of the "type" of
> a name or a value, even if that type may change over time.

OK, now we are simply back full circle to Chris Smith's original
complaint that started this whole subthread, namely (roughly) that
long-established terms like "type" or "typing" should *not* be
stretched in ways like this, because that is technically inaccurate and
prone to misinterpretation.

- Andreas
From: Joachim Durchholz
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e7lujg$q2r$1@online.de>
········@ps.uni-sb.de schrieb:
> Joachim Durchholz wrote:
>>  > A type is the encoding of these properties. A type
>>> varying over time is an inherent contradiction (or another abuse of the
>>> term "type").
>> No. It's just a matter of definition, essentially.
>> E.g. in Smalltalk and Lisp, it does make sense to talk of the "type" of
>> a name or a value, even if that type may change over time.
> 
> OK, now we are simply back full circle to Chris Smith's original
> complaint that started this whole subthread, namely (roughly) that
> long-established terms like "type" or "typing" should *not* be
> stretched in ways like this, because that is technically inaccurate and
> prone to misinterpretation.

Sorry, I have to insist that it's not me who's stretching terms here.

All textbook definitions that I have seen define a type as the 
set/operations/axioms triple I mentioned above.
No mention of immutability, at least not in the definitions.

Plus, I don't see any necessity on insisting on immutability for the 
definition of "type". Otherwise, you'd have to declare that Smalltalk 
objects truly don't have a type (not even an implied one), and that 
would simply make no sense: they do in fact have a type, even though it 
may occasionally change.

Regards,
Jo
From: Chris Smith
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <MPG.1f0868a2e5fe99229896fb@news.altopia.net>
Joachim Durchholz <··@durchholz.org> wrote:
> Sorry, I have to insist that it's not me who's stretching terms here.
> 
> All textbook definitions that I have seen define a type as the 
> set/operations/axioms triple I mentioned above.
> No mention of immutability, at least not in the definitions.

The immutability comes from the fact (perhaps implicit in these 
textbooks, or perhaps they are not really texts on formal type theory) 
that types are assigned to expressions, and the program is not likely to 
change as it executes.  Even such oddities as "self-modifying code" are 
generally formally modeled as a single program, which reduces to other 
programs as it evaluates just like everything else.  The original 
program doesn't change in the formal model.

Hence the common (and, I think, meaningless) distinction that has been 
made here: that static type systems assign types to expressions 
(sometimes I hear variables, which is limiting and not really correct), 
while dynamic type systems do so to values.  However, that falls apart 
when you understand what formal type systems mean by types.  What they 
mean, roughly, is "that label which we attach to an expressions 
according to fixed rules to help facilitate our proofs".  Since dynamic 
type systems don't do such proofs, I'm having trouble understanding what 
could possibly be meant by assigning types to values, unless we redefine 
type.  That's what I've been trying to do.  So far I've had only limited 
success, although there is definite potential in one post by Chris Uppal 
and another by Chris Clark (or maybe I'm just partial to other people 
named Chris).

-- 
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation
From: Joachim Durchholz
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e7mhbv$52g$1@online.de>
Chris Smith schrieb:
> Joachim Durchholz <··@durchholz.org> wrote:
>> Sorry, I have to insist that it's not me who's stretching terms here.
>>
>> All textbook definitions that I have seen define a type as the 
>> set/operations/axioms triple I mentioned above.
>> No mention of immutability, at least not in the definitions.
> 
> The immutability comes from the fact (perhaps implicit in these 
> textbooks, or perhaps they are not really texts on formal type theory) 
> that types are assigned to expressions,

That doesn't *define* what's a type or what isn't!

If it's impossible to assign types to all expressions of a program in a 
language, that does mean that there's no useful type theory for the 
program, but it most definitely does not mean that there are no types in 
the program.
I can still sensibly talk about sets of values, sets of allowable 
operations over each value, and about relationships between inputs and 
outputs of these operations.

So programs have types, even if they don't have a static type system.
Q.E.D.

Regards,
Jo
From: Chris Smith
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <MPG.1f08823f2d62b6219896fe@news.altopia.net>
Joachim Durchholz <··@durchholz.org> wrote:
> > The immutability comes from the fact (perhaps implicit in these 
> > textbooks, or perhaps they are not really texts on formal type theory) 
> > that types are assigned to expressions,
> 
> That doesn't *define* what's a type or what isn't!
> 

I'm sorry if you don't like it, but that's all there is.  That's the 
point that's being missed here.  It's not as if there's this thing 
called a type, and then we can consider how it is used by formal type 
systems.  A type, in formal type theory, is ANY label that is assigned 
to expressions of a program for the purpose of making a formal type 
system work.  If you wish to claim a different use of the word and then 
try to define what is or is not a type, then be my guest.  However, 
formal type theory will still not adopt whatever restrictions you come 
up with, and will continue to use the word type as the field has been 
using the word for a good part of a century now.

The first step toward considering the similarities and differences 
between the different uses of "type" in dynamic type systems and formal 
type theory is to avoid confusing aspects of one field with the other.

-- 
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation
From: Andrew McDonagh
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e7mkk7$dge$1@news.freedom2surf.net>
Joachim Durchholz wrote:
> Chris Smith schrieb:
>> Joachim Durchholz <··@durchholz.org> wrote:
>>> Sorry, I have to insist that it's not me who's stretching terms here.
>>>
>>> All textbook definitions that I have seen define a type as the 
>>> set/operations/axioms triple I mentioned above.
>>> No mention of immutability, at least not in the definitions.
>>
>> The immutability comes from the fact (perhaps implicit in these 
>> textbooks, or perhaps they are not really texts on formal type theory) 
>> that types are assigned to expressions,
> 
> That doesn't *define* what's a type or what isn't!
> 
> If it's impossible to assign types to all expressions of a program in a 
> language, that does mean that there's no useful type theory for the 
> program, but it most definitely does not mean that there are no types in 
> the program.
> I can still sensibly talk about sets of values, sets of allowable 
> operations over each value, and about relationships between inputs and 
> outputs of these operations.
> 
> So programs have types, even if they don't have a static type system.
> Q.E.D.
> 

Of course not.  Otherwise programs using dynamically  typed systems 
wouldnt exist.

> Regards,
> Jo

I haven't read all of this thread, I wonder, is the problem to do with 
Class being mistaken for Type? (which is usually the issue)
From: Chris Smith
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <MPG.1f0886663b38a0d29896ff@news.altopia.net>
Andrew McDonagh <····@andmc.com> wrote:
> I haven't read all of this thread, I wonder, is the problem to do with 
> Class being mistaken for Type? (which is usually the issue)

Hi Andrew!

Not much of this thread has to do with object oriented languages... so 
the word "class" would be a little out of place.  However, it is true 
that the word "type" is being used in the dynamically typed sense to 
include classes from class-based OO languages (at least those that 
include run-time type features), as well as similar concepts in other 
languages.  Some of us are asking for, and attempting to find, a formal 
definition to justify this concept, and are so far not finding it.  
Others are saying that the definition is somehow implicitly 
psychological in nature, and therefore not amenable to formal 
definition... which others (including myself) find rather unacceptable.

I started out insisting that "type" be used with its correct formal 
definition, but I'm convinced that was a mistake.  Asking someone to 
change their entire terminology is unlikely to succeed.  I'm now 
focusing on just trying to represent the correct formal definition of 
types in the first place, and make it clear when one or the other 
meaning is being used.

Hopefully, that's a fair summary of the thread to date.

-- 
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation
From: Andrew McDonagh
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e7mmc4$en7$1@news.freedom2surf.net>
Chris Smith wrote:
> Andrew McDonagh <····@andmc.com> wrote:
>> I haven't read all of this thread, I wonder, is the problem to do with 
>> Class being mistaken for Type? (which is usually the issue)
> 
> Hi Andrew!

Hi Chris

> 
> Not much of this thread has to do with object oriented languages... so 
> the word "class" would be a little out of place.  

Glad to here.

> However, it is true 
> that the word "type" is being used in the dynamically typed sense to 
> include classes from class-based OO languages (at least those that 
> include run-time type features), as well as similar concepts in other 
> languages.  Some of us are asking for, and attempting to find, a formal 
> definition to justify this concept, and are so far not finding it.  
> Others are saying that the definition is somehow implicitly 
> psychological in nature, and therefore not amenable to formal 
> definition... which others (including myself) find rather unacceptable.
> 
> I started out insisting that "type" be used with its correct formal 
> definition, but I'm convinced that was a mistake.  Asking someone to 
> change their entire terminology is unlikely to succeed.  I'm now 
> focusing on just trying to represent the correct formal definition of 
> types in the first place, and make it clear when one or the other 
> meaning is being used.
> 
> Hopefully, that's a fair summary of the thread to date.
> 

Cheers much appreciated!

Andrew
From: Joachim Durchholz
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e7phid$baf$2@online.de>
Andrew McDonagh schrieb:
> Joachim Durchholz wrote:
>> Chris Smith schrieb:
>>> Joachim Durchholz <··@durchholz.org> wrote:
>>>> Sorry, I have to insist that it's not me who's stretching terms here.
>>>>
>>>> All textbook definitions that I have seen define a type as the 
>>>> set/operations/axioms triple I mentioned above.
>>>> No mention of immutability, at least not in the definitions.
>>>
>>> The immutability comes from the fact (perhaps implicit in these 
>>> textbooks, or perhaps they are not really texts on formal type 
>>> theory) that types are assigned to expressions,
>>
>> That doesn't *define* what's a type or what isn't!
>>
>> If it's impossible to assign types to all expressions of a program in 
>> a language, that does mean that there's no useful type theory for the 
>> program, but it most definitely does not mean that there are no types 
>> in the program.
>> I can still sensibly talk about sets of values, sets of allowable 
>> operations over each value, and about relationships between inputs and 
>> outputs of these operations.
>>
>> So programs have types, even if they don't have a static type system.
>> Q.E.D.
> 
> Of course not.  Otherwise programs using dynamically  typed systems 
> wouldnt exist.

I don't understand.
Do you mean dynamic typing (aka runtime types)?

> I haven't read all of this thread, I wonder, is the problem to do with 
> Class being mistaken for Type? (which is usually the issue)

No, not at all. I have seen quite a lot beyond OO ;-)

Regards,
Jo
From: Marshall
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1150987852.784870.7610@y41g2000cwy.googlegroups.com>
Andreas Rossberg wrote:
> Marshall wrote:
> >
> > What prohibits us from describing an abstract type as a set of values?
>
> If you identify an abstract type with the set of underlying values then
> it is equivalent to the underlying representation type, i.e. there is no
> abstraction.

I don't follow. Are you saying that a set cannot be described
intentionally? How is "the set of all objects that implement the
Foo interface" not sufficiently abstract? Is it possible you are
mixing in implementation concerns?


> >>There were papers observing this as early as 1970.
> >
> > References?
>
> This is 1973, actually, but most relevant:
>
>    James Morris
>    Types Are Not Sets.
>    Proc. 1st ACM Symposium on Principles of Programming Languages, 1973

Okay. Since this paper is in the ACM walled garden, I'll have to
wait until next week to get a look at it. But thanks for the reference.


> >>(There are also theoretic problems with the types-as-sets view, because
> >>sufficiently rich type systems can no longer be given direct models in
> >>standard set theory. For example, first-class polymorphism would run
> >>afoul the axiom of foundation.)
> >
> > There is no reason why we must limit ourselves to "standard set theory"
> > any more than we have to limit ourselves to standard type theory.
> > Both are progressing, and set theory seems to me to be a good
> > choice for a foundation. What else would you use?
>
> I'm no expert here, but Category Theory is a preferred choice in many
> areas of PLT.

Fair enough.


Marshall
From: Andreas Rossberg
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e7ede4$8r8i0$5@hades.rz.uni-saarland.de>
Marshall wrote:
>>
>>>What prohibits us from describing an abstract type as a set of values?
>>
>>If you identify an abstract type with the set of underlying values then
>>it is equivalent to the underlying representation type, i.e. there is no
>>abstraction.
> 
> I don't follow. Are you saying that a set cannot be described
> intentionally? How is "the set of all objects that implement the
> Foo interface" not sufficiently abstract? Is it possible you are
> mixing in implementation concerns?

"Values" refers to the concrete values existent in the semantics of a 
programming language. This set is usually infinite, but basically fixed. 
To describe the set of "values" of an abstract type you would need 
"fresh" values that did not exist before (otherwise the abstract type 
would be equivalent to some already existent type). So you'd need at 
least a theory for name generation or something similar to describe 
abstract types in a types-as-sets metaphor.

- Andreas
From: David Hopwood
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <K4Fmg.208022$8W1.205835@fe1.news.blueyonder.co.uk>
Andreas Rossberg wrote:
> Marshall wrote:
> 
>>>> What prohibits us from describing an abstract type as a set of values?
>>>
>>> If you identify an abstract type with the set of underlying values then
>>> it is equivalent to the underlying representation type, i.e. there is no
>>> abstraction.
>>
>> I don't follow. Are you saying that a set cannot be described
>> intentionally? How is "the set of all objects that implement the
>> Foo interface" not sufficiently abstract? Is it possible you are
>> mixing in implementation concerns?
> 
> "Values" refers to the concrete values existent in the semantics of a
> programming language. This set is usually infinite, but basically fixed.
> To describe the set of "values" of an abstract type you would need
> "fresh" values that did not exist before (otherwise the abstract type
> would be equivalent to some already existent type). So you'd need at
> least a theory for name generation or something similar to describe
> abstract types in a types-as-sets metaphor.

Set theory has no difficulty with this. It's common, for example, to see
"the set of strings representing propositions" used in treatments of
formal systems.

-- 
David Hopwood <····················@blueyonder.co.uk>
From: Andreas Rossberg
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e7ge7s$90ukl$2@hades.rz.uni-saarland.de>
David Hopwood wrote:
>>
>>"Values" refers to the concrete values existent in the semantics of a
>>programming language. This set is usually infinite, but basically fixed.
>>To describe the set of "values" of an abstract type you would need
>>"fresh" values that did not exist before (otherwise the abstract type
>>would be equivalent to some already existent type). So you'd need at
>>least a theory for name generation or something similar to describe
>>abstract types in a types-as-sets metaphor.
> 
> Set theory has no difficulty with this. It's common, for example, to see
> "the set of strings representing propositions" used in treatments of
> formal systems.

Oh, I was not saying that this particular aspect cannot be described in 
set theory (that was a different argument, about different issues). Just 
that you cannot naively equate types with a set of underlying values, 
which is what is usually meant by the types-are-sets metaphor - to 
capture something like type abstraction you need to do more. (Even then 
it might be arguable if it really describes the same thing.)

- Andreas
From: Chris Uppal
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <449aaea0$0$656$bed64819@news.gradwell.net>
Andreas Rossberg wrote:

[me:]
> > It's worth noting, too, that (in some sense) the type of an object can
> > change over time[*].
>
> No. Since a type expresses invariants, this is precisely what may *not*
> happen. If certain properties of an object may change then the type of
> the object has to reflect that possibility. Otherwise you cannot
> legitimately call it a type.

Well, it seems to me that you are /assuming/ a notion of what kinds of logic
can be called type (theories), and I don't share your assumptions.  No offence
intended.

Actually I would go a little further than that.  Granted that whatever logic
one wants to apply in order to prove <whatever> about a program execution is
abstract -- and so timeless -- that does not (to my mind) imply that it must be
/static/.  However, even if we grant that additional restriction, that doesn't
imply that the analysis itself must not be cognisant of time.  I see no reason,
even in practise, why a static analysis should not be able to see that the set
of acceptable operations (for some definition of acceptable) for some
object/value/variable can be different at different times in the execution.  If
the analysis is rich enough to check that the temporal constraints are [not]
satisfied, then I don't see why you should want to use another word than "type"
to describe the results of its analysis.

    -- chris
From: Andreas Rossberg
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e7ed1s$8r8i0$4@hades.rz.uni-saarland.de>
Chris Uppal wrote:
> 
>>>It's worth noting, too, that (in some sense) the type of an object can
>>>change over time[*].
>>
>>No. Since a type expresses invariants, this is precisely what may *not*
>>happen. If certain properties of an object may change then the type of
>>the object has to reflect that possibility. Otherwise you cannot
>>legitimately call it a type.
> 
> Well, it seems to me that you are /assuming/ a notion of what kinds of logic
> can be called type (theories), and I don't share your assumptions.  No offence
> intended.

OK, but can you point me to any literature on type theory that makes a 
different assumption?

> I see no reason,
> even in practise, why a static analysis should not be able to see that the set
> of acceptable operations (for some definition of acceptable) for some
> object/value/variable can be different at different times in the execution.

Neither do I. But what is wrong with a mutable reference-to-union type, 
as I suggested? It expresses this perfectly well.

- Andreas
From: Chris Uppal
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <449bde5e$1$663$bed64819@news.gradwell.net>
Andreas Rossberg wrote:
> Chris Uppal wrote:
> >
> > > > It's worth noting, too, that (in some sense) the type of an object
> > > > can change over time[*].
> > >
> > > No. Since a type expresses invariants, this is precisely what may
> > > *not* happen. If certain properties of an object may change then the
> > > type of
> > > the object has to reflect that possibility. Otherwise you cannot
> > > legitimately call it a type.
> >
> > Well, it seems to me that you are /assuming/ a notion of what kinds of
> > logic can be called type (theories), and I don't share your
> > assumptions.  No offence intended.
>
> OK, but can you point me to any literature on type theory that makes a
> different assumption?

'Fraid not.  (I'm not a type theorist -- for all I know there may be lots, but
my suspicion is that they are rare at best.)

But perhaps I shouldn't have used the word theory at all.  What I mean is that
there is one or more logic of type (informal or not -- probably informal) with
respect to which the object in question has changed it categorisation.   If no
existing type /theory/ (as devised by type theorists) can handle that case,
then that is a deficiency in the set of existing theories -- we need newer and
better ones.

But, as a sort of half-way, semi-formal, example: consider the type environment
in a Java runtime.  The JVM does formal type-checking of classfiles as it loads
them.  In most ways that checking is static -- it's treating the bytecode as
program text and doing a static analysis on it before allowing it to run (and
rejecting what it can't prove to be acceptable by its criteria).  However, it
isn't /entirely/ static because the collection of classes varies at runtime in
a (potentially) highly dynamic way.  So it can't really examine the "whole"
text of the program -- indeed there is no such thing.  So it ends up with a
hybrid static/dynamic type system -- it records any assumptions it had to make
in order to find a proof of the acceptability of the new code, and if (sometime
in the future) another class is proposed which violates those assumptions, then
that second class is rejected.


> > I see no reason,
> > even in practise, why a static analysis should not be able to see that
> > the set of acceptable operations (for some definition of acceptable)
> > for some object/value/variable can be different at different times in
> > the execution.
>
> Neither do I. But what is wrong with a mutable reference-to-union type,
> as I suggested? It expresses this perfectly well.

Maybe I misunderstood what you meant by union type.  I took it to mean that the
type analysis didn't "know" which of the two types was applicable, and so would
reject both (or maybe accept both ?).  E..g  if at instant A some object, obj,
was in a state where it to responds to #aMessage, but not #anotherMessage; and
at instant B it is in a state where it responds to #anotherMessage but not
#aMessage.  In my (internal and informal) type logic, make the following
judgements:

    In code which will be executed at instant A
        obj aMessage.                "type correct"
        obj anotherMessage.       "type incorrect"

    In code which will be executed at instant B
        obj aMessage.                 "type incorrect"
        obj anotherMessage.        "type correct"

I don't see how a logic with no temporal element can arrive at all four those
judgements, whatever it means by a union type.

    -- chris
From: Marshall
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1151082956.067238.45100@b68g2000cwa.googlegroups.com>
Chris Uppal wrote:
>
> But, as a sort of half-way, semi-formal, example: consider the type environment
> in a Java runtime.  The JVM does formal type-checking of classfiles as it loads
> them.  In most ways that checking is static -- it's treating the bytecode as
> program text and doing a static analysis on it before allowing it to run (and
> rejecting what it can't prove to be acceptable by its criteria).  However, it
> isn't /entirely/ static because the collection of classes varies at runtime in
> a (potentially) highly dynamic way.  So it can't really examine the "whole"
> text of the program -- indeed there is no such thing.  So it ends up with a
> hybrid static/dynamic type system -- it records any assumptions it had to make
> in order to find a proof of the acceptability of the new code, and if (sometime
> in the future) another class is proposed which violates those assumptions, then
> that second class is rejected.

I have to object to the term "hybrid".

Java has a static type system.
Java has runtime tags and tag checks.

The two are distinct, and neither one is less than complete, so
I don't think "hybrid" captures the situation well.


Marshall
From: Chris Smith
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <MPG.1f05d0cabd0b607e9896ed@news.altopia.net>
Marshall <···············@gmail.com> wrote:
> Java has a static type system.
> Java has runtime tags and tag checks.

Yes.

> The two are distinct, and neither one is less than complete

How is neither one less than complete?

-- 
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation
From: Marshall
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1151100347.786836.128450@b68g2000cwa.googlegroups.com>
Chris Smith wrote:
> Marshall <···············@gmail.com> wrote:
> > Java has a static type system.
> > Java has runtime tags and tag checks.
>
> Yes.
>
> > The two are distinct, and neither one is less than complete
>
> How is neither one less than complete?

In the same way that a mule is less than a complete horse.

It is a small point, and probably not worth much attention.
In fact, I was probably just being fussy in the first place.


Marshall
From: Chris Uppal
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <449c4826$0$657$bed64819@news.gradwell.net>
Marshall wrote:

[me:]
> > But, as a sort of half-way, semi-formal, example: consider the type
> > environment in a Java runtime.  The JVM does formal type-checking of
> > classfiles as it loads them.  In most ways that checking is static --
> > it's treating the bytecode as program text and doing a static analysis
> > on it before allowing it to run (and rejecting what it can't prove to
> > be acceptable by its criteria).  However, it isn't /entirely/ static
> > because the collection of classes varies at runtime in a (potentially)
> > highly dynamic way.  So it can't really examine the "whole" text of the
> > program -- indeed there is no such thing.  So it ends up with a hybrid
> > static/dynamic type system -- it records any assumptions it had to make
> > in order to find a proof of the acceptability of the new code, and if
> > (sometime in the future) another class is proposed which violates those
> > assumptions, then that second class is rejected.
>
> I have to object to the term "hybrid".
>
> Java has a static type system.
> Java has runtime tags and tag checks.

It has both, agreed, but that isn't the full story.  I think I explained what I
meant, and why I feel the term is justified as well as I can in the second half
of the paragraph you quoted.  I doubt if I can do better.

Maybe you are thinking that I mean that /because/ the JVM does verification,
etc, at "runtime" the system is hybrid ?

Anyway that is /not/ what I mean.  I'm (for these purposes) completely
uninterested in the static checking done by the Java to bytecode translator,
javac.  I'm interested in what happens to the high-level, statically typed, OO,
language called "java bytecode" when the JVM sees it.  That language has a
strict static type system which the JVM is required to check.  That's a
/static/ check in my book -- it happens before the purportedly correct code is
accepted, rather than while that code is running.

I am also completely uninterested (for these immediate purposes) in the run
time checking that the JVM does (the stuff that results in
NoSuchMethodException, and the like).  In the wider context of the thread, I do
want to call that kind of thing (dynamic) type checking -- but those checks are
not why I call the JVMs type system hybrid either.

Oh well, having got that far, I may as well take another stab at "hybrid".
Since the JVM is running a static type system without access to the whole text
of the program, there are some things that it is expected to check which it
can't.   So it records preconditions on classes which might subsequently be
loaded.  Those are added to the static checks on future classes, but -- as far
as the original class is concerned -- those checks happen dynamically.  So part
of the static type checking which is supposed to happen, has been postponed to
a dynamic check.  It's that, and /only/ that which I meant by "hybrid".

Of course, /if/ we grant that runtime checking of the sort done by Smalltalk or
Lisp also constitutes a "type system" in some sense that puts it on a par with
static type checking, then that would be another, very different, reason to
claim that Java had a hybrid type system (though, in fact, I'd rather claim
that it had two independent type systems).  But that's the bigger question
point under discussion here and I wasn't trying to beg it by using the word
"hybrid".

    -- chris
From: Marshall
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1151100768.063223.120480@m73g2000cwd.googlegroups.com>
Chris Uppal wrote:
> Marshall wrote:
>
> [me:]
> > > But, as a sort of half-way, semi-formal, example: consider the type
> > > environment in a Java runtime.  The JVM does formal type-checking of
> > > classfiles as it loads them.  In most ways that checking is static --
> > > it's treating the bytecode as program text and doing a static analysis
> > > on it before allowing it to run (and rejecting what it can't prove to
> > > be acceptable by its criteria).  However, it isn't /entirely/ static
> > > because the collection of classes varies at runtime in a (potentially)
> > > highly dynamic way.  So it can't really examine the "whole" text of the
> > > program -- indeed there is no such thing.  So it ends up with a hybrid
> > > static/dynamic type system -- it records any assumptions it had to make
> > > in order to find a proof of the acceptability of the new code, and if
> > > (sometime in the future) another class is proposed which violates those
> > > assumptions, then that second class is rejected.
> >
> > I have to object to the term "hybrid".
> >
> > Java has a static type system.
> > Java has runtime tags and tag checks.
>
> It has both, agreed, but that isn't the full story.  I think I explained what I
> meant, and why I feel the term is justified as well as I can in the second half
> of the paragraph you quoted.  I doubt if I can do better.
>
> Maybe you are thinking that I mean that /because/ the JVM does verification,
> etc, at "runtime" the system is hybrid ?
>
> Anyway that is /not/ what I mean.  I'm (for these purposes) completely
> uninterested in the static checking done by the Java to bytecode translator,
> javac.  I'm interested in what happens to the high-level, statically typed, OO,
> language called "java bytecode" when the JVM sees it.  That language has a
> strict static type system which the JVM is required to check.  That's a
> /static/ check in my book -- it happens before the purportedly correct code is
> accepted, rather than while that code is running.
>
> I am also completely uninterested (for these immediate purposes) in the run
> time checking that the JVM does (the stuff that results in
> NoSuchMethodException, and the like).  In the wider context of the thread, I do
> want to call that kind of thing (dynamic) type checking -- but those checks are
> not why I call the JVMs type system hybrid either.
>
> Oh well, having got that far, I may as well take another stab at "hybrid".
> Since the JVM is running a static type system without access to the whole text
> of the program, there are some things that it is expected to check which it
> can't.   So it records preconditions on classes which might subsequently be
> loaded.  Those are added to the static checks on future classes, but -- as far
> as the original class is concerned -- those checks happen dynamically.  So part
> of the static type checking which is supposed to happen, has been postponed to
> a dynamic check.  It's that, and /only/ that which I meant by "hybrid".

That's fair, I guess. I wouldn't use the terms you do, but maybe that
doesn't matter very much assuming we understand each other
at this point.



Marshal
From: David Hopwood
Subject: Travails of the JVM type system (was: What is Expressiveness in a Computer Language)
Date: 
Message-ID: <6%%mg.458332$tc.115682@fe2.news.blueyonder.co.uk>
Chris Uppal wrote:
> Maybe you are thinking that I mean that /because/ the JVM does verification,
> etc, at "runtime" the system is hybrid ?
> 
> Anyway that is /not/ what I mean.  I'm (for these purposes) completely
> uninterested in the static checking done by the Java to bytecode translator,
> javac.  I'm interested in what happens to the high-level, statically typed, OO,
> language called "java bytecode" when the JVM sees it.  That language has a
> strict static type system which the JVM is required to check.  That's a
> /static/ check in my book -- it happens before the purportedly correct code is
> accepted, rather than while that code is running.
> 
> I am also completely uninterested (for these immediate purposes) in the run
> time checking that the JVM does (the stuff that results in
> NoSuchMethodException, and the like).  In the wider context of the thread, I do
> want to call that kind of thing (dynamic) type checking -- but those checks are
> not why I call the JVMs type system hybrid either.
> 
> Oh well, having got that far, I may as well take another stab at "hybrid".
> Since the JVM is running a static type system without access to the whole text
> of the program, there are some things that it is expected to check which it
> can't.   So it records preconditions on classes which might subsequently be
> loaded.

It does, but interestingly, it wasn't originally intended to. It was intended
to be possible to check each classfile based only on that class' explicit
dependencies, without having to record constraints on the loading of future
classes.

However, the designers made various mistakes with the design of ClassLoaders
which resulted in the original JVM type system being unsound, and hence insecure.
The problem is described in
<http://www.cis.upenn.edu/~bcpierce/courses/629/papers/Saraswat-javabug.html>
(which spells my name wrong, I just noticed).

The constraint-based approach, proposed by Liang and Bracha in
<http://www.cs.purdue.edu/homes/jv/smc/pubs/liang-oopsla98.pdf>, was adopted
in order to fix this. There were other, simpler, solutions (for example,
restricting ClassLoader delegation to a strict tree structure), but the
responsible people at Sun felt that they were insufficiently expressive.
I thought, and still think, that they were mistaken. It is important for the
type system to be no more complicated than necessary if there is to be any
confidence in the security of implementations.

In any case, as shown by
<http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4670071>, current JVM
implementations do *not* reliably support the patterns of ClassLoader delegation
that were intended to be enabled by Liang and Bracha's approach.

[This "bug" (it should be classed as an RFE) is, inexplicably given how few
programs are affected by it, the most voted-for bug in Sun's tracking system.
I get the impression that most of the commentators don't understand how
horribly complicated it would be to fix. Anyway, it hasn't been fixed for
4 years.]

-- 
David Hopwood <····················@blueyonder.co.uk>
From: Andreas Rossberg
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e7gpal$90ukl$3@hades.rz.uni-saarland.de>
Chris Uppal wrote:
>>>
>>>Well, it seems to me that you are /assuming/ a notion of what kinds of
>>>logic can be called type (theories), and I don't share your
>>>assumptions.  No offence intended.
>>
>>OK, but can you point me to any literature on type theory that makes a
>>different assumption?
> 
> 'Fraid not.  (I'm not a type theorist -- for all I know there may be lots, but
> my suspicion is that they are rare at best.)

I would suspect the same :-). And the reason is that "type" has a 
well-established use in theory. It is not just my "assumption", it is 
established practice since 80 or so years. So far, this discussion has 
not revealed the existence of any formal work that would provide a 
theory of "dynamic types" in the sense it is used to characterise 
"dynamically typed" languages.

So what you are suggesting may be an interesting notion, but it's not 
what is called "type" in a technical sense. Overloading the same term 
for something different is not a good idea if you want to avoid 
confusion and misinterpretations.

> But, as a sort of half-way, semi-formal, example: consider the type environment
> in a Java runtime.  The JVM does formal type-checking of classfiles as it loads
> them.  In most ways that checking is static -- it's treating the bytecode as
> program text and doing a static analysis on it before allowing it to run (and
> rejecting what it can't prove to be acceptable by its criteria).  However, it
> isn't /entirely/ static because the collection of classes varies at runtime in
> a (potentially) highly dynamic way.  So it can't really examine the "whole"
> text of the program -- indeed there is no such thing.  So it ends up with a
> hybrid static/dynamic type system -- it records any assumptions it had to make
> in order to find a proof of the acceptability of the new code, and if (sometime
> in the future) another class is proposed which violates those assumptions, then
> that second class is rejected.

Incidentally, I know this scenario very well, because that's what I'm 
looking at in my thesis :-). All of this can easily be handled 
coherently with well-established type machinery and terminology. No need 
to bend existing concepts and language, no need to resort to "dynamic 
typing" in the way Java does it either.

>     In code which will be executed at instant A
>         obj aMessage.                "type correct"
>         obj anotherMessage.       "type incorrect"
> 
>     In code which will be executed at instant B
>         obj aMessage.                 "type incorrect"
>         obj anotherMessage.        "type correct"
> 
> I don't see how a logic with no temporal element can arrive at all four those
> judgements, whatever it means by a union type.

I didn't say that the type system cannot have temporal elements. I only 
said that a type itself cannot change over time. A type system states 
propositions about a program, a type assignment *is* a proposition. A 
proposition is either true or false (or undecidable), but it is so 
persistently, considered under the same context. So if you want a type 
system to capture temporal elements, then these must be part of a type 
itself. You can introduce types with implications like "in context A, 
this is T, in context B this is U". But the whole quoted part then is 
the type, and it is itself invariant over time.

- Andreas
From: Marshall
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1151097146.910247.56170@g10g2000cwb.googlegroups.com>
Andreas Rossberg wrote:
>
> ... And the reason is that "type" has a
> well-established use in theory. It is not just my "assumption", it is
> established practice since 80 or so years.

Wouldn't it be fair to say it goes back a least to
Principia Mathematica, 1910?


> So far, this discussion has
> not revealed the existence of any formal work that would provide a
> theory of "dynamic types" in the sense it is used to characterise
> "dynamically typed" languages.

Indeed, the idea I am starting to get is that much of what is going
on in the dynamic/latent/untyped/whatever world is simply informal,
conceptual-level static analysis. The use of the runtime system
is a formal assistant to this informal analysis. In other words,
formal existential quantification is used as hint to help do
informal universal quantification.

Now, given that explanation, the formal analysis looks better.
But that's not the whole story. We also have the fact that
the informal analysis gets to run on the human brain, while
the static analysis has to run on a lousy microprocessor
with many orders of magnitude less processing power.
We have the fact that the formal static analysis is of
limited power. We have the fact that historically,
static languages have forbidden the use of the formal
existential method preferred by the DT crowd until
*after* the static analysis has passed, thus defeating
their purpose. And probably still more things that I
don't appreciate yet. So I'll now throw around some
terms (in an attempt to look smart :-) and say,
late binding, impredicativity.


Marshal
From: Chris Uppal
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <449c1f4d$0$662$bed64819@news.gradwell.net>
Andreas Rossberg wrote:

> So what you are suggesting may be an interesting notion, but it's not
> what is called "type" in a technical sense. Overloading the same term
> for something different is not a good idea if you want to avoid
> confusion and misinterpretations.

Frivolous response: the word "type" has been in use in the general sense in
which I wish to use it for longer than that.  If theorists didn't want ordinary
people to abuse their technical terms they should invent their own words not
borrow other people's ;-)

Just for interest, here's one item from the OED's entry on "type" (noun):

    The general form, structure, or character distinguishing
    a particular kind, group, or class of beings or objects;
    hence transf. a pattern or model after which something is made.

And it has supporting quotes, here are the first and last:

1843 Mill Logic iv. ii. �3 (1856) II. 192
    When we see a creature resembling an animal, we
    compare it with our general conception of an animal;
    and if it agrees with that general conception, we include
    it in the class. The conception becomes the type of comparison.

1880 Mem. J. Legge vi. 76
    Every creature has a type, a peculiar character of its own.

Interestingly the first one is from:
    John Stuart Mill
    A System of Logic
(but it's not mathematical logic ;-).

OK, I admitted I was being frivolous.  But I really don't see why a type system
(as understood by type theorists) /must/ have no runtime element.  Just as I
wouldn't stop calling a type system a "type system" if it were designed to work
on incomplete information (only part of the program text is available for
analysis) and so could only make judgements of the form "if X then Y".


[JVM dynamic class loading]
> Incidentally, I know this scenario very well, because that's what I'm
> looking at in my thesis :-).

<grin/>


> All of this can easily be handled
> coherently with well-established type machinery and terminology.

I'd be interested to know more, but I suspect I wouldn't understand it :-(


> A type system states
> propositions about a program, a type assignment *is* a proposition. A
> proposition is either true or false (or undecidable), but it is so
> persistently, considered under the same context. So if you want a type
> system to capture temporal elements, then these must be part of a type
> itself. You can introduce types with implications like "in context A,
> this is T, in context B this is U". But the whole quoted part then is
> the type, and it is itself invariant over time.

But since the evolving type-attribution that I'm thinking of also includes time
as a static element (at each specific time there is a specific attribution of
types), the dynamic logic can also be viewed as invarient in time, since each
judgement is tagged with the moment(s) to which it applies.

But there we go.  I don't expect you to agree, and though I'll read any reply
with interest, I think I've now said all I have to say in this particular
subthread.

Cheers.

    -- chris
From: David Hopwood
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <uBdmg.464152$xt.338218@fe3.news.blueyonder.co.uk>
Chris Uppal wrote:
> It's worth noting, too, that (in some sense) the type of an object can change
> over time[*].  That can be handled readily (if not perfectly) in the informal
> internal type system(s) which programmers run in their heads (pace the very
> sensible post by Anton van Straaten today in this thread -- several branches
> away), but cannot be handled by a type system based on sets-of-values (and is
> also a counter-example to the idea that "the" dynamic type of an object/value
> can be identified with its tag).
> 
> ([*] if the set of operations in which it can legitimately partake changes.
> That can happen explicitly in Smalltalk (using DNU proxies for instance if the
> proxied object changes, or even using #becomeA:), but can happen anyway in less
> "free" languages -- the State Pattern for instance, or even (arguably) in the
> difference between an empty list and a non-empty list).

Dynamic changes in object behaviour are not incompatible with type systems based
on sets of values (e.g. semantic subtyping). There are some tricky issues in
making such a system work, and I'm not aware of any implemented language that
does it currently, but in principle it's quite feasible.

For a type system that can handle dynamic proxying, see
<http://www.doc.ic.ac.uk/~scd/FOOL11/FCM.pdf>.

-- 
David Hopwood <····················@blueyonder.co.uk>
From: Darren New
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <hNdmg.19752$uy3.9114@tornado.socal.rr.com>
Chris Uppal wrote:
> doesn't fit with my intuitions very well -- most noticeably in that the sets
> are generally unbounded

Errr, not in Ada.  Indeed, not in any machine I know of with a limited 
address space.

Andreas Rossberg wrote:
 > Indeed, this view is much too narrow. In particular, it cannot explain
 > abstract types, which is *the* central aspect of decent type systems.

Well, it's Ada's view. I didn't say it was right for theoretical 
languages or anything like that. As far as I know, LOTOS is the only 
language that *actually* uses abstract data types - you have to use the 
equivalent of #include to bring in the integers, for example. Everything 
else uses informal rules to say how types work.

But Ada's definition gives you a very nice way of talking about things 
like whether integers that overflow are the same type as integers that 
don't overflow, or whether an assignment of an integer to a positive is 
legal, or adding a CountOfApples to a CountOfOranges is legal, or 
whether passing a "Dog" object to an "Animal" function parameter makes 
sense in a particular context.

Indeed, the ability to declare a new type that has the exact same 
underlying representation and isomorphically identical operations but 
not be the same type is something I find myself often missing in 
languages. It's nice to be able to say "this integer represents vertical 
pixel count, and that represents horizontal pixel count, and you don't 
get to add them together."

-- 
   Darren New / San Diego, CA, USA (PST)
     My Bath Fu is strong, as I have
     studied under the Showerin' Monks.
From: Matthias Blume
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <m1vequlcyh.fsf@hana.uchicago.edu>
Darren New <····@san.rr.com> writes:

> [ ... ] As far as I know, LOTOS is the only
> language that *actually* uses abstract data types - you have to use
> the equivalent of #include to bring in the integers, for
> example. Everything else uses informal rules to say how types work.

There are *tons* of languages that "actually" facilitate abstract data
types, and some of these languages are actually used by real people.
From: Darren New
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <c1fmg.15240$Z67.41@tornado.socal.rr.com>
Matthias Blume wrote:
> There are *tons* of languages that "actually" facilitate abstract data
> types, and some of these languages are actually used by real people.

I don't know of any others in actual use. Could you name a couple?

Note that I don't consider things like usual OO languages (Eiffel, 
Smalltalk, etc) to have "abstract data types".

-- 
   Darren New / San Diego, CA, USA (PST)
     My Bath Fu is strong, as I have
     studied under the Showerin' Monks.
From: Andreas Rossberg
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e7bqgp$8v164$6@hades.rz.uni-saarland.de>
Darren New wrote:
> 
> As far as I know, LOTOS is the only 
> language that *actually* uses abstract data types

Maybe I don't understand what you mean with ADT here, but all languages 
with a decent module system support ADTs in the sense it is usually 
understood, see ML for a primary example. Classes in most OOPLs are 
essentially beefed-up ADTs as well.

> Indeed, the ability to declare a new type that has the exact same 
> underlying representation and isomorphically identical operations but 
> not be the same type is something I find myself often missing in 
> languages. It's nice to be able to say "this integer represents vertical 
> pixel count, and that represents horizontal pixel count, and you don't 
> get to add them together."

Not counting C/C++, I don't know when I last worked with a typed 
language that does *not* have this ability... (which is slightly 
different from ADTs, btw)

- Andreas
From: Vesa Karvonen
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e7bshq$pqk$1@oravannahka.helsinki.fi>
In comp.lang.functional Andreas Rossberg <········@ps.uni-sb.de> wrote:
> Darren New wrote:
[...]
> > Indeed, the ability to declare a new type that has the exact same 
> > underlying representation and isomorphically identical operations but 
> > not be the same type is something I find myself often missing in 
> > languages. It's nice to be able to say "this integer represents vertical 
> > pixel count, and that represents horizontal pixel count, and you don't 
> > get to add them together."

> Not counting C/C++, I don't know when I last worked with a typed 
> language that does *not* have this ability... (which is slightly 
> different from ADTs, btw)

Would Java count?

-Vesa Karvonen
From: Andreas Rossberg
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e7btcr$8v164$7@hades.rz.uni-saarland.de>
Vesa Karvonen wrote:
> 
>>>Indeed, the ability to declare a new type that has the exact same 
>>>underlying representation and isomorphically identical operations but 
>>>not be the same type is something I find myself often missing in 
>>>languages. It's nice to be able to say "this integer represents vertical 
>>>pixel count, and that represents horizontal pixel count, and you don't 
>>>get to add them together."
> 
>>Not counting C/C++, I don't know when I last worked with a typed 
>>language that does *not* have this ability... (which is slightly 
>>different from ADTs, btw)
> 
> Would Java count?

Yes, you are right. And there certainly are more in the OO camp.

But honestly, I do not remember when I last had to actively work with 
one of them, including Java... :-)

- Andreas
From: Darren New
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <c6fmg.15242$Z67.7053@tornado.socal.rr.com>
Andreas Rossberg wrote:
> Maybe I don't understand what you mean with ADT here, but all languages 
> with a decent module system support ADTs in the sense it is usually 
> understood, see ML for a primary example.

OK.  Maybe some things like ML and Haskell and such that I'm not 
intimately familiar with do, now that you mention it, yes.

> Classes in most OOPLs are essentially beefed-up ADTs as well.

Err, no. There's nothing really abstract about them. And their values 
are mutable. So while one can think about them as an ADT, one actually 
has to write code to (for example) calculate or keep track of how many 
entries are on a stack, if you want that information.

> Not counting C/C++, I don't know when I last worked with a typed 
> language that does *not* have this ability... (which is slightly 
> different from ADTs, btw)

Java? C#? Icon? Perl? (Hmmm... Pascal does, IIRC.) I guess you just work 
with better languages than I do. :-)

-- 
   Darren New / San Diego, CA, USA (PST)
     My Bath Fu is strong, as I have
     studied under the Showerin' Monks.
From: Andreas Rossberg
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e7bvbv$8v164$8@hades.rz.uni-saarland.de>
Darren New wrote:
> 
>> Maybe I don't understand what you mean with ADT here, but all 
>> languages with a decent module system support ADTs in the sense it is 
>> usually understood, see ML for a primary example.
> 
> OK.  Maybe some things like ML and Haskell and such that I'm not 
> intimately familiar with do, now that you mention it, yes.

Well, Modula and CLU already had this, albeit in restricted form.

>> Classes in most OOPLs are essentially beefed-up ADTs as well.
> 
> Err, no. There's nothing really abstract about them. And their values 
> are mutable. So while one can think about them as an ADT, one actually 
> has to write code to (for example) calculate or keep track of how many 
> entries are on a stack, if you want that information.

Now you lost me completely. What has mutability to do with it? And the 
stack?

AFAICT, ADT describes a type whose values can only be accessed by a 
certain fixed set of operations. Classes qualify for that, as long as 
they provide proper encapsulation.

>> Not counting C/C++, I don't know when I last worked with a typed 
>> language that does *not* have this ability... (which is slightly 
>> different from ADTs, btw)
> 
> Java? C#? Icon? Perl? (Hmmm... Pascal does, IIRC.)  I guess you just work
> with better languages than I do. :-)

OK, I admit that I exaggerated slightly. Although currently I'm indeed 
able to mostly work with the more pleasant among languages. :-)

(Btw, Pascal did not have it either, AFAIK)

- Andreas
From: Darren New
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <dzfmg.3585$MF6.270@tornado.socal.rr.com>
Andreas Rossberg wrote:
> AFAICT, ADT describes a type whose values can only be accessed by a 
> certain fixed set of operations. 

No. AFAIU, an ADT defines the type based on the operations. The stack 
holding the integers 1 and 2 is the value (push(2, push(1, empty()))). 
There's no "internal" representation. The values and operations are 
defined by preconditions and postconditions.

Both a stack and a queue could be written in most languages as "values 
that can only be accessed by a fixed set of operations" having the same 
possible internal representations and the same function signatures. 
They're far from the same type, because they're not abstract. The part 
you can't see from outside the implementation is exactly the part that 
defines how it works.

Granted, it's a common mistake to write that some piece of C++ code 
implements a stack ADT.

For example, an actual ADT for a stack (and a set) is shown on
http://www.cs.unc.edu/~stotts/danish/web/userman/umadt.html
Note that the "axia" for the stack type completely define its operation 
and semantics. There is no other "implementation" involved. And in LOTOS 
(which uses ACT.1 or ACT.ONE (I forget which)) as its type, this is 
actually how you'd write the code for a stack, and then the compiler 
would crunch a while to figure out a corresponding implementation.

I suspect "ADT" is by now so corrupted that it's useful to use a 
different term, such as "algebraic type" or some such.

> Classes qualify for that, as long as they provide proper encapsulation.

Nope.

> OK, I admit that I exaggerated slightly. Although currently I'm indeed 
> able to mostly work with the more pleasant among languages. :-)

Yah. :-)

> (Btw, Pascal did not have it either, AFAIK)

I'm pretty sure in Pascal you could say

Type Apple = Integer; Orange = Integer;
and then vars of type apple and orange were not interchangable.

-- 
   Darren New / San Diego, CA, USA (PST)
     My Bath Fu is strong, as I have
     studied under the Showerin' Monks.
From: ········@ps.uni-sb.de
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1150990929.210041.198580@m73g2000cwd.googlegroups.com>
Darren New schrieb:
> Andreas Rossberg wrote:
> > AFAICT, ADT describes a type whose values can only be accessed by a
> > certain fixed set of operations.
>
> No. AFAIU, an ADT defines the type based on the operations. The stack
> holding the integers 1 and 2 is the value (push(2, push(1, empty()))).
> There's no "internal" representation. The values and operations are
> defined by preconditions and postconditions.

Are you sure that you aren't confusing *abstract* with *algebraic* data
types? In my book, abstract types usually have an internal
representation, and that can even be stateful. I don't remember having
encountered definitions of ADT as restrictive as you describe it.

> Both a stack and a queue could be written in most languages as "values
> that can only be accessed by a fixed set of operations" having the same
> possible internal representations and the same function signatures.
> They're far from the same type, because they're not abstract.

Different abstract types can have the same signature. That does not
make them the same type. The types are distinguished by their identity.

Likewise, two classes can have the same set of methods, without being
the same class (at least in languages that have nominal typing, which
includes almost all typed OOPLs).

> I'm pretty sure in Pascal you could say
>
> Type Apple = Integer; Orange = Integer;
> and then vars of type apple and orange were not interchangable.

No, the following compiles perfectly fine (using GNU Pascal):

  program bla;
  type
    apple = integer;
    orange = integer;
  var
    a : apple;
    o : orange;
  begin
    a := o
  end.

- Andreas
From: Darren New
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <uazmg.15709$Z67.12824@tornado.socal.rr.com>
········@ps.uni-sb.de wrote:
> Are you sure that you aren't confusing *abstract* with *algebraic* data
> types? 

I've never heard of algebraic data types. It's always been "abstract 
data types". Perhaps I stopped studying it, and the terminology changed 
when many people started to consider encapsulation as abstraction. I 
have any number of old textbooks and journals that refer to these as 
"abstract data types", including texts that show an abstract data type 
and then explain how to program it as an encapsulated object class.

> No, the following compiles perfectly fine (using GNU Pascal):

That'll teach me to rely on 15-year-old memories. :-) Maybe I'm 
remembering the wishful thinking from when I used Pascal.

-- 
   Darren New / San Diego, CA, USA (PST)
     My Bath Fu is strong, as I have
     studied under the Showerin' Monks.
From: George Neuner
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1kcs92pmg156qdg89ksfuuslgv6njcvkt3@4ax.com>
On 22 Jun 2006 08:42:09 -0700, ········@ps.uni-sb.de wrote:

>Darren New schrieb:
>> I'm pretty sure in Pascal you could say
>>
>> Type Apple = Integer; Orange = Integer;
>> and then vars of type apple and orange were not interchangable.
>
>No, the following compiles perfectly fine (using GNU Pascal):
>
>  program bla;
>  type
>    apple = integer;
>    orange = integer;
>  var
>    a : apple;
>    o : orange;
>  begin
>    a := o
>  end.

You are both correct.  

The original Pascal specification failed to mention whether user
defined types should be compatible by name or by structure.  Though
Wirth's own 1974 reference implementation used name compatibility,
implementations were free to use structure compatibility instead and
many did.  There was a time when typing differences made Pascal code
completely non-portable[1].

When Pascal was finally standardized in 1983, the committees followed
C's (dubious) example and chose to use structure compatibility for
simple types and name compatibility for records.


[1] Wirth also failed to specify whether boolean expression evaluation
should be short-circuit or complete.  Again, implementations went in
both directions.  Some allowed either method by switch, but the type
compatibility issue continued to plague Pascal until standard
conforming compilers emerged in the mid 80's.

George
--
for email reply remove "/" from address
From: Stephen J. Bevan
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <87bqs95l49.fsf@dnsalias.com>
Darren New <····@san.rr.com> writes:
> Andreas Rossberg wrote:
>> AFAICT, ADT describes a type whose values can only be accessed by a
>> certain fixed set of operations.
>
> No. AFAIU, an ADT defines the type based on the operations. The stack
> holding the integers 1 and 2 is the value (push(2, push(1,
> empty()))). There's no "internal" representation. The values and
> operations are defined by preconditions and postconditions.

As a user of the ADT you get a specification consisting of a signature
(names the operations and defines their arity and type of each
argument) and axioms which define the behaviour of the operations.

The implementer has the task for producing an implementation (or
"model" to use alebraic specifiction terminology) that satisfies the
specification.  One model that comes for free is the term model where
the operations are treated as terms and the representation of a stack
containing 2 and 1 would just be "(push(2, push(1, empty())))".
However, while that model comes for free it isn't necessarily the most
efficient model.  Thus the non-free aspect of chosing a different
implementation is that technically it requires an accompanying proof
that the implementation satisfies the specification.
From: Joachim Durchholz
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e7et83$2bd$2@online.de>
Andreas Rossberg schrieb:
> (Btw, Pascal did not have it either, AFAIK)

Indeed.
Some Pascal dialects have it.

Regards,
Jo
From: Joachim Durchholz
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e8d89p$29d$1@online.de>
Andreas Rossberg schrieb:
> AFAICT, ADT describes a type whose values can only be accessed by a 
> certain fixed set of operations. Classes qualify for that, as long as 
> they provide proper encapsulation.

The first sentence is true if you associate a semantics (i.e. axioms) 
with the operations. Most OO languages don't have a place for expressing 
axioms (except via comments and handwaving), so they still don't fully 
qualify.

Regards,
jo
From: Chris Uppal
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <449aaea0$1$656$bed64819@news.gradwell.net>
I wrote:

> It would be interesting to see what a language designed specifically to
> support user-defined, pluggable, and perhaps composable, type systems
> would look like.

Since writing that I've come across some thoughts by Gilad Bracha (a Name known
to Java and Smalltalk enthusiasts alike) here:

    http://blogs.sun.com/roller/page/gbracha?entry=a_few_ideas_on_type

and a long, and occasionally interesting, related thread on LtU:

    http://lambda-the-ultimate.org/node/1311

Not much discussion of concrete language design, though.

    -- chris
From: Andreas Rossberg
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e78o2g$8usvs$2@hades.rz.uni-saarland.de>
Rob Thorpe wrote:
> 
> No, that isn't what I said.  What I said was:
> "A language is latently typed if a value has a property - called it's
> type - attached to it, and given it's type it can only represent values
> defined by a certain class."

"it [= a value] [...] can [...] represent values"?

> Easy, any statically typed language is not latently typed.  Values have
> no type associated with them, instead variables have types.

A (static) type system assigns types to (all) *expressions*. This 
includes values as well as variables.

Don't confuse type assignment with type annotation (which many 
mainstream languages enforce for, but also only allow for, variable 
declarations).

- Andreas
From: Rob Thorpe
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1150809118.677124.142230@u72g2000cwu.googlegroups.com>
Andreas Rossberg wrote:
> Rob Thorpe wrote:
> >
> > No, that isn't what I said.  What I said was:
> > "A language is latently typed if a value has a property - called it's
> > type - attached to it, and given it's type it can only represent values
> > defined by a certain class."
>
> "it [= a value] [...] can [...] represent values"?

???

> > Easy, any statically typed language is not latently typed.  Values have
> > no type associated with them, instead variables have types.
>
> A (static) type system assigns types to (all) *expressions*.

That's right most of the time yes, I probably should have said
expressions.  Though I can think of static typed languages where the
resulting type of an expression depends on the type of the variable it
is being assigned to.

> This includes values as well as variables.

Well I haven't programmed in any statically typed language where values
have types themselves.  Normally the language only knows that variable
Z is of type Q because it's in a variable of type Q, or as you mention
an expression of type Q.  There are many things that could be
considered more than one type.  The integer 45 could be unsigned 45 or
signed 45 or even float 45 depending on the variable it's in, but
without context it doesn't have a precise type.

It does imply the type to some extent though, you could imagine a
language where every value has a precise type.  So, you've found a hole
in my definition.

Maybe a better definition would be:-

if (variables have types || expressions have types) <lang is statically
typed>
else if (values have types) <lang is latently/dynamically typed>
else <lang is untyped>

That seems to fit usage, quite well.

Even then there are problems.  Perl has static types for arrays, hashs
and scalars.  But scalars themselves can be integers, strings, etc.

> Don't confuse type assignment with type annotation (which many
> mainstream languages enforce for, but also only allow for, variable
> declarations).

Point taken.
From: Andreas Rossberg
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e78ukp$8usvs$3@hades.rz.uni-saarland.de>
Rob Thorpe wrote:
>>
>>>No, that isn't what I said.  What I said was:
>>>"A language is latently typed if a value has a property - called it's
>>>type - attached to it, and given it's type it can only represent values
>>>defined by a certain class."
>>
>>"it [= a value] [...] can [...] represent values"?
> 
> ???

I just quoted, in condensed form, what you said above: namely, that a 
value represents values - which I find a strange and circular definition.

>>A (static) type system assigns types to (all) *expressions*.
> 
> That's right most of the time yes, I probably should have said
> expressions.  Though I can think of static typed languages where the
> resulting type of an expression depends on the type of the variable it
> is being assigned to.

Yes, but that's no contradiction. A type system does not necessarily 
assign *unique* types to individual expressions (consider overloading, 
subtyping, polymorphism, etc).

> Well I haven't programmed in any statically typed language where values
> have types themselves.

They all have - the whole purpose of a type system is to ensure that any 
expression of type T always evaluates to a value of type T. So when you 
look at type systems formally then you certainly have to assign types to 
values, otherwise you couldn't prove any useful property about those 
systems (esp. soundness).

- Andreas
From: Rob Thorpe
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1150815461.210286.53120@r2g2000cwb.googlegroups.com>
Andreas Rossberg wrote:
> Rob Thorpe wrote:
> >>
> >>>No, that isn't what I said.  What I said was:
> >>>"A language is latently typed if a value has a property - called it's
> >>>type - attached to it, and given it's type it can only represent values
> >>>defined by a certain class."
> >>
> >>"it [= a value] [...] can [...] represent values"?
> >
> > ???
>
> I just quoted, in condensed form, what you said above: namely, that a
> value represents values - which I find a strange and circular definition.

Yes, but the point is, as the other poster mentioned: values defined by
a class.
For example, in lisp:
"xyz" is a string, #(1 2 3) is an array, '(1 2 3) is a list, 45 is a
fixed-point number.
Each item that can be stored has a type, no item can be stored without
one.  The types can be tested.  Most dynamic typed languages are like
that.

Compare this to an untyped language where types cannot generally be
tested.

> >>A (static) type system assigns types to (all) *expressions*.
> >
> > That's right most of the time yes, I probably should have said
> > expressions.  Though I can think of static typed languages where the
> > resulting type of an expression depends on the type of the variable it
> > is being assigned to.
>
> Yes, but that's no contradiction. A type system does not necessarily
> assign *unique* types to individual expressions (consider overloading,
> subtyping, polymorphism, etc).

That's a fair way of looking at it.

> > Well I haven't programmed in any statically typed language where values
> > have types themselves.
>
> They all have - the whole purpose of a type system is to ensure that any
> expression of type T always evaluates to a value of type T.

But it only gaurantees this because the variables themselves have a
type, the values themselves do not.  Most of the time the fact that the
variable they are held in has a type infers that the value does too.
But the value itself has no type, in a C program for example I can take
the value from some variable and recast it in any way I feel and the
language cannot correct any errors I make because their is no
information in the variable to indicate what type it is.

> So when you
> look at type systems formally then you certainly have to assign types to
> values, otherwise you couldn't prove any useful property about those
> systems (esp. soundness).

Yes, but indirectly.
From: Matthias Blume
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <m1ejxjhlhs.fsf@hana.uchicago.edu>
"Rob Thorpe" <·············@antenova.com> writes:

> Andreas Rossberg wrote:
>> Rob Thorpe wrote:
>> >>
>> >>>No, that isn't what I said.  What I said was:
>> >>>"A language is latently typed if a value has a property - called it's
>> >>>type - attached to it, and given it's type it can only represent values
>> >>>defined by a certain class."
>> >>
>> >>"it [= a value] [...] can [...] represent values"?
>> >
>> > ???
>>
>> I just quoted, in condensed form, what you said above: namely, that a
>> value represents values - which I find a strange and circular definition.
>
> Yes, but the point is, as the other poster mentioned: values defined by
> a class.
> For example, in lisp:
> "xyz" is a string, #(1 2 3) is an array, '(1 2 3) is a list, 45 is a
> fixed-point number.
> Each item that can be stored has a type, no item can be stored without
> one.  The types can be tested.  Most dynamic typed languages are like
> that.

Your "types" are just predicates.

> Compare this to an untyped language where types cannot generally be
> tested.

You mean there are no predicates in untyped languages?

>> They all have - the whole purpose of a type system is to ensure that any
>> expression of type T always evaluates to a value of type T.
>
> But it only gaurantees this because the variables themselves have a
> type, the values themselves do not.

Of course they do.

>  Most of the time the fact that the
> variable they are held in has a type infers that the value does too.
> But the value itself has no type, in a C program for example I can take
> the value from some variable and recast it in any way I feel and the
> language cannot correct any errors I make because their is no
> information in the variable to indicate what type it is.

Casting in C takes values of one type to values of another type.

>> So when you
>> look at type systems formally then you certainly have to assign types to
>> values, otherwise you couldn't prove any useful property about those
>> systems (esp. soundness).
>
> Yes, but indirectly.

No, this is usually done very directly and very explicitly.
From: Rob Thorpe
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1150822150.701434.296460@u72g2000cwu.googlegroups.com>
Matthias Blume wrote:
> "Rob Thorpe" <·············@antenova.com> writes:
> > Andreas Rossberg wrote:
> >> Rob Thorpe wrote:
> >> >>
> >> >>>No, that isn't what I said.  What I said was:
> >> >>>"A language is latently typed if a value has a property - called it's
> >> >>>type - attached to it, and given it's type it can only represent values
> >> >>>defined by a certain class."
> >> >>
> >> >>"it [= a value] [...] can [...] represent values"?
> >> >
> >> > ???
> >>
> >> I just quoted, in condensed form, what you said above: namely, that a
> >> value represents values - which I find a strange and circular definition.
> >
> > Yes, but the point is, as the other poster mentioned: values defined by
> > a class.
> > For example, in lisp:
> > "xyz" is a string, #(1 2 3) is an array, '(1 2 3) is a list, 45 is a
> > fixed-point number.
> > Each item that can be stored has a type, no item can be stored without
> > one.  The types can be tested.  Most dynamic typed languages are like
> > that.
>
> Your "types" are just predicates.

You can call them predicates if you like.  Most people programming in
python, perl, or lisp will call them types though.

> > Compare this to an untyped language where types cannot generally be
> > tested.
>
> You mean there are no predicates in untyped languages?

Well, no there aren't.  That's what defines a language as untyped.

Of-course you can make them with your own code, in for example
assembler, but that's outside the language.

> >> They all have - the whole purpose of a type system is to ensure that any
> >> expression of type T always evaluates to a value of type T.
> >
> > But it only gaurantees this because the variables themselves have a
> > type, the values themselves do not.
>
> Of course they do.

I think we're discussing this at cross-purposes.  In a language like C
or another statically typed language there is no information passed
with values indicating their type.  Have a look in a C compiler if you
don't believe me.  You store 56 in a location in memory then in most
compilers all that will be store is 56, nothing more.  The compiler
relys entirely on the types of the variables to know how to correctly
operate on the values.  The values themselves have no type information
associated with them.

> >  Most of the time the fact that the
> > variable they are held in has a type infers that the value does too.
> > But the value itself has no type, in a C program for example I can take
> > the value from some variable and recast it in any way I feel and the
> > language cannot correct any errors I make because their is no
> > information in the variable to indicate what type it is.
>
> Casting in C takes values of one type to values of another type.

No it doesn't. Casting reinterprets a value of one type as a value of
another type.
There is a difference.  If I cast an unsigned integer 2000000000 to a
signed integer in C on the machine I'm using then the result I will get
will not make any sense.

> >> So when you
> >> look at type systems formally then you certainly have to assign types to
> >> values, otherwise you couldn't prove any useful property about those
> >> systems (esp. soundness).
> >
> > Yes, but indirectly.
>
> No, this is usually done very directly and very explicitly.

No it isn't.
If I say "4 is a integer" that is a direct statement about 4.
If I say "X is and integer and X is 4, therefore 4 is an integer" that
is a slightly less direct statement about 4.

The difference in directness is only small, and much more indirectness
is needed to prove anything useful about a system formally.
From: Matthias Blume
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <m164ivhebb.fsf@hana.uchicago.edu>
"Rob Thorpe" <·············@antenova.com> writes:

> I think we're discussing this at cross-purposes.  In a language like C
> or another statically typed language there is no information passed
> with values indicating their type.

You seem to be confusing "does not have a type" with "no type
information is passed at runtime".

> Have a look in a C compiler if you don't believe me.

Believe me, I have.

> No it doesn't. Casting reinterprets a value of one type as a value of
> another type.
> There is a difference.  If I cast an unsigned integer 2000000000 to a
> signed integer in C on the machine I'm using then the result I will get
> will not make any sense.

Which result are you getting?  What does it mean to "make sense"?
From: Rob Thorpe
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1150883442.498308.226730@b68g2000cwa.googlegroups.com>
Matthias Blume wrote:
> "Rob Thorpe" <·············@antenova.com> writes:
>
> > I think we're discussing this at cross-purposes.  In a language like C
> > or another statically typed language there is no information passed
> > with values indicating their type.
>
> You seem to be confusing "does not have a type" with "no type
> information is passed at runtime".
>
> > Have a look in a C compiler if you don't believe me.
>
> Believe me, I have.

In a C compiler the compiler has no idea what the values are in the
program.
It knows only their type in that it knows the type of the variable they
are contained within.
Would you agree with that?

> > No it doesn't. Casting reinterprets a value of one type as a value of
> > another type.
> > There is a difference.  If I cast an unsigned integer 2000000000 to a
> > signed integer in C on the machine I'm using then the result I will get
> > will not make any sense.
>
> Which result are you getting?  What does it mean to "make sense"?

Well the right one actually, bad example.

But, if I cast an unsigned int 2500000000 to signed I get -1794967296.
From: Matthias Blume
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <m2u06ehbki.fsf@hanabi.local>
"Rob Thorpe" <·············@antenova.com> writes:

> Matthias Blume wrote:
>> "Rob Thorpe" <·············@antenova.com> writes:
>>
>> > I think we're discussing this at cross-purposes.  In a language like C
>> > or another statically typed language there is no information passed
>> > with values indicating their type.
>>
>> You seem to be confusing "does not have a type" with "no type
>> information is passed at runtime".
>>
>> > Have a look in a C compiler if you don't believe me.
>>
>> Believe me, I have.
>
> In a C compiler the compiler has no idea what the values are in the
> program.

It is no different from any other compiler, really.  If the compiler
sees the literal 1 in a context that demands type int, then it knows
perfectly well what value that is.

> It knows only their type in that it knows the type of the variable they
> are contained within.
> Would you agree with that?
>
>> > No it doesn't. Casting reinterprets a value of one type as a value of
>> > another type.
>> > There is a difference.  If I cast an unsigned integer 2000000000 to a
>> > signed integer in C on the machine I'm using then the result I will get
>> > will not make any sense.
>>
>> Which result are you getting?  What does it mean to "make sense"?
>
> Well the right one actually, bad example.
>
> But, if I cast an unsigned int 2500000000 to signed I get -1794967296.

So, why do you think this "does not make sense"?  And, as this example
illustrates, casting in C maps values to values.  Depending on the
types of the source and the target, a cast might change the underlying
representation, or it might leave it the same.  But it does produce a
value, and the result value is usually not the same as the argument
value, even if the representation is the same.

Matthias
From: Rob Thorpe
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1150899429.801658.16120@r2g2000cwb.googlegroups.com>
Matthias Blume wrote:
> "Rob Thorpe" <·············@antenova.com> writes:
> > Matthias Blume wrote:
> >> "Rob Thorpe" <·············@antenova.com> writes:
> >>
> >> > I think we're discussing this at cross-purposes.  In a language like C
> >> > or another statically typed language there is no information passed
> >> > with values indicating their type.
> >>
> >> You seem to be confusing "does not have a type" with "no type
> >> information is passed at runtime".
> >>
> >> > Have a look in a C compiler if you don't believe me.
> >>
> >> Believe me, I have.
> >
> > In a C compiler the compiler has no idea what the values are in the
> > program.
>
> It is no different from any other compiler, really.  If the compiler
> sees the literal 1 in a context that demands type int, then it knows
> perfectly well what value that is.

Well, with a literal yes.  But with the value of some variable x at
runtime it only know the type because it knows the type of the
variable.  Similarly the type of values generated by an expression are
only known because the type the expression generates is known.

> > It knows only their type in that it knows the type of the variable they
> > are contained within.
> > Would you agree with that?

Would you?

> >> > No it doesn't. Casting reinterprets a value of one type as a value of
> >> > another type.
> >> > There is a difference.  If I cast an unsigned integer 2000000000 to a
> >> > signed integer in C on the machine I'm using then the result I will get
> >> > will not make any sense.
> >>
> >> Which result are you getting?  What does it mean to "make sense"?
> >
> > Well the right one actually, bad example.
> >
> > But, if I cast an unsigned int 2500000000 to signed I get -1794967296.
>
> So, why do you think this "does not make sense"?

Well, it makes sense in terms of the C spec certainly.
It does not make sense in that it does not emit an error.

>And, as this example
> illustrates, casting in C maps values to values.  Depending on the
> types of the source and the target, a cast might change the underlying
> representation, or it might leave it the same.  But it does produce a
> value, and the result value is usually not the same as the argument
> value, even if the representation is the same.

Yes. I'm not arguing with that.
From: Matthias Blume
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <m1wtbaft0u.fsf@hana.uchicago.edu>
"Rob Thorpe" <·············@antenova.com> writes:

>> >> > No it doesn't. Casting reinterprets a value of one type as a value of
>> >> > another type.
>> >> > There is a difference.  If I cast an unsigned integer 2000000000 to a
>> >> > signed integer in C on the machine I'm using then the result I will get
>> >> > will not make any sense.
>> >>
>> >> Which result are you getting?  What does it mean to "make sense"?
>> >
>> > Well the right one actually, bad example.
>> >
>> > But, if I cast an unsigned int 2500000000 to signed I get -1794967296.
>>
>> So, why do you think this "does not make sense"?
>
> Well, it makes sense in terms of the C spec certainly.
> It does not make sense in that it does not emit an error.

Why not?
From: Rob Thorpe
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1150905566.142680.246550@u72g2000cwu.googlegroups.com>
Matthias Blume wrote:
> "Rob Thorpe" <·············@antenova.com> writes:
>
> >> >> > No it doesn't. Casting reinterprets a value of one type as a value of
> >> >> > another type.
> >> >> > There is a difference.  If I cast an unsigned integer 2000000000 to a
> >> >> > signed integer in C on the machine I'm using then the result I will get
> >> >> > will not make any sense.
> >> >>
> >> >> Which result are you getting?  What does it mean to "make sense"?
> >> >
> >> > Well the right one actually, bad example.
> >> >
> >> > But, if I cast an unsigned int 2500000000 to signed I get -1794967296.
> >>
> >> So, why do you think this "does not make sense"?
> >
> > Well, it makes sense in terms of the C spec certainly.
> > It does not make sense in that it does not emit an error.
>
> Why not?

Well it's really a matter of perspective isn't it.  In some cases it
may even be what the programmer intends.

>From my perspective I don't expect a large positive number to turn into
a negative one just because I've requested that it be signed.  The
behaviour I would expect is that either the variable being set gets the
right number, or an error occurs.  I don't care if the reporting is at
runtime or compile time, so long as it can be seen.

--
Anyway, this is a small point.  Does the fact that you only disagreed
with me on this point indicate that you agreed with everything else?

... Only joking :) I think you and I have irreconcilable views on this
subject.
I think we can agree that value mostly have some sort of type in a
statically typed language.  You may say that they have directly have
types, but I can't see how that can be, as far as I can see they have
types only indirectly.  This is only a minor point though.

The issue of typing of values vs variables is not practically
meaningless though.

The main problem I find with statically typed programs is that much of
the world outside the program is not statically typed.  Library and API
writers in-particular seem to treat static typing carelessly.  As a
result it's not generally possible to write a program meaningfully
inside the static type system, as the type systems designers intended.
You're force to break it all over the place with casts etc anyway.  And
at every place you break it the possibility of error occurs and flow
from that place into the rest of the program.  So, you end up actually
checking the values of variables rather than relying on their types
anyway.  And this can be difficult because their values have no
associated information telling you their types.

It's different writing a function deep inside a large program, in this
case data can be cleaned up beforehand.  And in this case the static
typing works are intended and becomes more useful.


Anyway, I don't think it would help either of us to continue this
sub-thread any more.  Feel free to reply to this post, but I might not
reply back.

Static typing is rather off topic on most of the newsgroups this thread
is in anyway, and I suspect the people in them are getting rather sick
of it.
From: David Hopwood
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <Ogdmg.443406$tc.367810@fe2.news.blueyonder.co.uk>
Rob Thorpe wrote:
> Matthias Blume wrote:
>>"Rob Thorpe" <·············@antenova.com> writes:
>>
>>>I think we're discussing this at cross-purposes.  In a language like C
>>>or another statically typed language there is no information passed
>>>with values indicating their type.
>>
>>You seem to be confusing "does not have a type" with "no type
>>information is passed at runtime".
>>
>>>Have a look in a C compiler if you don't believe me.
>>
>>Believe me, I have.
> 
> In a C compiler the compiler has no idea what the values are in the program.
> It knows only their type in that it knows the type of the variable they
> are contained within.
> Would you agree with that?

No. In any language, it may be possible to statically infer that the
value of an expression will belong to a set of values smaller than that
allowed by the expression's type in that language's type system. For
example, all constants have a known value, but most constants have a
type which allows more than one value.

(This is an essential point, not just nitpicking.)

-- 
David Hopwood <····················@blueyonder.co.uk>
From: Rob Thorpe
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1150905875.552388.44170@y41g2000cwy.googlegroups.com>
David Hopwood wrote:
> Rob Thorpe wrote:
> > Matthias Blume wrote:
> >>"Rob Thorpe" <·············@antenova.com> writes:
> >>
> >>>I think we're discussing this at cross-purposes.  In a language like C
> >>>or another statically typed language there is no information passed
> >>>with values indicating their type.
> >>
> >>You seem to be confusing "does not have a type" with "no type
> >>information is passed at runtime".
> >>
> >>>Have a look in a C compiler if you don't believe me.
> >>
> >>Believe me, I have.
> >
> > In a C compiler the compiler has no idea what the values are in the program.
> > It knows only their type in that it knows the type of the variable they
> > are contained within.
> > Would you agree with that?
>
> No. In any language, it may be possible to statically infer that the
> value of an expression will belong to a set of values smaller than that
> allowed by the expression's type in that language's type system. For
> example, all constants have a known value, but most constants have a
> type which allows more than one value.
>
> (This is an essential point, not just nitpicking.)

Yes, I agree.  That does not apply in general though.
In general the value of the variable could be, for example, read from a
file, in which case the compiler may know it's type, but not any more.

I was talking about the general case.
From: Darren New
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <zSVlg.19623$uy3.12503@tornado.socal.rr.com>
Rob Thorpe wrote:
> The compiler
> relys entirely on the types of the variables to know how to correctly
> operate on the values.  The values themselves have no type information
> associated with them.

int x = (int) (20.5 / 3);

What machine code operations does the "/" there invoke? Integer 
division, or floating point division? How did the variables involved in 
the expression affect that?

>>Casting in C takes values of one type to values of another type.

> No it doesn't. Casting reinterprets a value of one type as a value of
> another type.

No it doesn't.
int x = (int) 20.5;
There's no point at which bits from the floating point representation 
appear in the variable x.

int * x = (int *) 0;
There's nothing that indicates all the bits of "x" are zero, and indeed 
in some hardware configurations they aren't.

-- 
   Darren New / San Diego, CA, USA (PST)
     My Bath Fu is strong, as I have
     studied under the Showerin' Monks.
From: Rob Thorpe
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1150824033.452278.45310@c74g2000cwc.googlegroups.com>
Darren New wrote:
> Rob Thorpe wrote:
> > The compiler
> > relys entirely on the types of the variables to know how to correctly
> > operate on the values.  The values themselves have no type information
> > associated with them.
>
> int x = (int) (20.5 / 3);
>
> What machine code operations does the "/" there invoke? Integer
> division, or floating point division? How did the variables involved in
> the expression affect that?

In that case it knew because it could see at compile time.  In general
though it doesn't.
If I divide x / y it only knows which to use because of types declared
for x and y.

> >>Casting in C takes values of one type to values of another type.
>
> > No it doesn't. Casting reinterprets a value of one type as a value of
> > another type.
>
> No it doesn't.
> int x = (int) 20.5;
> There's no point at which bits from the floating point representation
> appear in the variable x.
>
> int * x = (int *) 0;
> There's nothing that indicates all the bits of "x" are zero, and indeed
> in some hardware configurations they aren't.

I suppose some are conversions and some reinterpretations.  What I
should have said it that there are cases where cast reinterprets.
From: Darren New
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <boWlg.19627$uy3.179@tornado.socal.rr.com>
Rob Thorpe wrote:
> Darren New wrote:
>>Rob Thorpe wrote:
>>>The values themselves have no type information associated with them.
>>int x = (int) (20.5 / 3);
> In that case it knew because it could see at compile time.

Well, yes. That's the point of static typing.

>  In general though it doesn't.

Sure it does. There are all kinds of formal rules about type promotions 
and operator version selection.

In deed, in *general* every value has a type associated with it (at 
compile time). Some values have a type associated with them due to the 
declaration of the variable supplying the value, but that's far from the 
general case.

Note that in
main() { char x = 'x'; foo(x); }
the value passed to "foo" is not even the same type as the declaration 
of "x", so it's far from the general case that variables even determine 
the values they provide to the next part of the calculation.

> If I divide x / y it only knows which to use because of types declared
> for x and y.

Yes? So? All you're saying is that the value of the expression "x" is 
based on the declared type for the variable "x" in scope at that point. 
That doesn't mean values don't have types. It just means that *some* 
values' types are determined by the type of the variable the value is 
stored in. As soon as you do anything *else* with that value, such as 
passing it to an operator, a function, or a cast, the value potentially 
takes on a type different from that of the variable from which it came.

> I suppose some are conversions and some reinterpretations.  What I
> should have said it that there are cases where cast reinterprets.

There are cases where the cast is defined to return a value that has the 
same bit pattern as its argument, to the extent the hardware can support 
it.  However, this is obviously limited to the values for which C 
actually defines the bit patterns of its values, namely the scalar 
integers.

Again, you're taking a special case and treating all the rest (which are 
a majority) as aberations.  For the most part, casts do *not* return the 
same bit pattern as the value.  For the most part
union {T1 x; T2 y;};
can be used to do transformations that
T2 y; T1 x = (T1) y;
does not. Indeed, the only bit patterns that don't change are when 
casting from signed to unsigned versions of the same underlying scalar 
type and back. Everything else must change, except perhaps pointers, 
depending on your architecture. (I've worked on machines where char* had 
more bits than long*, for example.)

(Funny that comp.lang.c isn't on this thread. ;-)

-- 
   Darren New / San Diego, CA, USA (PST)
     My Bath Fu is strong, as I have
     studied under the Showerin' Monks.
From: Darren New
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <iRUlg.15161$Z67.14450@tornado.socal.rr.com>
Rob Thorpe wrote:
> Yes, but the point is, as the other poster mentioned: values defined by
> a class.

A value can only represent one value, right? Or can a value have 
multiple values?

> For example, in lisp:
> "xyz" is a string, 

"xyz" cannot represent values from the class of strings. It can only 
represent one value.

I think that's what the others are getting at.

>>They all have - the whole purpose of a type system is to ensure that any
>>expression of type T always evaluates to a value of type T.
> 
> But it only gaurantees this because the variables themselves have a
> type, the values themselves do not.  

Sure they do. 23.5e3 is a "real" in Pascal, and there's no variable there.

("hello" % "there") is illegal in most languages, because the modulo 
operator doesn't apply to strings. How could this be determined at 
compile time if "hello" and "there" don't have types?

-- 
   Darren New / San Diego, CA, USA (PST)
     My Bath Fu is strong, as I have
     studied under the Showerin' Monks.
From: Ketil Malde
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <eghd2f264i.fsf@polarvier.ii.uib.no>
"Rob Thorpe" <·············@antenova.com> writes:

> But it only gaurantees this because the variables themselves have a
> type, the values themselves do not. 

I think statements like this are confusing, because there are
different interpretations of what a "value" is.  I would say that the
integer '4' is a value, and that it has type Integer (for instance).
This value is different from 4 the Int16, or 4 the double-precision
floating point number.  From this viewpoint, all values in statically
typed languages have types, but I think you use 'value' to denote the
representation of a datum in memory, which is a different thing.

-k
-- 
If I haven't seen further, it is by standing in the footprints of giants
From: Rob Thorpe
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1150821239.077242.236710@y41g2000cwy.googlegroups.com>
Ketil Malde wrote:
> "Rob Thorpe" <·············@antenova.com> writes:
>
> > But it only gaurantees this because the variables themselves have a
> > type, the values themselves do not.
>
> I think statements like this are confusing, because there are
> different interpretations of what a "value" is.  I would say that the
> integer '4' is a value, and that it has type Integer (for instance).
> This value is different from 4 the Int16, or 4 the double-precision
> floating point number.  From this viewpoint, all values in statically
> typed languages have types, but I think you use 'value' to denote the
> representation of a datum in memory, which is a different thing.

Well I haven't been consistent so far :)

But I mean the value as the semantics of the program itself sees it.
Which mostly means the datum in memory.
From: Ketil Malde
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <eg8xnk1h43.fsf@polarvier.ii.uib.no>
"Rob Thorpe" <·············@antenova.com> writes:

>> I think statements like this are confusing, because there are
>> different interpretations of what a "value" is.

> But I mean the value as the semantics of the program itself sees it.
> Which mostly means the datum in memory.

I don't agree with that.  Generally, a language specifies a virtual
machine and doesn't need to concern itself with things like "memory"
at all.  Although langauges like C tries to, look at all the undefined
behavior you get when you make assumptions about memory layout etc.

Memory representation is just an artifact of a particular
implementation of the language for a particular architecture.

-k
-- 
If I haven't seen further, it is by standing in the footprints of giants
From: Andreas Rossberg
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e794rg$8usvs$5@hades.rz.uni-saarland.de>
Rob Thorpe wrote:
>Andreas Rossberg wrote:
>>Rob Thorpe wrote:
>>
>>>>>"A language is latently typed if a value has a property - called it's
>>>>>type - attached to it, and given it's type it can only represent values
>>>>>defined by a certain class."
>>>>
>>>>"it [= a value] [...] can [...] represent values"?
>>>
>>>???
>>
>>I just quoted, in condensed form, what you said above: namely, that a
>>value represents values - which I find a strange and circular definition.
> 
> Yes, but the point is, as the other poster mentioned: values defined by
> a class.

I'm sorry, but I still think that the definition makes little sense. 
Obviously, a value simply *is* a value, it does not "represent" one, or 
several ones, regardless how you qualify that statement.

>>>Well I haven't programmed in any statically typed language where values
>>>have types themselves.
>>
>>They all have - the whole purpose of a type system is to ensure that any
>>expression of type T always evaluates to a value of type T.
> 
> But it only gaurantees this because the variables themselves have a
> type,

No, variables are insignificant in this context. You can consider a 
language without variables at all (such languages exist, and they can 
even be Turing-complete) and still have evaluation, values, and a 
non-trivial type system.

> But the value itself has no type

You mean that the type of the value is not represented at runtime? True, 
but that's simply because the type system is static. It's not the same 
as saying it has no type.

> in a C program for example I can take
> the value from some variable and recast it in any way I feel and the
> language cannot correct any errors I make because their is no
> information in the variable to indicate what type it is.

Nothing in the C spec precludes an implementation from doing just that. 
The problem with C rather is that its semantics is totally 
underspecified. In any case, C is about the worst example to use when 
discussing type systems. For starters, it is totally unsound - which is 
what your example exploits.

- Andreas
From: Rob Thorpe
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1150823297.816352.75370@u72g2000cwu.googlegroups.com>
Andreas Rossberg wrote:
> Rob Thorpe wrote:
> >Andreas Rossberg wrote:
> >>Rob Thorpe wrote:
> >>
> >>>>>"A language is latently typed if a value has a property - called it's
> >>>>>type - attached to it, and given it's type it can only represent values
> >>>>>defined by a certain class."
> >>>>
> >>>>"it [= a value] [...] can [...] represent values"?
> >>>
> >>>???
> >>
> >>I just quoted, in condensed form, what you said above: namely, that a
> >>value represents values - which I find a strange and circular definition.
> >
> > Yes, but the point is, as the other poster mentioned: values defined by
> > a class.
>
> I'm sorry, but I still think that the definition makes little sense.
> Obviously, a value simply *is* a value, it does not "represent" one, or
> several ones, regardless how you qualify that statement.

You've clipped a lot of context.  I'll put some back in, I said:-

> > Yes, but the point is, as the other poster mentioned: values defined by
> > a class.
> > For example, in lisp:
> > "xyz" is a string, #(1 2 3) is an array, '(1 2 3) is a list, 45 is a
> > fixed-point number.
> > Each item that can be stored has a type, no item can be stored without
> > one.  The types can be tested.  Most dynamic typed languages are like
> > that.
> > Compare this to an untyped language where types cannot generally be
> > tested.

I think this should make it clear.  If I have a "xyz" in lisp I know it
is a string.
If I have "xyz" in an untyped language like assembler it may be
anything, two pointers in binary, an integer, a bitfield.  There is no
data at compile time or runtime to tell what it is, the programmer has
to remember.

(I'd point out this isn't true of all assemblers, there are some typed
assemblers)

> >>>Well I haven't programmed in any statically typed language where values
> >>>have types themselves.
> >>
> >>They all have - the whole purpose of a type system is to ensure that any
> >>expression of type T always evaluates to a value of type T.
> >
> > But it only gaurantees this because the variables themselves have a
> > type,
>
> No, variables are insignificant in this context. You can consider a
> language without variables at all (such languages exist, and they can
> even be Turing-complete) and still have evaluation, values, and a
> non-trivial type system.

Hmm.  You're right, ML is no-where in my definition since it has no
variables.

> > But the value itself has no type
>
> You mean that the type of the value is not represented at runtime? True,
> but that's simply because the type system is static. It's not the same
> as saying it has no type.

Well, is it even represented at compile time?
The compiler doesn't know in general what values will exist at runtime,
it knows only what types variables have.  Sometimes it only has partial
knowledge and sometimes the programmer deliberately overrides it.  From
what knowledge it you could say it know what types values will have.

> > in a C program for example I can take
> > the value from some variable and recast it in any way I feel and the
> > language cannot correct any errors I make because their is no
> > information in the variable to indicate what type it is.
>
> Nothing in the C spec precludes an implementation from doing just that.

True, that would be an interesting implementation.

> The problem with C rather is that its semantics is totally
> underspecified. In any case, C is about the worst example to use when
> discussing type systems. For starters, it is totally unsound - which is
> what your example exploits.

Yes.  Unfortunately it's often necessary to break static type systems.

Regarding C the problem is, what should we discuss instead that would
be understood in all these newsgroups we're discussing this in.
From: Torben Ægidius Mogensen
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <7zzmg6amm1.fsf@app-3.diku.dk>
"Rob Thorpe" <·············@antenova.com> writes:

> Andreas Rossberg wrote:
>
> > No, variables are insignificant in this context. You can consider a
> > language without variables at all (such languages exist, and they can
> > even be Turing-complete) and still have evaluation, values, and a
> > non-trivial type system.
> 
> Hmm.  You're right, ML is no-where in my definition since it has no
> variables.

That's not true.  ML has variables in the mathematical sense of
variables -- symbols that can be associated with different values at
different times.  What it doesn't have is mutable variables (though it
can get the effect of those by having variables be immutable
references to mutable memory locations).

What Andreas was alluding to was presumably FP-style languages where
functions or relations are built by composing functions or relations
without ever naming values.

        Torben
From: Marshall
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1150904475.363283.234730@y41g2000cwy.googlegroups.com>
Torben Ægidius Mogensen wrote:
>
> That's not true.  ML has variables in the mathematical sense of
> variables -- symbols that can be associated with different values at
> different times.  What it doesn't have is mutable variables (though it
> can get the effect of those by having variables be immutable
> references to mutable memory locations).

While we're on the topic of terminology, here's a pet peeve of
mine: "immutable variable."

immutable = can't change
vary-able = can change

Clearly a contradiction in terms.

If you have a named value that cannot be updated, it makes
no sense to call it "variable" since it isn't *able* to *vary.*
Let's call it a named constant.


Marshall
From: Matthias Blume
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <m1zmg6ld3e.fsf@hana.uchicago.edu>
"Marshall" <···············@gmail.com> writes:

> Torben �gidius Mogensen wrote:
>>
>> That's not true.  ML has variables in the mathematical sense of
>> variables -- symbols that can be associated with different values at
>> different times.  What it doesn't have is mutable variables (though it
>> can get the effect of those by having variables be immutable
>> references to mutable memory locations).
>
> While we're on the topic of terminology, here's a pet peeve of
> mine: "immutable variable."
>
> immutable = can't change
> vary-able = can change
>
> Clearly a contradiction in terms.

No, it is not a contradiction.  See the mathematical usage of the word
"variable".  Immutable variables can stand for different values at
different times, even without mutation involved, usually because they
are bound by some construct such as lambda.

> If you have a named value that cannot be updated, it makes
> no sense to call it "variable" since it isn't *able* to *vary.*

Mutation is not the only way for an expression to evaluate to
different values over time.
From: David Hopwood
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <uhfmg.445208$tc.122076@fe2.news.blueyonder.co.uk>
Marshall wrote:
> Torben Ægidius Mogensen wrote:
> 
>>That's not true.  ML has variables in the mathematical sense of
>>variables -- symbols that can be associated with different values at
>>different times.  What it doesn't have is mutable variables (though it
>>can get the effect of those by having variables be immutable
>>references to mutable memory locations).
> 
> While we're on the topic of terminology, here's a pet peeve of
> mine: "immutable variable."
> 
> immutable = can't change
> vary-able = can change
> 
> Clearly a contradiction in terms.

But one that is at least two hundred years old [*], and so unlikely to be
fixed now.

In any case, the intent of this usage (in both mathematics and programming)
is that different *instances* of a variable can be associated with different
values.

[*] introduced by Leibniz, according to <http://members.aol.com/jeff570/v.html>,
    but that was presumably in Latin. The first use of "variable" as a noun
    recorded by the OED in written English is in 1816.

-- 
David Hopwood <····················@blueyonder.co.uk>
From: Andreas Rossberg
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e7bpie$8v164$4@hades.rz.uni-saarland.de>
Marshall wrote:
> 
> While we're on the topic of terminology, here's a pet peeve of
> mine: "immutable variable."
> 
> immutable = can't change
> vary-able = can change
> 
> Clearly a contradiction in terms.
> 
> If you have a named value that cannot be updated, it makes
> no sense to call it "variable" since it isn't *able* to *vary.*
> Let's call it a named constant.

The name of a function argument is a variable. Its denotation changes 
between calls. Still it cannot be mutated. Likewise, local "constants" 
depending on an argument are not constant.

- Andreas
From: Joachim Durchholz
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e7esks$1hk$1@online.de>
Marshall schrieb:
> immutable = can't change
> vary-able = can change
> 
> Clearly a contradiction in terms.

Not in mathematics.
The understanding there is that a "variable" varies - not over time, but 
according to the whim of the usage. (E.g. if a function is displayed in 
a graph, the parameter varies along the X axis. If it's used within a 
term, the parameter varies depending on how it's used. Etc.)

Similarly for computer programs.
Of course, if values are immutable, the value associated with a 
parameter name cannot vary within the same invocation - but it can still 
vary from one invocation to the next.

Regards,
Jo
From: Marshall
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1151027800.460188.116860@i40g2000cwc.googlegroups.com>
Joachim Durchholz wrote:
> Marshall schrieb:
> > immutable = can't change
> > vary-able = can change
> >
> > Clearly a contradiction in terms.
>
> Not in mathematics.

So stipulated.

However, it *is* a contradiction in English. Now, we often redefine
or narrow terms for technical fields. However, when we find
ourselves in a situation where we've inverted the English
meaning with the technical meaning, or ended up with a
contradiction, it ought to give us pause.


> The understanding there is that a "variable" varies - not over time, but
> according to the whim of the usage. (E.g. if a function is displayed in
> a graph, the parameter varies along the X axis. If it's used within a
> term, the parameter varies depending on how it's used. Etc.)
>
> Similarly for computer programs.
> Of course, if values are immutable, the value associated with a
> parameter name cannot vary within the same invocation - but it can still
> vary from one invocation to the next.

Again, stipulated. In fact I conceded this was a good point when it
was first mentioned. However, we generally refer to parameters
to functions as "parameters"-- you did it yourself above.

What we generally (in programming) call variables are locals
and globals. If the languages supports an update operation
on those variables, then calling them variables makes sense.
But "variable" has become such a catch-all term that we call

  public static final int FOO = 7;

a variable, even though it can never, ever vary.

That doesn't make any sense.

If we care about precision in our terminology, we ought to
distinguish among parameters, named values that support
destructive update, named values that don't support
destructive update, and possibly other things.


Marshall
From: Andreas Rossberg
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e7gcb9$90ukl$1@hades.rz.uni-saarland.de>
Marshall wrote:
> 
> What we generally (in programming) call variables are locals
> and globals. If the languages supports an update operation
> on those variables, then calling them variables makes sense.
> But "variable" has become such a catch-all term that we call
> 
>   public static final int FOO = 7;
> 
> a variable, even though it can never, ever vary.
> 
> That doesn't make any sense.

It does, because it is only a degenerate case. In general, you can have 
something like

   void f(int x)
   {
      const int foo = x+1;
      //...
   }

Now, foo is still immutable, it is a local, but it clearly also varies.

- Andreas
From: Marshall
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1151080333.467194.286050@m73g2000cwd.googlegroups.com>
Andreas Rossberg wrote:
> Marshall wrote:
> >
> > What we generally (in programming) call variables are locals
> > and globals. If the languages supports an update operation
> > on those variables, then calling them variables makes sense.
> > But "variable" has become such a catch-all term that we call
> >
> >   public static final int FOO = 7;
> >
> > a variable, even though it can never, ever vary.
> >
> > That doesn't make any sense.
>
> It does, because it is only a degenerate case. In general, you can have
> something like
>
>    void f(int x)
>    {
>       const int foo = x+1;
>       //...
>    }
>
> Now, foo is still immutable, it is a local, but it clearly also varies.

So what then is to be the definition of "variable" if it includes
things that can never vary, because they are only a degenerate
case? Shall it be simply a named value?


Marshall
From: Piet van Oostrum
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <m2mzc4drwq.fsf@ordesa.cs.uu.nl>
>>>>> "Marshall" <···············@gmail.com> (M) wrote:

>M> Torben �gidius Mogensen wrote:
>>> 
>>> That's not true.  ML has variables in the mathematical sense of
>>> variables -- symbols that can be associated with different values at
>>> different times.  What it doesn't have is mutable variables (though it
>>> can get the effect of those by having variables be immutable
>>> references to mutable memory locations).

>M> While we're on the topic of terminology, here's a pet peeve of
>M> mine: "immutable variable."

>M> immutable = can't change
>M> vary-able = can change

>M> Clearly a contradiction in terms.

I would say that immutable = 'can't *be* changed' rather than 'can't
change'. But I am not a native English speaker.

Compare with this: the distance of the Earth to Mars is variable (it
varies), but it is also immutable (we can't change it).
-- 
Piet van Oostrum <····@cs.uu.nl>
URL: http://www.cs.uu.nl/~piet [PGP 8DAE142BE17999C4]
Private email: ····@vanoostrum.org
From: Andreas Rossberg
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e7b4u9$8v164$2@hades.rz.uni-saarland.de>
Rob Thorpe wrote:
> 
> I think this should make it clear.  If I have a "xyz" in lisp I know it
> is a string.
> If I have "xyz" in an untyped language like assembler it may be
> anything, two pointers in binary, an integer, a bitfield.  There is no
> data at compile time or runtime to tell what it is, the programmer has
> to remember.

You have to distinguish between values (at the level of language 
semantics) and their low-level representation (at the implementation 
level). In a high-level language, the latter should be completely 
immaterial to the semantics, and hence not interesting for the discussion.

>>No, variables are insignificant in this context. You can consider a
>>language without variables at all (such languages exist, and they can
>>even be Turing-complete) and still have evaluation, values, and a
>>non-trivial type system.
> 
> Hmm.  You're right, ML is no-where in my definition since it has no
> variables.

Um, it has. Mind you, it has no /mutable/ variables, but that was not 
even what I was talking about.

>>>But the value itself has no type
>>
>>You mean that the type of the value is not represented at runtime? True,
>>but that's simply because the type system is static. It's not the same
>>as saying it has no type.
> 
> Well, is it even represented at compile time?
> The compiler doesn't know in general what values will exist at runtime,
> it knows only what types variables have.  Sometimes it only has partial
> knowledge and sometimes the programmer deliberately overrides it.  From
> what knowledge it you could say it know what types values will have.

Again, variables are insignificant. From the structure of an expression 
the type system derives the type of the resulting value. An expression 
may contain variables, and then the type system generally must know (or 
be able to derive) their types too, but that's a separate issue. Most 
values are anonymous. Nevertheless their types are known.

> Unfortunately it's often necessary to break static type systems.

Your definitely using the wrong static language then. ;-)

- Andreas
From: Joachim Durchholz
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e7b98n$ji0$1@online.de>
Andreas Rossberg schrieb:
> Rob Thorpe wrote:
>> Hmm.  You're right, ML is no-where in my definition since it has no
>> variables.
> 
> Um, it has. Mind you, it has no /mutable/ variables, but that was not 
> even what I was talking about.

Indeed. A (possibly nonexhaustive) list of program entities that (can) 
have type would comprise of mutable variables, immutable variables (i.e. 
constants and parameter names), and functions resp. their results.

Regards,
Jo
From: David Squire
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e790og$4t6$1@news.ox.ac.uk>
Andreas Rossberg wrote:
> Rob Thorpe wrote:
>>>
>>>> No, that isn't what I said.  What I said was:
>>>> "A language is latently typed if a value has a property - called it's
>>>> type - attached to it, and given it's type it can only represent values
>>>> defined by a certain class."
>>>
>>> "it [= a value] [...] can [...] represent values"?
>>
>> ???
> 
> I just quoted, in condensed form, what you said above: namely, that a 
> value represents values - which I find a strange and circular definition.
> 

But you left out the most significant part: "given it's type it can only 
represent values *defined by a certain class*" (my emphasis). In C-ish 
notation:

     unsigned int x;

means that x can only represent elements that are integers elements of 
the set (class) of values [0, MAX_INT]. Negative numbers and non-integer 
numbers are excluded, as are all sorts of other things.

You over-condensed.

DS

NB. This is not a comment on static, latent, derived or other typing, 
merely on summarization.
From: Matthias Blume
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <m1irmvhloc.fsf@hana.uchicago.edu>
David Squire <············@no.spam.from.here.au> writes:

> Andreas Rossberg wrote:
>> Rob Thorpe wrote:
>>>>
>>>>> No, that isn't what I said.  What I said was:
>>>>> "A language is latently typed if a value has a property - called it's
>>>>> type - attached to it, and given it's type it can only represent values
>>>>> defined by a certain class."
>>>>
>>>> "it [= a value] [...] can [...] represent values"?
>>>
>>> ???
>> I just quoted, in condensed form, what you said above: namely, that
>> a value represents values - which I find a strange and circular
>> definition.
>> 
>
> But you left out the most significant part: "given it's type it can
> only represent values *defined by a certain class*" (my emphasis). In
> C-ish notation:
>
>      unsigned int x;
>
> means that x can only represent elements that are integers elements of
> the set (class) of values [0, MAX_INT]. Negative numbers and
> non-integer numbers are excluded, as are all sorts of other things.

This x is not a value.  It is a name of a memory location.

> You over-condensed.

Andreas condensed correctly.
From: David Squire
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e795if$6hi$1@news.ox.ac.uk>
Matthias Blume wrote:
> David Squire <············@no.spam.from.here.au> writes:
> 
>> Andreas Rossberg wrote:
>>> Rob Thorpe wrote:
>>>>>> No, that isn't what I said.  What I said was:
>>>>>> "A language is latently typed if a value has a property - called it's
>>>>>> type - attached to it, and given it's type it can only represent values
>>>>>> defined by a certain class."
>>>>> "it [= a value] [...] can [...] represent values"?
>>>> ???
>>> I just quoted, in condensed form, what you said above: namely, that
>>> a value represents values - which I find a strange and circular
>>> definition.
>>>
>> But you left out the most significant part: "given it's type it can
>> only represent values *defined by a certain class*" (my emphasis). In
>> C-ish notation:
>>
>>      unsigned int x;
>>
>> means that x can only represent elements that are integers elements of
>> the set (class) of values [0, MAX_INT]. Negative numbers and
>> non-integer numbers are excluded, as are all sorts of other things.
> 
> This x is not a value.  It is a name of a memory location.
> 
>> You over-condensed.
> 
> Andreas condensed correctly.

I should have stayed out of this. I had not realised that it had 
degenerated to point-scoring off someone typing "value" when it is clear 
from context that he meant "variable".

Bye.

DS
From: Matthias Blume
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <m1ac87hlex.fsf@hana.uchicago.edu>
David Squire <············@no.spam.from.here.au> writes:

> Matthias Blume wrote:
>> David Squire <············@no.spam.from.here.au> writes:
>> 
>>> Andreas Rossberg wrote:
>>>> Rob Thorpe wrote:
>>>>>>> No, that isn't what I said.  What I said was:
>>>>>>> "A language is latently typed if a value has a property - called it's
>>>>>>> type - attached to it, and given it's type it can only represent values
>>>>>>> defined by a certain class."
>>>>>> "it [= a value] [...] can [...] represent values"?
>>>>> ???
>>>> I just quoted, in condensed form, what you said above: namely, that
>>>> a value represents values - which I find a strange and circular
>>>> definition.
>>>>
>>> But you left out the most significant part: "given it's type it can
>>> only represent values *defined by a certain class*" (my emphasis). In
>>> C-ish notation:
>>>
>>>      unsigned int x;
>>>
>>> means that x can only represent elements that are integers elements of
>>> the set (class) of values [0, MAX_INT]. Negative numbers and
>>> non-integer numbers are excluded, as are all sorts of other things.
>> This x is not a value.  It is a name of a memory location.
>> 
>>> You over-condensed.
>> Andreas condensed correctly.
>
> I should have stayed out of this. I had not realised that it had
> degenerated to point-scoring off someone typing "value" when it is
> clear from context that he meant "variable".

If he really had meant "variable" then he would have been completely wrong.

> Bye.

Bye.
From: Andreas Rossberg
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e791g6$8usvs$4@hades.rz.uni-saarland.de>
David Squire wrote:
> Andreas Rossberg wrote:
> 
>> Rob Thorpe wrote:
>>
>>>>
>>>>> No, that isn't what I said.  What I said was:
>>>>> "A language is latently typed if a value has a property - called it's
>>>>> type - attached to it, and given it's type it can only represent 
>>>>> values
>>>>> defined by a certain class."
>>>>
>>>>
>>>> "it [= a value] [...] can [...] represent values"?
>>>
>>>
>>> ???
>>
>> I just quoted, in condensed form, what you said above: namely, that a 
>> value represents values - which I find a strange and circular definition.
> 
> But you left out the most significant part: "given it's type it can only 
> represent values *defined by a certain class*" (my emphasis).

That qualification does not remove the circularity from the definition.

> In C-ish notation:
> 
>     unsigned int x;
> 
> means that x can only represent elements that are integers elements of 
> the set (class) of values [0, MAX_INT]. Negative numbers and non-integer 
> numbers are excluded, as are all sorts of other things.

I don't see how that example is relevant, since the above definition 
does not mention variables.

- Andreas
From: Ketil Malde
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <eglkrs0xxr.fsf@polarvier.ii.uib.no>
Andreas Rossberg <········@ps.uni-sb.de> writes:

>> "A language is latently typed if a value has a property - called it's
>> type - attached to it, and given it's type it can only represent values
>> defined by a certain class."

I thought the point was to separate the (bitwise) representation of a
value from its interpretation (which is its type).  In a static
system, the interpretation is derived from context, in a dynamic
system values must carry some form of tags specifying which
interpretation to use. 

I think this applies - conceptually, at least - also to expressions?

My impression is that dynamic typers tend to include more general
properties in their concept of types (div by zero, srqt of negatives,
etc). 

-k
-- 
If I haven't seen further, it is by standing in the footprints of giants
From: Matthias Blume
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <m1veqxgg0g.fsf@hana.uchicago.edu>
"Rob Thorpe" <·············@antenova.com> writes:

> I don't think dynamic typing is that nebulous.  I remember this being
> discussed elsewhere some time ago, I'll post the same reply I did then
> ..
>
>
> A language is statically typed if a variable has a property - called
> it's type - attached to it, and given it's type it can only represent
> values defined by a certain class.

By this definition, all languages are statically typed (by making that
"certain class" the set of all values).  Moreover, this "definition",
when read the way you probably wanted it to be read, requires some
considerable stretch to accommodate existing static type systems such
as F_\omega.

Perhaps better: A language is statically typed if its definition
includes (or ever better: is based on) a static type system, i.e., a
static semantics with typing judgments derivable by typing rules.
Usually typing judgmets associate program phrases ("expressions") with
types given a typing environment.

> A language is latently typed if a value has a property - called it's
> type - attached to it, and given it's type it can only represent values
> defined by a certain class.

This "definition" makes little sense.  Any given value can obviously
only represent one value: itself.  "Dynamic types" are nothing more
than sets of values, often given by computable predicates.

> Untyped and type-free mean something else: they mean no type checking
> is done.

Look up "untyped lambda calculus".
From: Pascal Costanza
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <4fo91pF1jr7gpU1@individual.net>
Matthias Blume wrote:
> "Rob Thorpe" <·············@antenova.com> writes:
> 
>> I don't think dynamic typing is that nebulous.  I remember this being
>> discussed elsewhere some time ago, I'll post the same reply I did then
>> ..
>>
>>
>> A language is statically typed if a variable has a property - called
>> it's type - attached to it, and given it's type it can only represent
>> values defined by a certain class.
> 
> By this definition, all languages are statically typed (by making that
> "certain class" the set of all values).  Moreover, this "definition",
> when read the way you probably wanted it to be read, requires some
> considerable stretch to accommodate existing static type systems such
> as F_\omega.
> 
> Perhaps better: A language is statically typed if its definition
> includes (or ever better: is based on) a static type system, i.e., a
> static semantics with typing judgments derivable by typing rules.
> Usually typing judgmets associate program phrases ("expressions") with
> types given a typing environment.

How does your definition exclude the trivial type system in which the 
only typing judgment states that every expression is acceptable?


Pascal

-- 
3rd European Lisp Workshop
July 3 - Nantes, France - co-located with ECOOP 2006
http://lisp-ecoop06.bknr.net/
From: Chris Smith
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <MPG.1f00a629ba0b762d9896c9@news.altopia.net>
Pascal Costanza <··@p-cos.net> wrote:
> How does your definition exclude the trivial type system in which the 
> only typing judgment states that every expression is acceptable?

It is not necessary to exclude that trivial type system.  Since it is 
useless, no one will implement it.  However, if pressed, I suppose one 
would have to admit that that definition includes a type system that is 
just useless.

I do, though, prefer Pierce's definition:

    A type system is a tractable syntactic method for proving the
    absence of certain program behaviors by classifying phrases
    according to the kinds of values they compute.

(Benjamin Pierce, Types and Programming Languages, MIT Press, pg. 1)

Key words include:

 - tractable: it's not sufficient to just evaluate the program

 - syntactic: types are tied to the kinds of expressions in the language

 - certain program behaviors: while perhaps confusing out of context,
   there is nowhere in the book a specification of which program
   behaviors may be prevented by type systems and which may not.  In
   context, the word "certain" there is meant to make it clear that type
   systems should be able to specifically identify which behaviors they
   prevent, and not that there is some universal set.

-- 
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation
From: Matthias Blume
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <m1r71lgde8.fsf@hana.uchicago.edu>
Pascal Costanza <··@p-cos.net> writes:

> Matthias Blume wrote:
>> "Rob Thorpe" <·············@antenova.com> writes:
>> 
>>> I don't think dynamic typing is that nebulous.  I remember this being
>>> discussed elsewhere some time ago, I'll post the same reply I did then
>>> ..
>>>
>>>
>>> A language is statically typed if a variable has a property - called
>>> it's type - attached to it, and given it's type it can only represent
>>> values defined by a certain class.
>> By this definition, all languages are statically typed (by making
>> that
>> "certain class" the set of all values).  Moreover, this "definition",
>> when read the way you probably wanted it to be read, requires some
>> considerable stretch to accommodate existing static type systems such
>> as F_\omega.
>> Perhaps better: A language is statically typed if its definition
>> includes (or ever better: is based on) a static type system, i.e., a
>> static semantics with typing judgments derivable by typing rules.
>> Usually typing judgmets associate program phrases ("expressions") with
>> types given a typing environment.
>
> How does your definition exclude the trivial type system in which the
> only typing judgment states that every expression is acceptable?

It does not.
From: Joachim Durchholz
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e79s2q$f9t$1@online.de>
Matthias Blume schrieb:
> Perhaps better: A language is statically typed if its definition
> includes (or ever better: is based on) a static type system, i.e., a
> static semantics with typing judgments derivable by typing rules.
> Usually typing judgmets associate program phrases ("expressions") with
> types given a typing environment.

This is defining a single term ("statically typed") using three 
undefined terms ("typing judgements", "typing rules", "typing environment").

Regards,
Jo
From: Matthias Blume
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <m11wtjh33u.fsf@hana.uchicago.edu>
Joachim Durchholz <··@durchholz.org> writes:

> Matthias Blume schrieb:
>> Perhaps better: A language is statically typed if its definition
>> includes (or ever better: is based on) a static type system, i.e., a
>> static semantics with typing judgments derivable by typing rules.
>> Usually typing judgmets associate program phrases ("expressions") with
>> types given a typing environment.
>
> This is defining a single term ("statically typed") using three
> undefined terms ("typing judgements", "typing rules", "typing
> environment").

This was not meant to be a rigorous definition.  Also, I'm not going
to repeat the textbook definitions for those three standard terms
here.  Next thing you are going to ask me to define the meaning of the
word "is"...
From: Joachim Durchholz
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e7b9fh$k1t$1@online.de>
Matthias Blume schrieb:
> Joachim Durchholz <··@durchholz.org> writes:
> 
>> Matthias Blume schrieb:
>>> Perhaps better: A language is statically typed if its definition
>>> includes (or ever better: is based on) a static type system, i.e., a
>>> static semantics with typing judgments derivable by typing rules.
>>> Usually typing judgmets associate program phrases ("expressions") with
>>> types given a typing environment.
>> This is defining a single term ("statically typed") using three
>> undefined terms ("typing judgements", "typing rules", "typing
>> environment").
> 
> This was not meant to be a rigorous definition.

Rigorous or not, introducing additional undefined terms doesn't help 
with explaining a term.

> Also, I'm not going to repeat the textbook definitions for those
> three standard terms here.

These terms certainly aren't standard for Perl, Python, Java, or Lisp, 
and they aren't even standard for topics covered on comp.lang.functional 
(which includes dynamically-typed languages after all).

Regards,
Jo
From: Matthias Blume
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <m2psh2hbbh.fsf@hanabi.local>
Joachim Durchholz <··@durchholz.org> writes:

> Matthias Blume schrieb:
>> Joachim Durchholz <··@durchholz.org> writes:
>> 
>>> Matthias Blume schrieb:
>>>> Perhaps better: A language is statically typed if its definition
>>>> includes (or ever better: is based on) a static type system, i.e., a
>>>> static semantics with typing judgments derivable by typing rules.
>>>> Usually typing judgmets associate program phrases ("expressions") with
>>>> types given a typing environment.
>>> This is defining a single term ("statically typed") using three
>>> undefined terms ("typing judgements", "typing rules", "typing
>>> environment").
>> This was not meant to be a rigorous definition.
>
> Rigorous or not, introducing additional undefined terms doesn't help
> with explaining a term.

I think you missed my point.  My point was that a language is
statically typed IF IT IS DEFINED THAT WAY, i.e., if it has a static
type system that is PART OF THE LANGUAGE DEFINITION.  The details are
up to each individual definition.

>> Also, I'm not going to repeat the textbook definitions for those
>> three standard terms here.
>
> These terms certainly aren't standard for Perl, Python, Java, or Lisp,

Indeed.  That's because these languages are not statically typed.
From: Joachim Durchholz
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e7bkcs$7no$2@online.de>
Matthias Blume schrieb:
> Joachim Durchholz <··@durchholz.org> writes:
> 
>> Matthias Blume schrieb:
>>> Joachim Durchholz <··@durchholz.org> writes:
>>>
>>>> Matthias Blume schrieb:
>>>>> Perhaps better: A language is statically typed if its definition
>>>>> includes (or ever better: is based on) a static type system, i.e., a
>>>>> static semantics with typing judgments derivable by typing rules.
>>>>> Usually typing judgmets associate program phrases ("expressions") with
>>>>> types given a typing environment.
>>>> This is defining a single term ("statically typed") using three
>>>> undefined terms ("typing judgements", "typing rules", "typing
>>>> environment").
>>> This was not meant to be a rigorous definition.
>> Rigorous or not, introducing additional undefined terms doesn't help
>> with explaining a term.
> 
> I think you missed my point.  My point was that a language is
> statically typed IF IT IS DEFINED THAT WAY, i.e., if it has a static
> type system that is PART OF THE LANGUAGE DEFINITION.  The details are
> up to each individual definition.

Well, that certainly makes more sense to me.

Regards,
Jo
From: Rob Thorpe
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1150907242.960017.79600@r2g2000cwb.googlegroups.com>
Rob Thorpe wrote:
> Chris Smith wrote:
> > Torben Ægidius Mogensen <·······@app-3.diku.dk> wrote:
> > > That's not really the difference between static and dynamic typing.
> > > Static typing means that there exist a typing at compile-time that
> > > guarantess against run-time type violations.  Dynamic typing means
> > > that such violations are detected at run-time.  This is orthogonal to
> > > strong versus weak typing, which is about whether such violations are
> > > detected at all.  The archetypal weakly typed language is machine code
> > > -- you can happily load a floating point value from memory, add it to
> > > a string pointer and jump to the resulting value.  ML and Scheme are
> > > both strongly typed, but one is statically typed and the other
> > > dynamically typed.
> >
> > Knowing that it'll cause a lot of strenuous objection, I'll nevertheless
> > interject my plea not to abuse the word "type" with a phrase like
> > "dynamically typed".  If anyone considers "untyped" to be perjorative,
> > as some people apparently do, then I'll note that another common term is
> > "type-free," which is marketing-approved but doesn't carry the
> > misleading connotations of "dynamically typed."  We are quickly losing
> > any rational meaning whatsoever to the word "type," and that's quite a
> > shame.
>
> I don't think dynamic typing is that nebulous.  I remember this being
> discussed elsewhere some time ago, I'll post the same reply I did then
> ..
> A language is statically typed if a variable has a property - called
> it's type - attached to it, and given it's type it can only represent
> values defined by a certain class.
>
> A language is latently typed if a value has a property - called it's
> type - attached to it, and given it's type it can only represent values
> defined by a certain class.
>
> Some people use dynamic typing as a word for latent typing, others use
> it to mean something slightly different.  But for most purposes the
> definition above works for dynamic typing also.
>
> Untyped and type-free mean something else: they mean no type checking
> is done.

Since people have found some holes in this definition I'll have another
go:-

Firstly, a definition, General expression (gexpr) are variables
(mutable or immutable), expressions and the entities functions return.

A statically typed language has a parameter associated with each gexpr
called it's type.  The code may test the type of a gexpr.  The language
will check if the gexprs of an operator/function have types that match
what is required, to some criteria of sufficiency. It will emit an
error/warning when they don't.  It will do so universally.

A latently typed language has a parameter associated with each value
called it's type.  The code may test the type of a value.  The language
may check if the gexprs of an operator/function have types that match
what is required, to some criteria of sufficiency.  It will not
necessarily do so universally.

An untyped language is one that does not possess either a static or
latent type system.  In an untyped language gexprs possess no type
information, and neither do values.

--

These definitions still have problems, they don't say anything about
languages that sit between the various categories for example.  I don't
know where HM type system would come in them.
From: Joe Marshall
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1150746989.489702.26290@i40g2000cwc.googlegroups.com>
Chris Smith wrote:
>
> Knowing that it'll cause a lot of strenuous objection, I'll nevertheless
> interject my plea not to abuse the word "type" with a phrase like
> "dynamically typed".

Allow me to strenuously object.  The static typing community has its
own set of
terminology and that's fine.  However, we Lisp hackers are not used to
this terminology.
It confuses us.  *We* know what we mean by `dynamically typed', and we
suspect *you* do, too.


> This cleaner terminology eliminates a lot of confusion.

Hah!  Look at the archives.

> This isn't just a matter of preference in terminology.

No?

>  If types DON'T mean a compile-time method for proving the
> absence of certain program behaviors, then they don't mean anything at
> all.

Nonsense.
From: Chris Smith
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <MPG.1f00b5c17e12e5af9896ca@news.altopia.net>
Joe Marshall <··········@gmail.com> wrote:
> 
> Chris Smith wrote:
> >
> > Knowing that it'll cause a lot of strenuous objection, I'll nevertheless
> > interject my plea not to abuse the word "type" with a phrase like
> > "dynamically typed".
> 
> Allow me to strenuously object.  The static typing community has its
> own set of
> terminology and that's fine.  However, we Lisp hackers are not used to
> this terminology.
> It confuses us.  *We* know what we mean by `dynamically typed', and we
> suspect *you* do, too.

I know what you mean by types in LISP.  The phrase "dynamically typed," 
though, was explicitly introduced as a counterpart to "statically 
typed" in order to imply (falsely) that the word "typed" has related 
meanings in those two cases.  Nevertheless, I did not really object, 
since it's long since passed into common usage, until Torben attempted 
to give what I believe are rather meaningless definitions to those 
words, in terms of some mythical thing called "type violations" that he 
seems to believe exist apart from any specific type systems.  
(Otherwise, how could you define kinds of type systems in terms of when 
they catch type violations?)

> > This cleaner terminology eliminates a lot of confusion.
> 
> Hah!  Look at the archives.

I'm not sure what you mean here.  You would like me to look at the 
archives of which of the five groups that are part of this conversation?  
In any case, the confusion I'm referring to pertains to comparison of 
languages, and it's already been demonstrated once in the half-dozen or 
so responses to this branch of this thread.

> >  If types DON'T mean a compile-time method for proving the
> > absence of certain program behaviors, then they don't mean anything at
> > all.
> 
> Nonsense.

Please accept my apologies for not making the context clear.  I tried to 
clarify, in my response to Pascal, that I don't mean that the word 
"type" can't have any possible meaning except for the one from 
programming language type theory.  I should modify my statement as 
follows:

    An attempt to generalize the definition of "type" from programming
    language type theory to eliminate the requirement that they are
    syntactic in nature yields something meaningless.  Any concept of
    "type" that is not syntactic is a completely different thing from
    static types.

Basically, I start objecting when someone starts comparing "statically 
typed" and "dynamically typed" as if they were both varieties of some 
general concept called "typed".  They aren't.  Furthermore, these two 
phrases were invented under the misconception that that are.  If you 
mean something else by types, such as the idea that a value has a tag 
indicating its range of possible values, then I tend to think it would 
be less confusing to just say "type" and then clarify the meaning it 
comes into doubt, rather than adopting language that implies that those 
types are somehow related to types from type theory.

-- 
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation
From: Marshall
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1150749670.053433.242210@p79g2000cwp.googlegroups.com>
Chris Smith wrote:
>
> Basically, I start objecting when someone starts comparing "statically
> typed" and "dynamically typed" as if they were both varieties of some
> general concept called "typed".  They aren't.  Furthermore, these two
> phrases were invented under the misconception that that are.  If you
> mean something else by types, such as the idea that a value has a tag
> indicating its range of possible values, then I tend to think it would
> be less confusing to just say "type" and then clarify the meaning it
> comes into doubt, rather than adopting language that implies that those
> types are somehow related to types from type theory.

While I am quite sympathetic to this point, I have to say that
this horse left the barn quite some time ago.


Marshall

PS. Hi Chris!
From: Chris Smith
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <MPG.1f00bd6458e102849896cb@news.altopia.net>
Marshall <···············@gmail.com> wrote:
> While I am quite sympathetic to this point, I have to say that
> this horse left the barn quite some time ago.

I don't think so.  Perhaps it's futile to go scouring the world for uses 
of the phrase "dynamic type" and eliminating them.   It's not useless to 
point out when the term is used in a particularly confusing way, though, 
as when it's implied that there is some class of "type errors" that is 
strictly a subset of the class of "errors".  Terminology is often 
confused for historical reasons, but incorrect statements eventually get 
corrected.

> PS. Hi Chris!

Hi!  Where are you posting from these days?

-- 
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation
From: Marshall
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1150754918.338770.89870@i40g2000cwc.googlegroups.com>
Chris Smith wrote:
> Marshall <···············@gmail.com> wrote:
> > While I am quite sympathetic to this point, I have to say that
> > this horse left the barn quite some time ago.
>
> I don't think so.  Perhaps it's futile to go scouring the world for uses
> of the phrase "dynamic type" and eliminating them.   It's not useless to
> point out when the term is used in a particularly confusing way, though,
> as when it's implied that there is some class of "type errors" that is
> strictly a subset of the class of "errors".  Terminology is often
> confused for historical reasons, but incorrect statements eventually get
> corrected.

That's fair.

One thing that is frustrating to me is that I really want to build
an understanding of what dynamic typing is and what its
advantages are, but it's difficult to have a productive discussion
on the static vs. dynamic topic. Late binding of every function
invocation: how does that work, what are the implications of that,
what does that buy you?

I have come to believe that the two actually represent
very different ways of thinking about programming. Which
goes a long way to explaining why there are such difficulties
communicating. Each side tries to describe to the other how
the tools and systems they use facilitate doing tasks that
don't exist in the other side's mental model.


> > PS. Hi Chris!
>
> Hi!  Where are you posting from these days?

I'm mostly on comp.databases.theory, but I also lurk
on comp.lang.functional, which is an awesome group, and
if you're reading Pierce then you might like it too.


Marshall
From: Chris F Clark
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <sddirmwd90v.fsf@shell01.TheWorld.com>
Chris Smith wrote:
> Marshall <···············@gmail.com> wrote:
> > While I am quite sympathetic to this point, I have to say that
> > this horse left the barn quite some time ago.
>
> I don't think so.  Perhaps it's futile to go scouring the world for uses
> of the phrase "dynamic type" and eliminating them.   It's not useless to
> point out when the term is used in a particularly confusing way, though,
> as when it's implied that there is some class of "type errors" that is
> strictly a subset of the class of "errors".  Terminology is often
> confused for historical reasons, but incorrect statements eventually get
> corrected.

Ok, so you (Chris Smith) object to the term "dynamic type".  From a
historical perspective it makes sense, perticular in the sense of
tags.  A dynamic type system involves associating tags with values and
expresses variant operations in terms of those tags, with certain
disallowed combinations checked (dynamicly) at run-time.  A static
type system eliminates some set of tags on values by syntactic
analysis of annotations (types) written with or as part of the program
and detects some of the disallowed compuatations (staticly) at compile
time.

Type errors are the errors caught by ones personal sense of which
annotations are expressible, computable, and useful.  Thus, each
person has a distinct sense of which errors can (and/or should) be
staticly caught.

A personal example, I read news using Emacs, which as most readers in
these newsgroups will know is a dialect of lisp that includes
primitives to edit files.  Most of emacs is quite robust, perhaps due
to it being lisp.  However, some commands (reading news being one of
them) are exceptionally fragile and fail in the most undebuggable
ways, often just complaining about "nil being an invalid argument" (or
"stack overflow in regular expression matcher".)

This is particularly true, when I am composing lisp code.  If I write
some lisp code and make a typographical error, the code may appear to
run on some sample case and fail due to a path not explored in my
sample case when applied to real data.

I consider such an error to be a "type error" because I believe if I
used a languages that required more type annotations, the compiler
would have caught my typographical error (and no I'm not making a pun
on type and typographical).  Because I do a lot of this--it is easy
enough for me to conjure up a small lisp macro to make some edit that
it is a primary tool in my toolkit, I wish I could make my doing so
more robust.  It would not bother me to have to type more to do so.  I
simply make too many stupid errors to use emacs lisp as effectively as
I would like.

Now, this has nothing to do with real lisp programming, and so I do
not wish to offend those who do that.  However, I personally would
like a staticly typed way of writing macros (and more importantly of
annotating some of the existing emacs code) to remove some of the
fragilities that I experience.  I'm not taking avantage of the
exploratory nature of lisp, except in the sense that the exploratory
nature has created lots of packages which mostly work most of the
time.

Now, I will return this group to its much more erudite discussion of
the issue.

Thank you,
-Chris

*****************************************************************************
Chris Clark                    Internet   :  ·······@world.std.com
Compiler Resources, Inc.       Web Site   :  http://world.std.com/~compres  
23 Bailey Rd                   voice      :  (508) 435-5016
Berlin, MA  01503  USA         fax        :  (978) 838-0263  (24 hours)
------------------------------------------------------------------------------
From: Andreas Rossberg
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e78f6m$8usvs$1@hades.rz.uni-saarland.de>
Chris F Clark wrote:
> 
> A static
> type system eliminates some set of tags on values by syntactic
> analysis of annotations (types) written with or as part of the program
> and detects some of the disallowed compuatations (staticly) at compile
> time.

Explicit annotations are not a necessary ingredient of a type system, 
nor is "eliminating tags" very relevant to its function.

   - Andreas
From: Chris F Clark
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <sddu06f3in6.fsf@shell01.TheWorld.com>
Chris F Clark wrote:
> A static
> type system eliminates some set of tags on values by syntactic
> analysis of annotations (types) written with or as part of the program
> and detects some of the disallowed compuatations (staticly) at compile
> time.

Adreas relied:
> Explicit annotations are not a necessary ingredient of a type system,
> nor is "eliminating tags" very relevant to its function.

While this is true, I disagree at some level with the second part.  By
eliminating tags, I mean allowing one to perform "type safe"
computations without requiring the values to be tagged.  One could
argue that the tags were never there.  However, many of the
interesting polymorphic computations reaquire either that the values
be tagged or that some other process assures that at each point one
can determine apriori what the correct variant of computation is which
applies.

To me a static type system is one which does this apriori
determination.  A dynamic type system does not do a apriori and
instead includes explicit information in the values being computed to
select the corret variant computations.

In that sense, a static type system is eliminating tags, because the
information is pre-computed and not explicitly stored as a part of the
computation.  Now, you may not view the tag as being there, but in my
mind if there exists a way of perfoming the computation that requires
tags, the tag was there and that tag has been eliminated.

To put it another way, I consider the tags to be axiomatic.  Most
computations involve some decision logic that is driven by distinct
values that have previously been computed.  The separation of the
values which drive the compuation one-way versus another is a tag.
That tag can potentially be eliminated by some apriori computation.

In what I do, it is very valuable to move information from being
explicitly represented in the computed result into the tag, so that I
often have distinct "types" (i.e. tags) for an empty list, a list with
one element, a list with two elements, and a longer list.  In that
sense, I agree with Chris Smith's assertion that "static typing" is
about asserting general properties of the algorithm/data.  These
assertions are important to the way I am manipulating the data.  They
are part of my type model, but they may not be part of anyone else's,
and to me toe pass a empty list to a function requiring a list with
two elements is a "type error", because it is something I expect the
type system to detect as incorrect.  The fact that no one else may
have that expectation does not seem relevant to me.

Now, to carry this farther, since I expect the type system to validate
that certain values are of certain types and only be used in certain
contexts, I am happy when it does not require certain "tags" to be
actualy present in the data.  However, because other bits of code are
polymorphic, I do expect certain values to require tags.  In the end,
this is still a win for me.  I had certain data elements that in the
abstract had to be represented explicitly.  I have encoded that
information into the type system and in some cases the type system is
not using any bits in the computed representation to hold that
information.  Whenever that happens, I win and that solves one of the
problems that I need solved.

Thus, a type system is a way for me to express certain axioms about my
algorithm.  A dynamic type system encodes those facts as part of the
computation.  A static type system pre-calculates certain "theorems"
from my axioms and uses those theorems to allow my algorithm to be
computed without all the facts being stored as part of the computation.

Hopefully, this makes my point of view clear.

-Chris

*****************************************************************************
Chris Clark                    Internet   :  ·······@world.std.com
Compiler Resources, Inc.       Web Site   :  http://world.std.com/~compres  
23 Bailey Rd                   voice      :  (508) 435-5016
Berlin, MA  01503  USA         fax        :  (978) 838-0263  (24 hours)
------------------------------------------------------------------------------
From: Joachim Durchholz
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e79so3$ga2$1@online.de>
Chris F Clark schrieb:
> In that sense, a static type system is eliminating tags, because the
> information is pre-computed and not explicitly stored as a part of the
> computation.  Now, you may not view the tag as being there, but in my
> mind if there exists a way of perfoming the computation that requires
> tags, the tag was there and that tag has been eliminated.

On a semantic level, the tag is always there - it's the type (and 
definitely part of an axiomatic definition of the language).
Tag elimination is "just" an optimization.

> To put it another way, I consider the tags to be axiomatic.  Most
> computations involve some decision logic that is driven by distinct
> values that have previously been computed.  The separation of the
> values which drive the compuation one-way versus another is a tag.
> That tag can potentially be eliminated by some apriori computation.

Um... just as precomputing constants, I'd say.
Are the constants that went into a precomputed constant eliminated?
On the implementation level, yes. On the semantic/axiomatic level, no.
Or, well, maybe - since that's just an optimization, the compiler may 
have decided to no precompute the constant at all.

(Agreeing with the snipped parts.)

Regards,
Jo
From: Marshall
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1150902546.400454.219430@i40g2000cwc.googlegroups.com>
Joachim Durchholz wrote:
>
> On a semantic level, the tag is always there - it's the type (and
> definitely part of an axiomatic definition of the language).
> Tag elimination is "just" an optimization.

I see what you're saying, but the distinction is a bit fine for me.
If the language has no possible mechanism to observe the
it-was-only-eliminated-as-an-optimization tag, in what sense
is it "always there?" E.g. The 'C' programming language.


Marshall
From: Joachim Durchholz
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e7m00h$tq3$1@online.de>
Chris F Clark schrieb:
> Chris F Clark schrieb:
>> In that sense, a static type system is eliminating tags, because the
>> information is pre-computed and not explicitly stored as a part of the
>> computation.  Now, you may not view the tag as being there, but in my
>> mind if there exists a way of perfoming the computation that requires
>> tags, the tag was there and that tag has been eliminated.
> 
> Joachim Durchholz replied:
>> On a semantic level, the tag is always there - it's the type (and
>> definitely part of an axiomatic definition of the language).
>> Tag elimination is "just" an optimization.
> 
> I agree the tag is always there in the abstract.  

Well - in the context of a discussion of dynamic and static typing, I'd 
think that the semantic ("abstract") level is usually the best level of 
discourse.
Of course, this being the Usenet, shifting the level is common (and can 
even helpful).

> In the end, I'm trying to fit things which are "too big" and "too
> slow" into much less space and time, using types to help me reliably
> make my program smaller and faster is just one trick.  [...]
> 
> However, I also know that my way of thinking about it is fringe.

Oh, I really agree that's an important application of static typing.

Essentially, which aspects of static typing is more important depends on 
where your problems lie: if it's ressource constraints, static typing 
tends to be seen as a tool to keep ressource usage down; if it's bug 
counts, static typing tends to be seen as a tool to express static 
properties.
These aspects are obviously not equally important to everybody.

Regards,
Jo
From: Joe Marshall
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1150761004.928251.318340@h76g2000cwa.googlegroups.com>
Chris Smith wrote:
> Joe Marshall <··········@gmail.com> wrote:
> >
> > Chris Smith wrote:
> > >
> > > Knowing that it'll cause a lot of strenuous objection, I'll nevertheless
> > > interject my plea not to abuse the word "type" with a phrase like
> > > "dynamically typed".
> >
> > Allow me to strenuously object.  The static typing community has its
> > own set of
> > terminology and that's fine.  However, we Lisp hackers are not used to
> > this terminology.
> > It confuses us.  *We* know what we mean by `dynamically typed', and we
> > suspect *you* do, too.
>
> I know what you mean by types in LISP.  The phrase "dynamically typed,"
> though, was explicitly introduced as a counterpart to "statically
> typed" in order to imply (falsely) that the word "typed" has related
> meanings in those two cases.

They *do* have a related meaning.  Consider this code fragment:
(car "a string")

Obviously this code is `wrong' in some way.  In static typing terms, we
could say that
we have a type error because the primitive procedure CAR doesn't
operate on strings.
Alternatively, we could say that since Lisp has one universal type (in
static type terms)
the code is correctly statically typed - that is, CAR is defined on all
input, but it's definition is to raise a runtime exception when passed
a string.

But regardless of which way you want to look at it, CAR is *not* going
to perform its usual computation and it is *not* going to return a
value.  The reason behind this is that you cannot take the CAR of a
string.  A string is *not* a valid argument to CAR.  Ask anyone why and
they will tell you `It's the wrong type.'

Both `static typing' and `dynamic typing' (in the colloquial sense) are
strategies to detect this sort of error.

> Nevertheless, I did not really object,
> since it's long since passed into common usage,

Exactly.  And you are far more likely to encounter this sort of usage
outside of a type theorist's convention.


> until Torben attempted
> to give what I believe are rather meaningless definitions to those
> words, in terms of some mythical thing called "type violations" that he
> seems to believe exist apart from any specific type systems.

It's hardly mythical.  (car "a string") is obviously an error and you
don't need a static type system to know that.

> > > This cleaner terminology eliminates a lot of confusion.
> >
> > Hah!  Look at the archives.
>
> I'm not sure what you mean here.  You would like me to look at the
> archives of which of the five groups that are part of this conversation?
> In any case, the confusion I'm referring to pertains to comparison of
> languages, and it's already been demonstrated once in the half-dozen or
> so responses to this branch of this thread.

I mean that this has been argued time and time again in comp.lang.lisp
and probably the other groups as well.  You may not like the fact that
we say that Lisp is dynamically typed, but *we* are not confused by
this usage.  In fact, we become rather confused when you say `a
correctly typed program cannot go wrong at runtime' because we've seen
plenty of runtime errors from code that is `correctly typed'.

> > >  If types DON'T mean a compile-time method for proving the
> > > absence of certain program behaviors, then they don't mean anything at
> > > all.
> >
> > Nonsense.
>
> Please accept my apologies for not making the context clear.  I tried to
> clarify, in my response to Pascal, that I don't mean that the word
> "type" can't have any possible meaning except for the one from
> programming language type theory.  I should modify my statement as
> follows:
>
>     An attempt to generalize the definition of "type" from programming
>     language type theory to eliminate the requirement that they are
>     syntactic in nature yields something meaningless.  Any concept of
>     "type" that is not syntactic is a completely different thing from
>     static types.

Agreed.  That is why there is the qualifier `dynamic'.  This indicates
that it is a completely different thing from static types.

> Basically, I start objecting when someone starts comparing "statically
> typed" and "dynamically typed" as if they were both varieties of some
> general concept called "typed".  They aren't.

I disagree.  There is clearly a concept that there are different
varieties of data and they are not interchangable.  In some languages,
it is up to the programmer to ensure that mistakes in data usage do not
happen.  In other languages, the computer can detect such mistakes and
prevent them.  If this detection is performed by syntactic analysis
prior to running the program, it is static typing.  Some languages like
Lisp defer the detection until the program is run.  Call it what you
want, but here in comp.lang.lisp we tend to call it `dynamic typing'.

>  Furthermore, these two
> phrases were invented under the misconception that that are.  If you
> mean something else by types, such as the idea that a value has a tag
> indicating its range of possible values, then I tend to think it would
> be less confusing to just say "type" and then clarify the meaning it
> comes into doubt, rather than adopting language that implies that those
> types are somehow related to types from type theory.

You may think it would be less confusing, but if you look at the
archives of comp.lang.lisp you would see that it is not.

We're all rubes here, so don't try to educate us with your
high-falutin' technical terms.
From: Marshall
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1150765102.867144.12590@h76g2000cwa.googlegroups.com>
Joe Marshall wrote:
>
> They *do* have a related meaning.  Consider this code fragment:
> (car "a string")
> [...]
> Both `static typing' and `dynamic typing' (in the colloquial sense) are
> strategies to detect this sort of error.

The thing is though, that putting it that way makes it seems as
if the two approaches are doing the same exact thing, but
just at different times: runtime vs. compile time. But they're
not the same thing. Passing the static check at compile
time is universally quantifying the absence of the class
of error; passing the dynamic check at runtime is existentially
quantifying the absence of the error. A further difference is
the fact that in the dynamically typed language, the error is
found during the evaluation of the expression; in a statically
typed language, errors are found without attempting to evaluate
the expression.

I find everything about the differences between static and
dynamic to be frustratingly complex and subtle.

(To be clear, I do know that Joe understands these issues
quite well.)

So I kind of agree with Chris, insofar as I think the terminology
plays a role in obscuring rather than illuminating the differences.

On the other hand I agree with Joe in that:

> I mean that this has been argued time and time again in comp.lang.lisp
> and probably the other groups as well.  You may not like the fact that
> we say that Lisp is dynamically typed, but *we* are not confused by
> this usage.  In fact, we become rather confused when you say `a
> correctly typed program cannot go wrong at runtime' because we've seen
> plenty of runtime errors from code that is `correctly typed'.

Yes; as I said ealier, the horse has left the barn on this one.

The conversation I would *really* like to have is the one where we
discuss what all the differences are, functionally, between the two,
and what the implications of those differences are, without trying
to address which approach is "right" or "better", because those are
dependent on the problem domain anyway, and because I can
make up my own mind just fine about which one I prefer.

The comp.lang.functional and comp.lang.lisp people are probably
two of the smartest groups on usenet. (I do not consider myself
a member of either group.) You wouldn't *think* that conversation
would be *so* hard to have.


Marshall
From: Ron Garret
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <rNOSPAMon-F06215.21301119062006@news.gha.chartermi.net>
In article <·······················@h76g2000cwa.googlegroups.com>,
 "Marshall" <···············@gmail.com> wrote:

> The conversation I would *really* like to have is the one where we
> discuss what all the differences are, functionally, between the two,
> and what the implications of those differences are, without trying
> to address which approach is "right" or "better", because those are
> dependent on the problem domain anyway, and because I can
> make up my own mind just fine about which one I prefer.
> 
> The comp.lang.functional and comp.lang.lisp people are probably
> two of the smartest groups on usenet. (I do not consider myself
> a member of either group.) You wouldn't *think* that conversation
> would be *so* hard to have.

It's hard to have because there's very little to say, which leaves the 
academics without enough to do to try to justify their existence.  This 
is the long and the short of it:

1.  There are mismatches between the desired behavior of code and its 
actual behavior.  Such mismatches are generally referred to as "bugs" or 
"errors" (and, occasionally, "features").

2.  Some of those mismatches can be detected before the program runs.  
Some can be detected while the program runs.  And some cannot be 
detected until after the program has finished running.

3.  The various techniques for detecting those mismatches impose varying 
degrees of burden upon the programmer and the user.

That's it.  Everything else, including but not limited to quibbling over 
the meaning of the word "type", is nothing but postmodernist claptrap.

IMHO of course.

rg
From: Pascal Costanza
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <4fpvbvF1keed1U1@individual.net>
Marshall wrote:

> The conversation I would *really* like to have is the one where we
> discuss what all the differences are, functionally, between the two,
> and what the implications of those differences are, without trying
> to address which approach is "right" or "better", because those are
> dependent on the problem domain anyway, and because I can
> make up my own mind just fine about which one I prefer.

My current take on this is that static typing and dynamic typing are 
incompatible, at least in their "extreme" variants.

The simplest examples I have found are this:

- In a statically typed language, you can have variables that contain 
only first-class functions at runtime that are guaranteed to have a 
specific return type. Other values are rejected, and the rejection 
happens at compile time.

In dynamically typed languages, this is impossible because you can never 
be sure about the types of return values - you cannot predict the 
future. This can at best be approximated.


- In a dynamically typed language, you can run programs successfully 
that are not acceptable by static type systems. Here is an example in 
Common Lisp:

; A class "person" with no superclasses and with the only field "name":
(defclass person ()
   (name))

; A test program:
(defun test ()
   (let ((p (make-instance 'person)))
     (eval (read))
     (slot-value p 'address)))

(slot-value p 'address) is an attempt to access the field 'address in 
the object p. In many languages, the notation for this is p.address.

Although the class definition for person doesn't mention the field 
address, the call to (eval (read)) allows the user to change the 
definition of the class person and update its existing instances. 
Therefore at runtime, the call to (slot-value p 'adress) has a chance to 
succeed.

(Even without the call to (eval (read)), in Common Lisp the call to 
(slot-value p 'address) would raise an exception which gives the user a 
chance to fix things and continue from the point in the control flow 
where the exception was raised.)

I cannot imagine a static type system which has a chance to predict that 
this program can successfully run without essentially accepting all 
kinds of programs.

At least, development environments for languages like Smalltalk, Common 
Lisp, Java, etc., make use of such program updates to improve 
edit-compile-test cycles. However, it is also possible (and done in 
practice) to use such program updates to minimize downtimes when adding 
new features to deployed systems.


Pascal

-- 
3rd European Lisp Workshop
July 3 - Nantes, France - co-located with ECOOP 2006
http://lisp-ecoop06.bknr.net/
From: Matthias Blume
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <m1mzc7hluh.fsf@hana.uchicago.edu>
Pascal Costanza <··@p-cos.net> writes:

> - In a dynamically typed language, you can run programs successfully
>   that are not acceptable by static type systems.

This statement is false.

For every program that can run successfully to completion there exists
a static type system which accepts that program.  Moreover, there is
at least one static type system that accepts all such programs.

What you mean is that for static type systems that are restrictive
enough to be useful in practice there always exist programs which
(after type erasure in an untyped setting, i.e., by switching to a
different language) would run to completion, but which are rejected by
the static type system.

By the way, the parenthetical remark is important: If a language's
definition is based on a static type system, then there are *no*
programs in that language which are rejected by its type checker.
That's trivially so because strings that do not type-check are simply
not considered programs.


> Here is an example in Common Lisp:
>
> ; A class "person" with no superclasses and with the only field "name":
> (defclass person ()
>    (name))
>
> ; A test program:
> (defun test ()
>    (let ((p (make-instance 'person)))
>      (eval (read))
>      (slot-value p 'address)))
>
> (slot-value p 'address) is an attempt to access the field 'address in
> the object p. In many languages, the notation for this is p.address.
>
> Although the class definition for person doesn't mention the field
> address, the call to (eval (read)) allows the user to change the
> definition of the class person and update its existing
> instances. Therefore at runtime, the call to (slot-value p 'adress)
> has a chance to succeed.

I am quite comfortable with the thought that this sort of evil would
get rejected by a statically typed language. :-)
From: Pascal Costanza
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <4fqonoF1kfkapU1@individual.net>
Matthias Blume wrote:
> Pascal Costanza <··@p-cos.net> writes:
> 
>> - In a dynamically typed language, you can run programs successfully
>>   that are not acceptable by static type systems.
> 
> This statement is false.

The example I have given is more important than this statement.

> For every program that can run successfully to completion there exists
> a static type system which accepts that program.  Moreover, there is
> at least one static type system that accepts all such programs.
> 
> What you mean is that for static type systems that are restrictive
> enough to be useful in practice there always exist programs which
> (after type erasure in an untyped setting, i.e., by switching to a
> different language) would run to completion, but which are rejected by
> the static type system.

No, that's not what I mean.

>> Here is an example in Common Lisp:
>>
>> ; A class "person" with no superclasses and with the only field "name":
>> (defclass person ()
>>    (name))
>>
>> ; A test program:
>> (defun test ()
>>    (let ((p (make-instance 'person)))
>>      (eval (read))
>>      (slot-value p 'address)))
>>
>> (slot-value p 'address) is an attempt to access the field 'address in
>> the object p. In many languages, the notation for this is p.address.
>>
>> Although the class definition for person doesn't mention the field
>> address, the call to (eval (read)) allows the user to change the
>> definition of the class person and update its existing
>> instances. Therefore at runtime, the call to (slot-value p 'adress)
>> has a chance to succeed.
> 
> I am quite comfortable with the thought that this sort of evil would
> get rejected by a statically typed language. :-)

This sort of feature is clearly not meant for you. ;-P


Pascal

-- 
3rd European Lisp Workshop
July 3 - Nantes, France - co-located with ECOOP 2006
http://lisp-ecoop06.bknr.net/
From: Rob Thorpe
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1150824752.799997.206210@p79g2000cwp.googlegroups.com>
Pascal Costanza wrote:
> Matthias Blume wrote:
> > Pascal Costanza <··@p-cos.net> writes:
> >> (slot-value p 'address) is an attempt to access the field 'address in
> >> the object p. In many languages, the notation for this is p.address.
> >>
> >> Although the class definition for person doesn't mention the field
> >> address, the call to (eval (read)) allows the user to change the
> >> definition of the class person and update its existing
> >> instances. Therefore at runtime, the call to (slot-value p 'adress)
> >> has a chance to succeed.
> >
> > I am quite comfortable with the thought that this sort of evil would
> > get rejected by a statically typed language. :-)
>
> This sort of feature is clearly not meant for you. ;-P

To be fair though that kind of thing would only really be used while
debugging a program.
Its no different than adding a new member to a class while in the
debugger.

There are other places where you might add a slot to an object at
runtime, but they would be done in tidier ways.
From: Pascal Costanza
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <4fqt33F1kfd1jU1@individual.net>
Rob Thorpe wrote:
> Pascal Costanza wrote:
>> Matthias Blume wrote:
>>> Pascal Costanza <··@p-cos.net> writes:
>>>> (slot-value p 'address) is an attempt to access the field 'address in
>>>> the object p. In many languages, the notation for this is p.address.
>>>>
>>>> Although the class definition for person doesn't mention the field
>>>> address, the call to (eval (read)) allows the user to change the
>>>> definition of the class person and update its existing
>>>> instances. Therefore at runtime, the call to (slot-value p 'adress)
>>>> has a chance to succeed.
>>> I am quite comfortable with the thought that this sort of evil would
>>> get rejected by a statically typed language. :-)
>> This sort of feature is clearly not meant for you. ;-P
> 
> To be fair though that kind of thing would only really be used while
> debugging a program.
> Its no different than adding a new member to a class while in the
> debugger.
> 
> There are other places where you might add a slot to an object at
> runtime, but they would be done in tidier ways.

Yes, but the question remains how a static type system can deal with 
this kind of updates.


Pascal

-- 
3rd European Lisp Workshop
July 3 - Nantes, France - co-located with ECOOP 2006
http://lisp-ecoop06.bknr.net/
From: David Hopwood
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <2N3mg.462963$xt.331923@fe3.news.blueyonder.co.uk>
Pascal Costanza wrote:
> Rob Thorpe wrote:
>> Pascal Costanza wrote:
>>> Matthias Blume wrote:
>>>> Pascal Costanza <··@p-cos.net> writes:
>>>>
>>>>> (slot-value p 'address) is an attempt to access the field 'address in
>>>>> the object p. In many languages, the notation for this is p.address.
>>>>>
>>>>> Although the class definition for person doesn't mention the field
>>>>> address, the call to (eval (read)) allows the user to change the
>>>>> definition of the class person and update its existing
>>>>> instances. Therefore at runtime, the call to (slot-value p 'adress)
>>>>> has a chance to succeed.
>>>>
>>>> I am quite comfortable with the thought that this sort of evil would
>>>> get rejected by a statically typed language. :-)
>>>
>>> This sort of feature is clearly not meant for you. ;-P
>>
>> To be fair though that kind of thing would only really be used while
>> debugging a program.
>> Its no different than adding a new member to a class while in the
>> debugger.
>>
>> There are other places where you might add a slot to an object at
>> runtime, but they would be done in tidier ways.
> 
> Yes, but the question remains how a static type system can deal with
> this kind of updates.

It's not difficult in principle:

 - for each class [*], define a function which converts an 'old' value of
   that class to a 'new' value (the ability to do this is necessary anyway
   to support some kinds of upgrade). A default conversion function may be
   autogenerated if the class definition has changed only in minor ways.

 - typecheck the new program and the conversion functions, using the old
   type definitions for the argument of each conversion function, and the
   new type definitions for its result.

 - have the debugger apply the conversions to all values, and then resume
   the program.


[*] or nearest equivalent in a non-OO language.

-- 
David Hopwood <····················@blueyonder.co.uk>
From: Pascal Costanza
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <4fsbivF1jqb81U1@individual.net>
David Hopwood wrote:
> Pascal Costanza wrote:
>> Rob Thorpe wrote:
>>> Pascal Costanza wrote:
>>>> Matthias Blume wrote:
>>>>> Pascal Costanza <··@p-cos.net> writes:
>>>>>
>>>>>> (slot-value p 'address) is an attempt to access the field 'address in
>>>>>> the object p. In many languages, the notation for this is p.address.
>>>>>>
>>>>>> Although the class definition for person doesn't mention the field
>>>>>> address, the call to (eval (read)) allows the user to change the
>>>>>> definition of the class person and update its existing
>>>>>> instances. Therefore at runtime, the call to (slot-value p 'adress)
>>>>>> has a chance to succeed.
>>>>> I am quite comfortable with the thought that this sort of evil would
>>>>> get rejected by a statically typed language. :-)
>>>> This sort of feature is clearly not meant for you. ;-P
>>> To be fair though that kind of thing would only really be used while
>>> debugging a program.
>>> Its no different than adding a new member to a class while in the
>>> debugger.
>>>
>>> There are other places where you might add a slot to an object at
>>> runtime, but they would be done in tidier ways.
>> Yes, but the question remains how a static type system can deal with
>> this kind of updates.
> 
> It's not difficult in principle:
> 
>  - for each class [*], define a function which converts an 'old' value of
>    that class to a 'new' value (the ability to do this is necessary anyway
>    to support some kinds of upgrade). A default conversion function may be
>    autogenerated if the class definition has changed only in minor ways.

Yep, this is more or less exactly how CLOS does it. (The conversion 
function is called update-instance-for-redefined-class, and you can 
provide your own methods on it.)

>  - typecheck the new program and the conversion functions, using the old
>    type definitions for the argument of each conversion function, and the
>    new type definitions for its result.

The problem here is: The program is already executing, so this typecheck 
isn't performed at compile-time, in the strict sense of the word (i.e., 
before the program is deployed). It may still be a syntactic analysis, 
but you don't get the kind of guarantees anymore that you typically 
expect from a static type checker _before_ the program is started in the 
first place.

(It's really important to understand that the idea is to use this for 
deployed programs - albeit hopefully in a more structured fashion - and 
not only for debugging. The example I have given is an extreme one that 
you would probably not use as such in a "real-world" setting, but it 
shows that there is a boundary beyond which static type systems cannot 
be used in a meaningful way anymore, at least as far as I can tell.)

>  - have the debugger apply the conversions to all values, and then resume
>    the program.

In CLOS, this conversion is defined as part of the language proper, but 
this is mostly because Common Lisp doesn't make a sharp distinction 
between debugging capabilities and "regular" language features. (I think 
it's a good thing that there is no strong barrier against having 
debugging capabilities in a deployed program.)


> [*] or nearest equivalent in a non-OO language.


Pascal

-- 
3rd European Lisp Workshop
July 3 - Nantes, France - co-located with ECOOP 2006
http://lisp-ecoop06.bknr.net/
From: Joachim Durchholz
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e7ba98$lbo$1@online.de>
Pascal Costanza schrieb:
> (It's really important to understand that the idea is to use this for 
> deployed programs - albeit hopefully in a more structured fashion - and 
> not only for debugging. The example I have given is an extreme one that 
> you would probably not use as such in a "real-world" setting, but it 
> shows that there is a boundary beyond which static type systems cannot 
> be used in a meaningful way anymore, at least as far as I can tell.)

As soon as the running program can be updated, the distinction between 
"static" (compile time) and "dynamic" (run time) blurs.
You can still erect a definition for such a case, but it needs to refer 
to the update process, and hence becomes language-specific. In other 
words, language-independent definitions of dynamic and static typing 
won't give any meaningful results for such languages.

I'd say it makes more sense to talk about what advantages of static vs. 
dynamic typing can be applied in such a situation.
E.g. one interesting topic would be the change in trade-offs: making 
sure that a type error cannot occur becomes much more difficult 
(particularly if the set of available types can change during an 
update), so static typing starts to lose some of its appeal; OTOH a good 
type system can give you a lot of guarantees even in such a situation, 
even if it might have to revert to the occasional run-time type check, 
so static checking still has its merits.

Regards,
Jo
From: Pascal Costanza
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <4fsu9dF1k21c9U1@individual.net>
Joachim Durchholz wrote:
> Pascal Costanza schrieb:
>> (It's really important to understand that the idea is to use this for 
>> deployed programs - albeit hopefully in a more structured fashion - 
>> and not only for debugging. The example I have given is an extreme one 
>> that you would probably not use as such in a "real-world" setting, but 
>> it shows that there is a boundary beyond which static type systems 
>> cannot be used in a meaningful way anymore, at least as far as I can 
>> tell.)
> 
> As soon as the running program can be updated, the distinction between 
> "static" (compile time) and "dynamic" (run time) blurs.
> You can still erect a definition for such a case, but it needs to refer 
> to the update process, and hence becomes language-specific. In other 
> words, language-independent definitions of dynamic and static typing 
> won't give any meaningful results for such languages.
> 
> I'd say it makes more sense to talk about what advantages of static vs. 
> dynamic typing can be applied in such a situation.
> E.g. one interesting topic would be the change in trade-offs: making 
> sure that a type error cannot occur becomes much more difficult 
> (particularly if the set of available types can change during an 
> update), so static typing starts to lose some of its appeal; OTOH a good 
> type system can give you a lot of guarantees even in such a situation, 
> even if it might have to revert to the occasional run-time type check, 
> so static checking still has its merits.

I am not opposed to this view. The two examples I have given for things 
that are impossible in static vs. dynamic type systems were 
intentionally extreme to make the point that you have to make a choice, 
that you cannot just blindly throw (instances of) both approaches 
together. Static type systems potentially change the semantics of a 
language in ways that cannot be captured by dynamically typed languages 
anymore, and vice versa.

There is, of course, room for research on performing static type checks 
in a running system, for example immediately after or before a software 
update is applied, or maybe even on separate type checking on software 
increments such that guarantees for their composition can be derived. 
However, I am not aware of a lot of work in that area, maybe because the 
static typing community is too focused on compile-time issues.

Personally, I also don't think that's the most interesting issue in that 
area, but that's of course only a subjective opinion.


Pascal

-- 
3rd European Lisp Workshop
July 3 - Nantes, France - co-located with ECOOP 2006
http://lisp-ecoop06.bknr.net/
From: Joachim Durchholz
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e7bla2$9h1$1@online.de>
Pascal Costanza schrieb:
> Static type systems potentially change the semantics of a 
> language in ways that cannot be captured by dynamically typed languages 
> anymore, and vice versa.

Very true.

I also suspect that's also why adding type inference to a 
dynamically-typed language doesn't give you all the benefits of static 
typing: the added-on type system is (usually) too weak to express really 
interesting guarantees, usually because the language's semantics isn't 
tailored towards making the inference steps easy enough.

Conversely, I suspect that adding dynamic typing to statically-typed 
languages tends to miss the most interesting applications, mostly 
because all the features that can "simply be done" in a 
dynamically-typed language have to be retrofitted to the 
statically-typed language on a case-by-case basis.

In both cases, the language designers often don't know the facilities of 
the opposed camp well enough to really assess the trade-offs they are doing.

> There is, of course, room for research on performing static type checks 
> in a running system, for example immediately after or before a software 
> update is applied, or maybe even on separate type checking on software 
> increments such that guarantees for their composition can be derived. 
> However, I am not aware of a lot of work in that area, maybe because the 
> static typing community is too focused on compile-time issues.

I think it's mostly because it's intimidating.

The core semantics of an ideal language fits on a single sheet of paper, 
to facilitate proofs of language properties. Type checking 
dynamically-loaded code probably wouldn't fit on that sheet of paper.
(The non-core semantics is then usually a set of transformation rules 
that map the constructs that the programmer sees to constructs of the 
core language.)

Regards,
Jo
From: Anton van Straaten
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <DD6mg.26347$VE1.23482@newssvr14.news.prodigy.com>
Marshall wrote:
> Joe Marshall wrote:
> 
>>They *do* have a related meaning.  Consider this code fragment:
>>(car "a string")
>>[...]
>>Both `static typing' and `dynamic typing' (in the colloquial sense) are
>>strategies to detect this sort of error.
> 
> 
> The thing is though, that putting it that way makes it seems as
> if the two approaches are doing the same exact thing, but
> just at different times: runtime vs. compile time. But they're
> not the same thing. Passing the static check at compile
> time is universally quantifying the absence of the class
> of error; passing the dynamic check at runtime is existentially
> quantifying the absence of the error. A further difference is
> the fact that in the dynamically typed language, the error is
> found during the evaluation of the expression; in a statically
> typed language, errors are found without attempting to evaluate
> the expression.
> 
> I find everything about the differences between static and
> dynamic to be frustratingly complex and subtle.

Let me add another complex subtlety, then: the above description misses 
an important point, which is that *automated* type checking is not the 
whole story.  I.e. that compile time/runtime distinction is a kind of 
red herring.

In fact, automated type checking is orthogonal to the question of the 
existence of types.  It's perfectly possible to write fully typed 
programs in a (good) dynamically-checked language.

In a statically-checked language, people tend to confuse automated 
static checking with the existence of types, because they're thinking in 
a strictly formal sense: they're restricting their world view to what 
they see "within" the language.

Then they look at programs in a dynamically-checked language, and see 
checks happening at runtime, and they assume that this means that the 
program is "untyped".

It's certainly close enough to say that the *language* is untyped.  One 
could also say that a program, as seen by the language, is untyped.

But a program as seen by the programmer has types: the programmer 
performs (static) type inference when reasoning about the program, and 
debugs those inferences when debugging the program, finally ending up 
with a program which has a perfectly good type scheme.  It's may be 
messy compared to say an HM type scheme, and it's usually not proved to 
be perfect, but that again is an orthogonal issue.

Mathematicians operated for thousands of years without automated 
checking of proofs, so you can't argue that because a 
dynamically-checked program hasn't had its type scheme proved correct, 
that it somehow doesn't have types.  That would be a bit like arguing 
that we didn't have Math until automated theorem provers came along.

These observations affect the battle over terminology in various ways. 
I'll enumerate a few.

1. "Untyped" is really quite a misleading term, unless you're talking 
about something like the untyped lambda calculus.  That, I will agree, 
can reasonably be called untyped.

2.  "Type-free" as suggested by Chris Smith is equally misleading.  It's 
only correct in a relative sense, in a narrow formal domain which 
ignores the process of reasoning about types which is inevitably 
performed by human programmers, in any language.

3.  A really natural term to refer to types which programmers reason 
about, even if they are not statically checked, is "latent types".  It 
captures the situation very well intuitively, and it has plenty of 
precedent -- e.g. it's mentioned in the Scheme reports, R5RS and its 
predecessors, going back at least a decade or so (haven't dug to check 
when it first appeared).

4.  Type theorists like to say that "universal" types can be used in a 
statically-typed language to subsume "dynamic types".  Those theorists 
are right, the term "dynamic type", with its inextricable association 
with runtime checks, definitely gets in the way here.  It might be 
enlightening to rephrase this: what's really happening is that universal 
types allow you to embed a latently-typed program in a 
statically-checked language.  The latent types don't go anywhere, 
they're still latent in the program with universal types.  The program's 
statically-checked type scheme doesn't capture the latent types. 
Describing it in these terms clarifies what's actually happening.

5.  Dynamic checks are only part of the mechanism used to verify latent 
types.  They shouldn't be focused on as being the primary equivalent to 
  static checks.  The closest equivalent to the static checks is a 
combination of human reasoning and testing, in which dynamic checks play 
an important but ultimately not a fundamental part.  You could debug a 
program and get the type scheme correct without dynamic checks, it would 
just be more difficult.

So, will y'all just switch from using "dynamically typed" to "latently 
typed", and stop talking about any real programs in real programming 
languages as being "untyped" or "type-free", unless you really are 
talking about situations in which human reasoning doesn't come into 
play?  I think you'll find it'll help to reason more clearly about this 
whole issue.

Thanks for your cooperation!!

Anton
From: Rob Thorpe
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1150892256.591282.43880@u72g2000cwu.googlegroups.com>
> So, will y'all just switch from using "dynamically typed" to "latently
> typed", and stop talking about any real programs in real programming
> languages as being "untyped" or "type-free", unless you really are
> talking about situations in which human reasoning doesn't come into
> play?  I think you'll find it'll help to reason more clearly about this
> whole issue.

I agree with most of what you say except regarding "untyped".

In machine language or most assembly the type of a variable is
something held only in the mind of the programmer writing it, and
nowhere else.  In latently typed languages though the programmer can
ask what they type of a particular value is.  There is a vast
difference to writing code in the latter kind of language to writing
code in assembly.

I would suggest that at least assembly should be referred to as
"untyped".
From: John W. Kennedy
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <GOymg.60$6O1.44@fe11.lga>
Rob Warnock wrote:
> Another language which has *neither* latent ("dynamic") nor
> manifest ("static") types is (was?) BLISS[1], in which, like
> assembler, variables are "just" addresses[2], and values are
> "just" a machine word of bits.

360-family assembler, yes. 8086-family assembler, not so much.

-- 
John W. Kennedy
"The blind rulers of Logres
Nourished the land on a fallacy of rational virtue."
   -- Charles Williams.  "Taliessin through Logres: Prelude"
From: Darren New
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <Aezmg.15710$Z67.1986@tornado.socal.rr.com>
John W. Kennedy wrote:
> 360-family assembler, yes. 8086-family assembler, not so much.

And Burroughs B-series, not at all. There was one "ADD" instruction, and 
it looked at the data in the addresses to determine whether to add ints 
or floats. :-)

-- 
   Darren New / San Diego, CA, USA (PST)
     My Bath Fu is strong, as I have
     studied under the Showerin' Monks.
From: Joachim Durchholz
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e7m07f$tq3$2@online.de>
Darren New schrieb:
> John W. Kennedy wrote:
>> 360-family assembler, yes. 8086-family assembler, not so much.
> 
> And Burroughs B-series, not at all. There was one "ADD" instruction, and 
> it looked at the data in the addresses to determine whether to add ints 
> or floats. :-)

I heard that the Telefunken TR series had a tagged architecture.
This seems roughly equivalent to what the B-series did (does?).

Regards,
Jo
From: Anton van Straaten
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <61Gmg.27175$VE1.20@newssvr14.news.prodigy.com>
Rob Thorpe wrote:
>>So, will y'all just switch from using "dynamically typed" to "latently
>>typed", and stop talking about any real programs in real programming
>>languages as being "untyped" or "type-free", unless you really are
>>talking about situations in which human reasoning doesn't come into
>>play?  I think you'll find it'll help to reason more clearly about this
>>whole issue.
> 
> 
> I agree with most of what you say except regarding "untyped".
> 
> In machine language or most assembly the type of a variable is
> something held only in the mind of the programmer writing it, and
> nowhere else.  In latently typed languages though the programmer can
> ask what they type of a particular value is.  There is a vast
> difference to writing code in the latter kind of language to writing
> code in assembly.

The distinction you describe is pretty much exactly what I was getting 
at.  I may have put it poorly.

> I would suggest that at least assembly should be referred to as
> "untyped".

Yes, I agree.

While we're talking about "untyped", I want to expand on something I 
wrote in a recent reply to Vesa Karvonen: I accept that "untyped" has a 
technical definition which means that a language doesn't statically 
assign types to terms.  But this doesn't capture the distinction between 
"truly" untyped languages, and languages which tag their values to 
support the programmer's ability to think in terms of latent types.

The point of avoiding the "untyped" label is not because it's not 
technically accurate (in a limited sense) -- it's because in the absence 
of other information, it misses out on important aspects of the nature 
of latently typed languages.  My reply to Vesa goes into more detail 
about this.

Anton
From: Dr.Ruud
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e7hc5d.1c4.1@news.isolution.nl>
Rob Thorpe schreef:

> I would suggest that at least assembly should be referred to as
> "untyped".

There are many different languages under the umbrella of "assembly", so
your suggestion is bound to be false.

-- 
Affijn, Ruud

"Gewoon is een tijger."
From: Rob Thorpe
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1151082023.504275.257570@i40g2000cwc.googlegroups.com>
Dr.Ruud wrote:
> Rob Thorpe schreef:
>
> > I would suggest that at least assembly should be referred to as
> > "untyped".
>
> There are many different languages under the umbrella of "assembly", so
> your suggestion is bound to be false.

Well yes, it is false, I know of several typed assemblers.
But I mentioned that point elsewhere so I didn't repeat it.  This
thread is huge and I expect no-one will have read it all, so I will be
more clear in future.
From: Marshall
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1150904134.520639.71070@b68g2000cwa.googlegroups.com>
Nice post! One question:

Anton van Straaten wrote:
>
> 3.  A really natural term to refer to types which programmers reason
> about, even if they are not statically checked, is "latent types".  It
> captures the situation very well intuitively, and it has plenty of
> precedent -- e.g. it's mentioned in the Scheme reports, R5RS and its
> predecessors, going back at least a decade or so (haven't dug to check
> when it first appeared).

Can you be more explicit about what "latent types" means?
I'm sorry to say it's not at all natural or intuitive to me.
Are you referring to the types in the programmers head,
or the ones at runtime, or what?


Marshall
From: Marshall
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1150985905.819268.250640@c74g2000cwc.googlegroups.com>
Rob Warnock wrote:
> Marshall <···············@gmail.com> wrote:
> >
> > Can you be more explicit about what "latent types" means?
> > I'm sorry to say it's not at all natural or intuitive to me.
> > Are you referring to the types in the programmers head,
> > or the ones at runtime, or what?
>
> Here's what the Scheme Standard has to say:
>
>     http://www.schemers.org/Documents/Standards/R5RS/HTML/r5rs-Z-H-4.html
>     1.1  Semantics
>     ...
>     Scheme has latent as opposed to manifest types. Types are assoc-
>     iated with values (also called objects) rather than with variables.
>     (Some authors refer to languages with latent types as weakly typed
>     or dynamically typed languages.) Other languages with latent types
>     are APL, Snobol, and other dialects of Lisp. Languages with manifest
>     types (sometimes referred to as strongly typed or statically typed
>     languages) include Algol 60, Pascal, and C.
>
> To me, the word "latent" means that when handed a value of unknown type
> at runtime, I can look at it or perform TYPE-OF on it or TYPECASE or
> something and thereby discover its actual type at the moment[1], whereas
> "manifest" means that types[2] are lexically apparent in the code.

Hmmm. If I read the R5RS text correctly, it is simply doing the
either/or thing that often happens with "static/dynamic" only
using different terms. I don't see any difference between
"latent" and "dynamic." Also, this phrase "types associated with
values instead of variables" that I'm starting to see a lot is
beginning to freak me out: the implication is that other languages
have types associated with variables and not values, which
doesn't describe anything I can think of.

In your followup paragraph, you've contrasted runtime type
introspection, vs. explicit type declarations, which seem
orthorgonal to me. (Not that you said they weren't.)


Marshall
From: Anton van Straaten
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <ixEmg.55552$Lm5.50067@newssvr12.news.prodigy.com>
Marshall wrote:
> Can you be more explicit about what "latent types" means?
> I'm sorry to say it's not at all natural or intuitive to me.
> Are you referring to the types in the programmers head,
> or the ones at runtime, or what?

Sorry, that was a huge omission.  (What I get for posting at 3:30am.)

The short answer is that I'm most directly referring to "the types in 
the programmer's head".  A more complete informal summary is as follows:

Languages with latent type systems typically don't include type 
declarations in the source code of programs.  The "static" type scheme 
of a given program in such a language is thus latent, in the English 
dictionary sense of the word, of something that is present but 
undeveloped.  Terms in the program may be considered as having static 
types, and it is possible to infer those types, but it isn't necessarily 
easy to do so automatically, and there are usually many possible static 
type schemes that can be assigned to a given program.

Programmers infer and reason about these latent types while they're 
writing or reading programs.  Latent types become manifest when a 
programmer reasons about them, or documents them e.g. in comments.

(As has already been noted, this definition may seem at odds with the 
definition given in the Scheme report, R5RS, but I'll address that in a 
separate post.)

There's a close connection between latent types in the sense I've 
described, and the "tagged values" present at runtime.  However, as type 
theorists will tell you, the tags used to tag values at runtime, as e.g. 
a number or a string or a FooBar object, are not the same thing as the 
sort of types which statically-typed languages have.

A simple example of the distinction can be seen in the type of a 
function.  Using Javascript as a lingua franca:

   function timestwo(x) { return x * 2 }

In a statically-typed language, the type of a function like this might 
be something like "number -> number", which tells us three things: that 
timestwo is a function; that it accepts a number argument; and that it 
returns a number result.

But if we ask Javascript what it thinks the type of timestwo is, by 
evaluating "typeof timestwo", it returns "function".  That's because the 
value bound to timestwo has a tag associated with it which says, in 
effect, "this value is a function".

But "function" is not a useful type.  Why not?  Because if all you know 
is that timestwo is a function, then you have no idea what an expression 
like "timestwo(foo)" means.  You couldn't write working programs, or 
read them, if all you knew about functions was that they were functions. 
  As a type, "function" is incomplete.

By my definition, though, the latent type of timestwo is "number -> 
number".  Any programmer looking at the function can figure out that 
this is its type, and programmers do exactly that when reasoning about a 
program.

(Aside: technically, you can pass timestwo something other than a 
number, but then you get NaN back, which is usually not much use except 
to generate errors.  I'll ignore that here; latent typing requires being 
less rigourous about some of these issues.)

So, where do tagged values fit into this?  Tags help to check types at 
runtime, but that doesn't mean that there's a 1:1 correspondence between 
tags and types.  For example, when an expression such as "timestwo(5) * 
3" is evaluated, three checks occur that are relevant to the type of 
timestwo:

1. Before the function call takes place, a check ensures that timestwo 
is a function.

2. Before the multiplication in "x * 2", a check ensures that x is a number.

3. When timestwo returns, before the subsequent multiplication by 3, a 
check ensures that the return type of timestwo is a number.

These three checks correspond to the three pieces of information 
contained in the function type signature "number -> number".

However, these dynamic checks still don't actually tell us the type of a 
function.  All they do is check that in a particular case, the values 
involved are compatible with the type of the function.  In many cases, 
the checks may infer a signature that's either more or less specific 
than the function's type, or they may infer an incomplete signature -- 
e.g., the return type doesn't need to be checked when evaluating "arr[i] 
= timestwo(5)".

I used a function just as an example.  There are many other cases where 
a value's tag doesn't match the static (or latent) type of the terms 
through which it flows.  A simple example is an expression such as:

     (flag ? 5 : "foo")

Here, the latent type of this expression could be described as "number | 
string".  There won't be a runtime tag anywhere which represents that 
type, though, since the language implementation never deals with the 
actual type of expressions, except in those degenerate cases where the 
type is so simple that it happens to be a 1:1 match to the corresponding 
tag.  It's only the programmer that "knows" that this expression has 
that type.

Anton
From: Marshall
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1151030140.202720.245140@y41g2000cwy.googlegroups.com>
Anton van Straaten wrote:
> Marshall wrote:
> > Can you be more explicit about what "latent types" means?
> > I'm sorry to say it's not at all natural or intuitive to me.
> > Are you referring to the types in the programmers head,
> > or the ones at runtime, or what?
>
> Sorry, that was a huge omission.  (What I get for posting at 3:30am.)

Thanks for the excellent followup.


> The short answer is that I'm most directly referring to "the types in
> the programmer's head".

In the database theory world, we speak of three levels: conceptual,
logical, physical. In a dbms, these might roughly be compared to
business entities described in a requirements doc, (or in the
business owners' heads), the (logical) schema encoded in the
dbms, and the b-tree indicies the dbms uses for performance.

So when you say "latent types", I think "conceptual types."

The thing about this is, both the Lisp and the Haskell programmers
are using conceptual types (as best I can tell.) And also, the
conceptual/latent types are not actually a property of the
program; they are a property of the programmer's mental
model of the program.

It seems we have languages:
with or without static analysis
with or without runtime type information (RTTI or "tags")
with or without (runtime) safety
with or without explicit type annotations
with or without type inference

Wow. And I don't think that's a complete list, either.

I would be happy to abandon "strong/weak" as terminology
because I can't pin those terms down. (It's not clear what
they would add anyway.)


> A more complete informal summary is as follows:
>
> Languages with latent type systems typically don't include type
> declarations in the source code of programs.  The "static" type scheme
> of a given program in such a language is thus latent, in the English
> dictionary sense of the word, of something that is present but
> undeveloped.  Terms in the program may be considered as having static
> types, and it is possible to infer those types, but it isn't necessarily
> easy to do so automatically, and there are usually many possible static
> type schemes that can be assigned to a given program.
>
> Programmers infer and reason about these latent types while they're
> writing or reading programs.  Latent types become manifest when a
> programmer reasons about them, or documents them e.g. in comments.

Uh, oh, a new term, "manifest." Should I worry about that?


> (As has already been noted, this definition may seem at odds with the
> definition given in the Scheme report, R5RS, but I'll address that in a
> separate post.)
>
> There's a close connection between latent types in the sense I've
> described, and the "tagged values" present at runtime.  However, as type
> theorists will tell you, the tags used to tag values at runtime, as e.g.
> a number or a string or a FooBar object, are not the same thing as the
> sort of types which statically-typed languages have.
>
> A simple example of the distinction can be seen in the type of a
> function.  Using Javascript as a lingua franca:
>
>    function timestwo(x) { return x * 2 }
>
> In a statically-typed language, the type of a function like this might
> be something like "number -> number", which tells us three things: that
> timestwo is a function; that it accepts a number argument; and that it
> returns a number result.
>
> But if we ask Javascript what it thinks the type of timestwo is, by
> evaluating "typeof timestwo", it returns "function".  That's because the
> value bound to timestwo has a tag associated with it which says, in
> effect, "this value is a function".

Well, darn. It strikes me that that's just a decision the language
designers
made, *not* to record complete RTTI. (Is it going to be claimed that
there is an *advantage* to having only incomplete RTTI? It is a
serious question.)


> But "function" is not a useful type.  Why not?  Because if all you know
> is that timestwo is a function, then you have no idea what an expression
> like "timestwo(foo)" means.  You couldn't write working programs, or
> read them, if all you knew about functions was that they were functions.
>   As a type, "function" is incomplete.

Yes, function is a parameterized type, and they've left out the
parameter values.


> By my definition, though, the latent type of timestwo is "number ->
> number".  Any programmer looking at the function can figure out that
> this is its type, and programmers do exactly that when reasoning about a
> program.

Gotcha.


> (Aside: technically, you can pass timestwo something other than a
> number, but then you get NaN back, which is usually not much use except
> to generate errors.  I'll ignore that here; latent typing requires being
> less rigourous about some of these issues.)
>
> So, where do tagged values fit into this?  Tags help to check types at
> runtime, but that doesn't mean that there's a 1:1 correspondence between
> tags and types.  For example, when an expression such as "timestwo(5) *
> 3" is evaluated, three checks occur that are relevant to the type of
> timestwo:
>
> 1. Before the function call takes place, a check ensures that timestwo
> is a function.
>
> 2. Before the multiplication in "x * 2", a check ensures that x is a number.
>
> 3. When timestwo returns, before the subsequent multiplication by 3, a
> check ensures that the return type of timestwo is a number.
>
> These three checks correspond to the three pieces of information
> contained in the function type signature "number -> number".
>
> However, these dynamic checks still don't actually tell us the type of a
> function.  All they do is check that in a particular case, the values
> involved are compatible with the type of the function.  In many cases,
> the checks may infer a signature that's either more or less specific
> than the function's type, or they may infer an incomplete signature --
> e.g., the return type doesn't need to be checked when evaluating "arr[i]
> = timestwo(5)".
>
> I used a function just as an example.  There are many other cases where
> a value's tag doesn't match the static (or latent) type of the terms
> through which it flows.  A simple example is an expression such as:
>
>      (flag ? 5 : "foo")
>
> Here, the latent type of this expression could be described as "number |
> string".  There won't be a runtime tag anywhere which represents that
> type, though, since the language implementation never deals with the
> actual type of expressions, except in those degenerate cases where the
> type is so simple that it happens to be a 1:1 match to the corresponding
> tag.  It's only the programmer that "knows" that this expression has
> that type.

Hmmm. Another place where the static type isn't the same thing as
the runtime type occurs in languages with subtyping.

Question: if a language *does* record complete RTTI, and the
languages does *not* have subtyping, could we then say that
the runtime type information *is* the same as the static type?


Marshall
From: Anton van Straaten
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <BXMmg.102764$H71.72376@newssvr13.news.prodigy.com>
Marshall wrote:
>>The short answer is that I'm most directly referring to "the types in
>>the programmer's head".
> 
> 
> In the database theory world, we speak of three levels: conceptual,
> logical, physical. In a dbms, these might roughly be compared to
> business entities described in a requirements doc, (or in the
> business owners' heads), the (logical) schema encoded in the
> dbms, and the b-tree indicies the dbms uses for performance.
> 
> So when you say "latent types", I think "conceptual types."

That sounds plausible, but of course at some point we need to pick a 
term and attempt to define it.  What I'm attempting to do with "latent 
types" is to point out and emphasize their relationship to static types, 
which do have a very clear, formal definition.  Despite some people's 
skepticism, that definition gives us a lot of useful stuff that can be 
applied to what I'm calling latent types.

> The thing about this is, both the Lisp and the Haskell programmers
> are using conceptual types (as best I can tell.) 

Well, a big difference is that the Haskell programmers have language 
implementations that are clever enough to tell them, statically, a very 
precise type for every term in their program.  Lisp programmers usually 
don't have that luxury.  And in the Haskell case, the conceptual type 
and the static type match very closely.

There can be differences, e.g. a programmer might have knowledge about 
the input to the program or some programmed constraint that Haskell 
isn't capable of inferring, that allows them to figure out a more 
precise type than Haskell can.  (Whether that type can be expressed in 
Haskell's type system is a separate question.)

In that case, you could say that the conceptual type is different than 
the inferred static type.  But most of the time, the human is reasoning 
about pretty much the same types as the static types that Haskell 
infers.  Things would get a bit confusing otherwise.

> And also, the
> conceptual/latent types are not actually a property of the
> program; 

That's not exactly true in the Haskell case (or other languages with 
static type inference), assuming you accept the static/conceptual 
equivalence I've drawn above.

Although static types are often not explicitly written in the program in 
such languages, they are unambiguously defined and automatically 
inferrable.  They are a static property of the program, even if in many 
cases those properties are only implicit with respect to the source 
code.  You can ask the language to tell you what the type of any given 
term is.

> they are a property of the programmer's mental
> model of the program.

That's more accurate.  In languages with type inference, the programmer 
still has to figure out what the implicit types are (or ask the language 
to tell her).

You won't get any argument from me that this figuring out of implicit 
types in a Haskell program is quite similar to what a Lisp, Scheme, or 
Python programmer does.  That's one of the places the connection between 
static types and what I call latent types is the strongest.

(However, I think I just heard a sound as though a million type 
theorists howled in unison.[*])

[*] most obscure inadvertent pun ever.

> It seems we have languages:
> with or without static analysis
> with or without runtime type information (RTTI or "tags")
> with or without (runtime) safety
> with or without explicit type annotations
> with or without type inference
> 
> Wow. And I don't think that's a complete list, either.

Yup.

> I would be happy to abandon "strong/weak" as terminology
> because I can't pin those terms down. (It's not clear what
> they would add anyway.)

I wasn't following the discussion earlier, but I agree that strong/weak 
don't have strong and unambiguous definitions.

> Uh, oh, a new term, "manifest." Should I worry about that?

Well, people have used the term "manifest type" to refer to a type 
that's explicitly apparent in the source, but I wouldn't worry about it. 
  I just used the term to imply that at some point, the idea of "latent 
type" has to be converted to something less latent.  Once you explicitly 
identify a type, it's no longer latent to the entity doing the identifying.

>>But if we ask Javascript what it thinks the type of timestwo is, by
>>evaluating "typeof timestwo", it returns "function".  That's because the
>>value bound to timestwo has a tag associated with it which says, in
>>effect, "this value is a function".
> 
> 
> Well, darn. It strikes me that that's just a decision the language
> designers
> made, *not* to record complete RTTI. 

No, there's more to it.  There's no way for a dynamically-typed language 
to figure out that something like the timestwo function I gave has the 
type "number -> number" without doing type inference, by examining the 
source of the function, at which point it pretty much crosses the line 
into being statically typed.

> (Is it going to be claimed that
> there is an *advantage* to having only incomplete RTTI? It is a
> serious question.)

More than an advantage, it's difficult to do it any other way.  Tags are 
associated with values.  Types in the type theory sense are associated 
with terms in a program.  All sorts of values can flow through a given 
term, which means that types can get complicated (even in a nice clean 
statically typed language).  The only way to reason about types in that 
sense is statically - associating tags with values at runtime doesn't 
get you there.

This is the sense in which the static type folk object to the term 
"dynamic type" - because the tags/RTTI are not "types" in the type 
theory sense.

Latent types as I'm describing them are intended to more closely 
correspond to static types, in terms of the way in which they apply to 
terms in a program.

> Hmmm. Another place where the static type isn't the same thing as
> the runtime type occurs in languages with subtyping.

Yes.

> Question: if a language *does* record complete RTTI, and the
> languages does *not* have subtyping, could we then say that
> the runtime type information *is* the same as the static type?

You'd still have a problem with function types, as I mentioned above, 
as well as expressions like the conditional one I gave:

   (flag ? 5 : "foo")

The problem is that as the program is running, all it can ever normally 
do is tag a value with the tags obtained from the path the program 
followed on that run.  So RTTI isn't, by itself, going to determine 
anything similar to a static type for such a term, such as "string | 
number".

One way to think about "runtime" types is as being equivalent to certain 
kinds of leaves on a statically-typed program syntax tree.  In an 
expression like "x = 3", an inferring compiler might determine that 3 is 
of type "integer".  The tagged values which RTTI uses operate at this 
level: the level of individual values.

However, as soon as you start dealing with other kinds of nodes in the 
syntax tree -- nodes that don't represent literal values, or compound 
nodes (that have children) -- the possibility arises that the type of 
the overall expression will be more complex than that of a single value. 
  At that point, RTTI can't do what static types do.

Even in the simple "x = 3" case, a hypothetical inferring compiler might 
notice that elsewhere in the same function, x is treated as a floating 
point number, perhaps via an expression like "x = x / 2.0".   According 
to the rules of its type system, our language might determine that x has 
type "float", as opposed to "number" or "integer".  It might then either 
treat the original "3" as a float, or supply a conversion when assigning 
it to "x".  (Of course, some languages might just give an error and 
force you to be more explicit, but bear with me for the example - it's 
after 3:30am again.)

Compare this to the dynamically-typed language: it sees "3" and, 
depending on the language, might decide it's a "number" or perhaps an 
"integer", and tag it as such.  Either way, x ends up referring to that 
tagged value.  So what type is x?  All you can really say, in the RTTI 
case, is that x is a number or an integer, depending on the tag the 
value has.  There's no way it can figure out that x should be a float at 
that point.

Of course, further down when it runs into the code "x = x / 2.0", x 
might end up holding a value that's tagged as a float.  But that only 
tells you the value of x at some point in time, it doesn't help you with 
the static type of x, i.e. a type that either encompasses all possible 
values x could have during its lifetime, or alternatively determines how 
values assigned to x should be treated (e.g. cast to float).

BTW, it's worth noting at this point, since it's implied by the last 
paragraph, that a static type is only an approximation to the values 
that a term can have during the execution of a program.  Static types 
can (often!) be too conservative, describing possible values that a 
particular term couldn't actually ever have.  This gives another hint as 
to why RTTI can't be equivalent to static types.  It's only ever dealing 
with the concrete values right in front of it, it can't see the bigger 
static picture.

Anton
From: Chris Uppal
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <449bde5f$1$663$bed64819@news.gradwell.net>
Anton van Straaten wrote:

> In that case, you could say that the conceptual type is different than
> the inferred static type.  But most of the time, the human is reasoning
> about pretty much the same types as the static types that Haskell
> infers.  Things would get a bit confusing otherwise.

Or any mechanised or formalised type system, for any language.  If a system
doesn't match pretty closely with at least part of the latent type systems (in
your sense) used by the programmers, then that type system is useless.

(I gather that it took, or maybe is still taking, theorists a while to get to
grips with the infromal type logics which were obvious to working OO
programmers.)

    -- chris
From: Joachim Durchholz
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e7m1f5$nf$2@online.de>
Anton van Straaten schrieb:
>> It seems we have languages:
>> with or without static analysis
>> with or without runtime type information (RTTI or "tags")
>> with or without (runtime) safety
>> with or without explicit type annotations
>> with or without type inference
>>
>> Wow. And I don't think that's a complete list, either.
> 
> Yup.

What else?
(Genuinely curious.)

Regards,
Jo
From: Marshall
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1151255749.436492.70680@m73g2000cwd.googlegroups.com>
Joachim Durchholz wrote:
> Anton van Straaten schrieb:
> >> It seems we have languages:
> >> with or without static analysis
> >> with or without runtime type information (RTTI or "tags")
> >> with or without (runtime) safety
> >> with or without explicit type annotations
> >> with or without type inference
> >>
> >> Wow. And I don't think that's a complete list, either.
> >
> > Yup.
>
> What else?
> (Genuinely curious.)

I left off a big one: nominal vs. structural

Also: has subtyping polymorphism or not, has parametric polymorphism or
not.


Marshall
From: Darren New
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <tXBng.16192$Z67.12248@tornado.socal.rr.com>
Marshall wrote:
> Also: has subtyping polymorphism or not, has parametric polymorphism or
> not.

And covariant or contravariant.

-- 
   Darren New / San Diego, CA, USA (PST)
     Native Americans used every part
     of the buffalo, including the wings.
From: Joachim Durchholz
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e7pjnl$gac$1@online.de>
Darren New schrieb:
> Marshall wrote:
>> Also: has subtyping polymorphism or not, has parametric polymorphism or
>> not.
> 
> And covariant or contravariant.

That's actually not a design choice - if you wish to have a sound type 
system, all input parameters *must* be contravariant, all output 
parameters *must* be covariant, and all in/out parameters must be 
novariant. (Eiffel got this one wrong in almost all cases.)

Regards,
Jo
From: Darren New
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <FzZng.16310$Z67.15253@tornado.socal.rr.com>
Joachim Durchholz wrote:
> That's actually not a design choice

It's certainly a choice you can get wrong, as you say. ;-)

I mean, if "without runtime safety" is a choice, I expect picking the 
wrong choice here can be. :-)

-- 
   Darren New / San Diego, CA, USA (PST)
     Native Americans used every part
     of the buffalo, including the wings.
From: Dr.Ruud
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e7hcer.19s.1@news.isolution.nl>
Marshall schreef:

> It seems we have languages:
> with or without static analysis
> with or without runtime type information (RTTI or "tags")
> with or without (runtime) safety
> with or without explicit type annotations
> with or without type inference
>
> Wow. And I don't think that's a complete list, either.

Right. And don't forget that some languages are hybrids in any of those
areas.

-- 
Affijn, Ruud

"Gewoon is een tijger."
From: Joachim Durchholz
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e7m1b6$nf$1@online.de>
Marshall schrieb:
> It seems we have languages:
> with or without static analysis
> with or without runtime type information (RTTI or "tags")
> with or without (runtime) safety
> with or without explicit type annotations
> with or without type inference
> 
> Wow. And I don't think that's a complete list, either.
> 
> I would be happy to abandon "strong/weak" as terminology
> because I can't pin those terms down. (It's not clear what
> they would add anyway.)

Indeed.

>> Programmers infer and reason about these latent types while they're
>> writing or reading programs.  Latent types become manifest when a
>> programmer reasons about them, or documents them e.g. in comments.
> 
> Uh, oh, a new term, "manifest." Should I worry about that?

I think that was OED usage of the term.

> Well, darn. It strikes me that that's just a decision the language
> designers
> made, *not* to record complete RTTI. (Is it going to be claimed that
> there is an *advantage* to having only incomplete RTTI? It is a
> serious question.)

In most cases, it's probably "we don't have to invent or look up 
efficient algorithms that we can't think of right now".
One could consider this a sorry excuse or a wise decision to stick with 
available resources, both views have their merits IMHO ;-)

>> But "function" is not a useful type.  Why not?  Because if all you know
>> is that timestwo is a function, then you have no idea what an expression
>> like "timestwo(foo)" means.  You couldn't write working programs, or
>> read them, if all you knew about functions was that they were functions.
>>   As a type, "function" is incomplete.
> 
> Yes, function is a parameterized type, and they've left out the
> parameter values.

Well, in JavaScript, the explicit type system as laid down in the 
run-time type information is unparameterized.
You can criticize this as unsound, or inadequate, or whatever you wish, 
and I'd like agree with that ;-), but the type of timestwo is indeed 
just "function".

*Your mental model* is far more detailed, of course.

> Question: if a language *does* record complete RTTI, and the
> languages does *not* have subtyping, could we then say that
> the runtime type information *is* the same as the static type?

Only if the two actually use the same type system.

In practice, I'd say that this is what most designers intend but what 
implementors may not have gotten right ;-p

Regards,
Jo
From: ········@ps.uni-sb.de
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1151056505.908723.162580@u72g2000cwu.googlegroups.com>
Anton van Straaten wrote:
>
> Languages with latent type systems typically don't include type
> declarations in the source code of programs.  The "static" type scheme
> of a given program in such a language is thus latent, in the English
> dictionary sense of the word, of something that is present but
> undeveloped.  Terms in the program may be considered as having static
> types, and it is possible to infer those types, but it isn't necessarily
> easy to do so automatically, and there are usually many possible static
> type schemes that can be assigned to a given program.
>
> Programmers infer and reason about these latent types while they're
> writing or reading programs.  Latent types become manifest when a
> programmer reasons about them, or documents them e.g. in comments.

I very much agree with the observation that every programmer performs
"latent typing" in his head (although Pascal Constanza's seems to have
the opposite opinion).

But I also think that "latently typed language" is not a meaningful
characterisation. And for the very same reason! Since any programming
activity involves latent typing - naturally, even in assembler! - it
cannot be attributed to any language in particular, and is hence
useless to distinguish between them. (Even untyped lambda calculus
would not be a counter-example. If you really were to program in it,
you certainly would think along lines like "this function takes two
chuch numerals and produces a third one".)

I hear you when you define latently typed languages as those that
support the programmer's latently typed thinking by providing dynamic
tag checks. But in the very same post (and others) you also explain in
length why these tags are far from being actual types. This seems a bit
contradictory to me.

As Chris Smith points out, these dynamic checks are basically a
necessaity for a well-defined operational semantics. You need them
whenever you have different syntactic classes of values, but lack a
type system to preclude interference. They are just an encoding for
differentiating these syntactic classes. Their connection to types is
rather coincidential.

- Andreas
From: Anton van Straaten
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <fgRmg.155793$F_3.69979@newssvr29.news.prodigy.net>
········@ps.uni-sb.de wrote:
> I very much agree with the observation that every programmer performs
> "latent typing" in his head 

Great!

> (although Pascal Constanza's seems to have
> the opposite opinion).

I'll have to catch up on that.

> But I also think that "latently typed language" is not a meaningful
> characterisation. And for the very same reason! Since any programming
> activity involves latent typing - naturally, even in assembler! - it
> cannot be attributed to any language in particular, and is hence
> useless to distinguish between them. (Even untyped lambda calculus
> would not be a counter-example. If you really were to program in it,
> you certainly would think along lines like "this function takes two
> chuch numerals and produces a third one".)

Vesa raised a similar objection in his post 'Saying "latently typed 
language" is making a category mistake'.  I've made some relevant 
responses to that post.

I agree that there's a valid point in the sense that latent types are 
not a property of the semantics of the languages they're associated with.

But to take your example, I've had the pleasure of programming a little 
in untyped lambda calculus.  I can tell you that not having tags is 
quite problematic.  You certainly do perform latent typing in your head, 
but without tags, the language doesn't provide any built-in support for 
it.  You're on your own.

I'd characterize this as being similar to other situations in which a 
language has explicit support for some programming style (e.g. FP or 
OO), vs. not having such support, so that you have to take steps to fake 
it.  The style may be possible to some degree in either case, but it's 
much easier if the language has explicit support for it. 
Dynamically-typed languages provide support for latent typing.  Untyped 
lambda calculus and assembler don't provide such support, built-in.

However, I certainly don't want to add to confusion about things like 
this.  I'm open to other ways to describe these things.  Part of where 
"latently-typed language" comes from is as an alternative to 
"dynamically-typed language" -- an alternative with connotations that 
I'm suggesting should be more compatible with type theory.

> I hear you when you define latently typed languages as those that
> support the programmer's latently typed thinking by providing dynamic
> tag checks. But in the very same post (and others) you also explain in
> length why these tags are far from being actual types. This seems a bit
> contradictory to me.

I don't see it as contradictory.  Tags are a mechanism.  An individual 
tag is not a type.  But together, tags help check types.  More below:

> As Chris Smith points out, these dynamic checks are basically a
> necessaity for a well-defined operational semantics. You need them
> whenever you have different syntactic classes of values, but lack a
> type system to preclude interference. They are just an encoding for
> differentiating these syntactic classes. Their connection to types is
> rather coincidential.

Oooh, I think I'd want to examine that "coincidence" very closely.  For 
example, *why* do we have different syntactic classes of values in the 
first place?  I'd say that we have them precisely to support type-like 
reasoning about programs, even in the absence of a formal static type 
system.

If you're going to talk about "syntactic classes of values" in the 
context of latent types, you need to consider how they relate to latent 
types.  Syntactic classes of values relate to latent types in a very 
similar way to that in which static types do.

One difference is that since latent types are not automatically 
statically checked, checks have to happen in some other way.  That's 
what checks of tags are for.

Chris Smith's observation was made in a kind of deliberate void: one in 
which the existence of anything like latent types is discounted, on the 
grounds that they're informal.  If you're willing to acknowledge the 
idea of latent types, then you can reinterpret the meaning of tags in 
the context of latent types.  In that context, tags are part of the 
mechanism which checks latent types.

Anton
From: David Hopwood
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <Cvcng.463162$tc.6353@fe2.news.blueyonder.co.uk>
Anton van Straaten wrote:
> ········@ps.uni-sb.de wrote:
> 
>> I very much agree with the observation that every programmer performs
>> "latent typing" in his head 
[...]
>> But I also think that "latently typed language" is not a meaningful
>> characterisation. And for the very same reason! Since any programming
>> activity involves latent typing - naturally, even in assembler! - it
>> cannot be attributed to any language in particular, and is hence
>> useless to distinguish between them. (Even untyped lambda calculus
>> would not be a counter-example. If you really were to program in it,
>> you certainly would think along lines like "this function takes two
>> chuch numerals and produces a third one".)
> 
> Vesa raised a similar objection in his post 'Saying "latently typed
> language" is making a category mistake'.  I've made some relevant
> responses to that post.
> 
> I agree that there's a valid point in the sense that latent types are
> not a property of the semantics of the languages they're associated with.
> 
> But to take your example, I've had the pleasure of programming a little
> in untyped lambda calculus.  I can tell you that not having tags is
> quite problematic.  You certainly do perform latent typing in your head,
> but without tags, the language doesn't provide any built-in support for
> it.  You're on your own.

I can accept that dynamic tagging provides some support for latent typing
performed "in the programmer's head". But that still does not mean that
dynamic tagging is the same thing as latent typing, or that languages
that use dynamic tagging are "latently typed". This simply is not a
property of the language (as you've already conceded).

-- 
David Hopwood <····················@blueyonder.co.uk>
From: Anton van Straaten
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <JYdng.156673$F_3.102006@newssvr29.news.prodigy.net>
David Hopwood wrote:
> I can accept that dynamic tagging provides some support for latent typing
> performed "in the programmer's head". But that still does not mean that
> dynamic tagging is the same thing as latent typing

No, I'm not saying it is, although I am saying that the former supports 
the latter.

> or that languages
> that use dynamic tagging are "latently typed". This simply is not a
> property of the language (as you've already conceded).

Right.  I see at least two issues here: one is that as a matter of 
shorthand, compressing "language which supports latent typing" to 
"latently-typed language" ought to be fine, as long as the term's 
meaning is understood.

But beyond that, there's an issue here about the definition of "the 
language".  When programming in a latently-typed language, a lot of 
action goes on outside the language - reasoning about static properties 
of programs that are not captured by the semantics of the language.

This means that there's a sense in which the language that the 
programmer programs in is not the same language that has a formal 
semantic definition.  As I mentioned in another post, programmers are 
essentially mentally programming in a richer language - a language which 
has informal (static) types - but the code they write down elides this 
type information, or else puts it in comments.

We have to accept, then, that the formal semantic definitions of 
dynamically-checked languages are incomplete in some important ways. 
Referring to those semantic definitions as "the language", as though 
that's all there is to the language in a broader sense, is misleading.

In this context, the term "latently-typed language" refers to the 
language that a programmer experiences, not to the subset of that 
language which is all that we're typically able to formally define.

Anton
From: Marshall
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1151171007.230666.235430@r2g2000cwb.googlegroups.com>
Anton van Straaten wrote:
>
> But beyond that, there's an issue here about the definition of "the
> language".  When programming in a latently-typed language, a lot of
> action goes on outside the language - reasoning about static properties
> of programs that are not captured by the semantics of the language.
>
> This means that there's a sense in which the language that the
> programmer programs in is not the same language that has a formal
> semantic definition.  As I mentioned in another post, programmers are
> essentially mentally programming in a richer language - a language which
> has informal (static) types - but the code they write down elides this
> type information, or else puts it in comments.
>
> We have to accept, then, that the formal semantic definitions of
> dynamically-checked languages are incomplete in some important ways.
> Referring to those semantic definitions as "the language", as though
> that's all there is to the language in a broader sense, is misleading.
>
> In this context, the term "latently-typed language" refers to the
> language that a programmer experiences, not to the subset of that
> language which is all that we're typically able to formally define.

That is starting to get a bit too mystical for my tastes.


Marshall
From: Anton van Straaten
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <rkgng.157632$F_3.137561@newssvr29.news.prodigy.net>
Marshall wrote:
> Anton van Straaten wrote:
> 
>>But beyond that, there's an issue here about the definition of "the
>>language".  When programming in a latently-typed language, a lot of
>>action goes on outside the language - reasoning about static properties
>>of programs that are not captured by the semantics of the language.
>>
>>This means that there's a sense in which the language that the
>>programmer programs in is not the same language that has a formal
>>semantic definition.  As I mentioned in another post, programmers are
>>essentially mentally programming in a richer language - a language which
>>has informal (static) types - but the code they write down elides this
>>type information, or else puts it in comments.
>>
>>We have to accept, then, that the formal semantic definitions of
>>dynamically-checked languages are incomplete in some important ways.
>>Referring to those semantic definitions as "the language", as though
>>that's all there is to the language in a broader sense, is misleading.
>>
>>In this context, the term "latently-typed language" refers to the
>>language that a programmer experiences, not to the subset of that
>>language which is all that we're typically able to formally define.
> 
> 
> That is starting to get a bit too mystical for my tastes.

It's informal, but hardly mystical.  The formal semantic definitions of 
dynamically-checked languages I'm referring to are the typical "untyped" 
definitions, in which the language has a single static type.

This means that when you as a programmer see a function that you "know" 
e.g. takes a number and returns a number, that knowledge is not 
something that's captured by the language's formal semantics or static 
type system.  All that can be expressed in the language's static type 
system is that the function takes a value and returns a value.

When you're programming in your favorite dynamically-typed language, 
I'll bet that you're not thinking of functions, or other expressions, in 
such a restricted way.  Your experience of programming in the language 
thus relies heavily on features that are not a "property of the 
language", if "the language" is taken to mean its formal semantic 
definition.

So in referring to a language as latently-typed, the word "language" can 
be taken to mean what a programmer thinks of as the language, rather 
than a formal definition that only captures a subset of what the 
programmer routinely deals with.

If we don't acknowledge this distinction between the part of the 
language that we can capture formally and the part that we haven't 
(yet), then we'll be forever stuck without accepted terminology for the 
informal bits, because such terminology can always be objected to on the 
grounds that it isn't supported by formal semantics.

Anton
From: Pascal Costanza
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <4g5hdiF1lpuniU1@individual.net>
Marshall wrote:
> Anton van Straaten wrote:
>> But beyond that, there's an issue here about the definition of "the
>> language".  When programming in a latently-typed language, a lot of
>> action goes on outside the language - reasoning about static properties
>> of programs that are not captured by the semantics of the language.
>>
>> This means that there's a sense in which the language that the
>> programmer programs in is not the same language that has a formal
>> semantic definition.  As I mentioned in another post, programmers are
>> essentially mentally programming in a richer language - a language which
>> has informal (static) types - but the code they write down elides this
>> type information, or else puts it in comments.
>>
>> We have to accept, then, that the formal semantic definitions of
>> dynamically-checked languages are incomplete in some important ways.
>> Referring to those semantic definitions as "the language", as though
>> that's all there is to the language in a broader sense, is misleading.
>>
>> In this context, the term "latently-typed language" refers to the
>> language that a programmer experiences, not to the subset of that
>> language which is all that we're typically able to formally define.
> 
> That is starting to get a bit too mystical for my tastes.

To paraphrase Abelson & Sussman: Programs must be written for people to 
read, and only incidentally for compilers to check their types.

;)


Pascal

-- 
3rd European Lisp Workshop
July 3 - Nantes, France - co-located with ECOOP 2006
http://lisp-ecoop06.bknr.net/
From: ········@ps.uni-sb.de
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1151224844.569347.167420@r2g2000cwb.googlegroups.com>
Marshall wrote:
> >
> > This means that there's a sense in which the language that the
> > programmer programs in is not the same language that has a formal
> > semantic definition.  As I mentioned in another post, programmers are
> > essentially mentally programming in a richer language - a language which
> > has informal (static) types - but the code they write down elides this
> > type information, or else puts it in comments.
> >
> > [...]
> >
> > In this context, the term "latently-typed language" refers to the
> > language that a programmer experiences, not to the subset of that
> > language which is all that we're typically able to formally define.

That language is not a subset, if at all, it's the other way round, but
I'd say they are rather incomparable. That is, they are different
languages.

> That is starting to get a bit too mystical for my tastes.

I have to agree.

\sarcasm One step further, and somebody starts calling C a "latently
memory-safe language", because a real programmer "knows" that his code
is in a safe subset... And where he is wrong, dynamic memory page
protection checks will guide him.

- Andreas
From: Anton van Straaten
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <elAng.56441$Lm5.36771@newssvr12.news.prodigy.com>
········@ps.uni-sb.de wrote:
>>>In this context, the term "latently-typed language" refers to the
>>>language that a programmer experiences, not to the subset of that
>>>language which is all that we're typically able to formally define.
> 
> 
> That language is not a subset, if at all, it's the other way round, but
> I'd say they are rather incomparable. That is, they are different
> languages.

The "subset" characterization is not important for what I'm saying.  The 
fact that they are different languages is what's important.  If you 
agree about that, then you can at least understand which language I'm 
referring to when I say "latently-typed language".

Besides, many dynamically-typed languages have no formal models, in 
which case the untyped formal model I've referred to is just a 
speculative construct.  The language I'm referring to with 
"latently-typed language" is the language that programmers are familiar 
with, and work with.

>>That is starting to get a bit too mystical for my tastes.
> 
> 
> I have to agree.
> 
> \sarcasm One step further, and somebody starts calling C a "latently
> memory-safe language", because a real programmer "knows" that his code
> is in a safe subset... And where he is wrong, dynamic memory page
> protection checks will guide him.

That's a pretty apt comparison, and it probably explains how it is that 
the software we all use, which relies so heavily on C, works as well as 
it does.

But the comparison critiques the practice of operating without static 
guarantees, it's not a critique of the terminology.

Anton
From: John Thingstad
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <op.tbqokem8pqzri1@pandora.upc.no>
On Sun, 25 Jun 2006 20:11:22 +0200, Anton van Straaten  
<·····@appsolutions.com> wrote:

> ········@ps.uni-sb.de wrote:
>>>> In this context, the term "latently-typed language" refers to the
>>>> language that a programmer experiences, not to the subset of that
>>>> language which is all that we're typically able to formally define.
>>   That language is not a subset, if at all, it's the other way round,  
>> but
>> I'd say they are rather incomparable. That is, they are different
>> languages.
>
> The "subset" characterization is not important for what I'm saying.  The  
> fact that they are different languages is what's important.  If you  
> agree about that, then you can at least understand which language I'm  
> referring to when I say "latently-typed language".
>
> Besides, many dynamically-typed languages have no formal models, in  
> which case the untyped formal model I've referred to is just a  
> speculative construct.  The language I'm referring to with  
> "latently-typed language" is the language that programmers are familiar  
> with, and work with.
>
>>> That is starting to get a bit too mystical for my tastes.
>>   I have to agree.
>>  \sarcasm One step further, and somebody starts calling C a "latently
>> memory-safe language", because a real programmer "knows" that his code
>> is in a safe subset... And where he is wrong, dynamic memory page
>> protection checks will guide him.
>
> That's a pretty apt comparison, and it probably explains how it is that  
> the software we all use, which relies so heavily on C, works as well as  
> it does.
>
> But the comparison critiques the practice of operating without static  
> guarantees, it's not a critique of the terminology.
>
> Anton

Actually I have never developed a C/C++ program
without a bounds checker the last 15 years.
It checks all memory references and on program shutdown
checks for memory leaks. What is it about you guys that make you blind
to these fact's. Allocation problem's haven't really bugged me at all
since forever. Now debugging fluky templates on the other hands..
But then debugging Lisp macro's isn't much better.

-- 
Using Opera's revolutionary e-mail client: http://www.opera.com/mail/
From: Anton van Straaten
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <HdLng.159803$F_3.28531@newssvr29.news.prodigy.net>
John Thingstad wrote:
> On Sun, 25 Jun 2006 20:11:22 +0200, Anton van Straaten  
> <·····@appsolutions.com> wrote:
> 
>> ········@ps.uni-sb.de wrote:
...
>>>  \sarcasm One step further, and somebody starts calling C a "latently
>>> memory-safe language", because a real programmer "knows" that his code
>>> is in a safe subset... And where he is wrong, dynamic memory page
>>> protection checks will guide him.
>>
>>
>> That's a pretty apt comparison, and it probably explains how it is 
>> that  the software we all use, which relies so heavily on C, works as 
>> well as  it does.
>>
>> But the comparison critiques the practice of operating without static  
>> guarantees, it's not a critique of the terminology.
>>
>> Anton
> 
> 
> Actually I have never developed a C/C++ program
> without a bounds checker the last 15 years.
> It checks all memory references and on program shutdown
> checks for memory leaks. What is it about you guys that make you blind
> to these fact's. 

You misunderstand -- for the purposes of the above comparison, a bounds 
checker serves essentially the same purpose as "dynamic memory page 
protection checks".  The point is that it happens dynamically, i.e. at 
runtime, and that there's a lack of static guarantees about memory 
safety in C or C++.  That's why, as I said, the comparison to latent vs. 
static typing is an apt one.

Anton
From: David Hopwood
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <vajng.477125$xt.438414@fe3.news.blueyonder.co.uk>
Anton van Straaten wrote:
> David Hopwood wrote:
> 
>> I can accept that dynamic tagging provides some support for latent typing
>> performed "in the programmer's head". But that still does not mean that
>> dynamic tagging is the same thing as latent typing
> 
> No, I'm not saying it is, although I am saying that the former supports
> the latter.

But since the relevant feature that the languages in question possess is
dynamic tagging, it is more precise and accurate to use that term to
describe them.

Also, dynamic tagging is only a minor help in this respect, as evidenced
by the fact that explicit tag tests are quite rarely used by most programs,
if I'm not mistaken. IMHO, the support does not go far enough for it to be
considered a defining characteristic of these languages.

When tag tests are used implicitly by other language features such as
pattern matching and dynamic dispatch, they are used for purposes that are
equally applicable to statically typed and non-(statically-typed) languages.

>> or that languages
>> that use dynamic tagging are "latently typed". This simply is not a
>> property of the language (as you've already conceded).
> 
> Right.  I see at least two issues here: one is that as a matter of
> shorthand, compressing "language which supports latent typing" to
> "latently-typed language" ought to be fine, as long as the term's
> meaning is understood.

If, for the sake of argument, "language which supports latent typing" is
to be compressed to "latently-typed language", then statically typed
languages must be considered also latently typed.

After all, statically typed languages support expression and
verification of the "types in the programmer's head" at least as well
as non-(statically-typed) languages do. In particular, most recent
statically typed OO languages use dynamic tagging and are memory safe.
And they support comments ;-)

This is not, quite obviously, what most people mean when they say
that a particular *language* is "latently typed". They almost always
mean that the language is dynamically tagged, *not* statically typed,
and memory safe. That is how this term is used in R5RS, for example.

> But beyond that, there's an issue here about the definition of "the
> language".  When programming in a latently-typed language, a lot of
> action goes on outside the language - reasoning about static properties
> of programs that are not captured by the semantics of the language.

This is true of programming in any language.

> This means that there's a sense in which the language that the
> programmer programs in is not the same language that has a formal
> semantic definition.  As I mentioned in another post, programmers are
> essentially mentally programming in a richer language - a language which
> has informal (static) types - but the code they write down elides this
> type information, or else puts it in comments.

If you consider stuff that might be in the programmer's head as part
of the program, where do you stop? When I maintain a program written
by someone I've never met, I have no idea what was in that programmer's
head. All I have is comments, which may be (and frequently are,
unfortunately) inaccurate.

(Actually, it's worse than that -- if I come back to a program 5 years
later, I probably have little idea what was in my head at the time I
wrote it.)

> We have to accept, then, that the formal semantic definitions of
> dynamically-checked languages are incomplete in some important ways.
> Referring to those semantic definitions as "the language", as though
> that's all there is to the language in a broader sense, is misleading.

Bah, humbug. The language is just the language.

> In this context, the term "latently-typed language" refers to the
> language that a programmer experiences, not to the subset of that
> language which is all that we're typically able to formally define.

I'm with Marshall -- this is way too mystical for me.

-- 
David Hopwood <····················@blueyonder.co.uk>
From: Anton van Straaten
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <h4rng.108670$H71.33586@newssvr13.news.prodigy.com>
David Hopwood wrote:
> But since the relevant feature that the languages in question possess is
> dynamic tagging, it is more precise and accurate to use that term to
> describe them.

So you're proposing to call them dynamically-tagged languages?

> Also, dynamic tagging is only a minor help in this respect, as evidenced
> by the fact that explicit tag tests are quite rarely used by most programs,
> if I'm not mistaken. 

It sounds as though you're not considering the language implementations 
themselves, where tag tests occur all the time - potentially on every 
operation.  That's how "type errors" get detected.  This is what I'm 
referring to when I say that dynamic tags support latent types.

Tags are absolutely crucial for that purpose: without them, you have a 
language similar to untyped lambda calculus, where "latent type errors" 
can result in very difficult to debug errors, since execution can 
continue past errors and produce completely uninterpretable results.

> IMHO, the support does not go far enough for it to be
> considered a defining characteristic of these languages.

Since tag checking is an implicit feature of language implementations 
and the language semantics, it certainly qualifies as a defining 
characteristic.

> When tag tests are used implicitly by other language features such as
> pattern matching and dynamic dispatch, they are used for purposes that are
> equally applicable to statically typed and non-(statically-typed) languages.

A fully statically-typed language doesn't have to do tag checks to 
detect static type errors.

Latently-typed languages do tag checks to detect latent type errors.

You can take the preceding two sentences as a summary definition for 
"latently-typed language", which will come in handy below.

>>>or that languages
>>>that use dynamic tagging are "latently typed". This simply is not a
>>>property of the language (as you've already conceded).
>>
>>Right.  I see at least two issues here: one is that as a matter of
>>shorthand, compressing "language which supports latent typing" to
>>"latently-typed language" ought to be fine, as long as the term's
>>meaning is understood.
> 
> 
> If, for the sake of argument, "language which supports latent typing" is
> to be compressed to "latently-typed language", then statically typed
> languages must be considered also latently typed.

See definition above.  The phrase "language which supports latent 
typing" wasn't intended to be a complete definition.

> After all, statically typed languages support expression and
> verification of the "types in the programmer's head" at least as well
> as non-(statically-typed) languages do. In particular, most recent
> statically typed OO languages use dynamic tagging and are memory safe.
> And they support comments ;-)

But they don't use tags checks to validate their static types.

When statically-typed languages *do* use tags, in cases where the static 
type system isn't sufficient to avoid them, then indeed, those parts of 
the program use latent types, in the exact same sense as more fully 
latently-typed languages do.  There's no conflict here, it's simply the 
case that most statically-typed languages aren't fully statically typed.

> This is not, quite obviously, what most people mean when they say
> that a particular *language* is "latently typed". They almost always
> mean that the language is dynamically tagged, *not* statically typed,
> and memory safe. That is how this term is used in R5RS, for example.

The R5RS definition is compatible with what I've just described, because 
the parts of a statically-typed language that would be considered 
latently-typed are precisely those which rely on dynamic tags.

>>But beyond that, there's an issue here about the definition of "the
>>language".  When programming in a latently-typed language, a lot of
>>action goes on outside the language - reasoning about static properties
>>of programs that are not captured by the semantics of the language.
> 
> 
> This is true of programming in any language.

Right, but when you compare a statically-typed language to an untyped 
language at the formal level, a great deal more static reasoning goes on 
outside the language in the untyped case.

What I'm saying is that it makes no sense, in most realistic contexts, 
to think of untyped languages as being just that: languages in which the 
type of every term is simply a tagged value, as though no static 
knowledge about that value exists.  The formal model requires that you 
do this, but programmers can't function if that's all the static 
information they have.  This isn't true in the case of a fully 
statically-typed language.

>>This means that there's a sense in which the language that the
>>programmer programs in is not the same language that has a formal
>>semantic definition.  As I mentioned in another post, programmers are
>>essentially mentally programming in a richer language - a language which
>>has informal (static) types - but the code they write down elides this
>>type information, or else puts it in comments.
> 
> 
> If you consider stuff that might be in the programmer's head as part
> of the program, where do you stop? When I maintain a program written
> by someone I've never met, I have no idea what was in that programmer's
> head. All I have is comments, which may be (and frequently are,
> unfortunately) inaccurate.

You have to make the same kind of inferences that the original 
programmer made.  E.g. when you see a function that takes some values, 
manipulates them using numeric operators, and returns a value, you have 
to either figure out or trust the comments that the function accepts 
numbers (or integers, floats etc.) and returns a number.

This is not some mystical psychological quality: it's something that 
absolutely must be done in order to be able to reason about a program at 
all.

> (Actually, it's worse than that -- if I come back to a program 5 years
> later, I probably have little idea what was in my head at the time I
> wrote it.)

But that's simply the reality - you have to reconstruct: recover the 
latent types, IOW.  There's no way around that!

Just to be concrete, I'll resurrect my little Javascript example:

   function timestwo(x) { return x*2 }

Assume I wrote this and distributed it in source form as a library, and 
now you have to maintain it.  How do you know what values to call it 
with, or what kind of values it returns?  What stops you from calling it 
with a string?

Note that according the formal semantics of an untyped language, the 
type of that function is "value -> value".

The phrase "in the programmer's head" was supposed to help communicate 
the concept.  Don't get hung up on it.

>>We have to accept, then, that the formal semantic definitions of
>>dynamically-checked languages are incomplete in some important ways.
>>Referring to those semantic definitions as "the language", as though
>>that's all there is to the language in a broader sense, is misleading.
> 
> 
> Bah, humbug. The language is just the language.

If you want to have this discussion precisely, you have to do better 
than that.  There is an enormous difference between the formal semantics 
of an untyped language, and the (same) language which programmers work 
with and think of as "dynamically typed".

Use whatever terms you like to characterize this distinction, but you 
can't deny that the distinction exists, and that it's quite a big, 
important one.

>>In this context, the term "latently-typed language" refers to the
>>language that a programmer experiences, not to the subset of that
>>language which is all that we're typically able to formally define.
> 
> 
> I'm with Marshall -- this is way too mystical for me.

I said more in my reply to Marshall.  Perhaps I'm not communicating 
well.  At worst, what I'm talking about is informal.  It's easy to prove 
that you can't reason about a program if you don't know what types of 
values a function accepts or returns - which is what an untyped formal 
model gives you.

Anton
From: Joachim Durchholz
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e7m0th$vgu$1@online.de>
Anton van Straaten schrieb:
> Marshall wrote:
>> Can you be more explicit about what "latent types" means?
> 
> Sorry, that was a huge omission.  (What I get for posting at 3:30am.)
> 
> The short answer is that I'm most directly referring to "the types in 
> the programmer's head".

Ah, finally that terminology starts to make sense to me. I have been 
wondering whether there's any useful difference between "latent" and 
"run-time" typing. (I tend to avoid the term "dynamic typing" because 
it's overloaded with too many vague ideas.)

> there are usually many possible static 
> type schemes that can be assigned to a given program.

This seems to apply to latent types as well.

Actually the set of latent types seems to be the set of possible static 
type schemes.
Um, well, a superset of these - static type schemes tend to be slightly 
less expressive than what the programmer in his head. (Most type schemes 
cannot really express things like "the range of this index variable is 
such-and-so", and the boundary to general assertions about the code is 
quite blurry anyway.)

> There's a close connection between latent types in the sense I've 
> described, and the "tagged values" present at runtime.  However, as type 
> theorists will tell you, the tags used to tag values at runtime, as e.g. 
> a number or a string or a FooBar object, are not the same thing as the 
> sort of types which statically-typed languages have.

Would that be a profound difference, or is it just that annotating a 
value with a full type expression would cause just too much runtime 
overhead?

In your terminology:

> So, where do tagged values fit into this?  Tags help to check types at 
> runtime, but that doesn't mean that there's a 1:1 correspondence between 
> tags and types.

Would it be possible to establish such a correspondence, would it be 
common consensus that such a system should be called "tags" anymore, or 
are there other, even more profound differences?

Regards,
Jo
From: Anton van Straaten
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <0FLng.159806$F_3.1840@newssvr29.news.prodigy.net>
Joachim Durchholz wrote:
> Anton van Straaten schrieb:
> 
>> Marshall wrote:
>>
>>> Can you be more explicit about what "latent types" means?
>>
>>
>> Sorry, that was a huge omission.  (What I get for posting at 3:30am.)
>>
>> The short answer is that I'm most directly referring to "the types in 
>> the programmer's head".
> 
> 
> Ah, finally that terminology starts to make sense to me. I have been 
> wondering whether there's any useful difference between "latent" and 
> "run-time" typing. (I tend to avoid the term "dynamic typing" because 
> it's overloaded with too many vague ideas.)

Right, that's the reason for using a different term.  Runtime and 
dynamic types are both usually associated with the idea of tagged 
values, and don't address the types of program phrases.

>> there are usually many possible static type schemes that can be 
>> assigned to a given program.
> 
> 
> This seems to apply to latent types as well.

That's what I meant, yes.

> Actually the set of latent types seems to be the set of possible static 
> type schemes.
> Um, well, a superset of these - static type schemes tend to be slightly 
> less expressive than what the programmer in his head. (Most type schemes 
> cannot really express things like "the range of this index variable is 
> such-and-so", and the boundary to general assertions about the code is 
> quite blurry anyway.)

Yes, although this raises the type theory objection of how you know 
something is a type if you haven't formalized it.  For the purposes of 
comparison to static types, I'm inclined to be conservative and stick to 
things that have close correlates in traditional static types.

>> There's a close connection between latent types in the sense I've 
>> described, and the "tagged values" present at runtime.  However, as 
>> type theorists will tell you, the tags used to tag values at runtime, 
>> as e.g. a number or a string or a FooBar object, are not the same 
>> thing as the sort of types which statically-typed languages have.
> 
> 
> Would that be a profound difference, or is it just that annotating a 
> value with a full type expression would cause just too much runtime 
> overhead?

It's a profound difference.  The issue is that it's not just the values 
that need to be annotated with types, it's also other program terms.  In 
addition, during a single run of a program, all it can ever normally do 
is record the types seen on the path followed during that run, which 
doesn't get you to static types of terms.  To figure out the static 
types, you really need to do static analysis.

Of course, static types give an approximation of the actual types of 
values that flow at runtime, and if you come at it from the other 
direction, recording the types of values flowing through terms on 
multiple runs, you can get an approximation to the static type.  Some 
systems use that kind of thing for method lookup optimization.

> In your terminology:
> 
>> So, where do tagged values fit into this?  Tags help to check types at 
>> runtime, but that doesn't mean that there's a 1:1 correspondence 
>> between tags and types.
> 
> 
> Would it be possible to establish such a correspondence, would it be 
> common consensus that such a system should be called "tags" anymore, or 
> are there other, even more profound differences?

There's a non 1:1 correspondence which I gave an example of in the 
quoted message.  For 1:1 correspondence, you'd need to associate types 
with terms, which would need some static analysis.  It might result in 
an interesting program browsing and debugging tool...

Anton
From: Joachim Durchholz
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e7pkh8$hhd$1@online.de>
Anton van Straaten schrieb:
> Joachim Durchholz wrote:
 >> Anton van Straaten schrieb:
>>> There's a close connection between latent types in the sense I've 
>>> described, and the "tagged values" present at runtime.  However, as 
>>> type theorists will tell you, the tags used to tag values at runtime, 
>>> as e.g. a number or a string or a FooBar object, are not the same 
>>> thing as the sort of types which statically-typed languages have.
>>
>> Would that be a profound difference, or is it just that annotating a 
>> value with a full type expression would cause just too much runtime 
>> overhead?
> 
> It's a profound difference.  The issue is that it's not just the values 
> that need to be annotated with types, it's also other program terms.

Yes - but isn't that essentially just auxiliary data from and for the 
data-flow analysis that tracks what values with what types might reach 
which functions?

 > In
> addition, during a single run of a program, all it can ever normally do 
> is record the types seen on the path followed during that run, which 
> doesn't get you to static types of terms.  To figure out the static 
> types, you really need to do static analysis.

Agreed.

Regards,
Jo
From: Anton van Straaten
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <VYGmg.27606$VE1.2542@newssvr14.news.prodigy.com>
Andreas Rossberg wrote:
> Rob Warnock wrote:
> 
>>
>> Here's what the Scheme Standard has to say:
>>
>>     http://www.schemers.org/Documents/Standards/R5RS/HTML/r5rs-Z-H-4.html
>>     1.1  Semantics
>>     ...
>>     Scheme has latent as opposed to manifest types. Types are assoc-
>>     iated with values (also called objects) rather than with variables.
>>     (Some authors refer to languages with latent types as weakly typed
>>     or dynamically typed languages.) Other languages with latent types
>>     are APL, Snobol, and other dialects of Lisp. Languages with manifest
>>     types (sometimes referred to as strongly typed or statically typed
>>     languages) include Algol 60, Pascal, and C.
> 
> 
> Maybe this is the original source of the myth that static typing is all 
> about assigning types to variables...
> 
> With all my respect to the Scheme people, I'm afraid this paragraph is 
> pretty off, no matter where you stand. Besides the issue just mentioned 
> it equates "manifest" with static types. I understand "manifest" to mean 
> "explicit in code", which of course is nonsense - static typing does not 
> require explicit types. Also, I never heard "weakly typed" used in the 
> way they suggest - in my book, C is a weakly typed language (= typed, 
> but grossly unsound).

That text goes back at least 20 years, to R3RS in 1986, and possibly 
earlier, so part of what it represents is simply changing use of 
terminology, combined with an attempt to put Scheme in context relative 
to multiple languages and terminology usages.  The fact that we're still 
discussing this now, and haven't settled on terminology acceptable to 
all sides, illustrates the problem.

The Scheme report doesn't actually say anything about how latent types 
relate to static types, in the way that I've recently characterized the 
relationship here.  However, I thought this was a good place to explain 
how I got to where I am on this subject.

There's been a lot of work done on soft type inference in Scheme, and 
other kinds of type analysis.  The Scheme report talks about latent 
types as meaning "types are associated with values".  While this might 
cause some teeth-grinding amongst type theorists, it's meaningful enough 
when you look at simple cases: cases where the type of a term is an 
exact match for the type tags of the values that flow through that term.

When you get to more complex cases, though, most type inferencers for 
Scheme assign traditional static-style types to terms.  If you think 
about this in conjunction with the term "latent types", it's an obvious 
connection to make that what the inferencer is doing is recovering types 
that are latent in the source.

Once that connection is made, it's obvious that the tags associated with 
values are not the whole story: that the conformance of one or more 
values to a "latent type" may be checked by a series of tag checks, in 
different parts of a program (i.e. before, during and after the 
expression in question is evaluated).  I gave a more detailed 
description of how latent types relate to tags in an earlier reply to 
Marshall (Spight, not Joe).

Anton
From: David Hopwood
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <Wz0ng.458936$tc.15681@fe2.news.blueyonder.co.uk>
Anton van Straaten wrote:
> Andreas Rossberg wrote:
>> Rob Warnock wrote:
>>
>>> http://www.schemers.org/Documents/Standards/R5RS/HTML/r5rs-Z-H-4.html
>>>     1.1  Semantics
>>>     ...
>>>     Scheme has latent as opposed to manifest types. Types are assoc-
>>>     iated with values (also called objects) rather than with variables.
>>>     (Some authors refer to languages with latent types as weakly typed
>>>     or dynamically typed languages.) Other languages with latent types
>>>     are APL, Snobol, and other dialects of Lisp. Languages with manifest
>>>     types (sometimes referred to as strongly typed or statically typed
>>>     languages) include Algol 60, Pascal, and C.
>>
[...]
> 
> That text goes back at least 20 years, to R3RS in 1986, and possibly
> earlier, so part of what it represents is simply changing use of
> terminology, combined with an attempt to put Scheme in context relative
> to multiple languages and terminology usages.  The fact that we're still
> discussing this now, and haven't settled on terminology acceptable to
> all sides, illustrates the problem.
> 
> The Scheme report doesn't actually say anything about how latent types
> relate to static types, in the way that I've recently characterized the
> relationship here.  However, I thought this was a good place to explain
> how I got to where I am on this subject.
> 
> There's been a lot of work done on soft type inference in Scheme, and
> other kinds of type analysis.  The Scheme report talks about latent
> types as meaning "types are associated with values".  While this might
> cause some teeth-grinding amongst type theorists, it's meaningful enough
> when you look at simple cases: cases where the type of a term is an
> exact match for the type tags of the values that flow through that term.
> 
> When you get to more complex cases, though, most type inferencers for
> Scheme assign traditional static-style types to terms.  If you think
> about this in conjunction with the term "latent types", it's an obvious
> connection to make that what the inferencer is doing is recovering types
> that are latent in the source.

But these types are not part of the Scheme language. If you combine Scheme
with a type inferencer, you get a new language that is not R*RS Scheme,
and *that* language is typed.

Note that different inferencers will give different type assignments.
They may be similar, but they may also be quite dissimilar in some cases.
This casts considerable doubt on the assertion that the inferencer is
"recovering types latent in the source".

-- 
David Hopwood <····················@blueyonder.co.uk>
From: Anton van Straaten
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <oueng.157315$F_3.111244@newssvr29.news.prodigy.net>
David Hopwood wrote:
> Anton van Straaten wrote:
...
>>When you get to more complex cases, though, most type inferencers for
>>Scheme assign traditional static-style types to terms.  If you think
>>about this in conjunction with the term "latent types", it's an obvious
>>connection to make that what the inferencer is doing is recovering types
>>that are latent in the source.
> 
> 
> But these types are not part of the Scheme language. If you combine Scheme
> with a type inferencer, you get a new language that is not R*RS Scheme,
> and *that* language is typed.

Sure.  So one obvious question, as I've just observed in another reply 
to you, is which language programmers actually program in.  I'd say that 
they certainly don't program in the completely untyped language as 
defined by RnRS.

> Note that different inferencers will give different type assignments.
> They may be similar, but they may also be quite dissimilar in some cases.
> This casts considerable doubt on the assertion that the inferencer is
> "recovering types latent in the source".

I mentioned this earlier, in a reply to Marshall where I gave an 
informal definition of latent typing, which read in part: "Terms in the 
program may be considered as having static types, and it is possible to 
infer those types, but it isn't necessarily easy to do so automatically, 
and there are usually many possible static type schemes that can be 
assigned to a given program."

As far as the term "recovering" goes, that perhaps shouldn't be taken 
too literally.  It's clearly not the case that a latently-typed program 
has a single formalizable type scheme which was put there deliberately 
by the programmer.  But programmers do reason about things like the 
types of functions and expressions, and the goal of soft type 
inferencers is to find an approximation to what the programmer intended.

Anton
From: Chris Uppal
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <44992e6c$1$664$bed64819@news.gradwell.net>
Anton van Straaten wrote:

> But a program as seen by the programmer has types: the programmer
> performs (static) type inference when reasoning about the program, and
> debugs those inferences when debugging the program, finally ending up
> with a program which has a perfectly good type scheme.  It's may be
> messy compared to say an HM type scheme, and it's usually not proved to
> be perfect, but that again is an orthogonal issue.

I like this way of looking at it.

    -- chris
From: Vesa Karvonen
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e7bqms$pb8$1@oravannahka.helsinki.fi>
In comp.lang.functional Anton van Straaten <·····@appsolutions.com> wrote:
[...]

This static vs dynamic type thing reminds me of one article written by
Bjarne Stroustrup where he notes that "Object-Oriented" has become a
synonym for "good".  More precisely, it seems to me that both camps
(static & dynamic) think that "typed" is a synonym for having
"well-defined semantics" or being "safe" and therefore feel the need
to be able to speak of their language as "typed" whether or not it
makes sense.

> Let me add another complex subtlety, then: the above description misses 
> an important point, which is that *automated* type checking is not the 
> whole story.  I.e. that compile time/runtime distinction is a kind of 
> red herring.

I agree.  I think that instead of "statically typed" we should say
"typed" and instead of "(dynamically|latently) typed" we should say
"untyped".

> In a statically-checked language, people tend to confuse automated 
> static checking with the existence of types, because they're thinking in 
> a strictly formal sense: they're restricting their world view to what 
> they see "within" the language.

That is not unreasonable.  You see, you can't have types unless you
have a type system.  Types without a type system are like answers
without questions - it just doesn't make any sense.

> Then they look at programs in a dynamically-checked language, and see 
> checks happening at runtime, and they assume that this means that the 
> program is "untyped".

Not in my experience.  Either a *language* specifies a type system or
not.  There is little room for confusion.  Well, at least unless you
equate "typing" with being "well-defined" or "safe" and go to great
lengths to convince yourself that your program has "latent types" even
without specifying a type system.

> It's certainly close enough to say that the *language* is untyped.

Indeed.  Either a language has a type system and is typed or has no
type system and is untyped.  I see very little room for confusion
here.  In my experience, the people who confuse these things are
people from the dynamic/latent camp who wish to see types everywhere
because they confuse typing with safety or having well-defined
semantics.

> But a program as seen by the programmer has types: the programmer 
> performs (static) type inference when reasoning about the program, and 
> debugs those inferences when debugging the program, finally ending up 
> with a program which has a perfectly good type scheme.  It's may be 
> messy compared to say an HM type scheme, and it's usually not proved to 
> be perfect, but that again is an orthogonal issue.

There is a huge hole in your argument above.  Types really do not make
sense without a type system.  To claim that a program has a type
scheme, you must first specify the type system.  Otherwise it just
doesn't make any sense.

> Mathematicians operated for thousands of years without automated 
> checking of proofs, so you can't argue that because a 
> dynamically-checked program hasn't had its type scheme proved correct, 
> that it somehow doesn't have types.  That would be a bit like arguing 
> that we didn't have Math until automated theorem provers came along.

No - not at all.  First of all, mathematics has matured quite a bit
since the early days.  I'm sure you've heard of the axiomatic method.
However, what you are missing is that to prove that your program has
types, you first need to specify a type system.  Similarly, to prove
something in math you start by specifying [fill in the rest].

> 1. "Untyped" is really quite a misleading term, unless you're talking 
> about something like the untyped lambda calculus.  That, I will agree, 
> can reasonably be called untyped.

Untyped is not misleading.  "Typed" is not a synonym for "safe" or
"having well-defined semantics".

> So, will y'all just switch from using "dynamically typed" to "latently 
> typed"

I won't (use "latently typed").  At least not without further
qualification.

-Vesa Karvonen
From: Rob Thorpe
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1150908952.804184.218990@r2g2000cwb.googlegroups.com>
Vesa Karvonen wrote:
> In comp.lang.functional Anton van Straaten <·····@appsolutions.com> wrote:
> > Let me add another complex subtlety, then: the above description misses
> > an important point, which is that *automated* type checking is not the
> > whole story.  I.e. that compile time/runtime distinction is a kind of
> > red herring.
>
> I agree.  I think that instead of "statically typed" we should say
> "typed" and instead of "(dynamically|latently) typed" we should say
> "untyped".
>
> > In a statically-checked language, people tend to confuse automated
> > static checking with the existence of types, because they're thinking in
> > a strictly formal sense: they're restricting their world view to what
> > they see "within" the language.
>
> That is not unreasonable.  You see, you can't have types unless you
> have a type system.  Types without a type system are like answers
> without questions - it just doesn't make any sense.
>
> > Then they look at programs in a dynamically-checked language, and see
> > checks happening at runtime, and they assume that this means that the
> > program is "untyped".
>
> Not in my experience.  Either a *language* specifies a type system or
> not.  There is little room for confusion.  Well, at least unless you
> equate "typing" with being "well-defined" or "safe" and go to great
> lengths to convince yourself that your program has "latent types" even
> without specifying a type system.

The question is: What do you mean by "type system"?
Scheme and Lisp both define how types work in their specifications
clearly, others may do too, I don't know.
Of-course you may not consider that as a type system if you mean "type
system" to mean a static type system.

> > It's certainly close enough to say that the *language* is untyped.
>
> Indeed.  Either a language has a type system and is typed or has no
> type system and is untyped.  I see very little room for confusion
> here.  In my experience, the people who confuse these things are
> people from the dynamic/latent camp who wish to see types everywhere
> because they confuse typing with safety or having well-defined
> semantics.

No.  It's because the things that we call latent types we use for the
same purpose that programmers of static typed languages use static
types for.

Statically typed programmers ensure that the value of some expression
is of some type by having the compiler check it.  Programmers of
latently typed languages check, if they think it's important, by asking
what the type of the result is.

The objection here is that advocates of statically typed language seem
to be claiming the "type" as their own word, and asking that others use
their definitions of typing, which are really specific to their
subjects of interest. This doesn't help advocates of static languages/
latently typed languages, or anyone else.  It doesn't help because
no-one else is likely to change their use of terms, there's no reason
why they would.  All that may happen is that users of statically typed
languages change the words they use.  This would confuse me, for one. I
would much rather understand what ML programmers, for example, are
saying and that's hard enough as it is.

There's also my other objection, if you consider latently typed
languages untyped, then what is assembly?
From: Rob Thorpe
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1150991074.705803.211570@m73g2000cwd.googlegroups.com>
David Hopwood wrote:
> Rob Thorpe wrote:
> > Vesa Karvonen wrote:
> >
> >>In comp.lang.functional Anton van Straaten <·····@appsolutions.com> wrote:
> >>
> >>>Let me add another complex subtlety, then: the above description misses
> >>>an important point, which is that *automated* type checking is not the
> >>>whole story.  I.e. that compile time/runtime distinction is a kind of
> >>>red herring.
> >>
> >>I agree.  I think that instead of "statically typed" we should say
> >>"typed" and instead of "(dynamically|latently) typed" we should say
> >>"untyped".
> [...]
> >>>It's certainly close enough to say that the *language* is untyped.
> >>
> >>Indeed.  Either a language has a type system and is typed or has no
> >>type system and is untyped.  I see very little room for confusion
> >>here.  In my experience, the people who confuse these things are
> >>people from the dynamic/latent camp who wish to see types everywhere
> >>because they confuse typing with safety or having well-defined
> >>semantics.
> >
> > No.  It's because the things that we call latent types we use for the
> > same purpose that programmers of static typed languages use static
> > types for.
> >
> > Statically typed programmers ensure that the value of some expression
> > is of some type by having the compiler check it.  Programmers of
> > latently typed languages check, if they think it's important, by asking
> > what the type of the result is.
> >
> > The objection here is that advocates of statically typed language seem
> > to be claiming the "type" as their own word, and asking that others use
> > their definitions of typing, which are really specific to their
> > subjects of interest.
>
> As far as I can tell, the people who advocate using "typed" and "untyped"
> in this way are people who just want to be able to discuss all languages in
> a unified terminological framework, and many of them are specifically not
> advocates of statically typed languages.

Its easy to create a reasonable framework. My earlier posts show simple
ways of looking at it that could be further refined, I'm sure there are
others who have already done this.

The real objection to this was that latently/dynamically typed
languages have a place in it.  But some of the advocates of statically
typed languages wish to lump these languages together with assembly
language a "untyped" in an attempt to label them as unsafe.
From: Andreas Rossberg
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e7ef53$8r8i0$6@hades.rz.uni-saarland.de>
Rob Thorpe wrote:
> 
> Its easy to create a reasonable framework.

Luca Cardelli has given the most convincing one in his seminal tutorial 
"Type Systems", where he identifies "typed" and "safe" as two orthogonal 
dimensions and gives the following matrix:

           | typed | untyped
    -------+-------+----------
    safe   | ML    | Lisp
    unsafe | C     | Assembler

Now, jargon "dynamically typed" is simply untyped safe, while "weakly 
typed" is typed unsafe.

> The real objection to this was that latently/dynamically typed
> languages have a place in it.  But some of the advocates of statically
> typed languages wish to lump these languages together with assembly
> language a "untyped" in an attempt to label them as unsafe.

No, see above. And I would assume that that is how most proponents of 
the typed/untyped dichotomy understand it.

- Andreas
From: Rob Thorpe
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1151083239.829296.33030@i40g2000cwc.googlegroups.com>
Andreas Rossberg wrote:
> Rob Thorpe wrote:
> >
> > Its easy to create a reasonable framework.
>
> Luca Cardelli has given the most convincing one in his seminal tutorial
> "Type Systems", where he identifies "typed" and "safe" as two orthogonal
> dimensions and gives the following matrix:
>
>            | typed | untyped
>     -------+-------+----------
>     safe   | ML    | Lisp
>     unsafe | C     | Assembler
>
> Now, jargon "dynamically typed" is simply untyped safe, while "weakly
> typed" is typed unsafe.

Consider a langauge something like BCPL or a fancy assembler, but with
not quite a 1:1 mapping with machine langauge.

It differs in one key regard: it has a variable declaration command.
This command allows the programmer to allocate a block of memory to a
variable.  If the programmer attempts to index a variable outside the
block of memory allocated to it an error will occur.  Similarly if the
programmer attempts to copy a larger block into a smaller block an
error would occur.

Such a language would be truly untyped and "safe", that is safe
according to many peoples use of the word including, I think, yours.

But it differs from latently typed languages like python, perl or lisp.
 In such a language there is no information about the type the variable
stores.  The programmer cannot write code to test it, and so can't
write functions that issue errors if given arguments of the wrong type.
 The programmer must use his or her memory to substitute for that
facility.  As far as I can see this is a significant distinction and
warrants a further category for latently typed languages.


As a sidenote: the programmer may also create a tagging system to allow
the types to be tracked.  But this is not the same as saying the
language supports types, it is akin to the writing of OO programs in C
using structs.
From: ········@ps.uni-sb.de
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1151083931.162325.268050@g10g2000cwb.googlegroups.com>
Rob Thorpe wrote:
>
> But it differs from latently typed languages like python, perl or lisp.
>  In such a language there is no information about the type the variable
> stores.  The programmer cannot write code to test it, and so can't
> write functions that issue errors if given arguments of the wrong type.
>  The programmer must use his or her memory to substitute for that
> facility.  As far as I can see this is a significant distinction and
> warrants a further category for latently typed languages.

Take one of these languages. You have a variable that is supposed to
store functions from int to int. Can you test that a given function
meets this requirement?

You see, IMO the difference is marginal. From my point of view, the
fact that you can do such tests *in some very trivial cases* in the
languages you mention is an artefact, nothing fundamental.

- Andreas
From: Rob Thorpe
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1151086320.564736.272500@y41g2000cwy.googlegroups.com>
········@ps.uni-sb.de wrote:
> Rob Thorpe wrote:
> >
> > But it differs from latently typed languages like python, perl or lisp.
> >  In such a language there is no information about the type the variable
> > stores.  The programmer cannot write code to test it, and so can't
> > write functions that issue errors if given arguments of the wrong type.
> >  The programmer must use his or her memory to substitute for that
> > facility.  As far as I can see this is a significant distinction and
> > warrants a further category for latently typed languages.
>
> Take one of these languages. You have a variable that is supposed to
> store functions from int to int. Can you test that a given function
> meets this requirement?

The answer is no for python and perl AFAIK.  Also no for lisp _aux
naturelle_ (you can do it by modifying lisp though of-course, I believe
you can make it entirely statically typed if you want).

But the analogous criticism could be made of statically typed
languages.
Can I make a type in C that can only have values between 1 and 10?
How about a variable that can only hold odd numbers, or, to make it
more difficult, say fibonacci numbers?

> You see, IMO the difference is marginal. From my point of view, the
> fact that you can do such tests *in some very trivial cases*

They are not very trivial for practical programming purposes though,
they are important.

> in the
> languages you mention is an artefact, nothing fundamental.

The cases can be far from trivial.  In lisp a type specifier can mean
something like "argument must be a 10x10 matrix with top corner element
larger than 35"  You can also, for example, easily make predicates that
tell if a value is odd, or a fibonacci number.

The system simply extends in different directions.
From: ········@ps.uni-sb.de
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1151092939.711328.280740@p79g2000cwp.googlegroups.com>
Rob Thorpe write:
> >
> > Take one of these languages. You have a variable that is supposed to
> > store functions from int to int. Can you test that a given function
> > meets this requirement?
>
> The answer is no for python and perl AFAIK.  Also no for lisp _aux
> naturelle_ (you can do it by modifying lisp though of-course, I believe
> you can make it entirely statically typed if you want).
>
> But the analogous criticism could be made of statically typed
> languages.
> Can I make a type in C that can only have values between 1 and 10?
> How about a variable that can only hold odd numbers, or, to make it
> more difficult, say fibonacci numbers?

Fair point. However, there are in fact static type systems powerful
enough to express all of this and more (whether they are convenient
enough in practice is a different matter). On the other hand, AFAICS,
it is principally impossible to express the above property on functions
with tags or any other purely dynamic mechanism.

> The cases can be far from trivial.  In lisp a type specifier can mean
> something like "argument must be a 10x10 matrix with top corner element
> larger than 35"  You can also, for example, easily make predicates that
> tell if a value is odd, or a fibonacci number.

Now, these are yet another beast. They are not tags, they are
user-defined predicates. I would not call them types either, but that's
an equally pointless battle. :-)

- Andreas
From: Marshall
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1151100483.057344.310970@g10g2000cwb.googlegroups.com>
Rob Thorpe wrote:
>
> Can I make a type in C that can only have values between 1 and 10?
> How about a variable that can only hold odd numbers, or, to make it
> more difficult, say fibonacci numbers?

Well, of course you can't in *C*; you can barely zip you pants with C.
But I believe you can do the above in C++, can't you?


Marshall
From: Dr.Ruud
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e7i9q9.184.1@news.isolution.nl>
Marshall schreef:
> Rob Thorpe:

>> Can I make a type in C that can only have values between 1 and 10?
>> How about a variable that can only hold odd numbers, or, to make it
>> more difficult, say fibonacci numbers?
>
> Well, of course you can't in *C*; you can barely zip you pants with C.
> But I believe you can do the above in C++, can't you?

You can write self-modifying code in C, so I don't see how you can not
do that in C.
;)

-- 
Affijn, Ruud

"Gewoon is een tijger."
From: Marshall
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1151115244.951029.56060@g10g2000cwb.googlegroups.com>
Dr.Ruud wrote:
> Marshall schreef:
> > Rob Thorpe:
>
> >> Can I make a type in C that can only have values between 1 and 10?
> >> How about a variable that can only hold odd numbers, or, to make it
> >> more difficult, say fibonacci numbers?
> >
> > Well, of course you can't in *C*; you can barely zip you pants with C.
> > But I believe you can do the above in C++, can't you?
>
> You can write self-modifying code in C, so I don't see how you can not
> do that in C.
> ;)

I stand corrected: if one is using C and writing self-modifying
code, then one *can* zip one's pants.


Marshall
From: Chris Smith
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <MPG.1f0653fc9020495b9896f2@news.altopia.net>
Marshall <···············@gmail.com> wrote:
> I stand corrected: if one is using C and writing self-modifying
> code, then one *can* zip one's pants.

I believe you'd need quite the creative USB-based or similar peripheral 
device, as well.

-- 
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation
From: Joe Marshall
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1151339115.942653.59210@c74g2000cwc.googlegroups.com>
Marshall wrote:
>
> I stand corrected: if one is using C and writing self-modifying
> code, then one *can* zip one's pants.

Static proofs notwithstanding, I'd prefer a dynamic check just prior to
this operation.

I want my code to be the only self-modifying thing around here.
From: Darren New
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <Bs0ng.16003$Z67.2174@tornado.socal.rr.com>
Dr.Ruud wrote:
> You can write self-modifying code in C,

No, by violating the standards and invoking undefined behavior, you can 
write self-modifying code. I wouldn't say you're still "in C" any more, tho.

-- 
   Darren New / San Diego, CA, USA (PST)
     Native Americans used every part
     of the buffalo, including the wings.
From: David Hopwood
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <HGcng.463183$tc.147087@fe2.news.blueyonder.co.uk>
Dr.Ruud wrote:
> Marshall schreef:
>>Rob Thorpe:
> 
>>>Can I make a type in C that can only have values between 1 and 10?
>>>How about a variable that can only hold odd numbers, or, to make it
>>>more difficult, say fibonacci numbers?
>>
>>Well, of course you can't in *C*; you can barely zip you pants with C.
>>But I believe you can do the above in C++, can't you?
> 
> You can write self-modifying code in C, so I don't see how you can not
> do that in C. ;)

Strictly speaking, you can't write self-modifying code in Standard C.

-- 
David Hopwood <····················@blueyonder.co.uk>
From: David Hopwood
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <2Ecng.463176$tc.227775@fe2.news.blueyonder.co.uk>
Rob Thorpe wrote:
> Andreas Rossberg wrote:
>>Rob Thorpe wrote:
>>
>>>Its easy to create a reasonable framework.
>>
>>Luca Cardelli has given the most convincing one in his seminal tutorial
>>"Type Systems", where he identifies "typed" and "safe" as two orthogonal
>>dimensions and gives the following matrix:
>>
>>           | typed | untyped
>>    -------+-------+----------
>>    safe   | ML    | Lisp
>>    unsafe | C     | Assembler
>>
>>Now, jargon "dynamically typed" is simply untyped safe, while "weakly
>>typed" is typed unsafe.
> 
> Consider a langauge something like BCPL or a fancy assembler, but with
> not quite a 1:1 mapping with machine langauge.
> 
> It differs in one key regard: it has a variable declaration command.
> This command allows the programmer to allocate a block of memory to a
> variable.  If the programmer attempts to index a variable outside the
> block of memory allocated to it an error will occur.  Similarly if the
> programmer attempts to copy a larger block into a smaller block an
> error would occur.
> 
> Such a language would be truly untyped and "safe", that is safe
> according to many peoples use of the word including, I think, yours.
> 
> But it differs from latently typed languages like python, perl or lisp.
> In such a language there is no information about the type the variable
> stores. The programmer cannot write code to test it, and so can't
> write functions that issue errors if given arguments of the wrong type.

So the hypothetical language, unlike Python, Perl and Lisp, is not
dynamically *tagged*.

-- 
David Hopwood <····················@blueyonder.co.uk>
From: Joachim Durchholz
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e7m2l5$35g$1@online.de>
Andreas Rossberg schrieb:
> 
> Luca Cardelli has given the most convincing one in his seminal tutorial 
> "Type Systems", where he identifies "typed" and "safe" as two orthogonal 
> dimensions and gives the following matrix:
> 
>           | typed | untyped
>    -------+-------+----------
>    safe   | ML    | Lisp
>    unsafe | C     | Assembler
> 
> Now, jargon "dynamically typed" is simply untyped safe, while "weakly 
> typed" is typed unsafe.

Here's a matrix how most people that I know would fill in with terminology:

             | Statically   | Not         |
             | typed        | statically  |
             |              | typed       |
    ---------+--------------+-------------+
    typesafe | "strongly    | Dynamically |
             | typed"       | typed       |
             | (ML, Pascal) | (Lisp)      |
    ---------+--------------+-------------+
    not      | (no common   | "untyped"   |
    typesafe | terminology) |             |
             | (C)          | (Assembly)  |
    ---------+--------------+-------------+

(Terms in quotes are challenged on a regular basis, or rarely if ever 
applied.)

With the above terminology, it becomes clear that the opposite if 
"(statically) typed" isn't "statically untyped", but "not statically 
typed". "Statically typed" and "dynamically typed" aren't even 
opposites, they just don't overlap.

Another observation: type safeness is more of a spectrum than a clearcut 
distinction. Even ML and Pascal have ways to circumvent the type system, 
and even C is typesafe unless you use unsafe constructs.
IOW from a type-theoretic point of view, there is no real difference 
between their typesafe and not typesafe languages in the "statically 
typed" column; the difference is in the amount of unsafe construct usage 
in practial programs.

Regards,
Jo
From: ········@ps.uni-sb.de
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1151244628.566060.40500@p79g2000cwp.googlegroups.com>
Joachim Durchholz write:
>
> Another observation: type safeness is more of a spectrum than a clearcut
> distinction. Even ML and Pascal have ways to circumvent the type system,

No. I'm not sure about Pascal, but (Standard) ML surely has none. Same
with Haskell as defined by its spec. OCaml has a couple of clearly
marked unsafe library functions, but no unsafe feature in the language
semantics itself.

> and even C is typesafe unless you use unsafe constructs.

Tautology. Every language is "safe unless you use unsafe constructs".
(Unfortunately, you can hardly write interesting programs in any safe
subset of C.)

> IOW from a type-theoretic point of view, there is no real difference
> between their typesafe and not typesafe languages in the "statically
> typed" column; the difference is in the amount of unsafe construct usage
> in practial programs.

Huh? There is a huge, fundamental difference: namely whether a type
system is sound or not. A soundness proof is obligatory for any serious
type theory, and failure to establish it simply is a bug in the theory.

- Andreas
From: Scott David Daniels
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <449eba8d$1@nntp0.pdx.net>
········@ps.uni-sb.de wrote:
> Huh? There is a huge, fundamental difference: namely whether a type
> system is sound or not. A soundness proof is obligatory for any serious
> type theory, and failure to establish it simply is a bug in the theory.

So you claim Java and Objective C are "simply bugs in the theory."

--Scott David Daniels
·············@acm.org
From: ········@ps.uni-sb.de
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1151257933.443578.245450@r2g2000cwb.googlegroups.com>
Scott David Daniels wrote:
> ········@ps.uni-sb.de wrote:
> > Huh? There is a huge, fundamental difference: namely whether a type
> > system is sound or not. A soundness proof is obligatory for any serious
> > type theory, and failure to establish it simply is a bug in the theory.
>
> So you claim Java and Objective C are "simply bugs in the theory."

You're misquoting me. What I said is that lack of soundness of a
particular type theory (i.e. a language plus type system) is a bug in
that theory.

But anyway, Java in fact is frequently characterised as having a bogus
type system (e.g. with respect to array subtyping). Of course, in a
hybrid language like Java with its dynamic checks it can be argued
either way.

Naturally, the type system of Objective-C is as broken as the one of
plain C.

- Andreas
From: J�rgen Exner
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <zAzng.8068$Wl.2943@trnddc01>
Scott David Daniels wrote:
> ········@ps.uni-sb.de wrote:
>> Huh? There is a huge, fundamental difference: namely whether a type
>> system is sound or not. A soundness proof is obligatory for any
>> serious type theory, and failure to establish it simply is a bug in
>> the theory.
>
> So you claim Java and Objective C are "simply bugs in the theory."

They are certainly off topic in CLPM.

jue 
From: David Hopwood
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <elDng.218638$8W1.61900@fe1.news.blueyonder.co.uk>
Scott David Daniels wrote:
> ········@ps.uni-sb.de wrote:
> 
>> Huh? There is a huge, fundamental difference: namely whether a type
>> system is sound or not. A soundness proof is obligatory for any serious
>> type theory, and failure to establish it simply is a bug in the theory.
> 
> So you claim Java and Objective C are "simply bugs in the theory."

Java's type system was unsound for much of its life. I think that supports
the point that it's inadequate to simply "wish and hope" for soundness,
and that a proof should be insisted on.

-- 
David Hopwood <····················@blueyonder.co.uk>
From: Gabriel Dos Reis
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <m34py9dvmc.fsf@uniton.integrable-solutions.net>
········@ps.uni-sb.de writes:

[...]

| > and even C is typesafe unless you use unsafe constructs.
| 
| Tautology. Every language is "safe unless you use unsafe constructs".
| (Unfortunately, you can hardly write interesting programs in any safe
| subset of C.)

Fortunately, some people do, as living job.  

I must confess I'm sympathetic to Bob Harper's view expressed in

  http://www.seas.upenn.edu/~sweirich/types/archive/1999-2003/msg00298.html

-- Gaby
From: ········@ps.uni-sb.de
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1151252226.104247.62960@b68g2000cwa.googlegroups.com>
Gabriel Dos Reis wrote:
> |
> | (Unfortunately, you can hardly write interesting programs in any safe
> | subset of C.)
>
> Fortunately, some people do, as living job.

I don't think so. Maybe the question is what a "safe subset" consists
of. In my book, it excludes all features that are potentially unsafe.
So in particular, C-style pointers are out, because they can easily
dangle, be uninitialisied, whatever. Can you write a realistic C
program without using pointers?

- Andreas
From: Gabriel Dos Reis
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <m3mzc1cekr.fsf@uniton.integrable-solutions.net>
········@ps.uni-sb.de writes:

| Gabriel Dos Reis wrote:
| > |
| > | (Unfortunately, you can hardly write interesting programs in any safe
| > | subset of C.)
| >
| > Fortunately, some people do, as living job.
| 
| I don't think so. Maybe the question is what a "safe subset" consists
| of. In my book, it excludes all features that are potentially unsafe.

if you equate "unsafe" with "potentially unsafe", then you have
changed gear and redefined things on the fly, and things that could
be given sense before ceases to have meaning.  I decline following
such an uncertain, extreme, path.

| So in particular, C-style pointers are out, because they can easily
| dangle, be uninitialisied, whatever. 

sorry, the specific claim I was responding to is not whether *you* can
easily misuse C constructs.  I would refrain from generalizing your
inability to write interesting correct programs in C, to a quality of
the language itself.  Rather, the claim I was responding to is this:

  (Unfortunately, you can hardly write interesting programs in any safe
   subset of C.)

I would suggest you give more thoughts to the claims made in

  http://www.seas.upenn.edu/~sweirich/types/archive/1999-2003/msg00298.html

-- Gaby
From: ········@ps.uni-sb.de
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1151257480.183097.165640@b68g2000cwa.googlegroups.com>
Gabriel Dos Reis wrote:
> ········@ps.uni-sb.de writes:
>
> | Gabriel Dos Reis wrote:
> | > |
> | > | (Unfortunately, you can hardly write interesting programs in any safe
> | > | subset of C.)
> | >
> | > Fortunately, some people do, as living job.
> |
> | I don't think so. Maybe the question is what a "safe subset" consists
> | of. In my book, it excludes all features that are potentially unsafe.
>
> if you equate "unsafe" with "potentially unsafe", then you have
> changed gear and redefined things on the fly, and things that could
> be given sense before ceases to have meaning.  I decline following
> such an uncertain, extreme, path.

An unsafe *feature* is one that can potentially exhibit unsafe
behaviour. How else would you define it, if I may ask?

A safe *program* may or may not use unsafe features, but that is not
the point when we talk about safe *language subsets*.

> I would suggest you give more thoughts to the claims made in
>
>   http://www.seas.upenn.edu/~sweirich/types/archive/1999-2003/msg00298.html

Thanks, I am aware of it. Taking into account the hypothetical nature
of the argument, and all the caveats listed with respect to C, I do not
think that it is too relevant for the discussion at hand. Moreover,
Harper talks about a relative concept of "C-safety". I assume that
everybody here understands that by "safe" in this discussion we mean
something else (in particular, memory safety).

Or are you trying to suggest that we should indeed consider C safe for
the purpose of this discussion?

- Andreas
From: Gabriel Dos Reis
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <m3bqsgdfz2.fsf@uniton.integrable-solutions.net>
········@ps.uni-sb.de writes:

| think that it is too relevant for the discussion at hand. Moreover,
| Harper talks about a relative concept of "C-safety".

Then, I believe you missed the entire point.

   First point: "safety" is a *per-language* property.  Each language
   comes with its own notion of safety.  ML is ML-safe; C is C-safe;
   etc.  I'm not being facetious; I think this is the core of the
   confusion. 

   Safety is an internal consistency check on the formal definition of
   a language.  In a sense it is not interesting that a language is
   safe, precisely because if it weren't, we'd change the language to
   make sure it is!  I regard safety as a tool for the language
   designer, rather than a criterion with which we can compare
   languages.


[...]

| Or are you trying to suggest that we should indeed consider C safe for
| the purpose of this discussion?

I'm essentially suggesting "silly arguments" (as noted by someone
in another message) be left out for the sake of productive
conversation.  Apparently, I did not communicate that point well enough.

-- Gaby
From: Matthias Blume
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <m2r71c7i2c.fsf@hanabi.local>
Gabriel Dos Reis <···@integrable-solutions.net> writes:

> ········@ps.uni-sb.de writes:
>
> | think that it is too relevant for the discussion at hand. Moreover,
> | Harper talks about a relative concept of "C-safety".
>
> Then, I believe you missed the entire point.
>
>    First point: "safety" is a *per-language* property.  Each language
>    comes with its own notion of safety.  ML is ML-safe; C is C-safe;
>    etc.  I'm not being facetious; I think this is the core of the
>    confusion. 
>
>    Safety is an internal consistency check on the formal definition of
>    a language.  In a sense it is not interesting that a language is
>    safe, precisely because if it weren't, we'd change the language to
>    make sure it is!  I regard safety as a tool for the language
>    designer, rather than a criterion with which we can compare
>    languages.

I agree with Bob Harper about safety being language-specific and all
that.  But, with all due respect, I think his characterization of C is
not accurate.  In particular, the semantics of C (as specified by the
standard) is *not* just a mapping from memories to memories.  Instead,
there are lots of situations that are quite clearly marked in C's
definition as "undefined" (or whatever the technical term may be).  In
other words, C's specified transition system (to the extend that we
can actually call it "specified") has lots of places where programs
can "get stuck", i.e., where there is no transition to take.  Most
actual implementations of C (including "Unix") are actually quite
generous by filling in some of the transitions, so that programs that
are officially "legal C" will run anyway.

Example:

The following program (IIRC) is not legal C, even though I expect it
to run without problem on almost any machine/OS, printing 0 to stdout:

#include <stdio.h>

int a[] = { 0, 1, 2 };

int main (void)
{
  int *p, *q;
  p = a;
  q = p-1;  /** illegal **/
  printf("%d\n", q[1]);
  return 0;
}

Nevertheless, the line marked with the comment is illegal because it
creates a pointer that is not pointing into a memory object.  Still, a
C compiler is not required to reject this program.  It is allowed,
though, to make the program behave in unexpected ways.  In particular,
you can't really blame the compiler if the program does not print 0,
or if it even crashes.

AFAIC, C is C-unsafe by Bob's reasoning.

---

Of course, C can be made safe quite easily:

Define a state "undefined" that is considered "safe" and add a
transition to "undefined" wherever necessary.

Kind regards,
Matthias
From: David Hopwood
Subject: Re: What is Expressiveness in a Computer Language [off-topic]
Date: 
Message-ID: <ItSng.481412$xt.283730@fe3.news.blueyonder.co.uk>
Matthias Blume wrote:
> I agree with Bob Harper about safety being language-specific and all
> that.  But, with all due respect, I think his characterization of C is
> not accurate.
[...]
> AFAIC, C is C-unsafe by Bob's reasoning.

Agreed.

> Of course, C can be made safe quite easily:
> 
> Define a state "undefined" that is considered "safe" and add a
> transition to "undefined" wherever necessary.

I wouldn't say that was "quite easy" at all.

C99 4 #2:
# If a "shall" or "shall not" requirement that appears outside of a constraint
# is violated, the behavior is undefined. Undefined behavior is otherwise
# indicated in this International Standard by the words "undefined behavior"
# *or by the omission of any explicit definition of behavior*. [...]

In other words, to fix C to be a safe language (compatible with Standard C89
or C99), you first have to resolve all the ambiguities in the standard where
the behaviour is *implicitly* undefined. There are a lot of them.

-- 
David Hopwood <····················@blueyonder.co.uk>
From: Matthias Blume
Subject: Re: What is Expressiveness in a Computer Language [off-topic]
Date: 
Message-ID: <m1d5cwkkto.fsf@hana.uchicago.edu>
David Hopwood <····················@blueyonder.co.uk> writes:

> Matthias Blume wrote:
>> I agree with Bob Harper about safety being language-specific and all
>> that.  But, with all due respect, I think his characterization of C is
>> not accurate.
> [...]
>> AFAIC, C is C-unsafe by Bob's reasoning.
>
> Agreed.
>
>> Of course, C can be made safe quite easily:
>> 
>> Define a state "undefined" that is considered "safe" and add a
>> transition to "undefined" wherever necessary.
>
> I wouldn't say that was "quite easy" at all.
>
> C99 4 #2:
> # If a "shall" or "shall not" requirement that appears outside of a constraint
> # is violated, the behavior is undefined. Undefined behavior is otherwise
> # indicated in this International Standard by the words "undefined behavior"
> # *or by the omission of any explicit definition of behavior*. [...]
>
> In other words, to fix C to be a safe language (compatible with Standard C89
> or C99), you first have to resolve all the ambiguities in the standard where
> the behaviour is *implicitly* undefined. There are a lot of them.

Yes, if you want to make the transition system completely explict, it
won't be easy. I was thinking of a catch-all rule that says:
transition to "undefined" unless specified otherwise.  (Note that I am
not actually advocating this approach to making a language "safe".
For practical purposes, C is unsafe.  (And so is C++.))
From: Andreas Rossberg
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e7ogqc$99fn4$1@hades.rz.uni-saarland.de>
Gabriel Dos Reis wrote:
> ········@ps.uni-sb.de writes:
> 
> | think that it is too relevant for the discussion at hand. Moreover,
> | Harper talks about a relative concept of "C-safety".
> 
> Then, I believe you missed the entire point.

I think the part of my reply you snipped addressed it well enough.

Anyway, I can't believe that we actually need to argue about the fact 
that - for any *useful* and *practical* notion of safety - C is *not* a 
safe language. I refrain from continuing the discussion along this line, 
because *that* is *really* silly.

- Andreas
From: Darren New
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <ZzFng.4215$MF6.3773@tornado.socal.rr.com>
Gabriel Dos Reis wrote:
> I would suggest you give more thoughts to the claims made in
>   http://www.seas.upenn.edu/~sweirich/types/archive/1999-2003/msg00298.html

I'm not sure I understand this. Looking at "Example 2", where C is 
claimed to be "C-safe", he makes two points that I disagree with.

First, he says Unix misimplements the operations semantics of C. It 
seems more like Unix simply changes the transition system such that 
programs which access unmapped memory are terminal.

Second, he says that he hasn't seen a formal definition of C which 
provides precide operational semantics. However, the language standard 
itself specified numerous "undefined" results from operations such as 
accessing unallocated memory, or even pointing pointers to unallocated 
memory. So unless he wants to modify the C standard to indicate what 
should happen in such situations, I'm not sure why he thinks it's "safe" 
by his definition. There is no one "q" that a "p" goes to based on the 
language. He's confusing language with particular implementations. He 
seems to be saying "C is safe, oh, except for all the parts of the 
language that are undefined and therefore unsafe. But if you just 
clearly defined every part of C, it would be safe."  It would also seem 
to be the case that one could not say that a program "p" is either 
terminal or not, as that would rely on input, right? gets() is "safe" as 
long as you don't read more than the buffer you allocated.

What's the difference between "safe" and "well-defined semantics"? 
(Ignoring for the moment things like two threads modifying the same 
memory at the same time and other such hard-to-define situations?)

-- 
   Darren New / San Diego, CA, USA (PST)
     Native Americans used every part
     of the buffalo, including the wings.
From: Joachim Durchholz
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e7mirk$7s1$1@online.de>
········@ps.uni-sb.de schrieb:
> Gabriel Dos Reis wrote:
>> |
>> | (Unfortunately, you can hardly write interesting programs in any safe
>> | subset of C.)
>>
>> Fortunately, some people do, as living job.
> 
> I don't think so. Maybe the question is what a "safe subset" consists
> of. In my book, it excludes all features that are potentially unsafe.

Unless you define "safe", *any* program is unsafe.
Somebody could read the program listing, which could trigger a traumatic 
childhood experiece.
(Yes, I'm being silly. But the point is very serious. Even with less 
silly examples, whether a language or subset is "safe" entirely depends 
on what you define to be "safe", and these definitions tend to vary 
vastly across language communities.)

Regards,
Jo
From: Gabriel Dos Reis
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <m3hd28dgbd.fsf@uniton.integrable-solutions.net>
Joachim Durchholz <··@durchholz.org> writes:

[...]

| (Yes, I'm being silly. But the point is very serious. Even with less
| silly examples, whether a language or subset is "safe" entirely
| depends on what you define to be "safe", and these definitions tend to
| vary vastly across language communities.)

Yes, I know.  I hoped we would escape sillyland for the sake of
productive conversation.

-- Gaby
From: Joachim Durchholz
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e7migg$7em$1@online.de>
········@ps.uni-sb.de schrieb:
> Joachim Durchholz write:
>> Another observation: type safeness is more of a spectrum than a clearcut
>> distinction. Even ML and Pascal have ways to circumvent the type system,
> 
> No. I'm not sure about Pascal,

You'd have to use an untagged union type.
It's the standard maneuver in Pascal to get unchecked bitwise 
reinterpretation.
Since it's an undefined behavior, you're essentially on your own, but in 
practice, any compilers that implemented a different semantics were 
hammered with bug reports until they complied with the standard - in 
this case, an unwritten (and very unofficial) one, but a rather 
effective one.

 > but (Standard) ML surely has none.

NLFFI?

 > Same with Haskell as defined by its spec.

Um... I'm not 100% sure, but I dimly (mis?)remember having read that 
UnsafePerformIO also offered some ways to circumvent the type system.

(Not that this would be an important point anyway.)

 > OCaml has a couple of clearly
> marked unsafe library functions, but no unsafe feature in the language
> semantics itself.

If there's a library with not-typesafe semantics, then at least that 
language implementation is not 100% typesafe.
If all implementations of a language are not 100% typesafe, then I 
cannot consider the language itself 100% typesafe.

Still, even Pascal is quite a lot "more typesafe" than, say, C.

>> and even C is typesafe unless you use unsafe constructs.
> 
> Tautology. Every language is "safe unless you use unsafe constructs".

No tautology - the unsafe constructs aren't just typewise unsafe ;-p

That's exactly why I replaced Luca Cardelli's "safe/unsafe" 
"typesafe/not typesafe". There was no definition to the original terms 
attached, and this discussion is about typing anyway.

> (Unfortunately, you can hardly write interesting programs in any safe
> subset of C.)

Now that's a bold claim that I'd like to see substantiated.

>> IOW from a type-theoretic point of view, there is no real difference
>> between their typesafe and not typesafe languages in the "statically
>> typed" column; the difference is in the amount of unsafe construct usage
>> in practial programs.
> 
> Huh? There is a huge, fundamental difference:  namely whether a type
> system is sound or not.

I think you're overstating the case.

In type theory, of course, there's no such things as an "almost typesafe 
language" - it's typesafe or it isn't.

In practice, I find no implementation where type mismatches cannot 
occur, if only when interfacing with the external world (except if you 
cheat by treating everything external as a byte sequence, which is like 
saying that all languages have at least a universal ANY type and are 
hence statically-typed).
And in those languages that do have type holes, these holes may be more 
or less relevant - and it's a *very* broad spectrum here.
And from that perspective, if ML indeed has no type hole at all, then 
it's certainly an interesting extremal point, but I still have a 
relevant spectrum down to assembly language.

Regards,
Jo
From: ········@ps.uni-sb.de
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1151264564.653872.151090@m73g2000cwd.googlegroups.com>
Joachim Durchholz wrote:
>
>  > but (Standard) ML surely has none.
>
> NLFFI?
>
>  > Same with Haskell as defined by its spec.
>
> Um... I'm not 100% sure, but I dimly (mis?)remember having read that
> UnsafePerformIO also offered some ways to circumvent the type system.

Neither NLFFI nor unsafePerformIO are official part of the respective
languages. Besides, foreign function interfaces are outside a language
by definition, and can hardly be taken as an argument - don't blame
language A that unsafety arises when you subvert it by interfacing with
unsafe language B on a lower level.

> >> and even C is typesafe unless you use unsafe constructs.
> >
> > Tautology. Every language is "safe unless you use unsafe constructs".
>
> No tautology - the unsafe constructs aren't just typewise unsafe ;-p
>
> That's exactly why I replaced Luca Cardelli's "safe/unsafe"
> "typesafe/not typesafe". There was no definition to the original terms
> attached, and this discussion is about typing anyway.

The Cardelli paper I was referring to discusses it in detail. And if
you look up the context of my posting: it was exactly whether safety is
to be equated with type safety.

> >> IOW from a type-theoretic point of view, there is no real difference
> >> between their typesafe and not typesafe languages in the "statically
> >> typed" column; the difference is in the amount of unsafe construct usage
> >> in practial programs.
> >
> > Huh? There is a huge, fundamental difference:  namely whether a type
> > system is sound or not.
>
> I think you're overstating the case.
>
> In type theory, of course, there's no such things as an "almost typesafe
> language" - it's typesafe or it isn't.

Right, and you were claiming to argue from a type-theoretic POV.

[snipped the rest]

- Andreas
From: Vesa Karvonen
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e7m4oa$mm0$1@oravannahka.helsinki.fi>
In comp.lang.functional Joachim Durchholz <··@durchholz.org> wrote:
> [...] Even ML and Pascal have ways to circumvent the type system, [...]

Show me a Standard ML program that circumvents the type system.

-Vesa Karvonen
From: David Hopwood
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <YSxng.477416$xt.143460@fe3.news.blueyonder.co.uk>
Joachim Durchholz wrote:
> Andreas Rossberg schrieb:
> 
>> Luca Cardelli has given the most convincing one in his seminal
>> tutorial "Type Systems", where he identifies "typed" and "safe" as two
>> orthogonal dimensions and gives the following matrix:
>>
>>           | typed | untyped
>>    -------+-------+----------
>>    safe   | ML    | Lisp
>>    unsafe | C     | Assembler
>>
>> Now, jargon "dynamically typed" is simply untyped safe, while "weakly
>> typed" is typed unsafe.
> 
> Here's a matrix how most people that I know would fill in with terminology:
> 
>             | Statically   | Not         |
>             | typed        | statically  |
>             |              | typed       |
>    ---------+--------------+-------------+
>    typesafe | "strongly    | Dynamically |
>             | typed"       | typed       |
>             | (ML, Pascal) | (Lisp)      |

Pascal, in most of its implementations, isn't [memory] safe; it's in the same
box as C.

>    ---------+--------------+-------------+
>    not      | (no common   | "untyped"   |
>    typesafe | terminology) |             |
>             | (C)          | (Assembly)  |
>    ---------+--------------+-------------+

I have to say that I prefer the labels used by Cardelli.

> (Terms in quotes are challenged on a regular basis, or rarely if ever
> applied.)
> 
> With the above terminology, it becomes clear that the opposite if
> "(statically) typed" isn't "statically untyped", but "not statically
> typed". "Statically typed" and "dynamically typed" aren't even
> opposites, they just don't overlap.
> 
> Another observation: type safeness is more of a spectrum than a clearcut
> distinction. Even ML and Pascal have ways to circumvent the type system,

Standard ML doesn't (with just the standard basis; no external libraries,
and assuming that file I/O cannot be used to corrupt the language
implementation).

> and even C is typesafe unless you use unsafe constructs.
> IOW from a type-theoretic point of view, there is no real difference
> between their typesafe and not typesafe languages in the "statically
> typed" column; the difference is in the amount of unsafe construct usage
> in practial programs.

It is quite possible to design languages that have absolutely no unsafe
constructs, and are still useful.

-- 
David Hopwood <····················@blueyonder.co.uk>
From: Pascal Costanza
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <4g7k1mF1ma4ghU1@individual.net>
Joachim Durchholz wrote:
> Andreas Rossberg schrieb:
>>
>> Luca Cardelli has given the most convincing one in his seminal 
>> tutorial "Type Systems", where he identifies "typed" and "safe" as two 
>> orthogonal dimensions and gives the following matrix:
>>
>>           | typed | untyped
>>    -------+-------+----------
>>    safe   | ML    | Lisp
>>    unsafe | C     | Assembler
>>
>> Now, jargon "dynamically typed" is simply untyped safe, while "weakly 
>> typed" is typed unsafe.
> 
> Here's a matrix how most people that I know would fill in with terminology:
> 
>             | Statically   | Not         |
>             | typed        | statically  |
>             |              | typed       |
>    ---------+--------------+-------------+
>    typesafe | "strongly    | Dynamically |
>             | typed"       | typed       |
>             | (ML, Pascal) | (Lisp)      |
>    ---------+--------------+-------------+
>    not      | (no common   | "untyped"   |
>    typesafe | terminology) |             |
>             | (C)          | (Assembly)  |
>    ---------+--------------+-------------+
> 
> (Terms in quotes are challenged on a regular basis, or rarely if ever 
> applied.)
> 
> With the above terminology, it becomes clear that the opposite if 
> "(statically) typed" isn't "statically untyped", but "not statically 
> typed". "Statically typed" and "dynamically typed" aren't even 
> opposites, they just don't overlap.
> 
> Another observation: type safeness is more of a spectrum than a clearcut 
> distinction. Even ML and Pascal have ways to circumvent the type system, 
> and even C is typesafe unless you use unsafe constructs.
> IOW from a type-theoretic point of view, there is no real difference 
> between their typesafe and not typesafe languages in the "statically 
> typed" column; the difference is in the amount of unsafe construct usage 
> in practial programs.

It's also relevant how straightforward it is to distinguish between safe 
and unsafe code, how explicit you have to be when you use unsafe code, 
how likely it is that you accidentally use unsafe code without being 
aware of it, what the generally accepted conventions in a language 
community are, etc. pp.


Pascal

-- 
3rd European Lisp Workshop
July 3 - Nantes, France - co-located with ECOOP 2006
http://lisp-ecoop06.bknr.net/
From: Joachim Durchholz
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e7miuq$7s1$2@online.de>
Pascal Costanza schrieb:
>> Another observation: type safeness is more of a spectrum than a 
>> clearcut distinction. Even ML and Pascal have ways to circumvent the 
>> type system, and even C is typesafe unless you use unsafe constructs.
>> IOW from a type-theoretic point of view, there is no real difference 
>> between their typesafe and not typesafe languages in the "statically 
>> typed" column; the difference is in the amount of unsafe construct 
>> usage in practial programs.
> 
> It's also relevant how straightforward it is to distinguish between safe 
> and unsafe code, how explicit you have to be when you use unsafe code, 
> how likely it is that you accidentally use unsafe code without being 
> aware of it, what the generally accepted conventions in a language 
> community are, etc. pp.

Fully agreed.
From: David Hopwood
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <H2Emg.472606$xt.82611@fe3.news.blueyonder.co.uk>
Rob Thorpe wrote:
> David Hopwood wrote:
> 
>>As far as I can tell, the people who advocate using "typed" and "untyped"
>>in this way are people who just want to be able to discuss all languages in
>>a unified terminological framework, and many of them are specifically not
>>advocates of statically typed languages.
> 
> Its easy to create a reasonable framework. My earlier posts show simple
> ways of looking at it that could be further refined, I'm sure there are
> others who have already done this.
>
> The real objection to this was that latently/dynamically typed
> languages have a place in it.

You seem to very keen to attribute motives to people that are not apparent
from what they have said.

> But some of the advocates of statically
> typed languages wish to lump these languages together with assembly
> language a "untyped" in an attempt to label them as unsafe.

A common term for languages which have defined behaviour at run-time is
"memory safe". For example, "Smalltalk is untyped and memory safe."
That's not too objectionable, is it?

(It is actually more common for statically typed languages to fail to be
memory safe; consider C and C++, for example.)

-- 
David Hopwood <····················@blueyonder.co.uk>
From: Rob Thorpe
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1151053730.887911.256200@u72g2000cwu.googlegroups.com>
David Hopwood wrote:
> Rob Thorpe wrote:
> > David Hopwood wrote:
> >
> >>As far as I can tell, the people who advocate using "typed" and "untyped"
> >>in this way are people who just want to be able to discuss all languages in
> >>a unified terminological framework, and many of them are specifically not
> >>advocates of statically typed languages.
> >
> > Its easy to create a reasonable framework. My earlier posts show simple
> > ways of looking at it that could be further refined, I'm sure there are
> > others who have already done this.
> >
> > The real objection to this was that latently/dynamically typed
> > languages have a place in it.
>
> You seem to very keen to attribute motives to people that are not apparent
> from what they have said.

The term "dynamically typed" is well used and understood.  The term
untyped is generally associated with languages that as you put it "have
no memory safety", it is a pejorative term.  "Latently typed" is not
well used unfortunately, but more descriptive.

Most of the arguments above describe a static type system then follow
by saying that this is what "type system" should mean, and finishing by
saying everything else should be considered untyped. This seems to me
to be an effort to associate dynamically typed languages with this
perjorative term.

> > But some of the advocates of statically
> > typed languages wish to lump these languages together with assembly
> > language a "untyped" in an attempt to label them as unsafe.
>
> A common term for languages which have defined behaviour at run-time is
> "memory safe". For example, "Smalltalk is untyped and memory safe."
> That's not too objectionable, is it?

Memory safety isn't the whole point, it's only half of it.  Typing
itself is the point. Regardless of memory safety if you do a
calculation in a latently typed langauge, you can find the type of the
resulting object.
From: Marshall
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1151080122.252066.106740@i40g2000cwc.googlegroups.com>
Rob Thorpe wrote:
> David Hopwood wrote:
>
> The term "dynamically typed" is well used and understood.  The term
> untyped is generally associated with languages that as you put it "have
> no memory safety", it is a pejorative term.  "Latently typed" is not
> well used unfortunately, but more descriptive.
>
> Most of the arguments above describe a static type system then follow
> by saying that this is what "type system" should mean, and finishing by
> saying everything else should be considered untyped. This seems to me
> to be an effort to associate dynamically typed languages with this
> perjorative term.

I can believe that someone, somewhere out there might have done
this, but I can't recall ever having seen it. In any event, the
standard
term for memory safety is "safety", not "untyped." Further, anyone
who was interested in actually understanding the issues woudn't
be doing what you describe.

And if you did find someone who was actively doing this I would
say "never attribute to malice what can adequately be explained
by stupidity."


Marshall
From: Chris Uppal
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <449bde5f$2$663$bed64819@news.gradwell.net>
David Hopwood wrote:

> > But some of the advocates of statically
> > typed languages wish to lump these languages together with assembly
> > language a "untyped" in an attempt to label them as unsafe.
>
> A common term for languages which have defined behaviour at run-time is
> "memory safe". For example, "Smalltalk is untyped and memory safe."
> That's not too objectionable, is it?

I find it too weak, as if to say: "well, ok, it can't actually corrupt memory
as such, but the program logic is still apt go all over the shop"...

    -- chris
From: Marshall
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1151094966.249077.284610@m73g2000cwd.googlegroups.com>
Chris Uppal wrote:
> David Hopwood wrote:
>
> > > But some of the advocates of statically
> > > typed languages wish to lump these languages together with assembly
> > > language a "untyped" in an attempt to label them as unsafe.
> >
> > A common term for languages which have defined behaviour at run-time is
> > "memory safe". For example, "Smalltalk is untyped and memory safe."
> > That's not too objectionable, is it?
>
> I find it too weak, as if to say: "well, ok, it can't actually corrupt memory
> as such, but the program logic is still apt go all over the shop"...

It is impossible to evaluate your claim without a specific definition
of "go all over the shop." If it means subject to Method Not
Understood,
then we would have to say that Smalltalk can in fact go all over
the shop. If it means subject to memory corruption, then it can't.

In any event, the denotation is quite clear: Smalltalk does not
formally assign types to expressions during a static analysis
phase, and Smalltalk is not subject to memory corruption.

I have a problem with the whole line of reasoning that goes
"someone might think term x means something bad, so we
can't use it." It's unfalsifiable. It also optimizes for malicious
use of the terms. Both are bad properties to have as design
principles, at least in this context.


Marshall
From: David Hopwood
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <wUcng.475931$xt.353687@fe3.news.blueyonder.co.uk>
Chris Uppal wrote:
> David Hopwood wrote:
> 
>>>But some of the advocates of statically
>>>typed languages wish to lump these languages together with assembly
>>>language a "untyped" in an attempt to label them as unsafe.
>>
>>A common term for languages which have defined behaviour at run-time is
>>"memory safe". For example, "Smalltalk is untyped and memory safe."
>>That's not too objectionable, is it?
> 
> I find it too weak, as if to say: "well, ok, it can't actually corrupt memory
> as such, but the program logic is still apt go all over the shop"...

Well, it might ;-)

(In case anyone thinks I am being pejorative toward not-statically-typed
languages here, I would say that the program logic can *also* "go all over
the shop" in a statically typed, memory safe language. To avoid this, you
need at least a language that is "secure" in the sense used in capability
systems, which is a stronger property than memory safety.)

-- 
David Hopwood <····················@blueyonder.co.uk>
From: Anton van Straaten
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <7MFmg.27168$VE1.8188@newssvr14.news.prodigy.com>
Vesa Karvonen wrote:
> In comp.lang.functional Anton van Straaten <·····@appsolutions.com> wrote:
> [...]
> 
> This static vs dynamic type thing reminds me of one article written by
> Bjarne Stroustrup where he notes that "Object-Oriented" has become a
> synonym for "good".  More precisely, it seems to me that both camps
> (static & dynamic) think that "typed" is a synonym for having
> "well-defined semantics" or being "safe" and therefore feel the need
> to be able to speak of their language as "typed" whether or not it
> makes sense.

I reject this comparison.  There's much more to it than that.  The point 
is that the reasoning which programmers perform when working with an 
program in a latently-typed language bears many close similiarities to 
the purpose and behavior of type systems.

This isn't an attempt to jump on any bandwagons, it's an attempt to 
characterize what is actually happening in real programs and with real 
programmers.  I'm relating that activity to type systems because that is 
what it most closely relates to.

Usually, formal type theory ignores such things, because of course 
what's in the programmer's head is outside the domain of the formal 
definition of an untyped language.  But all that means is that formal 
type theory can't account for the entirety of what's happening in the 
case of programs in untyped languages.

Unless you can provide some alternate theory of the subject that's 
better than what I'm offering, it's not sufficient to complain "but that 
goes beyond (static/syntactic) type theory".  Yes, it goes beyond 
traditional type theory, because it's addressing with something which 
traditional type theory doesn't address.  There are reasons to connect 
it to type theory, and if you can't see those reasons, you need to be 
more explicit about why.

> I agree.  I think that instead of "statically typed" we should say
> "typed" and instead of "(dynamically|latently) typed" we should say
> "untyped".

The problem with "untyped" is that there are obvious differences in 
typing behavior between the untyped lambda calculus and, say, a language 
like Scheme (and many others).  Like all latently-typed languages, 
Scheme includes, in the language, mechanisms to tag values in a way that 
supports checks which help the programmer to ensure that the program's 
behavior matches the latent type scheme that the programmer has in mind. 
  See my other recent reply to Marshall for a more detailed explanation 
of what I mean.

I'm suggesting that if a language classifies and tags values in a way 
that supports the programmer in static reasoning about the behavior of 
terms, that calling it "untyped" does not capture the entire picture, 
even if it's technically accurate in a restricted sense (i.e. in the 
sense that terms don't have static types that are known within the 
language).

Let me come at this from another direction: what do you call the 
classifications into number, string, vector etc. that a language like 
Scheme does?  And when someone writes a program which includes the 
following lines, how would you characterize the contents of the comment:

; third : integer -> integer
(define (third n) (quotient n 3))

In my experience, answering these questions without using the word 
"type" results in a lot of silliness.  And if you do use "type", then 
you're going to have to adjust the rest of your position significantly.

>>In a statically-checked language, people tend to confuse automated 
>>static checking with the existence of types, because they're thinking in 
>>a strictly formal sense: they're restricting their world view to what 
>>they see "within" the language.
> 
> 
> That is not unreasonable.  You see, you can't have types unless you
> have a type system.  Types without a type system are like answers
> without questions - it just doesn't make any sense.

The first point I was making is that *automated* checking has very 
little to do with anything, and conflating static types with automated 
checking tends to lead to a lot of confusion on both sides of the 
static/dynamic fence.

But as to your point, latently typed languages have informal type 
systems.  Show me a latently typed language or program, and I can tell 
you a lot about its type system or type scheme.

Soft type inferencers demonstrate this by actually defining a type 
system and inferring type schemes for programs.  That's a challenging 
thing for an automated tool to do, but programmers routinely perform the 
same sort of activity on an informal basis.

>>But a program as seen by the programmer has types: the programmer 
>>performs (static) type inference when reasoning about the program, and 
>>debugs those inferences when debugging the program, finally ending up 
>>with a program which has a perfectly good type scheme.  It's may be 
>>messy compared to say an HM type scheme, and it's usually not proved to 
>>be perfect, but that again is an orthogonal issue.
> 
> 
> There is a huge hole in your argument above.  Types really do not make
> sense without a type system.  To claim that a program has a type
> scheme, you must first specify the type system.  Otherwise it just
> doesn't make any sense.

Again, the type system is informal.  What you're essentially saying is 
that only things that are formally defined make sense.  But you can't 
wish dynamically-checked languages out of existence.  So again, how 
would you characterize these issues in dynamically-checked languages?

Saying that it's just a matter of well-defined semantics doesn't do 
anything to address the details of what's going on.  I'm asking for a 
more specific account than that.

>>Mathematicians operated for thousands of years without automated 
>>checking of proofs, so you can't argue that because a 
>>dynamically-checked program hasn't had its type scheme proved correct, 
>>that it somehow doesn't have types.  That would be a bit like arguing 
>>that we didn't have Math until automated theorem provers came along.
> 
> 
> No - not at all.  First of all, mathematics has matured quite a bit
> since the early days.  I'm sure you've heard of the axiomatic method.
> However, what you are missing is that to prove that your program has
> types, you first need to specify a type system.  Similarly, to prove
> something in math you start by specifying [fill in the rest].

I agree, to make the comparison perfect, you'd need to define a type 
system.  But that's been done in various cases.  So is your complaint 
simply that most programmers are working with informal type systems? 
I've already stipulated that.

However, I think that you want to suggest that those programmers are not 
working with type systems at all.

This reminds me of a comedy skit which parodied the transparency of 
Superman's secret identity: Clark Kent is standing in front of Lois Lane 
and removes his glasses for some reason.  Lois looks confused and says 
"where did Clark go?"  Clark quickly puts his glasses back on, and Lois 
breathes a sigh of relief, "Oh, there you are, Clark".

The problem we're dealing with in this case is that anything that's not 
formally defined is essentially claimed to not exist.  It's lucky that 
this isn't really the case, otherwise much of the world around us would 
vanish in a puff of formalist skepticism.

We're discussing systems that operate on an informal basis: in this 
case, the reasoning about the classification of values which flow 
through terms in a dynamically-checked language.  If you can produce a 
useful formal model of those systems that doesn't omit significant 
information, that's great, and I'm all ears.

However, claiming that e.g. using a universal type is a complete model 
what's happening misses the point: it doesn't account at all for the 
reasoning process I've just described.

>>1. "Untyped" is really quite a misleading term, unless you're talking 
>>about something like the untyped lambda calculus.  That, I will agree, 
>>can reasonably be called untyped.
> 
> 
> Untyped is not misleading.  "Typed" is not a synonym for "safe" or
> "having well-defined semantics".

Again, your two suggested replacements don't come close to capturing 
what I'm talking about.  Without better alternatives, "type" is the 
closest appropriate term.  I'm qualifying it with the term "latent", 
precisely to indicate that I'm not talking about formally-defined types.

I'm open to alternative terminology or ways of characterizing this, but 
they need to address issues that exist outside the boundaries of formal 
type systems, so simply applying terms from formal type theory is not 
usually sufficient.

>>So, will y'all just switch from using "dynamically typed" to "latently 
>>typed"
> 
> 
> I won't (use "latently typed").  At least not without further
> qualification.

This and my other recent post give a fair amount of qualification, so 
let me know if you need anything else to be convinced. :)

But to be fair, I'll start using "untyped" if you can come up with a 
satisfactory answer to the two questions I asked above, just before I 
used the word "silliness".

Anton
From: Chris Smith
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <MPG.1f050051210205e29896e4@news.altopia.net>
Anton van Straaten wrote:
> Vesa Karvonen wrote:
> > 
> > This static vs dynamic type thing reminds me of one article written by
> > Bjarne Stroustrup where he notes that "Object-Oriented" has become a
> > synonym for "good".  More precisely, it seems to me that both camps
> > (static & dynamic) think that "typed" is a synonym for having
> > "well-defined semantics" or being "safe" and therefore feel the need
> > to be able to speak of their language as "typed" whether or not it
> > makes sense.
> 
> I reject this comparison.  There's much more to it than that.

I agree that there's more to it than that.  I also agree, however, that 
Vesa's observation is true, and is a big part of the reason why it's 
difficult to discuss this topic.  I don't recall who said what at this 
point, but earlier today someone else posted -- in this same thread -- 
the idea that static type "advocates" want to classify some languages as 
untyped in order to put them in the same category as assembly language 
programming.  That's something I've never seen, and I think it's far 
from the goal of pretty much anyone; but clearly, *someone* was 
concerned about it.  I don't know if much can be done to clarify this 
rhetorical problem, but it does exist.

The *other* bit that's been brought up in this thread is that the word 
"type" is just familiar and comfortable for programmers working in 
dynamically typed languages, and that they don't want to change their 
vocabulary.

The *third* thing that's brought up is that there is a (to me, somewhat 
vague) conception going around that the two really ARE varieties of the 
same thing.  I'd like to pin this down more, and I hope we get there, 
but for the time being I believe that this impression is incorrect.  At 
the very least, I haven't seen a good way to state any kind of common 
definition that withstands scrutiny.  There is always an intuitive word 
involved somewhere which serves as an escape hatch for the author to 
retain some ability to make a judgement call, and that of course 
sabotages the definition.  So far, that word has varied through all of 
"type", "type error", "verify", and perhaps others... but I've never 
seen anything that allows someone to identify some universal concept of 
typing (or even the phrase "dynamic typing" in the first place) in a way 
that doesn't appeal to intuition.

> The point 
> is that the reasoning which programmers perform when working with an 
> program in a latently-typed language bears many close similiarities to 
> the purpose and behavior of type systems.

Undoubtedly, some programmers sometimes perform reasoning about their 
programs which could also be performed by a static type system.  This is 
fairly natural, since static type systems specifically perform tractable 
analyses of programs (Pierce even uses the word "tractable" in the 
definition of a type system), and human beings are often (though not 
always) best-served by trying to solve tractable problems as well.

> There are reasons to connect 
> it to type theory, and if you can't see those reasons, you need to be 
> more explicit about why.

Let me pipe up, then, as saying that I can't see those reasons; or at 
least, if I am indeed seeing the same reasons that everyone else is, 
then I am unconvinced by them that there's any kind of rigorous 
connection at all.

> I'm suggesting that if a language classifies and tags values in a way 
> that supports the programmer in static reasoning about the behavior of 
> terms, that calling it "untyped" does not capture the entire picture, 
> even if it's technically accurate in a restricted sense (i.e. in the 
> sense that terms don't have static types that are known within the 
> language).

It is, nevertheless, quite appropriate to call the language "untyped" if 
I am considering static type systems.  I seriously doubt that this usage 
in any way misleads anyone into assuming the absence of any mental 
processes on the part of the programmer.  I hope you agree.  If not, 
then I think you significantly underestimate a large category of people.

> The first point I was making is that *automated* checking has very 
> little to do with anything, and conflating static types with automated 
> checking tends to lead to a lot of confusion on both sides of the 
> static/dynamic fence.

I couldn't disagree more.  Rather, when you're talking about static 
types (or just "types" in most research literature that I've seen), then 
the realm of discussion is specifically defined to be the very set of 
errors that are automatically caught and flagged by the language 
translator.  I suppose that it is possible to have an unimplemented type 
system, but it would be unimplemented only because someone hasn't felt 
the need nor gotten around to it.  Being implementABLE is a crucial part 
of the definition of a static type system.

I am beginning to suspect that you're make the converse of the error I 
made earlier in the thread.  That is, you may be saying things regarding 
the psychological processes of programmers and such that make sense when 
discussing dynamic types, and in any case I haven't seen any kind of 
definition of dynamic types that is more acceptable yet; but it's 
completely irrelevant to static types.  Static types are not fuzzy -- if 
they were fuzzy, they would cease to be static types -- and they are not 
a phenomenon of psychology.  To try to redefine static types in this way 
not only ignores the very widely accepted basis of entire field of 
existing literature, but also leads to false ideas such as that there is 
some specific definable set of problems that type systems are meant to 
solve.

Skipping ahead a bit...

> I agree, to make the comparison perfect, you'd need to define a type 
> system.  But that's been done in various cases.

I don't think that has been done, in the case of dynamic types.  It has 
been done for static types, but much of what you're saying here is in 
contradiction to the definition of a type system in that sense of the 
word.

> The problem we're dealing with in this case is that anything that's not 
> formally defined is essentially claimed to not exist.

I see it as quite reasonable when there's an effort by several 
participants in this thread to either imply or say outright that static 
type systems and dynamic type systems are variations of something 
generally called a "type system", and given that static type systems are 
quite formally defined, that we'd want to see a formal definition for a 
dynamic type system before accepting the proposition that they are of a 
kind with each other.  So far, all the attempts I've seen to define a 
dynamic type system seem to reduce to just saying that there is a well-
defined semantics for the language.

I believe that's unacceptable for several reasons, but the most 
significant of them is this.  It's not reasonable to ask anyone to 
accept that static type systems gain their essential "type system-ness" 
from the idea of having well-defined semantics.  From the perspective of 
a statically typed language, this looks like a group of people getting 
together and deciding that the real "essence" of what it means to be a 
type system is... and then naming something that's so completely non-
essential that we don't generally even mention it in lists of the 
benefits of static types, because we have already assumed that it's true 
of all languages except C, C++, and assembly language.

-- 
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation
From: Chris Smith
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <MPG.1f0503ddb2da16769896e7@news.altopia.net>
Chris Smith <·······@twu.net> wrote:
> I see it as quite reasonable when there's an effort by several 
> participants in this thread to either imply or say outright that static 
> type systems and dynamic type systems are variations of something 
> generally called a "type system" [...]

I didn't say that right.  Obviously, no one is making all that great an 
EFFORT to say anything.  Typing is not too difficult, after all.  What I 
meant is that there's an argument being made to that effect.

-- 
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation
From: Anton van Straaten
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <lWLmg.102756$H71.27944@newssvr13.news.prodigy.com>
Chris Smith wrote:
> I don't recall who said what at this 
> point, but earlier today someone else posted -- in this same thread -- 
> the idea that static type "advocates" want to classify some languages as 
> untyped in order to put them in the same category as assembly language 
> programming.  That's something I've never seen, and I think it's far 
> from the goal of pretty much anyone; but clearly, *someone* was 
> concerned about it.  I don't know if much can be done to clarify this 
> rhetorical problem, but it does exist.

For the record, I'm not concerned about that problem as such.  However, 
I do think that characterizations of dynamically typed languages from a 
static type perspective tend to oversimplify, usually because they 
ignore the informal aspects which static type systems don't capture.

Terminology is a big part of this, so there are valid reasons to be 
careful about how static type terminology and concepts are applied to 
languages which lack formally defined static type systems.

> The *other* bit that's been brought up in this thread is that the word 
> "type" is just familiar and comfortable for programmers working in 
> dynamically typed languages, and that they don't want to change their 
> vocabulary.

What I'm suggesting is actually a kind of bridge between the two 
positions.  The dynamically typed programmer tends to think in terms of 
values having types, rather than variables.  What I'm pointing out is 
that even those programmers reason about something much more like static 
types than they might realize; and that there's a connection between 
that reasoning and static types, and also a connection to the tags 
associated with values.

If you wanted to take the word "type" and have it mean something 
reasonably consistent between the static and dynamic camps, what I'm 
suggesting at least points in that direction.  Obviously, nothing in the 
dynamic camp is perfectly isomorphic to a real static type, which is why 
I'm qualifying the term as "latent type", and attempting to connect it 
to both static types and to tags.

> The *third* thing that's brought up is that there is a (to me, somewhat 
> vague) conception going around that the two really ARE varieties of the 
> same thing.  I'd like to pin this down more, and I hope we get there, 
> but for the time being I believe that this impression is incorrect.  At 
> the very least, I haven't seen a good way to state any kind of common 
> definition that withstands scrutiny.  There is always an intuitive word 
> involved somewhere which serves as an escape hatch for the author to 
> retain some ability to make a judgement call, and that of course 
> sabotages the definition.  So far, that word has varied through all of 
> "type", "type error", "verify", and perhaps others... but I've never 
> seen anything that allows someone to identify some universal concept of 
> typing (or even the phrase "dynamic typing" in the first place) in a way 
> that doesn't appeal to intuition.

It's obviously going to be difficult to formally pin down something that 
is fundamentally informal.  It's fundamentally informal because if 
reasoning about the static properties of terms in DT languages were 
formalized, it would essentially be something like a formal type system.

However, there are some pretty concrete things we can look at.  One of 
them, which as I've mentioned elsewhere is part of what led me to my 
position, is to look at what a soft type inferencer does.  It takes a 
program in a dynamically typed language, and infers a static type scheme 
for it (obviously, it also defines an appropriate type system for the 
language.)  This has been done in both Scheme and Erlang, for example.

How do you account for such a feat in the absence of something like 
latent types?  If there's no static type-like information already 
present in the program, how is it possible to assign a static type 
scheme to a program without dramatically modifying its source?

I think it's reasonable to look at a situation like that and conclude 
that even DT programs contain information that corresponds to types. 
Sure, it's informal, and sure, it's usually messy compared to an 
explicitly defined equivalent.  But the point is that there is 
"something" there that looks so much like static types that it can be 
identified and formalized.

> Undoubtedly, some programmers sometimes perform reasoning about their 
> programs which could also be performed by a static type system.  

I think that's a severe understatement.  Programmers always reason about 
things like the types of arguments, the types of variables, the return 
types of functions, and the types of expressions.  They may not do 
whole-program inference and proof in the way that a static type system 
does, but they do it locally, all the time, every time.

BTW, I notice you didn't answer any of the significant questions which I 
posed to Vesa.  So let me pose one directly to you: how should I rewrite 
the first sentence in the preceding paragraph to avoid appealing to an 
admittedly informal notion of type?  Note, also, that I'm using the word 
in the sense of static properties, not in the sense of tagged values.

>>There are reasons to connect 
>>it to type theory, and if you can't see those reasons, you need to be 
>>more explicit about why.
> 
> 
> Let me pipe up, then, as saying that I can't see those reasons; or at 
> least, if I am indeed seeing the same reasons that everyone else is, 
> then I am unconvinced by them that there's any kind of rigorous 
> connection at all.

For now, I'll stand on what I've written above.  When I see if or how 
that doesn't convince you, I can go further.

> It is, nevertheless, quite appropriate to call the language "untyped" if 
> I am considering static type systems.  

I agree that in the narrow context of considering to what extent a 
dynamically typed language has a formal static type system, you can call 
it untyped.  However, what that essentially says is that formal type 
theory doesn't have the tools to deal with that language, and you can't 
go much further than that.  As long as that's what you mean by untyped, 
I'm OK with it.

> I seriously doubt that this usage 
> in any way misleads anyone into assuming the absence of any mental 
> processes on the part of the programmer.  I hope you agree.  

I didn't suggest otherwise (or didn't mean to).  However, the term 
"untyped" does tend to lead to confusion, to a lack of recognition of 
the significance of all the static information in a DT program that is 
outside the bounds of a formal type system, and the way that runtime tag 
checks relate to that static information.

One misconception that occurs is the assumption that all or most of the 
static type information in a statically-typed program is essentially 
nonexistent in a dynamically-typed program, or at least is no longer 
statically present.  That can easily be demonstrated to be false, of 
course, and I'm not arguing that experts usually make this mistake.

> If not, 
> then I think you significantly underestimate a large category of people.

If you think there's no issue here, I think you significantly 
overestimate a large category of people.  Let's declare that line of 
argument a draw.

>>The first point I was making is that *automated* checking has very 
>>little to do with anything, and conflating static types with automated 
>>checking tends to lead to a lot of confusion on both sides of the 
>>static/dynamic fence.
> 
> 
> I couldn't disagree more.  Rather, when you're talking about static 
> types (or just "types" in most research literature that I've seen), then 
> the realm of discussion is specifically defined to be the very set of 
> errors that are automatically caught and flagged by the language 
> translator.  I suppose that it is possible to have an unimplemented type 
> system, but it would be unimplemented only because someone hasn't felt 
> the need nor gotten around to it.  Being implementABLE is a crucial part 
> of the definition of a static type system.

I agree with the latter sentence.  However, it's nevertheless the case 
that it's common to confuse "type system" with "compile-time checking". 
  This doesn't help reasoning in debates like this, where the existence 
of type systems in languages that don't have automated static checking 
is being examined.

> I am beginning to suspect that you're make the converse of the error I 
> made earlier in the thread.  That is, you may be saying things regarding 
> the psychological processes of programmers and such that make sense when 
> discussing dynamic types, and in any case I haven't seen any kind of 
> definition of dynamic types that is more acceptable yet; but it's 
> completely irrelevant to static types.  Static types are not fuzzy -- if 
> they were fuzzy, they would cease to be static types -- and they are not 
> a phenomenon of psychology.  To try to redefine static types in this way 
> not only ignores the very widely accepted basis of entire field of 
> existing literature, but also leads to false ideas such as that there is 
> some specific definable set of problems that type systems are meant to 
> solve.

I'm not trying to redefine static types.  I'm observing that there's a 
connection between the static properties of untyped programs, and static 
types; and attempting to characterize that connection.

You need to be careful about being overly formalist, considering that in 
real programming languages, the type system does have a purpose which 
has a connection to informal, fuzzy things in the real world.  If you 
were a pure mathematician, you might get away with claiming that type 
systems are just a self-contained symbolic game which doesn't need any 
connections beyond its formal ruleset.

Think of it like this: the more ambitious a language's type system is, 
the fewer uncaptured static properties remain in the code of programs in 
that language.  However, there are plenty of languages with rather weak 
static type systems.  In those cases, code has more static properties 
that aren't captured by the type system.  I'm pointing out that in many 
of these cases, those properties resemble types, to the point that it 
can make sense to think of them and reason about them as such, applying 
the same sort of reasoning that an automated type inferencer applies.

If you disagree, then I'd be interested to hear your answers to the two 
questions I posed to Vesa, and the related one I posed to you above, 
about what else to call these things.

>>I agree, to make the comparison perfect, you'd need to define a type 
>>system.  But that's been done in various cases.
> 
> 
> I don't think that has been done, in the case of dynamic types.  

I was thinking of the type systems designed for soft type inferencers; 
as well as those cases where e.g. a statically-typed subset of an 
untyped language is defined, as in the case of PreScheme.

But in such cases, you end up where a program in these systems, while in 
some sense statically typed, is also a valid untyped program.  There's 
also nothing to stop someone familiar with such things programming in a 
type-aware style - in fact, books like Felleisen et al's "How to Design 
Programs" encourage that, recommending that functions be annotated with 
comments expressing their type. Examples:

;; product : (listof number) -> number

;; copy : N X -> (listof X)

You also see something similar in e.g. many Erlang programs.  In these 
cases, reasoning about types is done explicitly by the programmer, and 
documented.

What would you call the descriptions in those comments?  Once you tell 
me what I should call them other than "type" (or some qualified variant 
such as "latent type"), then we can compare terminology and decide which 
is more appropriate.

> It has 
> been done for static types, but much of what you're saying here is in 
> contradiction to the definition of a type system in that sense of the 
> word.

Which is why I'm using a qualified version of the term.

>>The problem we're dealing with in this case is that anything that's not 
>>formally defined is essentially claimed to not exist.
> 
> 
> I see it as quite reasonable when there's an effort by several 
> participants in this thread to either imply or say outright that static 
> type systems and dynamic type systems are variations of something 
> generally called a "type system", and given that static type systems are 
> quite formally defined, that we'd want to see a formal definition for a 
> dynamic type system before accepting the proposition that they are of a 
> kind with each other.  

A complete formal definition of what I'm talking about may be impossible 
in principle, because if you could completely formally define it, you'd 
have a static type system.

If that makes you throw up your hands, then all you're saying is that 
you're unwilling to deal with a very real phenomenon that has obvious 
connections to type theory, examples of which I've given above.  That's 
your choice, but at the same time, you have to give up exclusive claim 
to any variation of the word "type".

Terms are used in a context, and it's perfectly reasonable to call 
something a "latent type" or even a "dynamic type" in a certain context 
and point out connections between those terms and their cousins (or if 
you insist, their completely unrelated namesakes) static types.

> So far, all the attempts I've seen to define a 
> dynamic type system seem to reduce to just saying that there is a well-
> defined semantics for the language.

That's a pretty strong claim, considering you have so far ducked the 
most important questions I raised in the post you replied to.

> I believe that's unacceptable for several reasons, but the most 
> significant of them is this.  It's not reasonable to ask anyone to 
> accept that static type systems gain their essential "type system-ness" 
> from the idea of having well-defined semantics.  

The definition of static type system is not in question.  However, 
realistically, as I pointed out above, you have to acknowledge that type 
systems exist in, and are inextricably connected to, a larger, less 
formal context.  (At least, you have to acknowledge that if you're 
interested in programs that do anything related to the real world.)  And 
outside the formally defined borders of static type systems, there are 
static properties that bear a pretty clear relationship to types.

Closing your eyes to this and refusing to acknowledge any connection 
doesn't achieve anything.  In the absence of some other account of the 
phenomena in question, (an informal version of) types turn out to be a 
pretty convenient way to deal with the situation.

> From the perspective of 
> a statically typed language, this looks like a group of people getting 
> together and deciding that the real "essence" of what it means to be a 
> type system is... 

There's a sense in which one can say that yes, the informal types I'm 
referring to have an interesting degree of overlap with static types; 
and also that static types do, loosely speaking, have the "purpose" of 
formalizing the informal properties I'm talking about.

But I hardly see why such an informal characterization should bother 
you.  It doesn't affect the definition of static type.  It's not being 
held up as a foundation for type theory.  It's simply a way of dealing 
with the realities of programming in a dynamically-checked language.

There are some interesting philophical issues there, to be sure 
(although only if you're willing to stray outside the formal), but you 
don't have to worry about those unless you want to.

> and then naming something that's so completely non-
> essential that we don't generally even mention it in lists of the 
> benefits of static types, because we have already assumed that it's true 
> of all languages except C, C++, and assembly language.

This is based on the assumption that all we're talking about is 
"well-defined semantics".  However, there's much more to it than that. 
I need to hear your characterization of the properties I've described 
before I can respond.

Anton
From: Chris Smith
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <MPG.1f06508710eb95809896f1@news.altopia.net>
I thought about this in the context of reading Anton's latest post to 
me, but I'm just throwing out an idea.  It's certainly too fuzzy in my 
mind at the moment to consider it in any way developed.  I'm sure it's 
just as problematic, if not more so, as any existing accounts of dynamic 
types.  Here it is, anyway, just because it's a different way of seeing 
things.

(I'll first point out here, because it's about to become significant, 
that contrary to what's been said several times, static type systems 
assign types to expressions rather than variables.)

First of all, it's been pointed out several times that static typing 
assumes a static program, and that it becomes difficult to describe if 
the program itself is being modified as it is evaluated.  The 
interesting thing that occurred to me earlier today is that for 
languages that are reducible to the lambda calculus at least, ALL 
programs are from some perspective self-modifying.  Specifically, no 
single expression is ever evaluated more than once.  When a statement is 
evaluated several times, that simply means that there is actually a more 
complex statement that, when it is evaluated, writes a new statement, 
which consists of a copy of itself, plus something else.  Thus it makes 
little sense, from that perspective, to worry about the possibility that 
an expression might have a different type each time it's evaluated, and 
therefore violate static typing.  It is never evaluated more than once.

What if I were to describe a dynamically typed system as one in which 
expressions have types just as they do in a static type system, but in 
which programs are seen as self-modifying?  Rather than limiting forms 
of recursion for well-typed programs to, say, the standard fixed-point 
operator and then assigning some special-purpose rule for it, we simply 
admit that the program is being modified as it is run.  Type inference 
(and therefore checking) would need to be performed incrementally.  When 
they do, the results of evaluating everything up to this point will have 
changed the type environment, providing more information than was 
previously known.  The type environment for assigning types to an 
expression could still be treated statically at the time that the 
expression first comes into existence; but it would be different then 
from what it was at the beginning of the program.

Of course, this is only a formal description of the meaning, and not a 
description of any kind of a reasonable implementation.  However, this 
initially appears, on the surface, to take us some distance toward 
unifying dynamic types and static types.

At least two interesting correlations just sort of "fall out".

First, Marshall has observed (and I agree) that static and dynamic types 
form not different ways of looking at verifying correctness, but rather 
different forms of programming in the first place.  It appears to be 
encouraging, then, that this perspective classifies static and dynamic 
types based on how we define the program rather than how we define the 
type checking.

Second, type theory is sometimes concerned with the distinction between 
structural and nominal types and various hybrids.  We've concerned 
ourself a good bit in this thread with exactly what kinds of "tags" or 
other runtime type notations are acceptable for defining a dynamically 
typed language.  In this view, Smalltalk (or some simplification thereof 
wherein doesNotUnderstand is considered an error handling mechanism 
rather than a first-class part of the language semantics) would 
represent the dynamic version of structural types, whereas type 
"tagging" systems with a user-declared finite set of type tags would be 
the dynamic equivalent to nominal types.  Just like in type theory, 
dynamic type systems can also have hybrids between structural and 
nominal types.

Obviously, that has a lot of problems... some of the same ones that I've 
been complaining about.  However, it has some interesting 
characteristics as well.  If an expression is assumed to be different 
each time it is encountered, then type the type of "one" expression is 
easy to evaluate in the initial environment in which it occurs (for some 
well-defined order of evaluation).

Now, to Anton's response, with admittedly rather liberal skimming 
(sorry!) to avoid my staying up all night thinking about this.

Anton wrote:
> It's obviously going to be difficult to formally pin down something that 
> is fundamentally informal.  It's fundamentally informal because if 
> reasoning about the static properties of terms in DT languages were 
> formalized, it would essentially be something like a formal type system.

We may be talking past each other, then, but I'm not particularly 
interested in the informal aspects of what makes up a type.  I agree 
that formalizing the notion appears quite difficult, especially if my 
conjecture above were to become part of it.

> However, there are some pretty concrete things we can look at.  One of 
> them, which as I've mentioned elsewhere is part of what led me to my 
> position, is to look at what a soft type inferencer does.  It takes a 
> program in a dynamically typed language, and infers a static type scheme 
> for it (obviously, it also defines an appropriate type system for the 
> language.)  This has been done in both Scheme and Erlang, for example.

> How do you account for such a feat in the absence of something like 
> latent types?

There are undoubtedly any number of typing relations and systems of 
rules which can be assigned to any program in a dynamically typed 
language.  I think the more interesting question is how these type 
inference systems choose between the limitless possibilities in a way 
that tends to be acceptable.  I suspect that they do it by assuming 
certain kinds of type systems, and then finding them.  I doubt they are 
going to find a type system in some language until the basics of that 
system have been fed to the software as something to look for.

I'd be delighted to discover that I'm wrong.  If nothing else, it would 
mean that applying these programs to significant amounts of Scheme and 
Erlang software would become an invaluable tool for doing research in 
type theory.

> > Undoubtedly, some programmers sometimes perform reasoning about their 
> > programs which could also be performed by a static type system.  
> 
> I think that's a severe understatement.

Yes, I agree.  I did not intend the connotations that ended up there.  
In fact, I'd say that something like 90% (and yes, I just grabbed that 
number out of thin air) of reasoning performed by programmers about 
their programs could be performed by a static type system.

> BTW, I notice you didn't answer any of the significant questions which I 
> posed to Vesa.  So let me pose one directly to you: how should I rewrite 
> the first sentence in the preceding paragraph to avoid appealing to an 
> admittedly informal notion of type?

I'll try to answer your question.  My answer is that I wouldn't try.  I 
would, instead, merely point out that the notion of type is informal, 
and that it is not necessarily even meaningful.  The two possibilities 
that concern me are that either reasoning with types in this informal 
sense is just a word, which gets arbitrarily applied to some things but 
not others such that distinctions are made that have no real value, and 
that any sufficient definition of type-based reasoning is going to be 
broad enough to cover most reasoning that programmers ever do, once we 
start applying enough rigor to try to distinguish between what counts as 
a type and what doesn't.  The fact that programmers think about types is 
not disputed; my concern is that the types we think about may not 
meaningfully exist.

> Note, also, that I'm using the word 
> in the sense of static properties, not in the sense of tagged values.

In the static sense of types, as I just wrote to Chris Uppal, I can then 
definitively assert at least the second of the two possibilities above.  
Every possible property of program execution is a type once the problem 
of how to statically check it has been solved.

> For now, I'll stand on what I've written above.  When I see if or how 
> that doesn't convince you, I can go further.

To be clear, I am certainly convinced that people talk and think about 
types and also about other things regarding their programs which they do 
not call types.  I am concerned only with how accurate it is to make 
that distinction, on something other than psychological grounds.  If we 
are addressing different problems, then so be it.  If however, you 
believe there is a type/non-type distinction that can be built on some 
logical foundation, then I'd love to hash it out.

> > It is, nevertheless, quite appropriate to call the language "untyped" if 
> > I am considering static type systems.  
> 
> I agree that in the narrow context of considering to what extent a 
> dynamically typed language has a formal static type system, you can call 
> it untyped.

OK.

> However, what that essentially says is that formal type 
> theory doesn't have the tools to deal with that language, and you can't 
> go much further than that.  As long as that's what you mean by untyped, 
> I'm OK with it.

I can see that in a sense... yes, if a specific program has certain 
properties, then it may be possible to extend the proof-generator (i.e., 
the type checker) to be able to prove those properties.  However, I 
remain inconvinced that the same is true of the LANGUAGE.  Comments in 
this thread indicate to me that even if I could write a type-checker 
that checks correctness in a general-purpose programming language, and 
that language was Scheme, there would be Scheme programmers who would 
prefer not to use my Scheme+Types language because it would pester them 
until they got everything perfect before they could run their code, and 
Scheme would remain a separate dynamically typed (i.e., from a type 
theory perspective, untyped) language despite my having achieved all of 
the wildest dreams of the field of type theory.  In that sense, I don't 
consider this a limitation of type theory.

(Two notes: first, I don't actually believe that this is possible, since 
the descriptions people would give for "correct" are nowhere near so 
formal as the type theory that would try to check them; and second, I am 
not so sure that I'd want to use this new language either, for anything 
except applications in which bugs are truly life-threatening.)

> Think of it like this: the more ambitious a language's type system is, 
> the fewer uncaptured static properties remain in the code of programs in 
> that language.  However, there are plenty of languages with rather weak 
> static type systems.  In those cases, code has more static properties 
> that aren't captured by the type system.

Yes, I agree with this.

> I'm pointing out that in many 
> of these cases, those properties resemble types, to the point that it 
> can make sense to think of them and reason about them as such, applying 
> the same sort of reasoning that an automated type inferencer applies.

It often makes sense to apply the same sort of reasoning as would be 
involved in static type inference.  I also agree that there are 
properties called types, which are frequently the same properties caught 
by commonly occurring type systems, and which enough developers already 
think of under the word "type" to make it worth trying to change that 
EXCEPT when there is an explicit discussion going on that touches on 
type theory.  In that latter case, though, we need to be very careful 
about defining some group of things as not-types across the scope of all 
possible type systems.  There are no such things a priori, and even when 
you take into account the common requirement that a type system is 
tractable, the set of things that are not-type because it's provably 
impossible to check them in polynomial time is smaller than the 
intuitive sense of that word.

Essentially, this use of "type" becomes problematic when it is used to 
draw conclusions about the scope, nature, purpose, etc. of static types.

This answers a good number of the following concerns.

> > I see it as quite reasonable when there's an effort by several 
> > participants in this thread to either imply or say outright that static 
> > type systems and dynamic type systems are variations of something 
> > generally called a "type system", and given that static type systems are 
> > quite formally defined, that we'd want to see a formal definition for a 
> > dynamic type system before accepting the proposition that they are of a 
> > kind with each other.  
> 
> A complete formal definition of what I'm talking about may be impossible 
> in principle, because if you could completely formally define it, you'd 
> have a static type system.

That may not actually be the case.  Lots of things may be formally 
defined that would nevertheless be different from a static type system.  
For example, if someone did provide a rigorous definition of some 
category of type errors that could be recognized by some defined process 
as such, then the set of language features that detect and react to 
these errors could be defined as a type system.  Chris Uppal just posted 
such a thing in a different part of this thread, and I'm unsure at this 
point whether it works or not.  The problem with earlier attempts to 
define type errors in this thread is that they either tend to include 
all possible errors, or are distinguished by some nebulously defined 
condition such as being commonly thought of in some way by some set of 
human beings.

> If that makes you throw up your hands, then all you're saying is that 
> you're unwilling to deal with a very real phenomenon that has obvious 
> connections to type theory, examples of which I've given above.  That's 
> your choice, but at the same time, you have to give up exclusive claim 
> to any variation of the word "type".

I really think it's been a while since I claimed any such thing... and 
even at the beginning, I only meant to challenge that use of type in 
comparing static and dynamic type systems.

I am not so sure, though, about these obvious connections to type 
theory.  Those connections may be, for example, some combination of the 
following rather uninteresting phenomena:

1. Type theory got the word "type" from somewhere a century ago, and the 
word has continued to be used in its original sense as well, and it just 
happens that despite the divergence of these uses, there remains some 
common ground between the two different meanings.

or, 2. Many programmers approach type theory from the perspective of 
having some intuitive definition of a type in mind, and they tend to mix 
it in with a few of the ideas of type theory in their minds, and then go 
off and use the word in other contexts.

> Terms are used in a context, and it's perfectly reasonable to call 
> something a "latent type" or even a "dynamic type" in a certain context
> and point out connections between those terms and their cousins (or if 
> you insist, their completely unrelated namesakes) static types.

Yes, I agree.  I'm yet to be convinced that they are completely 
unrelated namesakes.  I am actually trying to reconstruct such a thing.  
The problem is that, knowing how "type" is used in the static sense, I 
am forced to reject attempts that share only basic superficial 
similarities and yet miss the main point.

> > I believe that's unacceptable for several reasons, but the most 
> > significant of them is this.  It's not reasonable to ask anyone to 
> > accept that static type systems gain their essential "type system-ness" 
> > from the idea of having well-defined semantics.  
> 
> The definition of static type system is not in question.  However, 
> realistically, as I pointed out above, you have to acknowledge that type 
> systems exist in, and are inextricably connected to, a larger, less 
> formal context.

No, but it certainly seems that the reason behind calling them a "type 
system" is being placed very much in doubt.  Simply saying static type 
system is horribly wrong, and I do it only to avoid inventing words 
wholesale... but I greatly regret the implication that it means "a type 
system, which happens to be static."

In some senses, this is akin to describing a boat as "a floating 
platform".  That may be somewhat accurate as a rough definition for 
someone who is familiar with platforms but not with boats.  It's a 
terrible failure, though, in that it implies that being a platform is 
the important bit; where actually the important bit is that it floats!  
In fact, it may turn out that boats have very little to do with common 
platforms... for example, while some boats (barges) have many of the 
same properties that platforms do, there are whole other classes of 
boats, which are considerably more powerful than barges, that end up 
looking very little like a platform at all.  If you press hard enough, I 
can be forced to admit that there is some kind of connection between 
platforms and floating platforms, because after all one has to stand on 
something while using a boat, right?

Well, the important part of the phrase "static type" isn't "type" but 
"static."  Without even having to undertake the dead-end task of 
deciding who's got the right to the word "type", let's assume that I 
should really be saying some other word, like boat.  However, that word 
doesn't really exist, so I am stuck with "static type".  I just have to 
be very careful about explaining that what I mean by type is completely 
different from what others mean by type.  There is some overlap, which 
potentially arises from one of the uninteresting sources I've listed 
above, or maybe... just maybe... arises from some actual similarity that 
we haven't identified yet.  But we still haven't identified it.

This is very much the situation I see ourselves in.  I very much want to 
agree with everyone's right to use the word "type", but then people keep 
saying things that seem to me to be clear misunderstandings that the 
important part is the "static", not the "type".  When qualification is 
not needed

Fundamentally, the important aspect of types, in the static sense, is 
that they are used in a formal, syntactic, tractable method for proving 
the absence of program behaviors.  Even if it turns out that, in 
practice, all type systems contain mechanisms for solving a certain 
rigorously definable class of problems, that will not be part of the 
definition of types, when the word is used in the static sense.  So we 
can search for a more formal similarity, or we can agree that it's 
incidentally true that they sometimes solve the same problems but also 
agree that it's pointless and misleading to ever talk about a "type 
system" without clarifying the usage either explicitly or by context.  
It remains incorrect to relax the degree of formality and talk about 
static types in an intuitive and undefinable way, because than they lose 
the entire property of being types at all.  At that point, it's best to 
drop "static", which is not true anyway, and switch over to the other 
set of terminology where it makes sense to call them types.

> There's a sense in which one can say that yes, the informal types I'm 
> referring to have an interesting degree of overlap with static types; 
> and also that static types do, loosely speaking, have the "purpose" of 
> formalizing the informal properties I'm talking about.

Static types have the property of proving ANY properties of a program 
that may be proven.  Certainly, research is pragmatically restricted to 
specific problems that are (1) useful and (2) solvable.  Aside from 
that, I can't imagine any competent language designer building a type 
system, but rejecting some useful and solvable feature for the sole 
reason that he or she doesn't think of it as solving a type error.

> But I hardly see why such an informal characterization should bother 
> you.  It doesn't affect the definition of static type.

I hope I've explained how it does affect the definition of a static 
type.

> This is based on the assumption that all we're talking about is 
> "well-defined semantics".

Indeed.  My statement there will not apply if we find some formal 
definition of a dynamic type system that doesn't reduce to well-defined 
semantics once you remove the intuitive bits from it.

-- 
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation
From: Chris F Clark
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <sddac82bc75.fsf@shell01.TheWorld.com>
Chris Smith <·······@twu.net> writes: 
 
> I thought about this in the context of reading Anton's latest post to
> me, but I'm just throwing out an idea. 
 
I think there is some sense of convergence here.  In particular, I 
reason about my program using "unsound types".  That is, I reason 
about my program using types which I can (at least partially) 
formalize, but for which there is no sound axiomatic system.  Now, 
these types are locally sound, i.e. I can prove properties that hold 
for regions of my programs.  However, the complexity of the programs 
prevent me from formally proving the entire program is correct (or 
even entirely type-safe)--I generally work in a staticly, but 
weakly-typed language.  Sometimes, for performance reasons, the rules
need to be bent--and the goal is to bend the rules, but not break 
them.  At other times, I want to strenghten the rules, make parts of 
the program more provably correct.
 
How does this work in practice?  Well, there is a set of ideal types, 
I would like to use in my program.  Those types would prove that my 
program is correct.  However, because there are parts of the program 
that for perfomnace reasons need me to widen the ideal types to 
something less tight, pragmatic types.  The system can validate that I 
have not broken the pragmatic types.  That is not tight enough to 
prove the program correct, but it provides some level of security.
 
To improve my confidence in the program, I borrow the tagging concept 
to supplement the static type system.  At key points in the program I 
can check the type tag to ensure that the data being manipulated 
actual matches the unprovable ideal type system.  If it doesn't, I 
have a witness to the error in my reasoning.  It is not perfect, but 
it is better than nothing.
 
Why am I in this state, because I believe that most interesting
programs are beyond my ability to formally prove them, or at least
prove them in "reasonable" time bounds.  Therefore, I prove
(universally quantify) some properties of my programs, but many other
(and most of the "interesting") properties, I can only existentially
quantify--it worked in the cases that it was tested on.  It is a
corrolary of this, that most programs have bugs--places where the
formal reasoning was incomplete (did not reason down to relevant
interesting property) and the informal reasoning was wrong.

Thus, the two forms of reasoning, static typing to check the formal
properties that it can, and dynamic tagging to catch the errors that
it missed work like a belt and suspenders system.  I would like the
static type system to catch more errors, but I need it to express more
to do so, and many of the properties that are interesting require
n-squared operations to validate.  Equally important, the dynamic tags
help when working in an essentially untrustworthy world.  Not only do
our own programs have bugs, so do the systems we run them on, both
hardware and software.  Moreover, alpha radiation and quantum effects
can change "constants" in a program (some of my work has to do with
soft-error rates on chips where we are trying to guarantee that the
chip will have a reasonable MBTF for certain algorithms).  That's not
to ignore the effects of malware and other ways your program can be
made less reliable.

Yes, I want a well-proven program, but I also want some guarantee that
the program is actually executing the algorithm specified.  That
requires a certain amount of "redundant" run-time checks.  Knuth had a
quote about a program only being proven and not tested that reflects
the appropriate cynicism.

I consider any case where a program gives a function outside of its
domain a "type error", because an ideal (but unacheivable) type system
would have prevented it.  That does not make all errors, type errors,
because if I give a program an input within its domain and it
mis-computes the result, that is not a type error.

-Chris
From: Chris F Clark
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <sddac82xs81.fsf@shell01.TheWorld.com>
Chris Smith <·······@twu.net> writes: 
 
> I thought about this in the context of reading Anton's latest post to
> me, but I'm just throwing out an idea. 
 
I wrote:
> I think there is some sense of convergence here.

Apologies for following-up to my own post, but I forgot to describe
the convergence.

The convergence is there is something more general that what type
theorists talk about when discussing types.  Type theory is an
abstraction and systemization of reasoning about types.  This more
general reasoning doesn't need to be sound, it just needs to be based
aound "types" and computations.

My own reasoning uses such an unsound system.  It has clear roots in
sound reasoning, and I wish I could make it sound, but....

In any case, dynamic tagging helps support my unsound reasoning by
providing evidence when I have made an error.

I hope this makes it less mystical for Marshall.

-Chris
From: Chris Smith
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <MPG.1f074872ce4e876e9896f7@news.altopia.net>
Chris F Clark <···@shell01.TheWorld.com> wrote:
> Chris Smith <·······@twu.net> writes: 
> > I thought about this in the context of reading Anton's latest post to
> > me, but I'm just throwing out an idea. 
>  
> I wrote:
> > I think there is some sense of convergence here.
> 
> Apologies for following-up to my own post, but I forgot to describe
> the convergence.
> 
> The convergence is there is something more general that what type
> theorists talk about when discussing types.  Type theory is an
> abstraction and systemization of reasoning about types.  This more
> general reasoning doesn't need to be sound, it just needs to be based
> aound "types" and computations.

Unfortunately, I have to again reject this idea.  There is no such 
restriction on type theory.  Rather, the word type is defined by type 
theorists to mean the things that they talk about.  Something becomes a 
type the instant someone conceives of a type system that uses it.  We 
could try to define a dynamic type system, as you seem to say, as a 
system that checks against only what can be known about some ideal type 
system that would be able to prove the program valid... but I 
immediately begin to wonder if it would ever be possible to prove that 
something is a dynamic type system for a general-purpose (Turing-
complete) language.

If you could define types for a dynamic type system in some stand-alone 
way, then this difficulty could be largely pushed out of the way, 
because we could say that anything that checks types is a type system, 
and then worry about verifying that it's a sound type system without 
worrying about whether it's a subset of the perfect type system.  So 
back to that problem again.

You also wrote:
> I consider any case where a program gives a function outside of its
> domain a "type error", because an ideal (but unacheivable) type system
> would have prevented it.  That does not make all errors, type errors,
> because if I give a program an input within its domain and it
> mis-computes the result, that is not a type error.

So what is the domain of a function?  (Heck, what is a function?  I'll 
neglect that by assuming that a function is just an expression; 
otherwise, this will get more complicated.)  I see three possible ways 
to look at a domain.

(I need a word here that implies something like a set, but I don't care 
to verify the axioms of set theory.  I'm just going to use set.  Hope 
that's okay.)

1. The domain is the set of inputs to that expression which are going to 
produce a correct result.

2. The domain is the set of inputs that I expected this expression to 
work with when I wrote it.

3. The domain is the set of inputs for which the expression has a 
defined result within the language semantics.

So the problem, then, more clearly stated, is that we need something 
stronger than #3 and weaker than #1, but #2 includes some psychological 
nature that is problematic to me (though, admittedly, FAR less 
problematic than the broader uses of psychology to date.)

Is that a fair description of where we are in defining types, then?

-- 
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation
From: Chris F Clark
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <sddk676m4jz.fsf@shell01.TheWorld.com>
Chris Smith <·······@twu.net> writes: 
 
> Unfortunately, I have to again reject this idea.  There is no such
> restriction on type theory.  Rather, the word type is defined by type
> theorists to mean the things that they talk about. 
 
Do you reject that there could be something more general than what a 
type theorist discusses?  Or do you reject calling such things a type?
 
Let you write: 
> because we could say that anything that checks types is a type system,
> and then worry about verifying that it's a sound type system without
> worrying about whether it's a subset of the perfect type system. 
 
I'm particularly interested if something unsound (and perhaps 
ambiguous) could be called a type system.  I definitely consider such 
things type systems.  However, I like my definitions very general and 
vague.  Your writing suggests the opposite preference.  To me if 
something works in an analogous way to how a known type system, I tend
to consider it a "type system".  That probably isn't going to be at 
all satisfactory to someone wanting a more rigorous definition. Of 
course, to my mind, the rigorous definitions are just an attempt to 
capture something that is already known informally and put it on a
more rational foundation.
 
> So what is the domain of a function?  (Heck, what is a function? 
...
> (I need a word here that implies something like a set, but I don't care 
> to verify the axioms of set theory.  I'm just going to use set.  Hope 
> that's okay.) 
 
Yes, I meant the typical HS algebra definition of domain, which is a 
set, same thing for function.  More rigorous definitions can be 
sustituted as necessary and appropriate.
 
> 1. The domain is the set of inputs to that expression which are going to 
> produce a correct result. 
> 
> 2. The domain is the set of inputs that I expected this expression to 
> work with when I wrote it. 
> 
> 3. The domain is the set of inputs for which the expression has a 
> defined result within the language semantics. 
> 
> So the problem, then, more clearly stated, is that we need something 
> stronger than #3 and weaker than #1, but #2 includes some psychological 
> nature that is problematic to me (though, admittedly, FAR less 
> problematic than the broader uses of psychology to date.) 
 
Actually, I like 2 quite well.  There is some set in my mind when I'm
writing a particular expression.  It is likely an ill-defined set, but
I don't notice that.  That set is exactly the "type".  I approximate
that set and it's relationships to other sets in my program with the
typing machinery of the language I am using.  That is the static (and
well-founded) type.  It is however, only an approximation to the ideal
"real" type.  If I could make that imaginary mental set concrete (and
well-defined), that would be exactly my concept of type.
 
Now, that overplays the conceptual power in my mind.  My mental image
is influenced by my knowledge of the semantics of the language and
also any implementations I may have used.  Thus, I will weaken 2 to be
closer to 3, because I don't expect a perfectly ideal type system,
just an approximation.  To the extent that I am aware of gotchas in
the language or a relavent implementation, I amy write extra code to
try and limit things to model 1.  Note that in my fallible mind, 1 and
2 are identical.  In my hubris, I expect that the extra checks I have
made have restricted my inputs to where model 3 and 1 coincide.

Expanding that a little.  I expect the language to catch type errors
where I violate model 3.  To the extent model 3 differs from model 1
and my code hasn't added extra checks to catch it, I have bugs
resulting from undetected type errors or perhaps modelling errors.  To
me they are type errors, because I expect that there is a typing
system that captures 1 for the program I am writing, even though I
know that that is not generally true.

As I reflect on this more, I really do like 2 in the sense that I
believe there is some abstract Platonic set that is an ideal type
(like 1) and that is what the type system is approximating.  to the
sense that languages approximate either 1 or 2, those facilites are
type facilities of a language.  That puts me very close to your
(rejected) definition of type as the well-defined semantics of the
language.  Except I think of types as the sets of values that the
language provides, so it isn't the entire semantics of the language,
just that part which divides values into discrete sets which are
approriate to different operations (and detect inaapropriate values).

-Chris
From: Marshall
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1151199196.870389.81960@p79g2000cwp.googlegroups.com>
Chris F Clark wrote:
>
> I'm particularly interested if something unsound (and perhaps
> ambiguous) could be called a type system.  I definitely consider such
> things type systems.

I don't understand. You are saying you prefer to investigate the
unsound over the sound?


> However, I like my definitions very general and
> vague.  Your writing suggests the opposite preference.

Again, I cannot understand this. In a technical realm, vagueness
is the opposite of understanding. To me, it sounds like you are
saying that you prefer not to understand the field you work in.


> To me if
> something works in an analogous way to how a known type system, I tend
> to consider it a "type system".  That probably isn't going to be at
> all satisfactory to someone wanting a more rigorous definition.

Analogies are one thing; definitions are another.


> Of
> course, to my mind, the rigorous definitions are just an attempt to
> capture something that is already known informally and put it on a
> more rational foundation.

If something is informal and non-rational, it cannot be said to
be "known." At best, it could be called "suspected." Even if you
think something which turns out to be true, we could not say
that you "knew" it unless your reasons for your thoughts were
valid.

I flipped a coin to see who would win the election; it came
up "Bush". Therefore I *knew* who was going to win the
election before it happened. See the probem?


Marshall
From: Anton van Straaten
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <_xrng.108673$H71.2451@newssvr13.news.prodigy.com>
Marshall wrote:
> Chris F Clark wrote:
> 
>>I'm particularly interested if something unsound (and perhaps
>>ambiguous) could be called a type system.  I definitely consider such
>>things type systems.
> 
> 
> I don't understand. You are saying you prefer to investigate the
> unsound over the sound?

The problem is that there are no useful sound definitions for the type 
systems (in the static sense) of dynamically-typed languages.  Yet, we 
work with type-like static properties in those languages all the time, 
as I've been describing.

If you want to talk about those properties as though they were types, 
one of the options is what Chris Clark described when he wrote "I reason 
about my program using types which I can (at least partially) formalize, 
but for which there is no sound axiomatic system."

>>However, I like my definitions very general and
>>vague.  Your writing suggests the opposite preference.
> 
> 
> Again, I cannot understand this. In a technical realm, vagueness
> is the opposite of understanding. To me, it sounds like you are
> saying that you prefer not to understand the field you work in.

The issue as I see it is simply that if we're going to work with 
dynamically-typed programs at all, our choices are limited at present, 
when it comes to formal models that capture our informal static 
reasoning about those programs.  In statically-typed languages, this 
reasoning is mostly captured by the type system, but it has no formal 
description for dynamically-typed languages.

>>To me if
>>something works in an analogous way to how a known type system, I tend
>>to consider it a "type system".  That probably isn't going to be at
>>all satisfactory to someone wanting a more rigorous definition.
> 
> 
> Analogies are one thing; definitions are another.

A formal definition is problematic, precisely because we're dealing with 
something that to a large extent is deliberately unformalized.  But as 
Chris Clark pointed out, "these types are locally sound, i.e. I can 
prove properties that hold for regions of my programs."  We don't have 
to rely entirely on analogies, and this isn't something that's entirely 
fuzzy.  There are ways to relate it to formal type theory.

>>Of
>>course, to my mind, the rigorous definitions are just an attempt to
>>capture something that is already known informally and put it on a
>>more rational foundation.
> 
> 
> If something is informal and non-rational, it cannot be said to
> be "known." 

As much as I love the view down into that abyss, we're nowhere near 
being in such bad shape.

We know that we can easily take dynamically-typed program fragments and 
assign them type schemes, and we can make sure that the type schemes 
that we use in all our program fragments use the same type system.

We know that we can assign type schemes to entire dynamically-typed 
programs, and we can even automate this process, although not without 
some practical disadvantages.

So we actually have quite a bit of evidence about the presence of static 
types in dynamically-typed programs.

Besides, we should be careful not to forget that our formal methods are 
incredibly weak compared to the power of our intelligence.  If that were 
not the case, there would be no advantage to possessing intelligence, or 
implementing AIs.

We are capable of reasoning outside of fully formal frameworks.  The 
only way in which we are able to develop formal systems is by starting 
with the informal.

We're currently discussing something that so far has only been captured 
fairly informally.  If we restrict ourselves to only what we can say 
about it formally, then the conversation was over before it began.

Anton
From: Chris Smith
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <MPG.1f087344223ab0ab9896fc@news.altopia.net>
Anton van Straaten <·····@appsolutions.com> wrote:
> The problem is that there are no useful sound definitions for the type 
> systems (in the static sense) of dynamically-typed languages.  Yet, we 
> work with type-like static properties in those languages all the time, 
> as I've been describing.

I honestly don't find it very compelling that there are similarities 
between the kinds of reasoning that goes on in static type systems 
versus those used by programmers in dynamically typed languages.  
Rather, of course there are similarities.  It would be shocking if there 
were not similarities.  These similarities, though, have nothing to with 
the core nature of types.

Basically, there are ways to reason about programs being correct.  
Static type systems describe their reasoning (which is always axiomatic 
in nature) by assigning "types" to expressions.  Programmers in 
dynamically typed languages still care about the correctness of their 
programs, though, and they find ways to reason about their programs as 
well.  That reasoning is not completely axiomatic, but we spend an 
entire lifetime applying this basic logical system, and it makes sense 
that programmers would try to reason from premises to conclusions in 
ways similar to a type system.  Sometimes they may even ape other type 
systems with which they are familiar; but even a programmer who has 
never worked in a typed language would apply the same kind of logic as 
is used by type systems, because it's a form of logic with which 
basically all human beings are familiar and have practiced since the age 
of three (disclaimer: I am not a child psychologist).

On the other hand, programmers don't restrict themselves to that kind of 
pure axiomatic logic (regardless of whether the language they are 
working in is typed).  The also reason inductively -- in the informal 
sense -- and are satisfied with the resulting high probabilities.  They 
generally apply intuitions about a problem statement that is almost 
surely not written clearly enough to be understood by a machine.  And so 
forth.

What makes static type systems interesting is not the fact that these 
logical processes of reasoning exist; it is the fact that they are 
formalized with definite axioms and rules of inference, performed 
entirely on the program before execution, and designed to be entirely 
evaluatable in finite time bounds according to some procedure or set of 
rules, so that a type system may be checked and the same conclusion 
reached regardless of who (or what) is doing the checking.  All that, 
and they still reach interesting conclusions about programs.  If 
informal reasoning about types doesn't meet these criteria (and it 
doesn't), then it simply doesn't make sense to call it a type system 
(I'm using the static terminology here).  It may make sense to discuss 
some of the ways that programmer reasoning resembles types, if indeed 
there are resemblances beyond just that they use the same basic rules of 
logic.  It may even make sense to try to identify a subset of programmer 
reasoning that even more resembles... or perhaps even is... a type 
system.  It still doesn't make sense to call programmer reasoning in 
general a type system.

In the same post here, you simultaneously suppose that there's something 
inherently informal about the type reasoning in dynamic languages; and 
then talk about the type system "in the static sense" of a dynamic 
language.  How is this possibly consistent?

> We know that we can easily take dynamically-typed program fragments and 
> assign them type schemes, and we can make sure that the type schemes 
> that we use in all our program fragments use the same type system.

Again, it would be surprising if this were not true.  If programmers do, 
in fact, tend to reason correctly about their programs, then one would 
expect that there are formal proofs that could be found that they are 
right.  That doesn't make their programming in any way like a formal 
proof.  I tend to think that a large number of true statements are 
provable, and that programmers are good enough to make a large number of 
statements true.  Hence, I'd expect that it would be possible to find a 
large number of true, provable statements in any code, regardless of 
language.

> So we actually have quite a bit of evidence about the presence of static 
> types in dynamically-typed programs.

No.  What we have is quite a bit of evidence about properties remaining 
true in dynamically typed programs, which could have been verified by 
static typing.

> We're currently discussing something that so far has only been captured 
> fairly informally.  If we restrict ourselves to only what we can say 
> about it formally, then the conversation was over before it began.

I agree with that statement.  However, I think the conversation 
regarding static typing is also over when you decide that the answer is 
to weaken static typing until the word applies to informal reasoning.  
If the goal isn't to reach a formal understanding, then the word "static 
type" shouldn't be used; and when that is the goal, it still shouldn't 
be applied to process that aren't formalized until they manage to become 
formalized.

Hopefully, this isn't perceived as too picky.  I've already conceded 
that we can use "type" in a way that's incompatible with all existing 
research literature.  I would, however, like to retain some word with 
actually has that meaning.  Otherwise, we are just going to be 
linguistically prevented from discussing it.

-- 
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation
From: Anton van Straaten
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <E0Lng.159802$F_3.7156@newssvr29.news.prodigy.net>
Chris Smith wrote:
> What makes static type systems interesting is not the fact that these 
> logical processes of reasoning exist; it is the fact that they are 
> formalized with definite axioms and rules of inference, performed 
> entirely on the program before execution, and designed to be entirely 
> evaluatable in finite time bounds according to some procedure or set of 
> rules, so that a type system may be checked and the same conclusion 
> reached regardless of who (or what) is doing the checking.  All that, 
> and they still reach interesting conclusions about programs.  

There's only one issue mentioned there that needs to be negotiated 
w.r.t. latent types: the requirement that they are "performed entirely 
on the program before execution".  More below.

> If 
> informal reasoning about types doesn't meet these criteria (and it 
> doesn't), then it simply doesn't make sense to call it a type system 
> (I'm using the static terminology here).  It may make sense to discuss 
> some of the ways that programmer reasoning resembles types, if indeed 
> there are resemblances beyond just that they use the same basic rules of 
> logic.  It may even make sense to try to identify a subset of programmer 
> reasoning that even more resembles... or perhaps even is... a type 
> system.  It still doesn't make sense to call programmer reasoning in 
> general a type system.

I'm not trying to call programmer reasoning in general a type system. 
I'd certainly agree that a programmer muddling his way through the 
development of a program, perhaps using a debugger to find all his 
problems empirically, may not be reasoning about anything that's 
sufficiently close to types to call them that.  But "latent" means what 
it implies: someone who is more aware of types can do better at 
developing the latent types.

As a starting point, let's assume we're dealing with a disciplined 
programmer who (following the instructions found in books such as the 
one at htdp.org) reasons about types, writes them in his comments, and 
perhaps includes assertions (or contracts in the sense of "Contracts for 
Higher Order Functions[1]), to help check his types at runtime (and I'm 
talking about real type-checking, not just tag checking).

When such a programmer writes a type signature as a comment associated 
with a function definition, in many cases he's making a claim that's 
provable in principle about the inputs and outputs of that function.

For example, if the type in question is "int -> int", then the provable 
claim is that given an int, the function cannot return anything other 
than an int.  Voila, assuming we can actually prove our claim, then 
we've satisfied the requirement in Pierce's definition of type system, 
i.e. "proving the absence of certain program behaviors by classifying 
phrases according to the kinds of values they compute."

The fact that the proof in question may not be automatically verified is 
no more relevant than the fact that so many proofs in mathematics have 
not been automatically verified.  Besides, if the proof turns out to be 
wrong, we at least have an automated mechanism for finding witnesses to 
the error: runtime tag checks will generate an error.  Such detection is 
not guaranteed, but again, "proof" does not imply "automation".

What I've just described walks like a type and talks like a type, or at 
least it appears to do so according to Pierce's definition.  We can 
classify many terms in a given dynamically-typed program on this basis 
(although some languages may be better at supporting this than others).

So, on what grounds do you reject this as not being an example of a 
type, or at the very least, something which has clear and obvious 
connections to formal types, as opposed to simply being arbitrary 
programmer reasoning?

Is it the lack of a formalized type system, perhaps?  I assume you 
recognize that it's not difficult to define such a system.

Regarding errors, we haven't proved that the function in question can 
never be called with something other than an int, but we haven't claimed 
to prove that, so there's no problem there.  I've already described how 
errors in our proofs can be detected.

Another possible objection is that I'm talking about local cases, and 
not making any claims about extending it to an entire program (other 
than to observe that it can be done).  But let's take this a step at a 
time: considered in isolation, if we can assign types to terms at the 
local level in a program, do you agree that these are really types, in 
the type theory sense?

If you were to agree, then we could move on to looking at what it means 
to have many local situations in which we have fairly well-defined 
types, which may all be defined within a common type system.

As an initial next step, we could simply rely on the good old universal 
type everywhere else in the program - it ought to be possible to make 
the program well-typed in that case, it would just mean that the 
provable properties would be nowhere near as pervasive as with a 
traditional statically-typed program.  But the point is we would now 
have put our types into a formal context.

> In the same post here, you simultaneously suppose that there's something 
> inherently informal about the type reasoning in dynamic languages; and 
> then talk about the type system "in the static sense" of a dynamic 
> language.  How is this possibly consistent?

The point about inherent informality is just that if you fully formalize 
a static type system for a dynamic language, such that entire programs 
can be statically typed, you're likely to end up with a static type 
system with the same disadvantages that the dynamic language was trying 
to avoid.

However, that doesn't preclude type system being defined which can be 
used to reason locally about dynamic programs.  Some people actually do 
this, albeit informally.

>>So we actually have quite a bit of evidence about the presence of static 
>>types in dynamically-typed programs.
> 
> 
> No.  What we have is quite a bit of evidence about properties remaining 
> true in dynamically typed programs, which could have been verified by 
> static typing.

Using my example above, at least some of those properties would have 
been verified by manual static typing.  So those properties, at least, 
seem to be types, at least w.r.t. the local fragment they're proved within.

>>We're currently discussing something that so far has only been captured 
>>fairly informally.  If we restrict ourselves to only what we can say 
>>about it formally, then the conversation was over before it began.
> 
> 
> I agree with that statement.  However, I think the conversation 
> regarding static typing is also over when you decide that the answer is 
> to weaken static typing until the word applies to informal reasoning.  
> If the goal isn't to reach a formal understanding, then the word "static 
> type" shouldn't be used

Well, I'm trying to use the term "latent type", as a way to avoid the 
ambiguity of "type" alone, to distinguish it from "static type", and to 
get away from the obvious problems with "dynamic type".

But I'm much less interested in the term itself than in the issues 
surrounding dealing with "real" types in dynamically-checked programs - 
if someone had a better term, I'd be all in favor of it.

> and when that is the goal, it still shouldn't 
> be applied to process that aren't formalized until they manage to become 
> formalized.

This comes back to the phrase I highlighted at the beginning of this 
message: "performed entirely on the program before execution". 
Formalizing local reasoning about types is pretty trivial, in many 
cases.  It's just that there are other cases where that's less 
straightforward.

So when well-typed program fragments are considered as part of a larger 
untyped program, you're suggesting that this so radically changes the 
picture, that we can no longer distinguish the types we identified as 
being anything beyond programmer reasoning in general?  All the hard 
work we did figuring out the types, writing them down, writing 
assertions for them, and testing the program to look for violations 
evaporates into the epistemological black hole that exists outside the 
boundaries of formal type theory?

> Hopefully, this isn't perceived as too picky.  I've already conceded 
> that we can use "type" in a way that's incompatible with all existing 
> research literature.  

Not *all* research literature.  There's literature on various notions of 
dynamic type.

> I would, however, like to retain some word with 
> actually has that meaning.  Otherwise, we are just going to be 
> linguistically prevented from discussing it.

For now, you're probably stuck with "static type".  But I think it was 
never likely to be the case that type theory would succeed in absconding 
with the only technically valid use of the English word "type". 
Although not for want of trying!

Besides, I think it's worth considering why Bertrand Russell used the 
term "types" in the first place.  His work had deliberate connections to 
the real world.  Type theory in computer science unavoidably has similar 
connections.  You could switch to calling it Zoltar Theory, and you'd 
still have to deal with the fact that a zoltar would have numerous 
connections to the English word "type".  In CS, we don't have the luxury 
of using the classic mathematician's excuse when confronted with 
inconvenient philosophical issues, of claiming that type theory is just 
a formal symbolic game, with no meaning outside the formalism.

Anton

[1] 
http://people.cs.uchicago.edu/~robby/pubs/papers/ho-contracts-techreport.pdf
From: Chris Smith
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <MPG.1f0941a792c24922989708@news.altopia.net>
Anton van Straaten <·····@appsolutions.com> wrote:
> I'm not trying to call programmer reasoning in general a type system. 
> I'd certainly agree that a programmer muddling his way through the 
> development of a program, perhaps using a debugger to find all his 
> problems empirically, may not be reasoning about anything that's 
> sufficiently close to types to call them that.

Let me be clear, then.  I am far less agreeable than you that this 
couldn't be called a type system.  One of my goals here is to try to 
understand, if we are going to work out what can and can't be called a 
type system in human thought, how we can possible draw a boundary 
between kinds of logic and say that one of them is a type system while 
the other is not.  I haven't seen much of a convincing way to do it.  
The only things that I can see rejecting out of hand are the empirical 
things; confidence gained through testing or observation of the program 
in a debugger.  Everything else seems to qualify, and that's the 
problem.

So let's go ahead and agree about everything up until...

[...]

> As an initial next step, we could simply rely on the good old universal 
> type everywhere else in the program - it ought to be possible to make 
> the program well-typed in that case, it would just mean that the 
> provable properties would be nowhere near as pervasive as with a 
> traditional statically-typed program.

It would, in fact, mean that interesting provable properties would only 
be provable if data can't flow outside of those individual small bits 
where the programmer reasoned with types.  Of course that's not true, or 
the programmer wouldn't have written the rest of the program in the 
first place.  If we define the universal type as a variant type 
(basically, tagged or something like it, so that the type can be tested 
at runtime) then certain statements become provable, of the form "either 
X or some error condition is raised because a value is inappropriate".  
That is indeed a very useful property; but wouldn't it be easier to 
simply prove it by appealing to the language semantics that say that "if 
not X, the operation raises an exception," or to the existence of 
assertions within the function that verify X, or some other such thing 
(which must exist, or the statement wouldn't be true)?

> But the point is we would now 
> have put our types into a formal context.

Yes, but somewhat vacuously so...

> The point about inherent informality is just that if you fully formalize 
> a static type system for a dynamic language, such that entire programs 
> can be statically typed, you're likely to end up with a static type 
> system with the same disadvantages that the dynamic language was trying 
> to avoid.

Okay, so you've forced the admission that you have a static type system 
that isn't checked and doesn't prove anything interesting.  If you 
extended the static type system to cover the whole program, then you'd 
have a statically typed language that lacks an implementation of the 
type checker.

I don't see what we lose in generality by saying that the language lacks 
a static type system, since it isn't checked, and doesn't prove anything 
anyway.  The remaining statement is that the programmer may use static 
types to reason about the code.  But, once again, the problem here is 
that I don't see any meaning to that.  It would basically mean the same 
thing as that the programmer may use logic to reason about code.  That 
isn't particularly surprising to me.  I suppose this is why I'm 
searching for what is meant by saying that programmers reason about 
types, and am having that conversation in several places throughout this 
thread... because otherwise, it doesn't make sense to point out that 
programmers reason with types.

> Well, I'm trying to use the term "latent type", as a way to avoid the 
> ambiguity of "type" alone, to distinguish it from "static type", and to 
> get away from the obvious problems with "dynamic type".

I replied to your saying something to Marshall about the "type systems 
(in the static sense) of dynamically-typed languages."  That's what my 
comments are addressing.  If you somehow meant something other than the 
static sense, then I suppose this was all a big misunderstanding.

> But I'm much less interested in the term itself than in the issues 
> surrounding dealing with "real" types in dynamically-checked programs - 
> if someone had a better term, I'd be all in favor of it.

I will only repeat again that in static type systems, the "type" becomes 
a degenerate concept when it is removed from the type system.  It is 
only the type system that makes something a type.  Therefore, if you 
want "real" types, then my first intuition is that you should avoid 
looking in the static types of programming languages.  At best, they 
happen to coincide frequently with "real" types, but that's only a 
pragmatic issue if it means anything at all.

> So when well-typed program fragments are considered as part of a larger 
> untyped program, you're suggesting that this so radically changes the 
> picture, that we can no longer distinguish the types we identified as 
> being anything beyond programmer reasoning in general?

Is this so awfully surprising, when the type system itself is just 
programmer reasoning?  When there ceases to be anything that's provable 
about the program, there also ceases to be anything meaningful about the 
type system.  You still have the reasoning, and that's all you ever had.  
You haven't really lost anything, except a name for it.  Since I doubt 
the name had much meaning at all (as explained above), I don't count 
that as a great loss.  If you do, though, then this simple demonstrates 
that you didn't really mean "type systems (in the static sense)".

> > Hopefully, this isn't perceived as too picky.  I've already conceded 
> > that we can use "type" in a way that's incompatible with all existing 
> > research literature.  
> 
> Not *all* research literature.  There's literature on various notions of 
> dynamic type.

I haven't seen it, but I'll take your word that some of it exists.  I 
wouldn't tend to read research literature on dynamic types, so that 
could explain the discrepancy.

> > I would, however, like to retain some word with 
> > actually has that meaning.  Otherwise, we are just going to be 
> > linguistically prevented from discussing it.
> 
> For now, you're probably stuck with "static type".

That's fine.  I replied to your attempting to apply "static type" in a 
sense where it doesn't make sense.

> In CS, we don't have the luxury 
> of using the classic mathematician's excuse when confronted with 
> inconvenient philosophical issues, of claiming that type theory is just 
> a formal symbolic game, with no meaning outside the formalism.

As it turns out, my whole purpose here is to address the idea that type 
theory for programming languages is limited to some subset of problems 
which people happen to think of as type problems.  If type theorists 
fourty years ago had limited themselves to considering what people 
thought of as types back then, we'd be in a considerably worse position 
today with regard to the expressive power of types in commonly used 
programming languages.  Hence, your implication that insisting on a 
formal rather than intuitive definition is opposed to the real-world 
usefulness of the field is actually rather ironic.  The danger lies in 
assuming that unsolved problems are unsolvable, and therefore defining 
them out of the discussion until the very idea that someone would solve 
that problem with these techniques starts to seem strange.

-- 
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation
From: David Hopwood
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <lACog.230817$8W1.88545@fe1.news.blueyonder.co.uk>
Anton van Straaten wrote:
> Chris Smith wrote:
> 
>> What makes static type systems interesting is not the fact that these
>> logical processes of reasoning exist; it is the fact that they are
>> formalized with definite axioms and rules of inference, performed
>> entirely on the program before execution, and designed to be entirely
>> evaluatable in finite time bounds according to some procedure or set
>> of rules, so that a type system may be checked and the same conclusion
>> reached regardless of who (or what) is doing the checking.  All that,
>> and they still reach interesting conclusions about programs.  
> 
> There's only one issue mentioned there that needs to be negotiated
> w.r.t. latent types: the requirement that they are "performed entirely
> on the program before execution".  More below.
> 
>> If informal reasoning about types doesn't meet these criteria (and it
>> doesn't), then it simply doesn't make sense to call it a type system
>> (I'm using the static terminology here).  It may make sense to discuss
>> some of the ways that programmer reasoning resembles types, if indeed
>> there are resemblances beyond just that they use the same basic rules
>> of logic.  It may even make sense to try to identify a subset of
>> programmer reasoning that even more resembles... or perhaps even is...
>> a type system.  It still doesn't make sense to call programmer
>> reasoning in general a type system.
> 
> I'm not trying to call programmer reasoning in general a type system.
> I'd certainly agree that a programmer muddling his way through the
> development of a program, perhaps using a debugger to find all his
> problems empirically, may not be reasoning about anything that's
> sufficiently close to types to call them that.  But "latent" means what
> it implies: someone who is more aware of types can do better at
> developing the latent types.

So these "latent types" may in some cases not exist (in any sense), unless
and until someone other than the original programmer decides how the program
could have been written in a typed language?

I don't get it.

> As a starting point, let's assume we're dealing with a disciplined
> programmer who (following the instructions found in books such as the
> one at htdp.org) reasons about types, writes them in his comments, and
> perhaps includes assertions (or contracts in the sense of "Contracts for
> Higher Order Functions[1]), to help check his types at runtime (and I'm
> talking about real type-checking, not just tag checking).
> 
> When such a programmer writes a type signature as a comment associated
> with a function definition, in many cases he's making a claim that's
> provable in principle about the inputs and outputs of that function.

This is talking about a development practice, not a property of the
language. It may (for the sake of argument) make sense to assign a term
such as latent typing to this development practice, but not to call the
language "latently-typed".

-- 
David Hopwood <····················@blueyonder.co.uk>
From: Chris F Clark
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <sdd8xnl2oiz.fsf@shell01.TheWorld.com>
Chris F Clark (I) wrote:

> I'm particularly interested if something unsound (and perhaps
> ambiguous) could be called a type system.  I definitely consider such
> things type systems.

"Marshall" <···············@gmail.com> wrote:

> I don't understand. You are saying you prefer to investigate the
> unsound over the sound?
...
> Again, I cannot understand this. In a technical realm, vagueness
> is the opposite of understanding.

At the risk of injecting too much irrelevant philosophy into the
discussion, I will with great trepdiation reply.

First in the abstrtact: No, understanding is approximating.  The world
is inherently vague.  We make false symbolic models of the world which
are consistent, but at some level they do not reflect reality, because
reality isn't consistent.  Only by abtracting away the inherent
infinite amout of subtlety present in the real universe can we come to
comprehensible models.  Those models can be consistent, but they are
not the universe.  The models in their consistency, prove things which
are not true about the real universe.

Now in the concrete: In my work productivity is ultimately important.
Therefore, we live by the 80-20 rule in numerous variations.  One of
ths things we do to achieve productivity is simplify things.  In fact,
we are more interested in an unsound simplification that is right 80%
of the time, but takes only 20% of the effort to compute, than a
completely sound (100% right) model which takes 100% of the time to
compute (which is 5 times as long).  We are playing the probabilities.

It's not that we don't respect the sound underpining, the model which
is consistent and establishes certain truths.  However, what we want
is the rule of thumb which is far simpler and approximates the sound
model with reasonable accuracy.  In particular, we accept two types of
unsoundness in the model.  One, we accept a model which gives wrong
answers which are detectable.  We code tests to catch those cases, and
use a different model to get the right answer.  Two, we accept a model
which gets the right answer only when over-provisioned.  for example,
if we know a loop will occassionaly not converge in the n-steps that
it theoretically should, we will run the loop for n+m steps until the
approximation also converges--even if that requires allocating m extra
copies of some small element than are theoretically necessary.  A
small waste of a small resource, is ok if it saves a waste of a more
critical resource--the most critical resource of all being our project
timeliness.

We do the same things in our abstract reasoning (including our type
model).  The problems we are solving are too large for us to
understand.  Therefore, we make simplifications, simplifications we
know are inaccurate and potentially unsound.  Our algorithm then
solves the simplified model, which we think covers the main portion of
the problem.  It also attempts to detect when it has encountered a
problem that is outside of the simplified region, on the edge so to
speak.  Now, in general, we can figure out how to handle certain edge
cases, and we do it.  However, some times the edge of one part of the
algorithm intereacts with an edge of another part of the algorithm and
we get a "corner case".  Because our model is too simple and the
reasoning it allows is thus unsound, the algorithm will occassionally
compute the wrong results for some of those corners.  We use the same
techniques for dealing with them.  Dynamic tagging is one way of
catching when we have gotten the wrong answer in our type
simplification.

Marshall's last point:

> I flipped a coin to see who would win the election; it came
> up "Bush". Therefore I *knew* who was going to win the
> election before it happened. See the probem?

Flipping one coin to determine an election is not playing the
probabilities correctly.  You need a plausible explanation for why the
coin should predict the right answer and a track record of it being
correct.  If you had a coin that had correctly predicted the previous
42 presidencies and you had an explanation why the coin was right,
then it would be credible and I would be willing to wager that it
could also predict that the coin could tell us who the 44th president
would be.  One flip and no explanation is not sufficient.  (And to the
abstract point, to me that is all knowledge is, some convincing amount
of examples and a plausible explanation--anyone who thinks they have
more is appealing to a "knowledge" of the universe that I don't
accept.)

I do not want incredible unsound reasoning, just reasoning that is
marginal, within a margin of safety that I can control.  Everyone that
I have know of has as far as I can tell made simplifying assumptions
to make the world tractable.  Very rarely do we deal with the world
completely formally.  Look at where that got Russell and Whitehead.
I'm just trying to be "honest" about that fact and find ways to
compensate for my own failures.

-Chris
From: Marshall
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1151260503.976524.77060@b68g2000cwa.googlegroups.com>
Chris F Clark wrote:
> Chris F Clark (I) wrote:
>
> > I'm particularly interested if something unsound (and perhaps
> > ambiguous) could be called a type system.  I definitely consider such
> > things type systems.
>
> "Marshall" <···············@gmail.com> wrote:
>
> > I don't understand. You are saying you prefer to investigate the
> > unsound over the sound?
> ...
> > Again, I cannot understand this. In a technical realm, vagueness
> > is the opposite of understanding.
>
> At the risk of injecting too much irrelevant philosophy into the
> discussion, I will with great trepdiation reply.

I agree this is OT, but I'm not sure about the source of trepidation.


> First in the abstrtact: No, understanding is approximating.

Agreed.


> The world is inherently vague.

Our understanding of the world is vague. The world itself is
not at all vague.


> We make false symbolic models of the world which
> are consistent, but at some level they do not reflect reality,

Yes...

> because reality isn't consistent.

What?!


> Only by abtracting away the inherent
> infinite amout of subtlety present in the real universe can we come to
> comprehensible models.

Sure. (Although I object to "infinite.")


> Those models can be consistent, but they are
> not the universe.  The models in their consistency, prove things which
> are not true about the real universe.

Sure, sure, sure. None of these is a reaon to prefer the unsound
over the sound.


> Now in the concrete: In my work productivity is ultimately important.
> Therefore, we live by the 80-20 rule in numerous variations.  One of
> ths things we do to achieve productivity is simplify things.  In fact,
> we are more interested in an unsound simplification that is right 80%
> of the time, but takes only 20% of the effort to compute, than a
> completely sound (100% right) model which takes 100% of the time to
> compute (which is 5 times as long).  We are playing the probabilities.

What you are describing is using a precise mathematical function
to approximate a different precise mathematical function. This
argues for the value of approximation functions, which I do not
dispute. But this does not in any way support the idea of vague
trumping precise, informal trumping formal, or unsoundness as
an end in itself.


> It's not that we don't respect the sound underpining, the model which
> is consistent and establishes certain truths.  However, what we want
> is the rule of thumb which is far simpler and approximates the sound
> model with reasonable accuracy.  In particular, we accept two types of
> unsoundness in the model.  One, we accept a model which gives wrong
> answers which are detectable.  We code tests to catch those cases, and
> use a different model to get the right answer.  Two, we accept a model
> which gets the right answer only when over-provisioned.  for example,
> if we know a loop will occassionaly not converge in the n-steps that
> it theoretically should, we will run the loop for n+m steps until the
> approximation also converges--even if that requires allocating m extra
> copies of some small element than are theoretically necessary.  A
> small waste of a small resource, is ok if it saves a waste of a more
> critical resource--the most critical resource of all being our project
> timeliness.

Υes, I'm quite familiar with modelling, abstraction, approximation,
etc. However nothing about those endevours suggests to me
that unsoundness is any kind of goal.


> Marshall's last point:
>
> > I flipped a coin to see who would win the election; it came
> > up "Bush". Therefore I *knew* who was going to win the
> > election before it happened. See the probem?
>
> Flipping one coin to determine an election is not playing the
> probabilities correctly.  You need a plausible explanation for why the
> coin should predict the right answer and a track record of it being
> correct.  If you had a coin that had correctly predicted the previous
> 42 presidencies and you had an explanation why the coin was right,
> then it would be credible and I would be willing to wager that it
> could also predict that the coin could tell us who the 44th president
> would be.  One flip and no explanation is not sufficient.  (And to the
> abstract point, to me that is all knowledge is, some convincing amount
> of examples and a plausible explanation--anyone who thinks they have
> more is appealing to a "knowledge" of the universe that I don't
> accept.)

I used a coin toss; I could also have used a psycic hotline. There
is an explanation for why those things work, but the explanation
is unsound.


> Look at where that got Russell and Whitehead.

Universal acclaim, such that their names will be praised for
centuries to come?


> I'm just trying to be "honest" about that fact and find ways to
> compensate for my own failures.

Limitation != failure.


Marshal
From: Chris Smith
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <MPG.1f07adba87e360d29896f8@news.altopia.net>
Chris F Clark <···@shell01.TheWorld.com> wrote:
> Do you reject that there could be something more general than what a 
> type theorist discusses?  Or do you reject calling such things a type?

I think that the correspondence partly in the wrong direction to 
describe it that way.  If someone were to stand up and say "every 
possible but of reasoning about program correctness or behavior is type 
reasoning", then I'd be happy to say that formal type systems are a 
subset of that kind of reasoning.  However, that's quite the statement 
to ask of anyone.  So far, everyone has wanted to say that there are 
some kinds of reasoning that are type reasoning and some that are not.  
If that is the case, then a formal type system is in that sense more 
general than this intuitive notion of type, since formal type systems 
are not limited to verifying any specific category of statements about 
program behavior (except, perhaps that they are intended to verify 
guarantees, not possibilities; I don't believe it would fit all 
definitions of formal types to try to verify that it is possible for a 
program to terminate, for example).

What I can't agree to is that what you propose is actually more general.  
It is more general in some ways, and more limited in others.  As such, 
the best you can say is that is analogous to formal types in some ways, 
but certainly not that it's a generalization of them.

> Let you write: 
> > because we could say that anything that checks types is a type system,
> > and then worry about verifying that it's a sound type system without
> > worrying about whether it's a subset of the perfect type system. 
>  
> I'm particularly interested if something unsound (and perhaps 
> ambiguous) could be called a type system.

Yes, although this is sort of a pedantic question in the first place, I 
believe that an unsound type system would still generally be called a 
type system in formal type theory.  However, I have to qualify that with 
a clarification of what is meant by unsound.  We mean that it DOES 
assign types to expressions, but that those types are either not 
sufficient to prove what we want about program behavior, or they are not 
consistent so that a program that was well typed may reduce to poorly 
typed program during its execution.  Obviously, such type systems don't 
prove anything at all, but they probably still count as type systems.

(I'll point out briefly that a typical, but not required, approach to 
type theory is to define program semantics so that the evaluator lacks 
even the rules to continue evaluating programs that are poorly typed or 
that don't have the property of interest.  Hence, you'll see statements 
like one earlier in this thread that part of type-soundness is a 
guarantee that the evaluation or semantics doesn't get stuck.  I 
actually think this is an unfortunate confusion of the model with the 
thing being modeled, and the actual requirement of interest is that the 
typing relation is defined so that all well-typed terms have the 
interesting property which we are trying to prove.  It is irrelevant 
whether the definition of the language semantics happens to contain 
rules that generalize to ill-typed programs or not.)

I suspect, though, that you mean something else by unsound.  I don't 
know what you mean, or whether it could be considered formally a type 
system.

> > 1. The domain is the set of inputs to that expression which are going to 
> > produce a correct result. 
> > 
> > 2. The domain is the set of inputs that I expected this expression to 
> > work with when I wrote it. 
> > 
> > 3. The domain is the set of inputs for which the expression has a 
> > defined result within the language semantics. 

> Actually, I like 2 quite well.  There is some set in my mind when I'm
> writing a particular expression.  It is likely an ill-defined set, but
> I don't notice that.  That set is exactly the "type".

I don't actually mind #2 when we're talking about types that exist in 
the programmer's mind.  I suspect that it may get some complaint along 
the lines that in the presence of type polymorphism, programmers don't 
need to be (perhaps rarely are) thinking of any specific set of values 
when they write code.  I would agree with that statement, but point out 
that I can define at least my informal kind of set by saying, for 
instance, "the set of all inputs on which X operations make sense".

I believe there is more of a concrete difference between #1 and #2 than 
you may realize.  If we restrict types to considering the domain of 
functions without regard to the context in which they are called, then 
there are plenty of inputs on which this function does its own job quite 
well, but a global analysis tool may be able to conclude that if that 
input has been passed in, then this program already beyond the 
possibility of producing a correct result.  #1 would, therefore, be a 
more global form of analysis than #2, rather than being merely a 
question of whether you are fallible or not.

> Expanding that a little.  I expect the language to catch type errors
> where I violate model 3.

So would I.  (Actually, that's another pesky question; if the language 
"catches" it, did the error even occur, or did the expression just gain 
a defined result such that domain #3 is indistinguishable from the set 
of all possible inputs.  I tend to agree with Chris Uppal, I suppose, 
that the latter is really the case, so that dynamic type systems in 
languages, if they are sound, prevent concept #3 from ever being in 
question.  But, of course, you don't think about it that way, which 
distinguishes 2 from 3.)

-- 
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation
From: Chris F Clark
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <sdd1wtdymkm.fsf@shell01.TheWorld.com>
Chris F Clark <···@shell01.TheWorld.com> (I) wrote:
> Do you reject that there could be something more general than what a 
> type theorist discusses?  Or do you reject calling such things a type?

Chris Smith <·······@twu.net> wrote:

> I think that the correspondence partly in the wrong direction to 
> describe it that way.
...
> What I can't agree to is that what you propose is actually more general.  
> It is more general in some ways, and more limited in others.  As such, 
> the best you can say is that is analogous to formal types in some ways, 
> but certainly not that it's a generalization of them.

Yes, I see your point.

Me:
> I'm particularly interested if something unsound (and perhaps 
> ambiguous) could be called a type system.

Chris Smith:

> Yes, although this is sort of a pedantic question in the first place, I 
> believe that an unsound type system would still generally be called a 
> type system in formal type theory.  However, I have to qualify that with 
> a clarification of what is meant by unsound.  We mean that it DOES 
> assign types to expressions, but that those types are either not 
> sufficient to prove what we want about program behavior, or they are not 
> consistent so that a program that was well typed may reduce to poorly 
> typed program during its execution.  Obviously, such type systems don't 
> prove anything at all, but they probably still count as type systems.

Yes, and you see mine.

These informal systems, which may not prove what they claim to prove
are my concept of a "type system".  That's because at some point,
there is a person reasoning informally about the program.  (There may
also people reasoning formally about the program, but I am going to
neglect them.)  It is to the people who are reasoning informally about
the program we wish to communicate that there is a type error.  That
is, we want someone who is dealing with the program informally to
realize that there is an error, and that this error is somehow related
to some input not being kept within proper bounds (inside the domain
set) and that as a result the program will compute an unexpected and
incorrect result.

I stressed the informal aspect, because the intended client of the
program is almost universally *NOT* someone who is thinking rigorously
about some mathematical model, but is dealing with some "real world"
phenomena which they wish to model and may not have a completely
formed concrete representation of.  Even the author of the program is
not likely to have a completely formed rigorous model in mind, only
some approxiamtion to such a model.

I also stress the informality, because beyond a certain nearly trivial
level of complexity, people are not capable of dealing with truly
formal systems.  As a compiler person, I often have to read sections
of language standards to try to discern what the standard is requiring
the compiler to do.  Many language standards attempt to describe some
formal model.  However, most are so imprecise and ambiguous when they
attempt to do so that the result is only slightly better than not
trying at all.  Only when one understands the standard in the context
of typical practice can one begin to fathom what the standard is
trying to say and why they say it the way they do.  

A typical example of this is the grammars for most programming
languages, they are expressed in a BNF variant.  However, most of the
BNF descriptions omit important details and gloss over certain
inconsistencies.  Moreover, most BNF descriptions are totally
unsuitable to a formal translation (e.g. an LL or LR parser generator,
or even an Earley or CYK one).  Yet, when considered in context of
typical implementations, the errors in the formal specification can
easily be overlooked and resolved to produce what the standard authors
intended.

For example, a few years ago I wrote a Verilog parser by
transliterating the grammar in the standard (IEEE 1364-1995) into the
notation used by my parser generator, Yacc++.  It took a couple of
days to do so.  The resulting parser essentially worked on the first
try.  However, the reason it did so, is that I had a lot of outside
knowledge when I was making the transliteration.  There were places
where I knew that this part of the grammar was lexical for lexing and
this part was for parsing and cut the grammar correctly into two
parts, despite there being nothing in the standard which described
that dichotomy.  There were other parts, like the embedded
preprocessor, that I knew were borrowed from C, so that I could use
the reoultions of issues the way a C compiler would do so, to guide
resolving the ambiguities in the Verilog standard.  There were other
parts, I could tell the standard was simply written wrong, that is
where the grammar did not specify the correct BNF, because the BNF
they had written did not specifiy something which would make
sense--for example, the endif on an if-then-else-statement was bound
to the else, so that one would write an if then without an endif, but
and if then else with the endif--and I knew that was not the correct
syntax.

The point of the above paragraph is that the very knowledgable people
writing the Verilog standard made mistakes in their formal
specification, and in fact these mistakes made it passed several
layers of review and had been formally ratified.  Given the correct
outside knowledge these mistakes are relatively trivial to ignore and
correct. However, from a completely formal perspective, the standard
is simply incorrect.

More importantly, chip designers use knowledge of the standard to
guide the way they write chip descriptions.  Thus, at some level these
formal erros, creep into chip designs.  Fortunately, the chip
designers also look past the inconsistencies and "program" (design) in
a language which looks very like standard Verilog, but actually says
what they mean.

Moreover, not only is the grammar in the standard a place where
Verilog is not properly formalized.  There are several parts of the
type model which suffer similar problems.  Again, we have what the
standard specifies, and what is a useful (and expected) description of
the target model.  There are very useful and informal models of how
the Verilog language work and what types are involved.  There is also
an attempt in the standard to convey some of these aspects formally.
Occassionally, the formal description is write and illuminating, at
other times it is opaque and at still other times, it is inconsistent,
saying contradictory things in different parts of the standard.

Some of the more egregious errors were corrected in the 2001 revision.
However, at the same time, new features were added which were (as far
as I can tell) just as inconsistently specified.  And, this is not the
case of incompetence or unprofessionalism.  The Verilog standard is
not alone in these problems--it's just my most recent experience, and
thus the one I am most familiar with at the moment.

I would expect the Python, Scheme, Haskell, and ML standards to be
more precise and less inconsistent, because of their communities.
However, I would be surprised if they were entirely unambiguous and
error-free.  In general, I think most non-trivial attempts at
formalization flounder on our inability.

Therefore, I do not want to exlcude from type systems, things wich are
informal and unsound.  They are simply artifacts of human creation.

-Chris
From: Chris Smith
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <MPG.1f08d526ff47053e989703@news.altopia.net>
Chris F Clark <···@shell01.TheWorld.com> wrote:
> These informal systems, which may not prove what they claim to prove
> are my concept of a "type system".

Okay, that works.  I'm not sure where it gets us, though, except for 
gaining you the right to use "type system" in a completely uninteresting 
sense to describe your mental processes.  You still need to be careful, 
since most statements from type theory about type systems tend to 
implicitly assume the property of being sound.

I'm not exactly convinced that it's possible to separate that set of 
reasoning people do about programs into types and not-types, though.  If 
you want to stretch the definition like this, then I'm afraid that you 
end up with the entire realm of human thought counting as a type system.  
I seem to recall that there are epistemologists who would be happy with 
that conclusion (I don't recall who, and I'm not actually interested 
enough to seek them out), but it doesn't lead to a very helpful sort of 
characterization.

> It is to the people who are reasoning informally about
> the program we wish to communicate that there is a type error.  That
> is, we want someone who is dealing with the program informally to
> realize that there is an error, and that this error is somehow related
> to some input not being kept within proper bounds (inside the domain
> set) and that as a result the program will compute an unexpected and
> incorrect result.

I've lost the context for this part of the thread.  We want this for 
what purpose and in what situation?

I don't think we necessarily want this as the goal of a dynamic type 
system, if that's what you mean.  There is always a large (infinite? I 
think so...) list of potential type systems under which the error would 
not have been a type error.  How does the dynamic type system know which 
informally defined, potentially unsound, mental type system to check 
against?  If it doesn't know, then it can't possibly know that something 
is a type error.  If it does know, then the mental type system must be 
formally defined (if nothing else, operationally by the type checker.)

> I also stress the informality, because beyond a certain nearly trivial
> level of complexity, people are not capable of dealing with truly
> formal systems.

I think (hope, anyway) that you overstate the case here.  We can deal 
with formal systems.  We simply make errors.  When those errors are 
found, they are corrected.  Nevertheless, I don't suspect that a large 
part of the established base of theorems of modern mathematics are 
false, simply because they are sometimes rather involved reasoning in a 
complex system.

Specifications are another thing, because they are often developed under 
time pressure and then extremely difficult to change.  I agree that we 
should write better specs, but I think the main challenges are political 
and pragmatic, rather than fundamental to human understanding.

> Therefore, I do not want to exlcude from type systems, things wich are
> informal and unsound.  They are simply artifacts of human creation.

Certainly, though, when it is realized that a type system is unsound, it 
should be fixed as soon as possible.  Otherwise, the type system doesn't 
do much good.  It's also true, I suppose, that as soon as a person 
realizes that their mental thought process of types is unsound, they 
would want to fix it.  The difference, though, is that there is then no 
formal definition to fix.

-- 
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation
From: Chris F Clark
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <sddk673lg5f.fsf@shell01.TheWorld.com>
I wrote:
> These informal systems, which may not prove what they claim to prove
> are my concept of a "type system".

Chris Smith <·······@twu.net> replied:
> Okay, that works.  I'm not sure where it gets us, though....

Ok, we'll get there.  First, we need to step back in time, to when there
was roughly algol, cobol, fortran, and lisp.  Back then, at least as I
understood things, there were only a few types that generally people
understood integer, real, and (perhaps) pointer.  Now, with algol or
fortran things were generally only of the first two types and
variables were statically declared to be one or the other.  With lisp
any cell could hold any one of the 3, and some clever implementor
added "tag bits" to distinguish which the cell held.  As I understood
it, the tag bits weren't originally for type correctness, so much as
they facilitated garbage collection.  The garbage collector didn't
know what type of data you put in a cell and had to determine which
cells contained pointers, so that the mark-and-sweep algorithms could
sweep by following the pointers (and not following the integers that
looked like pointers).  Still, these tag bits did preserve the
"dynamic" type, in the sense that we knew types from the other
languages.  As I remember it, sophistication with type really didn't
occur as a phenomena for the general programmer until the introduction
of Pascal (and to a lesser extent PL/I).  Moreover, as I recall it,
perhaps because I didn't learn the appropriate dialect was that lisp
dialects kept with the more general notion of type (lists and tuples)
and eschewed low-level bit-fiddling where we might want to talk about
a 4 bit integer representing 0 .. 15 or -8 .. 7.

The important thing is the dynamicism of lisp allowed one to write
polymorphic programs, before most of us knew the term.  However, we
still had a notion of type: integers and floating point numbers were
still different and one could not portably use integer operations on
floating pointer values or vice versa.  However, one could check the
tag and "do the right thing" and many lisp primitives did just that,
making them polymorphic.

The way most of us conceived (or were taught to conceive) of the
situation was that there still were types, they were just associated
with values and the type laws we all knew and loved still worked, they
just worked dynamically upon the types of the operand values that were
presented at the time.

Can this be made rigorous?  Perhaps.

Instead of viewing the text of the program staticly, let us view it
dynamicly, that is by introducing a time (or execution) dimension.
This is easier if you take an imperative view of the dynamic program
and imagine things having an operational semantics, which is why we
stepped back in time in the first place, so that we are in a world
where imperative programming is the default model for most
programmers.  

Thus, as we traverse a list, the first element might be an integer,
the second a floating point value, the third a sub-list, the fourth
and fifth, two more integers, and so on.  If you look statically at
the head of the list, we have a very wide union of types going by.
However, perhaps there is a mapping going on that can be discerned.
For example, perhaps the list has 3 distinct types of elements
(integers, floating points, and sub-lists) and it circles through the
types in the order, first having one of each type, then two of each
type, then four of each type, then eight, and so on.  The world is now
patterned.  

However, I don't know how to write a static type annotation that
describes exactly that type.  That may just be my lack of experience
in writing annotations.  However, it's well within my grasp to imagine
the appropriate type structure, even though **I** can't describe it
formally.  More importantly, I can easily write something which checks
the tags and sees whether the data corresponds to my model.

And, this brings us to the point, if the data that my program
encounters doesn't match the model I have formulated, then something
is of the wrong "type".  Equally importantly, the dynamic tags, have
helped me discover that type error.

Now, the example may have seemed arbitrary to you, and it was in some
sense arbitrary.  However, if one imagines a complete binary tree with
three kinds elements stored in memory as rows per depth, one can get
exactly the structure I described.  And, if one were rewriting that
unusual representation to a more normal one, one might want exactly
the "type check" I proposed to validate that the input binary tree was
actually well formed.

Does this help explain dynamic typing as a form of typing?

-Chris
From: Chris Smith
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <MPG.1f0a1caffd8f6cbf989717@news.altopia.net>
Chris F Clark <···@shell01.TheWorld.com> wrote:
> Ok, we'll get there.  First, we need to step back in time, to when there
> was roughly algol, cobol, fortran, and lisp.  Back then, at least as I
> understood things, there were only a few types that generally people
> understood integer, real, and (perhaps) pointer.

In order to continue my tradition of trying to be as precise as 
possible, let me point out that of course there were more "candidate 
types", as it were, and that people understood them just fine.  They 
just didn't have a type system in place to make them into real types, so 
they didn't think to call them types.

> The important thing is the dynamicism of lisp allowed one to write
> polymorphic programs, before most of us knew the term.

Sure.  In exchange for giving up the proofs of the type checker, you 
could write all kinds of programs.  To this day, we continue to write 
programs in languages with dynamic checking features because we don't 
have powerful enough type systems to express the corresponding type 
system.

> Thus, as we traverse a list, the first element might be an integer,
> the second a floating point value, the third a sub-list, the fourth
> and fifth, two more integers, and so on.  If you look statically at
> the head of the list, we have a very wide union of types going by.
> However, perhaps there is a mapping going on that can be discerned.
> For example, perhaps the list has 3 distinct types of elements
> (integers, floating points, and sub-lists) and it circles through the
> types in the order, first having one of each type, then two of each
> type, then four of each type, then eight, and so on.  The world is now
> patterned.  
> 
> However, I don't know how to write a static type annotation that
> describes exactly that type.

I believe that, in fact, it would be trivial to imagine a type system 
which is capable of expressing that type.  Okay, not trivial, since it 
appears to be necessary to conceive of an infinite family of integer 
types with only one value each, along with range types, and 
relationships between them; but it's probably not completely beyond the 
ability of a very bright 12-year-old who has someone to describe the 
problem thoroughly and help her with the notation.

The more interesting problem is whether this type system could be made: 
(a) unobstrusive enough, probably via type inference or something along 
those lines, that it would be usable in practice; and (b) general enough 
that you could define a basic type operator that applies to a wide range 
of problems rather than revising your type operator the first time you 
need a complete 3-tree of similar form.

> That may just be my lack of experience
> in writing annotations.

You would, of course, need a suitable type system first.  For example, 
it appears to me that there is simply no possible way of expressing what 
you've described in Java, even with the new generics features.  Perhaps 
it's possible in ML or Haskell (I don't know).  My point is that if you 
were allowed to design a type system to meet your needs, I bet you could 
do it.

> And, this brings us to the point, if the data that my program
> encounters doesn't match the model I have formulated, then something
> is of the wrong "type".  Equally importantly, the dynamic tags, have
> helped me discover that type error.

Sure.  The important question, then, is whether there exists any program 
bug that can't be formulated as a type error.  And if so, what good does 
it do us to talk about type errors when we already have the perfectly 
good word "bug" to describe the same concept?

> Now, the example may have seemed arbitrary to you, and it was in some
> sense arbitrary.

Arbitrary is good.  One of the pesky things that keeps getting in the 
way here is the intuition attached to the word "type" that makes 
everyone think "string or integer".

-- 
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation
From: Chris F Clark
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <sddmzbzjjta.fsf@shell01.TheWorld.com>
I wrote:
> The important thing is the dynamicism of lisp allowed one to write
> polymorphic programs, before most of us knew the term.

Chris Smith <·······@twu.net> writes:

> Sure.  In exchange for giving up the proofs of the type checker, you 
> could write all kinds of programs.  To this day, we continue to write 
> programs in languages with dynamic checking features because we don't 
> have powerful enough type systems to express the corresponding type 
> system.

And to me the question is what kinds of types apply to these dynamic
programs, where in fact you may have to solve the halting problem to
know exactly when some statement is executed.  I expect that some
programs have type signatures that exceed the expressibility of any
static system (that is Turing complete).  Therefore, we need something
that "computes" the appropriate type at run-time, because we need full
Turing power to compute it.  To me such a system is a "dynamic type
system".  I think dynamic tags are a good approximation, because they
only compute what type the expression has this time.

> I believe that, in fact, it would be trivial to imagine a type system 
> which is capable of expressing that type.  Okay, not trivial, since it 
> appears to be necessary to conceive of an infinite family of integer 
> types with only one value each, along with range types, and 
> relationships between them; but it's probably not completely beyond the 
> ability of a very bright 12-year-old who has someone to describe the 
> problem thoroughly and help her with the notation.

Well, it look like you are right in that I see following is a Haskell
program that looks essentially correct.  I wanted something that was
simple enough that one could see that it could be computed, but which
was complex enough that it had to be computed (and couldn't be
statically expressed with a system that did no "type computations").
Perhaps, there is no such beast.  Or, perhaps I just can't formulate
it.  Or, perhaps we have static type checkers which can do
computations of unbounded complexity.  However, I thought that one of
the characteristics of type systems was that they did not allow
unbounded complexity and weren't Turing Complete.

> You would, of course, need a suitable type system first.  For example, 
> it appears to me that there is simply no possible way of expressing what 
> you've described in Java, even with the new generics features.  Perhaps 
> it's possible in ML or Haskell (I don't know).  My point is that if you 
> were allowed to design a type system to meet your needs, I bet you could 
> do it.

Or, I could do as I think the dynamic programmers do, dispense with
trying to formulate a sufficiently general type system and just check
the tags at the appropriate points.

> Sure.  The important question, then, is whether there exists any program 
> bug that can't be formulated as a type error.

If you allow Turing Complete type systems, then I would say no--every
bug can be reforumlated as a type error.  If you require your type
system to be less powerful, then some bugs must escape it.

-Chris
From: Marshall
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1151384511.223582.305760@m73g2000cwd.googlegroups.com>
Chris F Clark wrote:
>
> And to me the question is what kinds of types apply to these dynamic
> programs, where in fact you may have to solve the halting problem to
> know exactly when some statement is executed.  I expect that some
> programs have type signatures that exceed the expressibility of any
> static system (that is Turing complete).  Therefore, we need something
> that "computes" the appropriate type at run-time, because we need full
> Turing power to compute it.  To me such a system is a "dynamic type
> system".  I think dynamic tags are a good approximation, because they
> only compute what type the expression has this time.

Yes, an important question (IMHO the *more* important question
than the terminology) is what *programs* do we give up if we
wish to use static typing? I have never been able to pin this
one down at all.

Frankly, if the *only* issue between static and dynamic was
the static checking of the types, then static typing woud
unquestionably be superior. But that's not the only issue.
There are also what I call "packaging" issues, such as
being able to run partly-wrong programs on purpose so
that one would have the opportunity to do runtime analysis
without having to, say, implement parts of some interface
that one isn't interested in testing yet. These could also
be solved in a statically typed language. (Although
historically it hasn't been done that way.)

The real question is, are there some programs that we
can't write *at all* in a statically typed language, because
they'll *never* be typable? I think there might be, but I've
never been able to find a solid example of one.


> Perhaps, there is no such beast.  Or, perhaps I just can't formulate
> it.  Or, perhaps we have static type checkers which can do
> computations of unbounded complexity.  However, I thought that one of
> the characteristics of type systems was that they did not allow
> unbounded complexity and weren't Turing Complete.

The C++ type system is Turing complete, although in practical terms
it limits how much processing power it will spend on types at
compile time.


> If you allow Turing Complete type systems, then I would say no--every
> bug can be reforumlated as a type error.  If you require your type
> system to be less powerful, then some bugs must escape it.

I don't think so. Even with a Turing complete type system, a program's
runtime behavior is still something different from its static behavior.
(This is the other side of the "types are not tags" issue--not only
is it the case that there are things static types can do that tags
can't, it is also the case that there are things tags can do that
static types can't.)


Marshall
From: Chris Smith
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <MPG.1f0a78b132a01d5998971f@news.altopia.net>
Marshall <···············@gmail.com> wrote:
> Yes, an important question (IMHO the *more* important question
> than the terminology) is what *programs* do we give up if we
> wish to use static typing? I have never been able to pin this
> one down at all.

You'd need to clarify what you mean by "use static typing".  Clearly, if 
you use forms of static typing that currently exist for Java, there's 
considerable limitation in expressiveness.  If you mean a hypothetical 
"perfect" type system, then there would be no interesting programs you 
couldn't write, but such a system may not be possible to implement, and 
certainly isn't feasible.

So it seems to me that we have this ideal point at which it is possible 
to write all correct or interesting programs, and impossible to write 
buggy programs.  Pure static type systems try to approach that point 
from the conservative side.  Pure dynamic systems (and I actually think 
that design-by-contract languages and such apply here just as well as 
type tagging) try to approach that point via runtime verification from 
the liberal side.  (No, this has nothing to do with George Bush or Noam 
Chomsky... :)  Either side finds that the closer they get, the more 
creative they need to be to avoid creating development work that's no 
longer commensurate with the gains involved (whether in safety or 
expressiveness).

I think I am now convinced that the answer to the question of whether 
there is a class of type errors in dynamically checked languages is the 
same as the answer for static type systems: no.  In both cases, it's 
just a question of what systems may be provided for the purposes of 
verifying (more or less conservatively or selectively) certain program 
properties.  Type tags or other features that "look like" types are only 
relevant in that they encapsulate common kinds of constraints on program 
correctness without requiring the programmer to express those 
constraints in any more general purpose language.

As for what languages to use right now, we are interestingly enough back 
where Xah Lee started this whole thread.  All languages (except a few of 
the more extreme statically typed languages) are Turing complete, so in 
order to compare expressiveness, we'd need some other measure that 
considers some factor or factors beyond which operations are 
expressible.

> There are also what I call "packaging" issues, such as
> being able to run partly-wrong programs on purpose so
> that one would have the opportunity to do runtime analysis
> without having to, say, implement parts of some interface
> that one isn't interested in testing yet. These could also
> be solved in a statically typed language. (Although
> historically it hasn't been done that way.)

I'll note veryh briefly that the built-in compiler for the Eclipse IDE 
has the very nice feature that when a particular method fails to 
compile, it can still produces a result but replace the implementation 
of that method with one that just throws an Error.  I've taken advantage 
of that on occasion, though it doesn't allow the same range of options 
as a language that will just go ahead and try the buggy operations.

> The real question is, are there some programs that we
> can't write *at all* in a statically typed language, because
> they'll *never* be typable? I think there might be, but I've
> never been able to find a solid example of one.

This seems to depend on the specific concept of equivalence of programs 
that you have in mind.

-- 
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation
From: Dr.Ruud
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e7r5vs.19g.1@news.isolution.nl>
Chris Smith schreef:

> So it seems to me that we have this ideal point at which it is
> possible to write all correct or interesting programs, and impossible
> to write buggy programs.

I think that is a misconception. Even at the idealest point it will be
possible (and easy) to write buggy programs. G�del!

-- 
Affijn, Ruud

"Gewoon is een tijger."
From: Chris Smith
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <MPG.1f0b1db7c8cc6aab989721@news.altopia.net>
Dr.Ruud <··········@isolution.nl> wrote:
> > So it seems to me that we have this ideal point at which it is
> > possible to write all correct or interesting programs, and impossible
> > to write buggy programs.
> 
> I think that is a misconception. Even at the idealest point it will be
> possible (and easy) to write buggy programs. Gödel!

I agree.  I never said that the ideal point is achievable... only that 
there is a convergence toward it.  (In some cases, that convergence may 
have stalled at some equilibrium point because of the costs of being 
near that point.)

-- 
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation
From: Pascal Bourguignon
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <87odwf3zuh.fsf@thalassa.informatimago.com>
"Marshall" <···············@gmail.com> writes:
> Yes, an important question (IMHO the *more* important question
> than the terminology) is what *programs* do we give up if we
> wish to use static typing? I have never been able to pin this
> one down at all.

Well, given Turing Machine equivalence...

I'd mention retrospective software.  But you can always implement the
wanted retrospective features as a layer above the statically typed
language.

So the question is how much work the programmer needs to do to
implement a given program with static typing or with dynamic typing.


> The real question is, are there some programs that we
> can't write *at all* in a statically typed language, because
> they'll *never* be typable? I think there might be, but I've
> never been able to find a solid example of one.

More than *never* typable, you want to identify some kind of software
that is not *economically* statically typable.

Was it costlier to develop the software developed in non statically
typed programming languages than in a statically typed programming
language?   

-- 
__Pascal Bourguignon__                     http://www.informatimago.com/

"I have challenged the entire quality assurance team to a Bat-Leth
contest.  They will not concern us again."
From: Joe Marshall
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1151425350.679754.278820@i40g2000cwc.googlegroups.com>
Marshall wrote:
>
> Yes, an important question (IMHO the *more* important question
> than the terminology) is what *programs* do we give up if we
> wish to use static typing? I have never been able to pin this
> one down at all.

It would depend on the type system, naturally.

It isn't clear to me which programs we would have to give up, either.
I don't have much experience in sophisticated typed languages.  It is
rather easy to find programs that baffle an unsophisticated typed
language (C, C++, Java, etc.).

Looking back in comp.lang.lisp, I see these examples:

(defun noisy-apply (f arglist)
  (format t "I am now about to apply ~s to ~s" f arglist)
  (apply f arglist))

(defun blackhole (argument)
  (declare (ignore argument))
  #'blackhole)

But wait a sec.  It seems that these were examples I invented in
response to the same question from you!


>
> The real question is, are there some programs that we
> can't write *at all* in a statically typed language, because
> they'll *never* be typable?

Certainly!  As soon as you can reflect on the type system you can
construct programs that type-check iff they fail to type-check.

> > Perhaps, there is no such beast.  Or, perhaps I just can't formulate
> > it.  Or, perhaps we have static type checkers which can do
> > computations of unbounded complexity.  However, I thought that one of
> > the characteristics of type systems was that they did not allow
> > unbounded complexity and weren't Turing Complete.
>
> The C++ type system is Turing complete, although in practical terms
> it limits how much processing power it will spend on types at
> compile time.

I think templates only have to expand to seven levels, so you are
severely limited here.

> > If you allow Turing Complete type systems, then I would say no--every
> > bug can be reforumlated as a type error.  If you require your type
> > system to be less powerful, then some bugs must escape it.
>
> I don't think so. Even with a Turing complete type system, a program's
> runtime behavior is still something different from its static behavior.
> (This is the other side of the "types are not tags" issue--not only
> is it the case that there are things static types can do that tags
> can't, it is also the case that there are things tags can do that
> static types can't.)

I agree.  The point of static checking is to *not* run the program.  If
the type system gets too complicated, it may be a de-facto interpreter.
From: Marshall
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1151432002.905190.152470@y41g2000cwy.googlegroups.com>
Joe Marshall wrote:
> Marshall wrote:
> >
> > Yes, an important question (IMHO the *more* important question
> > than the terminology) is what *programs* do we give up if we
> > wish to use static typing? I have never been able to pin this
> > one down at all.
>
> It would depend on the type system, naturally.

Naturally!


> It isn't clear to me which programs we would have to give up, either.
> I don't have much experience in sophisticated typed languages.  It is
> rather easy to find programs that baffle an unsophisticated typed
> language (C, C++, Java, etc.).

C and Java, certainly, but I'm wary these days about making
any statement about limitations on C++'s type system, for it is
subtle and quick to anger.


> Looking back in comp.lang.lisp, I see these examples:
>
> (defun noisy-apply (f arglist)
>   (format t "I am now about to apply ~s to ~s" f arglist)
>   (apply f arglist))
>
> (defun blackhole (argument)
>   (declare (ignore argument))
>   #'blackhole)
>
> But wait a sec.  It seems that these were examples I invented in
> response to the same question from you!

Ah, how well I remember that thread, and how little I got from it.

My memories of that thread are
1) the troll who claimed that Java was unable to read two numbers from
standard input and add them together.
2) The fact that all the smart people agreed the blackhole
function indicated something profound.
3) The fact that I didn't understand the black hole function.

The noisy-apply function I think I understand; it's generic on the
entire arglist. In fact, if I read it correctly, it's even generic
on the *arity* of the function, which is actually pretty impressive.
True? This is an issue I've been wrestling with in my own type
system investigations: how to address genericity across arity.

Does noisy-apply get invoked the same way as other functions?
That would be cool.

As to the black hole function, could you explain it a bit? I apologize
for my lisp-ignorance. I am sure there is a world of significance
in the # ' on the third line, but I have no idea what it is.


> > > If you allow Turing Complete type systems, then I would say no--every
> > > bug can be reforumlated as a type error.  If you require your type
> > > system to be less powerful, then some bugs must escape it.
> >
> > I don't think so. Even with a Turing complete type system, a program's
> > runtime behavior is still something different from its static behavior.
> > (This is the other side of the "types are not tags" issue--not only
> > is it the case that there are things static types can do that tags
> > can't, it is also the case that there are things tags can do that
> > static types can't.)
>
> I agree.  The point of static checking is to *not* run the program.  If
> the type system gets too complicated, it may be a de-facto interpreter.

Aha! A lightbulb just want off.

Consider that a program is a function parameterized on its input.
Static
analysis is exactly the field of what we can say about a program
without
know the values of its parameters.


Marshall
From: Joe Marshall
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1151443591.286204.16620@x69g2000cwx.googlegroups.com>
Marshall wrote:
> Joe Marshall wrote:
> > Looking back in comp.lang.lisp, I see these examples:
> >
> > (defun noisy-apply (f arglist)
> >   (format t "I am now about to apply ~s to ~s" f arglist)
> >   (apply f arglist))
> >
> > (defun blackhole (argument)
> >   (declare (ignore argument))
> >   #'blackhole)
> >
> > But wait a sec.  It seems that these were examples I invented in
> > response to the same question from you!
>
> Ah, how well I remember that thread, and how little I got from it.
>
> My memories of that thread are
> 1) the troll who claimed that Java was unable to read two numbers from
> standard input and add them together.
> 2) The fact that all the smart people agreed the blackhole
> function indicated something profound.
> 3) The fact that I didn't understand the black hole function.
>
> The noisy-apply function I think I understand; it's generic on the
> entire arglist. In fact, if I read it correctly, it's even generic
> on the *arity* of the function, which is actually pretty impressive.
> True?

Yes.

> This is an issue I've been wrestling with in my own type
> system investigations: how to address genericity across arity.
>
> Does noisy-apply get invoked the same way as other functions?
> That would be cool.

Yes, but in testing I found an incompatability.  Here's a better
definiton:

(defun noisy-apply (f &rest args)
  (format t "~&Applying ~s to ~s." f (apply #'list* args))
  (apply f (apply #'list* args)))

CL-USER> (apply #'+ '(2 3))
5

CL-USER> (noisy-apply #'+ '(2 3))
Applying #<function + 20113532> to (2 3).
5

The internal application of list* flattens the argument list.  The
Common Lisp APPLY function has variable arity and this change makes my
NOISY-APPLY be identical.

> As to the black hole function, could you explain it a bit?

It is a function that takes any argument and returns the function
itself.

> I apologize
> for my lisp-ignorance. I am sure there is a world of significance
> in the # ' on the third line, but I have no idea what it is.

The #' is a namespace operator.  Since functions and variables have
different namespaces, I'm indicating that I wish to return the function
named blackhole rather than any variable of the same name that might be
around.
From: Pascal Costanza
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <4gdn8pF1l8ltjU1@individual.net>
Joe Marshall wrote:

> Yes, but in testing I found an incompatability.  Here's a better
> definiton:
> 
> (defun noisy-apply (f &rest args)
>   (format t "~&Applying ~s to ~s." f (apply #'list* args))
>   (apply f (apply #'list* args)))

This should be as follows, shouldn't it?

(defun noisy-apply (f &rest args)
   (format t "~&Applying ~s to ~s." f (apply #'list* args))
   (apply f args))


Pascal

-- 
3rd European Lisp Workshop
July 3 - Nantes, France - co-located with ECOOP 2006
http://lisp-ecoop06.bknr.net/
From: Joe Marshall
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1151449779.523778.310160@m73g2000cwd.googlegroups.com>
Pascal Costanza wrote:
> Joe Marshall wrote:
>
> > Yes, but in testing I found an incompatability.  Here's a better
> > definiton:
> >
> > (defun noisy-apply (f &rest args)
> >   (format t "~&Applying ~s to ~s." f (apply #'list* args))
> >   (apply f (apply #'list* args)))
>
> This should be as follows, shouldn't it?
>
> (defun noisy-apply (f &rest args)
>    (format t "~&Applying ~s to ~s." f (apply #'list* args))
>    (apply f args))

Try it.
From: Pascal Costanza
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <4gfchrF1mpbg7U2@individual.net>
Joe Marshall wrote:
> Pascal Costanza wrote:
>> Joe Marshall wrote:
>>
>>> Yes, but in testing I found an incompatability.  Here's a better
>>> definiton:
>>>
>>> (defun noisy-apply (f &rest args)
>>>   (format t "~&Applying ~s to ~s." f (apply #'list* args))
>>>   (apply f (apply #'list* args)))
>> This should be as follows, shouldn't it?
>>
>> (defun noisy-apply (f &rest args)
>>    (format t "~&Applying ~s to ~s." f (apply #'list* args))
>>    (apply f args))
> 
> Try it.

Ah, right. Sorry, I have confused this with a different idiom.

Pascal

-- 
3rd European Lisp Workshop
July 3 - Nantes, France - co-located with ECOOP 2006
http://lisp-ecoop06.bknr.net/
From: Juho Snellman
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <slrnea5092.kvn.jsnell@sbz-30.cs.Helsinki.FI>
> Joe Marshall wrote:
>> Yes, but in testing I found an incompatability.  Here's a better
>> definiton:
>> 
>> (defun noisy-apply (f &rest args)
>>   (format t "~&Applying ~s to ~s." f (apply #'list* args))
>>   (apply f (apply #'list* args)))
[...]

How about:

(defun noisy-apply (f &rest args)
  (format t "~&Applying ~s to ~s." f (apply #'list* args))
  (apply #'apply f args))

-- 
Juho Snellman
From: Joe Marshall
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1151514067.732996.286470@d56g2000cwd.googlegroups.com>
Juho Snellman wrote:
> > Joe Marshall wrote:
> >> Yes, but in testing I found an incompatability.  Here's a better
> >> definiton:
> >>
> >> (defun noisy-apply (f &rest args)
> >>   (format t "~&Applying ~s to ~s." f (apply #'list* args))
> >>   (apply f (apply #'list* args)))
> [...]
>
> How about:
>
> (defun noisy-apply (f &rest args)
>   (format t "~&Applying ~s to ~s." f (apply #'list* args))
>   (apply #'apply f args))

Nice!
From: Pascal Costanza
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <4gdkr6F1ldalhU4@individual.net>
Marshall wrote:
> Joe Marshall wrote:
>> Marshall wrote:

>> It isn't clear to me which programs we would have to give up, either.
>> I don't have much experience in sophisticated typed languages.  It is
>> rather easy to find programs that baffle an unsophisticated typed
>> language (C, C++, Java, etc.).
> 
> C and Java, certainly, but I'm wary these days about making
> any statement about limitations on C++'s type system, for it is
> subtle and quick to anger.
> 
>> Looking back in comp.lang.lisp, I see these examples:
>>
>> (defun noisy-apply (f arglist)
>>   (format t "I am now about to apply ~s to ~s" f arglist)
>>   (apply f arglist))
>>
>> (defun blackhole (argument)
>>   (declare (ignore argument))
>>   #'blackhole)
>>
> The noisy-apply function I think I understand; it's generic on the
> entire arglist. In fact, if I read it correctly, it's even generic
> on the *arity* of the function, which is actually pretty impressive.
> True? This is an issue I've been wrestling with in my own type
> system investigations: how to address genericity across arity.
> 
> Does noisy-apply get invoked the same way as other functions?
> That would be cool.
> 
> As to the black hole function, could you explain it a bit? I apologize
> for my lisp-ignorance. I am sure there is a world of significance
> in the # ' on the third line, but I have no idea what it is.

You can ignore the #'. In Scheme this as follows:

(define blackhole (argument)
   blackhole)

It just means that the function blackhole returns the function blackhole.


Pascal

-- 
3rd European Lisp Workshop
July 3 - Nantes, France - co-located with ECOOP 2006
http://lisp-ecoop06.bknr.net/
From: Chris Smith
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <MPG.1f0b5f831fd60c06989723@news.altopia.net>
Pascal Costanza <··@p-cos.net> wrote:
> You can ignore the #'. In Scheme this as follows:
> 
> (define blackhole (argument)
>    blackhole)
> 
> It just means that the function blackhole returns the function blackhole.

So, in other words, it's equivalent to (Y (\fa.f)) in lambda calculus, 
where \ is lambda, and Y is the fixed point combinator?

-- 
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation
From: QCD Apprentice
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e7rnra$m3k$3@news.doit.wisc.edu>
Joe Marshall wrote:
> Marshall wrote:
>> Yes, an important question (IMHO the *more* important question
>> than the terminology) is what *programs* do we give up if we
>> wish to use static typing? I have never been able to pin this
>> one down at all.
> 
> It would depend on the type system, naturally.
> 
> It isn't clear to me which programs we would have to give up, either.
> I don't have much experience in sophisticated typed languages.  It is
> rather easy to find programs that baffle an unsophisticated typed
> language (C, C++, Java, etc.).
> 
> Looking back in comp.lang.lisp, I see these examples:
> 
> (defun noisy-apply (f arglist)
>   (format t "I am now about to apply ~s to ~s" f arglist)
>   (apply f arglist))
> 
> (defun blackhole (argument)
>   (declare (ignore argument))
>   #'blackhole)
> 
> But wait a sec.  It seems that these were examples I invented in
> response to the same question from you!
> 
> 
>> The real question is, are there some programs that we
>> can't write *at all* in a statically typed language, because
>> they'll *never* be typable?
> 
> Certainly!  As soon as you can reflect on the type system you can
> construct programs that type-check iff they fail to type-check.

Sorry, but can you elaborate on this last point a little?  I think I 
missed something.
From: Joe Marshall
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1151445583.367455.288650@m73g2000cwd.googlegroups.com>
QCD Apprentice wrote:
> Joe Marshall wrote:
> > Marshall wrote:
> >>
> >> The real question is, are there some programs that we
> >> can't write *at all* in a statically typed language, because
> >> they'll *never* be typable?
> >
> > Certainly!  As soon as you can reflect on the type system you can
> > construct programs that type-check iff they fail to type-check.
>
> Sorry, but can you elaborate on this last point a little?  I think I
> missed something.

This is just the halting problem lifted up into the type domain.
From: Greg Buchholz
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1151449926.348233.20910@p79g2000cwp.googlegroups.com>
Joe Marshall wrote:
> It isn't clear to me which programs we would have to give up, either.
> I don't have much experience in sophisticated typed languages.  It is
> rather easy to find programs that baffle an unsophisticated typed
> language (C, C++, Java, etc.).
>
> Looking back in comp.lang.lisp, I see these examples:
>
> (defun noisy-apply (f arglist)
>   (format t "I am now about to apply ~s to ~s" f arglist)
>   (apply f arglist))

    Just for fun, here is how you might do "apply" in Haskell...

{-# OPTIONS -fglasgow-exts #-}

class Apply x y z | x y -> z where
    apply :: x -> y -> z

instance Apply (a->b) a b where
    apply f x = f x

instance Apply b as c => Apply (a->b) (a,as) c where
    apply f (x,xs) = apply (f x) xs

f :: Int -> Double -> String -> Bool -> Int
f x y z True = x + floor y * length z
f x y z False= x * floor y + length z

args =  (1::Int,(3.1415::Double,("flub",True)))

main = print $ apply f args


...which depends on the fact that functions are automatically curried
in Haskell.  You could do something similar for fully curried function
objects in C++.   Also of possible interest...

    Functions with the variable number of (variously typed) arguments
    http://okmij.org/ftp/Haskell/types.html#polyvar-fn

...But now I'm curious about how to create the "apply" function in a
language like Scheme.  I suppose you could do something like...

  (define (apply fun args)
    (eval (cons fun args)))

...but "eval" seems a little like overkill.  Is there a better way?


Greg Buchholz
From: Diez B. Roggisch
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <4gd56eF1n07dbU1@uni-berlin.de>
>> The C++ type system is Turing complete, although in practical terms
>> it limits how much processing power it will spend on types at
>> compile time.
> 
> I think templates only have to expand to seven levels, so you are
> severely limited here.

You can adjust the iteration-depth. However, as turing-complete implies the
potential to create infinite loops - it enforces _some_ boundary. Otherwise
the user hitting C-c will do :)
 
Diez
From: David Hopwood
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <Bnhog.485604$xt.51621@fe3.news.blueyonder.co.uk>
Joe Marshall wrote:
> It isn't clear to me which programs we would have to give up, either.
> I don't have much experience in sophisticated typed languages.  It is
> rather easy to find programs that baffle an unsophisticated typed
> language (C, C++, Java, etc.).
> 
> Looking back in comp.lang.lisp, I see these examples:
> 
> (defun noisy-apply (f arglist)
>   (format t "I am now about to apply ~s to ~s" f arglist)
>   (apply f arglist))

'noisy-apply' is typeable using either of these systems:

  <http://www.cs.jhu.edu/~pari/papers/fool2004/first-class_FOOL2004.pdf>
  <http://www.coli.uni-saarland.de/publikationen/softcopies/Muller:1998:TIF.pdf>

(it should be easy to see the similarity to a message forwarder).

> (defun blackhole (argument)
>   (declare (ignore argument))
>   #'blackhole)

This is typeable in any system with universally quantified types (including
most practical systems with parametric polymorphism); it has type
"forall a . a -> #'blackhole".

>>The real question is, are there some programs that we
>>can't write *at all* in a statically typed language, because
>>they'll *never* be typable?
> 
> Certainly!  As soon as you can reflect on the type system you can
> construct programs that type-check iff they fail to type-check.

A program that causes the type checker not to terminate (which is the
effect of trying to construct such a thing in the type-reflective systems
I know about) is hardly useful.

In any case, I think the question implied the condition: "and that you
can write in a language with no static type checking."

[...]
>>>If you allow Turing Complete type systems, then I would say no--every
>>>bug can be reforumlated as a type error.  If you require your type
>>>system to be less powerful, then some bugs must escape it.
>>
>>I don't think so. Even with a Turing complete type system, a program's
>>runtime behavior is still something different from its static behavior.
>>(This is the other side of the "types are not tags" issue--not only
>>is it the case that there are things static types can do that tags
>>can't, it is also the case that there are things tags can do that
>>static types can't.)
> 
> I agree.  The point of static checking is to *not* run the program.  If
> the type system gets too complicated, it may be a de-facto interpreter.

The following LtU post:

  <http://lambda-the-ultimate.org/node/1085>

argues that types should be expressed in the same language that is being
typed. I am not altogether convinced, but I can see the writer's point.

There do exist practical languages in which types can be defined as
arbitrary predicates or coercion functions, for example E (www.erights.org).
E implementations currently perform type checking only at runtime, however,
although it was intended to allow static analysis for types that could be
expressed in a conventional static type system. I think the delay in
implementing this has more to do with the E developers' focus on other
(security and concurrency) issues, rather than an inherent difficulty.

-- 
David Hopwood <····················@blueyonder.co.uk>
From: Joe Marshall
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1151450513.458623.139460@b68g2000cwa.googlegroups.com>
David Hopwood wrote:
> Joe Marshall wrote:
>
> > (defun blackhole (argument)
> >   (declare (ignore argument))
> >   #'blackhole)
>
> This is typeable in any system with universally quantified types (including
> most practical systems with parametric polymorphism); it has type
> "forall a . a -> #'blackhole".

The #' is just a namespace operator.  The function accepts a single
argument and returns the function itself.  You need a type that is a
fixed point (in addition to the universally quantified argument).

> >>The real question is, are there some programs that we
> >>can't write *at all* in a statically typed language, because
> >>they'll *never* be typable?
> >
> > Certainly!  As soon as you can reflect on the type system you can
> > construct programs that type-check iff they fail to type-check.
>
> A program that causes the type checker not to terminate (which is the
> effect of trying to construct such a thing in the type-reflective systems
> I know about) is hardly useful.
>
> In any case, I think the question implied the condition: "and that you
> can write in a language with no static type checking."

Presumably the remainder of the program does something interesting!

The point is that there exists (by construction) programs that can
never be statically checked.  The next question is whether such
programs are `interesting'.  Certainly a program that is simply
designed to contradict the type checker is of limited entertainment
value, but there should be programs that are independently non
checkable against a sufficiently powerful type system.
From: David Hopwood
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <Uujog.229504$8W1.48150@fe1.news.blueyonder.co.uk>
Joe Marshall wrote:
> David Hopwood wrote:
>>Joe Marshall wrote:
>>
>>>(defun blackhole (argument)
>>>  (declare (ignore argument))
>>>  #'blackhole)
>>
>>This is typeable in any system with universally quantified types (including
>>most practical systems with parametric polymorphism); it has type
>>"forall a . a -> #'blackhole".
> 
> The #' is just a namespace operator.  The function accepts a single
> argument and returns the function itself.  You need a type that is a
> fixed point (in addition to the universally quantified argument).

Yes, see the correction in my follow-up.

>>>>The real question is, are there some programs that we
>>>>can't write *at all* in a statically typed language, because
>>>>they'll *never* be typable?
>>>
>>>Certainly!  As soon as you can reflect on the type system you can
>>>construct programs that type-check iff they fail to type-check.
>>
>>A program that causes the type checker not to terminate (which is the
>>effect of trying to construct such a thing in the type-reflective systems
>>I know about) is hardly useful.
>>
>>In any case, I think the question implied the condition: "and that you
>>can write in a language with no static type checking."
> 
> Presumably the remainder of the program does something interesting!
> 
> The point is that there exists (by construction) programs that can
> never be statically checked.

I don't think you've shown that. I would like to see a more explicit
construction of a dynamically typed [*] program with a given specification,
where it is not possible to write a statically typed program that meets
the same specification using the same algorithms.

AFAIK, it is *always* possible to construct such a statically typed
program in a type system that supports typerec or gradual typing. The
informal construction you give above doesn't contradict that, because
it depends on reflecting on a [static] type system, and in a dynamically
typed program there is no type system to reflect on.


[*] Let's put aside disagreements about terminology for the time being,
    although I still don't like the term "dynamically typed".

> The next question is whether such
> programs are `interesting'.  Certainly a program that is simply
> designed to contradict the type checker is of limited entertainment
> value,

I don't know, it would probably have more entertainment value than most
executive toys :-)

> but there should be programs that are independently non
> checkable against a sufficiently powerful type system.

Why?

Note that I'm not claiming that you can check any desirable property of
a program (that would contradict Rice's Theorem), only that you can
express any dynamically typed program in a statically typed language --
with static checks where possible and dynamic checks where necessary.

-- 
David Hopwood <····················@blueyonder.co.uk>
From: Paul Rubin
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <7xwtb2j8ir.fsf@ruckus.brouhaha.com>
David Hopwood <····················@blueyonder.co.uk> writes:
> Note that I'm not claiming that you can check any desirable property of
> a program (that would contradict Rice's Theorem), only that you can
> express any dynamically typed program in a statically typed language --
> with static checks where possible and dynamic checks where necessary.

It starts to look like sufficiently powerful static type systems are
confusing enough, that programming with them is at least as bug-prone
as imperative programming in dynamically typed languages.  The static
type checker can spot type mismatches at compile time, but the
types themselves are easier and easier to get wrong.
From: David Hopwood
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <vttog.230289$8W1.53535@fe1.news.blueyonder.co.uk>
Paul Rubin wrote:
> David Hopwood <····················@blueyonder.co.uk> writes:
> 
>>Note that I'm not claiming that you can check any desirable property of
>>a program (that would contradict Rice's Theorem), only that you can
>>express any dynamically typed program in a statically typed language --
>>with static checks where possible and dynamic checks where necessary.
> 
> It starts to look like sufficiently powerful static type systems are
> confusing enough, that programming with them is at least as bug-prone
> as imperative programming in dynamically typed languages.  The static
> type checker can spot type mismatches at compile time, but the
> types themselves are easier and easier to get wrong.

My assertion above does not depend on having a "sufficiently powerful static
type system" to express a given dynamically typed program. It is true for
static type systems that are not particularly complicated, and suffice to
express all dynamically typed programs.

I disagree with your implication that in general, static type systems for
practical languages are getting too confusing, but that's a separate issue.
(Obviously one can find type systems in research proposals that are too
confusing as they stand.)

-- 
David Hopwood <····················@blueyonder.co.uk>
From: Joachim Durchholz
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e7tisp$n9i$1@online.de>
Paul Rubin schrieb:
> It starts to look like sufficiently powerful static type systems are
> confusing enough, that programming with them is at least as bug-prone
> as imperative programming in dynamically typed languages.  The static
> type checker can spot type mismatches at compile time, but the
> types themselves are easier and easier to get wrong.

That's where type inference comes into play. Usually you don't write the 
types, the compiler infers them for you, so you don't get them wrong.

Occasionally, you still write down the types, if only for documentation 
purposes (the general advice is: do the type annotations for external 
interfaces, but not internally).

BTW if you get a type wrong, you'll be told by the compiler, so this is 
still less evil than bugs in the code that pop up during testing (and 
*far* less evil than bugs that pop up after roll-out).
And the general consensus among FPL programmers is that you get the hang 
of it fairly quickly (one poster mentioned "two to three months" - this 
doesn't seem to be slower than learning to interpret synax error 
messages, so it's OK considering it's an entirely new kind of diagnostics).

Regards,
Jo
From: Joe Marshall
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1151520997.731688.193710@p79g2000cwp.googlegroups.com>
David Hopwood wrote:
> Joe Marshall wrote:
> >
> > The point is that there exists (by construction) programs that can
> > never be statically checked.
>
> I don't think you've shown that. I would like to see a more explicit
> construction of a dynamically typed [*] program with a given specification,
> where it is not possible to write a statically typed program that meets
> the same specification using the same algorithms.

I think we're at cross-purposes here.  It is trivial to create a static
type system that has a `universal type' and defers all the required
type narrowing to runtime.  In effect, we've `turned off' the static
typing (because everything trivially type checks at compile time).

However, a sufficiently powerful type system must be incomplete or
inconsistent (by Godel).  A type system proves theorems about a
program.  If the type system is rich enough, then there are unprovable,
yet true theorems.  That is, programs that are type-safe but do not
type check.  A type system that could reflect on itself is easily
powerful enough to qualify.

> AFAIK, it is *always* possible to construct such a statically typed
> program in a type system that supports typerec or gradual typing. The
> informal construction you give above doesn't contradict that, because
> it depends on reflecting on a [static] type system, and in a dynamically
> typed program there is no type system to reflect on.

A static type system is usually used for universal proofs:  For all
runtime values of such and such...   Dynamic checks usually provide
existential refutations:  This particular value isn't a string.  It may
be the case that there is no universal proof, yet no existential
refutation.

>
> Note that I'm not claiming that you can check any desirable property of
> a program (that would contradict Rice's Theorem), only that you can
> express any dynamically typed program in a statically typed language --
> with static checks where possible and dynamic checks where necessary.

Sure.  And I'm not saying that you cannot express these programs in a
static language, but rather that there exist programs that have
desirable properties (that run without error and produce the right
answer) but that the desirable properties cannot be statically checked.

The real question is how common these programs are, and what sort of
desirable properties are we trying to check.
From: David Hopwood
Subject: Re: What is Expressiveness in a Computer Language [correction]
Date: 
Message-ID: <7bjog.229500$8W1.29971@fe1.news.blueyonder.co.uk>
David Hopwood wrote:
> Joe Marshall wrote:
> 
>>(defun blackhole (argument)
>>  (declare (ignore argument))
>>  #'blackhole)
> 
> This is typeable in any system with universally quantified types (including
> most practical systems with parametric polymorphism); it has type
> "forall a . a -> #'blackhole".

Ignore this answer; I misinterpreted the code.

I believe this example requires recursive types. It can also be expressed
in a gradual typing system, but possibly only using an unknown ('?') type.

ISTR that O'Caml at one point (before version 1.06) supported general
recursive types, although I don't know whether it would have supported
this particular example.

-- 
David Hopwood <····················@blueyonder.co.uk>
From: Neelakantan Krishnaswami
Subject: Re: What is Expressiveness in a Computer Language [correction]
Date: 
Message-ID: <slrnea3tdi.o85.neelk@gs3106.sp.cs.cmu.edu>
In article <······················@fe1.news.blueyonder.co.uk>, David Hopwood
wrote:
> 
> ISTR that O'Caml at one point (before version 1.06) supported general
> recursive types, although I don't know whether it would have supported
> this particular example.

Sure:

  ~: ocaml -rectypes
  	  Objective Caml version 3.08.3
  
  # let rec blackhole(arg) = blackhole;;
  val blackhole : 'b -> 'a as 'a = <fun>


This is a good example of a type system feature that I don't want,
*because* it types more programs. The reason is that programs that
typecheck with recursive types, but don't type under standard HM type
inference, are almost always erroneous programs -- such types almost
always arise from mistakes like leaving off arguments to functions.

For example, if I write the fold left function, and forget to pass it
the tail of the list in the recursive call, it typechecks:

  let rec foldl f init list = 
    match list with 
     | [] -> init
     | x :: xs -> foldl (f x init)  (* missing xs argument *)

  val fold : ('b -> ('c -> 'b list -> 'c as 'c) -> 'a as 'a) -> 'c

I'd really prefer this to be a type error, because there actually is
an error in this program!


-- 
Neel Krishnaswami
·····@cs.cmu.edu
From: Andreas Rossberg
Subject: Re: What is Expressiveness in a Computer Language [correction]
Date: 
Message-ID: <e7tjon$9a1av$1@hades.rz.uni-saarland.de>
David Hopwood wrote:
>>
>>>(defun blackhole (argument)
>>> (declare (ignore argument))
>>> #'blackhole)

> I believe this example requires recursive types. It can also be expressed
> in a gradual typing system, but possibly only using an unknown ('?') type.
> 
> ISTR that O'Caml at one point (before version 1.06) supported general
> recursive types, although I don't know whether it would have supported
> this particular example.

No problem at all. It still is possible today if you really want:

   ~/> ocaml -rectypes
         Objective Caml version 3.08.3

   # let rec blackhole x = blackhole;;
   val blackhole : 'b -> 'a as 'a = <fun>

The problem is, though, that almost everything can be typed once you 
have unrestricted recursive types (e.g. missing arguments etc), and 
consequently many actual errors remain unflagged (which clearly shows 
that typing is not only about potential value class mismatches). 
Moreover, there are very few practical uses of such a feature, and they 
can always be coded easily with recursive datatypes.

It is a pragmatic decision born from experience that you simply do *not 
want* to have this, even though you easily could. E.g. for OCaml, 
unrestricted recursive typing was removed as default because of frequent 
user complaints.

Which is why this actually is a very bad example to chose for dynamic 
typing advocacy... ;-)

- Andreas
From: Joe Marshall
Subject: Re: What is Expressiveness in a Computer Language [correction]
Date: 
Message-ID: <1151521689.654605.261320@p79g2000cwp.googlegroups.com>
Andreas Rossberg wrote:
>
>    ~/> ocaml -rectypes
>          Objective Caml version 3.08.3
>
>    # let rec blackhole x = blackhole;;
>    val blackhole : 'b -> 'a as 'a = <fun>
>
> The problem is, though, that almost everything can be typed once you
> have unrestricted recursive types (e.g. missing arguments etc), and
> consequently many actual errors remain unflagged (which clearly shows
> that typing is not only about potential value class mismatches).
> Moreover, there are very few practical uses of such a feature, and they
> can always be coded easily with recursive datatypes.
>
> It is a pragmatic decision born from experience that you simply do *not
> want* to have this, even though you easily could. E.g. for OCaml,
> unrestricted recursive typing was removed as default because of frequent
> user complaints.
>
> Which is why this actually is a very bad example to chose for dynamic
> typing advocacy... ;-)

Actually, this seems a *good* example.  The problem seems to be that
you end up throwing the baby out with the bathwater:  your static type
system stops catching the errors you want once you make it powerful
enough to express certain programs.

So now it seems to come down to a matter of taste:  use a restricted
language and catch certain errors early or use an unrestricted language
and miss certain errors.  It is interesting that the PLT people have
made this tradeoff as well.  In the DrScheme system, there are
different `language levels' that provide a restricted subset of Scheme.
 At the beginner level, first-class procedures are not allowed.  This
is obviously restrictive, but it has the advantage that extraneous
parenthesis (a common beginner mistake) cannot be misinterpreted as the
intent to invoke a first-class procedure.
From: ········@ps.uni-sb.de
Subject: Re: What is Expressiveness in a Computer Language [correction]
Date: 
Message-ID: <1151615408.920886.237600@y41g2000cwy.googlegroups.com>
Joe Marshall wrote:
> Andreas Rossberg wrote:
> >
> > Which is why this actually is a very bad example to chose for dynamic
> > typing advocacy... ;-)
>
> Actually, this seems a *good* example.  The problem seems to be that
> you end up throwing the baby out with the bathwater:  your static type
> system stops catching the errors you want once you make it powerful
> enough to express certain programs.

That is not the case: you can still express these programs, you just
need to do an indirection through a datatype.

- Andreas
From: Ketil Malde
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <eg7j33xdgf.fsf@polarvier.ii.uib.no>
"Marshall" <···············@gmail.com> writes:

> There are also what I call "packaging" issues, such as
> being able to run partly-wrong programs on purpose so
> that one would have the opportunity to do runtime analysis
> without having to, say, implement parts of some interface
> that one isn't interested in testing yet. These could also
> be solved in a statically typed language. (Although
> historically it hasn't been done that way.)

I keep hearing this, but I guess I fail to understand it.  How
"partly-wrong" do you require the program to be?

During development, I frequently sprinkle my (Haskell) program with
'undefined' and 'error "implement later"' - which then lets me run the
implemented parts until execution hits one of the 'undefined's.

The "cost" of static typing for running an incomplete program is thus
simply that I have to name entities I refer to.  I'd argue that this
is good practice anyway, since it's easy to see what remains to be
done. (Python doesn't seem to, but presumably a serious dynamically
typed system would warn about references to entities not in scope?) 

-k
-- 
If I haven't seen further, it is by standing in the footprints of giants
From: Marshall
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1151420689.919204.12770@x69g2000cwx.googlegroups.com>
Ketil Malde wrote:
> "Marshall" <···············@gmail.com> writes:
>
> > There are also what I call "packaging" issues, such as
> > being able to run partly-wrong programs on purpose so
> > that one would have the opportunity to do runtime analysis
> > without having to, say, implement parts of some interface
> > that one isn't interested in testing yet. These could also
> > be solved in a statically typed language. (Although
> > historically it hasn't been done that way.)
>
> I keep hearing this, but I guess I fail to understand it.  How
> "partly-wrong" do you require the program to be?
>
> During development, I frequently sprinkle my (Haskell) program with
> 'undefined' and 'error "implement later"' - which then lets me run the
> implemented parts until execution hits one of the 'undefined's.
>
> The "cost" of static typing for running an incomplete program is thus
> simply that I have to name entities I refer to.  I'd argue that this
> is good practice anyway, since it's easy to see what remains to be
> done. (Python doesn't seem to, but presumably a serious dynamically
> typed system would warn about references to entities not in scope?)

I'll give you my understanding of it, but bear in mind that I only
program
in statically typed languages, and I in fact do exactly what you
decribe
above: stub out unimplemented methods.

The issue is that, while stubbing out things may not be expensive,
it is not free either. There is some mental switching cost to being
in a state where one writes some code, wants to execute it, but
can't, because of type system constraints that are globally applicable
but not locally applicable to the task at hand, or to the specific
subprogram one is working on right at that moment.

Since I'm very used to doing it, it doesn't seem like an issue to
me, but programmers in dynamical languages complain bitterly
about this. It is my feeling that programming languages should
try to be inclusive, and since this feature is easily enough
accomodated, (as a 'lenient' switch to execution) it ought
to be.


Marshall
From: Duane Rettig
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <o0sllqh9x3.fsf@franz.com>
Ketil Malde <··········@ii.uib.no> writes:

> "Marshall" <···············@gmail.com> writes:
>
>> There are also what I call "packaging" issues, such as
>> being able to run partly-wrong programs on purpose so
>> that one would have the opportunity to do runtime analysis
>> without having to, say, implement parts of some interface
>> that one isn't interested in testing yet. These could also
>> be solved in a statically typed language. (Although
>> historically it hasn't been done that way.)
>
> I keep hearing this, but I guess I fail to understand it.  How
> "partly-wrong" do you require the program to be?

This conclusion is false.  To be "wrong", whether partly or fully,
a program needs to specifically not conform to the requirements that
the programmer gives it.  You are assuming that only a completed
program can be right.  Let me give a counterexample:

Consider this part of a CL program:

CL-USER(1): (defun foo (x)
              (declare (optimize speed (safety 0) (debug 0))
                       (fixnum x)
)
              (bar (the fixnum (1+ x))))
FOO
CL-USER(2): 

This is of course not a complete program, because the function BAR is
not yet defined.  If I try to run it, it will of course get an error.
But does that require then that I call this a "partly wrong" program?
No, of course not; it is not a program; it is a function, and for my
purposes it is completely correct (I've debugged it already :-)

So what can we do with an incomplete program?  Nothing?  Well, no.  Of
course, we can't run it (or can we?).  But even before talking about
running the program, we can at least do some reasoning about it:

 1. It seems to have some static type declarations (in a dynamic
langiuage?  oh my :-).

 2. The 1+ operation limits the kinds of operations that would be
acceptable, even in the absence of static type declarations.

 3. The declarations can be ignored by the CL implementation, so #2
might indeed come into play.  I ensured that the declarations would
be "trusted" in Allegro CL, by declaring high speed and low safety.

 4. I can compile this function.  Note that I get a warning about the
incompleteness of the program:

CL-USER(2): (compile 'foo)
Warning: While compiling these undefined functions were referenced: BAR.
FOO
NIL
NIL
CL-USER(3): 

 5. I can disassemble the function.  Note that it is a complete
function, despite the fact that BAR is undefined:

CL-USER(3): (disassemble 'foo)
;; disassembly of #<Function FOO>
;; formals: X
;; constant vector:
0: BAR

;; code start: #x406f07ec:
   0: 83 c0 04 addl	eax,$4
   3: 8b 5e 12 movl	ebx,[esi+18]  ; BAR
   6: b1 01     movb	cl,$1
   8: ff e7     jmp	*edi
CL-USER(4): 

 6. I can _even_ run the program!  Yes, I get an error:

CL-USER(4): (foo 10)
Error: attempt to call `BAR' which is an undefined function.
  [condition type: UNDEFINED-FUNCTION]

Restart actions (select using :continue):
 0: Try calling BAR again.
 1: Return a value instead of calling BAR.
 2: Try calling a function other than BAR.
 3: Setf the symbol-function of BAR and call it again.
 4: Return to Top Level (an "abort" restart).
 5: Abort entirely from this (lisp) process.
[1] CL-USER(5): 

 7. But note first that I have many options, including retrying the
call to BAR.  What was this call to BAR?  I can reason about that as
well, by getting a backtrace:

[1] CL-USER(5): :zoom
Evaluation stack:

   (ERROR #<UNDEFINED-FUNCTION @ #x406f3dfa>)
 ->(BAR 11)
   [... EXCL::%EVAL ]
   (EVAL (FOO 10))
   (TPL:TOP-LEVEL-READ-EVAL-PRINT-LOOP)
   (TPL:START-INTERACTIVE-TOP-LEVEL
      #<TERMINAL-SIMPLE-STREAM [initial terminal io] fd 0/1 @ #x40120e5a>
      #<Function TOP-LEVEL-READ-EVAL-PRINT-LOOP> ...)
[1] CL-USER(6): 

 8. If I then define BAR, with or without compiling it, and then take
that option to retry the call:

[1] CL-USER(6): (defun bar (x) (1- x))
BAR
[1] CL-USER(7): :cont
10
CL-USER(8): 

Hmm, I was able to complete the program dynamically?  Who ever heard
of doing that? :-)

> During development, I frequently sprinkle my (Haskell) program with
> 'undefined' and 'error "implement later"' - which then lets me run the
> implemented parts until execution hits one of the 'undefined's.

Why do it manually?  And what do you do when you've hit the undefined?
Start the program over?

> The "cost" of static typing for running an incomplete program is thus
> simply that I have to name entities I refer to.  I'd argue that this
> is good practice anyway, since it's easy to see what remains to be
> done.

Good practice, yes, but why not have the language help you to practice
it?

-- 
Duane Rettig    ·····@franz.com    Franz Inc.  http://www.franz.com/
555 12th St., Suite 1450               http://www.555citycenter.com/
Oakland, Ca. 94607        Phone: (510) 452-2000; Fax: (510) 452-0182   
From: David Hopwood
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <PUaog.484573$xt.282436@fe3.news.blueyonder.co.uk>
Marshall wrote:
> The real question is, are there some programs that we
> can't write *at all* in a statically typed language, because
> they'll *never* be typable?

In a statically typed language that has a "dynamic" type, all
dynamically typed programs are straightforwardly expressible.

In a statically typed, Turing-complete language that does not have a
"dynamic" type, it is always *possible* to express a dynamically typed
program, but this may require a global transformation of the program
or use of an interpreter -- i.e. the "Felleisen expressiveness" of the
language is reduced.

-- 
David Hopwood <····················@blueyonder.co.uk>
From: Marshall
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1151420883.426116.288300@b68g2000cwa.googlegroups.com>
David Hopwood wrote:
> Marshall wrote:
> > The real question is, are there some programs that we
> > can't write *at all* in a statically typed language, because
> > they'll *never* be typable?
>
> In a statically typed language that has a "dynamic" type, all
> dynamically typed programs are straightforwardly expressible.

So, how does this "dynamic" type work? It can't simply be
the "any" type, because that type has no/few functions defined
on it. It strikes me that *when* exactly binding happens will
matter. In a statically typed language, it may be that all
binding occurs at compile time, and in a dynamic language,
it may be that all binding occurs at runtime. So you might
have to change the binding mechanism as well. Does the
"dynamic" type allow this?

 
Marshall
From: David Hopwood
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <DKdog.484675$xt.25388@fe3.news.blueyonder.co.uk>
Marshall wrote:
> David Hopwood wrote:
>>Marshall wrote:
>>
>>>The real question is, are there some programs that we
>>>can't write *at all* in a statically typed language, because
>>>they'll *never* be typable?
>>
>>In a statically typed language that has a "dynamic" type, all
>>dynamically typed programs are straightforwardly expressible.
> 
> So, how does this "dynamic" type work?

<http://citeseer.ist.psu.edu/abadi89dynamic.html>

> It can't simply be the "any" type, because that type has no/few
> functions defined on it.

It isn't. From the abstract of the above paper:

  [...] even in statically typed languages, there is often the need to
  deal with data whose type cannot be determined at compile time. To handle
  such situations safely, we propose to add a type Dynamic whose values are
  pairs of a value v and a type tag T where v has the type denoted by T.
  Instances of Dynamic are built with an explicit tagging construct and
  inspected with a type safe typecase construct.

"Gradual typing" as described in
<http://www.cs.rice.edu/~jgs3847/pubs/pubs/2006/siek06:_gradual.pdf> is
another alternative. The difference between gradual typing and a
"dynamic" type is one of convenience rather than expressiveness --
gradual typing does not require explicit tagging and typecase constructs.

-- 
David Hopwood <····················@blueyonder.co.uk>
From: Marshall
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1151430821.728345.31600@p79g2000cwp.googlegroups.com>
David Hopwood wrote:
> Marshall wrote:
> > David Hopwood wrote:
> >>Marshall wrote:
> >>
> >>>The real question is, are there some programs that we
> >>>can't write *at all* in a statically typed language, because
> >>>they'll *never* be typable?
> >>
> >>In a statically typed language that has a "dynamic" type, all
> >>dynamically typed programs are straightforwardly expressible.
> >
> > So, how does this "dynamic" type work?
>
> <http://citeseer.ist.psu.edu/abadi89dynamic.html>
>
> > It can't simply be the "any" type, because that type has no/few
> > functions defined on it.
>
> It isn't. From the abstract of the above paper:
>
>   [...] even in statically typed languages, there is often the need to
>   deal with data whose type cannot be determined at compile time. To handle
>   such situations safely, we propose to add a type Dynamic whose values are
>   pairs of a value v and a type tag T where v has the type denoted by T.
>   Instances of Dynamic are built with an explicit tagging construct and
>   inspected with a type safe typecase construct.

Well, all this says is that the type "dynamic" is a way to explicitly
indicate the inclusion of rtti. But that doesn't address my objection;
if a typesafe typecase construct is required, it's not like using
a dynamic language. They don't require typecase to inspect values
before one can, say, invoke a function.


> "Gradual typing" as described in
> <http://www.cs.rice.edu/~jgs3847/pubs/pubs/2006/siek06:_gradual.pdf> is
> another alternative. The difference between gradual typing and a
> "dynamic" type is one of convenience rather than expressiveness --
> gradual typing does not require explicit tagging and typecase constructs.

Perhaps this is the one I should read; it sounds closer to what I'm
talking about.


Marshall
From: David Hopwood
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <B3fog.484799$xt.409283@fe3.news.blueyonder.co.uk>
Marshall wrote:
> David Hopwood wrote:
>>Marshall wrote:
>>>David Hopwood wrote:
>>>>Marshall wrote:
>>>>
>>>>>The real question is, are there some programs that we
>>>>>can't write *at all* in a statically typed language, because
>>>>>they'll *never* be typable?
>>>>
>>>>In a statically typed language that has a "dynamic" type, all
>>>>dynamically typed programs are straightforwardly expressible.
>>>
>>>So, how does this "dynamic" type work?
>>
>><http://citeseer.ist.psu.edu/abadi89dynamic.html>
>>
>>>It can't simply be the "any" type, because that type has no/few
>>>functions defined on it.
>>
>>It isn't. From the abstract of the above paper:
>>
>>  [...] even in statically typed languages, there is often the need to
>>  deal with data whose type cannot be determined at compile time. To handle
>>  such situations safely, we propose to add a type Dynamic whose values are
>>  pairs of a value v and a type tag T where v has the type denoted by T.
>>  Instances of Dynamic are built with an explicit tagging construct and
>>  inspected with a type safe typecase construct.
> 
> Well, all this says is that the type "dynamic" is a way to explicitly
> indicate the inclusion of rtti. But that doesn't address my objection;
> if a typesafe typecase construct is required, it's not like using
> a dynamic language. They don't require typecase to inspect values
> before one can, say, invoke a function.

I was answering the question posed above: "are there some programs that
we can't write *at all* in a statically typed language...?"

>>"Gradual typing" as described in
>><http://www.cs.rice.edu/~jgs3847/pubs/pubs/2006/siek06:_gradual.pdf> is
>>another alternative. The difference between gradual typing and a
>>"dynamic" type is one of convenience rather than expressiveness --
>>gradual typing does not require explicit tagging and typecase constructs.
> 
> Perhaps this is the one I should read; it sounds closer to what I'm
> talking about.

Right; convenience is obviously important, as well as expressiveness.

-- 
David Hopwood <····················@blueyonder.co.uk>
From: David Hopwood
Subject: Re: What is Expressiveness in a Computer Language [correction]
Date: 
Message-ID: <QGgog.485335$xt.416310@fe3.news.blueyonder.co.uk>
David Hopwood wrote:
> Marshall wrote:
>>David Hopwood wrote:
>>>Marshall wrote:
>>>
>>>>The real question is, are there some programs that we
>>>>can't write *at all* in a statically typed language, because
>>>>they'll *never* be typable?
>>>
>>>In a statically typed language that has a "dynamic" type, all
>>>dynamically typed programs are straightforwardly expressible.

Correction: as the paper on gradual typing referenced below points
out in section 5, something like the "typerec" construct of
<http://citeseer.ist.psu.edu/harper95compiling.html> would be
needed, in order for a system with a "dynamic" type to be equivalent
in expressiveness to dynamic typing or gradual typing.

("typerec" can be used to implement "dynamic"; it is more general.)

>>So, how does this "dynamic" type work?
> 
> <http://citeseer.ist.psu.edu/abadi89dynamic.html>
> 
>>It can't simply be the "any" type, because that type has no/few
>>functions defined on it.
> 
> It isn't. From the abstract of the above paper:
> 
>   [...] even in statically typed languages, there is often the need to
>   deal with data whose type cannot be determined at compile time. To handle
>   such situations safely, we propose to add a type Dynamic whose values are
>   pairs of a value v and a type tag T where v has the type denoted by T.
>   Instances of Dynamic are built with an explicit tagging construct and
>   inspected with a type safe typecase construct.
> 
> "Gradual typing" as described in
> <http://www.cs.rice.edu/~jgs3847/pubs/pubs/2006/siek06:_gradual.pdf> is
> another alternative. The difference between gradual typing and a
> "dynamic" type

This should be: the difference between gradual typing and a system with
"typerec"...

> is one of convenience rather than expressiveness --
> gradual typing does not require explicit tagging and typecase constructs.

-- 
David Hopwood <····················@blueyonder.co.uk>
From: Pascal Costanza
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <4gdkmiF1ldalhU3@individual.net>
David Hopwood wrote:
> Marshall wrote:
>> The real question is, are there some programs that we
>> can't write *at all* in a statically typed language, because
>> they'll *never* be typable?
> 
> In a statically typed language that has a "dynamic" type, all
> dynamically typed programs are straightforwardly expressible.

What about programs where the types change at runtime?


Pascal

-- 
3rd European Lisp Workshop
July 3 - Nantes, France - co-located with ECOOP 2006
http://lisp-ecoop06.bknr.net/
From: David Hopwood
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <ZQhog.486001$xt.89406@fe3.news.blueyonder.co.uk>
Pascal Costanza wrote:
> David Hopwood wrote:
>> Marshall wrote:
>>
>>> The real question is, are there some programs that we
>>> can't write *at all* in a statically typed language, because
>>> they'll *never* be typable?
>>
>> In a statically typed language that has a "dynamic" type, all
>> dynamically typed programs are straightforwardly expressible.
> 
> What about programs where the types change at runtime?

Staged compilation is perfectly compatible with static typing.
Static typing only requires that it be possible to prove absence
of some category of type errors when the types are known; it
does not require that all types are known before the first-stage
program is run.

There are, however, staged compilation systems that guarantee that
the generated program will be typeable if the first-stage program
is.

(It's clear that to compare the expressiveness of statically and
dynamically typed languages, the languages must be comparable in
other respects than their type system. Staged compilation is the
equivalent feature to 'eval'.)

-- 
David Hopwood <····················@blueyonder.co.uk>
From: Pascal Costanza
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <4gfcclF1mpbg7U1@individual.net>
David Hopwood wrote:
> Pascal Costanza wrote:
>> David Hopwood wrote:
>>> Marshall wrote:
>>>
>>>> The real question is, are there some programs that we
>>>> can't write *at all* in a statically typed language, because
>>>> they'll *never* be typable?
>>> In a statically typed language that has a "dynamic" type, all
>>> dynamically typed programs are straightforwardly expressible.
>> What about programs where the types change at runtime?
> 
> Staged compilation is perfectly compatible with static typing.
> Static typing only requires that it be possible to prove absence
> of some category of type errors when the types are known; it
> does not require that all types are known before the first-stage
> program is run.

Can you change the types of the program that is already running, or are 
the levels strictly separated?

> There are, however, staged compilation systems that guarantee that
> the generated program will be typeable if the first-stage program
> is.

...and I guess that this reduces again the kinds of things you can express.

> (It's clear that to compare the expressiveness of statically and
> dynamically typed languages, the languages must be comparable in
> other respects than their type system. Staged compilation is the
> equivalent feature to 'eval'.)

If it is equivalent to eval (i.e., executed at runtime), and the static 
type checker that is part of eval yields a type error, then you still 
get a type error at runtime!


Pascal

-- 
3rd European Lisp Workshop
July 3 - Nantes, France - co-located with ECOOP 2006
http://lisp-ecoop06.bknr.net/
From: David Hopwood
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <6gwog.488405$xt.80841@fe3.news.blueyonder.co.uk>
Pascal Costanza wrote:
> David Hopwood wrote:
>> Pascal Costanza wrote:
>>> David Hopwood wrote:
>>>> Marshall wrote:
>>>>
>>>>> The real question is, are there some programs that we
>>>>> can't write *at all* in a statically typed language, because
>>>>> they'll *never* be typable?
>>>>
>>>> In a statically typed language that has a "dynamic" type, all
>>>> dynamically typed programs are straightforwardly expressible.
>>>
>>> What about programs where the types change at runtime?
>>
>> Staged compilation is perfectly compatible with static typing.
>> Static typing only requires that it be possible to prove absence
>> of some category of type errors when the types are known; it
>> does not require that all types are known before the first-stage
>> program is run.
> 
> Can you change the types of the program that is already running, or are
> the levels strictly separated?

In most staged compilation systems this is intentionally not permitted.
But this is not a type system issue. You can't change the types in a
running program because you can't change the program, period. And you
can't do that because most designers of these systems consider directly
self-modifying code to be a bad idea (I agree with them).

Note that prohibiting directly self-modifying code does not prevent a
program from specifying another program to *replace* it.

>> There are, however, staged compilation systems that guarantee that
>> the generated program will be typeable if the first-stage program
>> is.
> 
> ...and I guess that this reduces again the kinds of things you can express.

Yes. If you don't want that, use a system that does not impose this
restriction/guarantee.

>> (It's clear that to compare the expressiveness of statically and
>> dynamically typed languages, the languages must be comparable in
>> other respects than their type system. Staged compilation is the
>> equivalent feature to 'eval'.)
> 
> If it is equivalent to eval (i.e., executed at runtime), and the static
> type checker that is part of eval yields a type error, then you still
> get a type error at runtime!

You can choose to use a system in which this is impossible because only
typeable programs can be generated, or one in which non-typeable programs
can be generated. In the latter case, type errors are detected at the
earliest possible opportunity, as soon as the program code is known and
before any of that code has been executed. (In the case of eval, OTOH,
the erroneous code may cause visible side effects before any run-time
error occurs.)

I don't know what else you could ask for.

-- 
David Hopwood <····················@blueyonder.co.uk>
From: Andreas Rossberg
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e7u43j$9a1av$3@hades.rz.uni-saarland.de>
David Hopwood wrote:
> 
> (In the case of eval, OTOH,
> the erroneous code may cause visible side effects before any run-time
> error occurs.)

Not necessarily. You can replace the primitive eval by compile, which 
delivers a function encapsulating the program, so you can check the type 
of the function before actually running it. Eval itself can easily be 
expressed on top of this as a polymorphic function, which does not run 
the program if it does not have the desired type:

   eval ['a] s = typecase compile s of
       f : (()->'a) -> f ()
       _ -> raise TypeError

- Andreas
From: David Hopwood
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <S3yog.230709$8W1.24255@fe1.news.blueyonder.co.uk>
Andreas Rossberg wrote:
> David Hopwood wrote:
> 
>> (In the case of eval, OTOH,
>> the erroneous code may cause visible side effects before any run-time
>> error occurs.)
> 
> Not necessarily. You can replace the primitive eval by compile, which
> delivers a function encapsulating the program, so you can check the type
> of the function before actually running it. Eval itself can easily be
> expressed on top of this as a polymorphic function, which does not run
> the program if it does not have the desired type:
> 
>   eval ['a] s = typecase compile s of
>       f : (()->'a) -> f ()
>       _ -> raise TypeError

What I meant was, in the case of eval in an untyped ("dynamically typed")
language.

The approach you've just outlined is an implementation of staged compilation
in a typed language.

-- 
David Hopwood <····················@blueyonder.co.uk>
From: Pascal Costanza
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <4gfiufF1mg6h9U1@individual.net>
David Hopwood wrote:
> Pascal Costanza wrote:
>> David Hopwood wrote:
>>> Pascal Costanza wrote:
>>>> David Hopwood wrote:
>>>>> Marshall wrote:
>>>>>
>>>>>> The real question is, are there some programs that we
>>>>>> can't write *at all* in a statically typed language, because
>>>>>> they'll *never* be typable?
>>>>> In a statically typed language that has a "dynamic" type, all
>>>>> dynamically typed programs are straightforwardly expressible.
>>>> What about programs where the types change at runtime?
>>> Staged compilation is perfectly compatible with static typing.
>>> Static typing only requires that it be possible to prove absence
>>> of some category of type errors when the types are known; it
>>> does not require that all types are known before the first-stage
>>> program is run.
>> Can you change the types of the program that is already running, or are
>> the levels strictly separated?
> 
> In most staged compilation systems this is intentionally not permitted.
> But this is not a type system issue. You can't change the types in a
> running program because you can't change the program, period. And you
> can't do that because most designers of these systems consider directly
> self-modifying code to be a bad idea (I agree with them).

Whether you consider something you cannot do with statically typed 
languages a bad idea or not is irrelevant. You were asking for things 
that you cannot do with statically typed languages.

There are at least systems that a lot of people rely on (the JVM, .NET) 
that achieve runtime efficiency primarily by executing what is 
essentially self-modifying code. These runtime optimization systems have 
been researched primarily for the language Self, and also implemented in 
Strongtalk, CLOS, etc., to various degrees.

Beyond that, I am convinced that the ability to update a running system 
without the need to shut it down can be an important asset.

> Note that prohibiting directly self-modifying code does not prevent a
> program from specifying another program to *replace* it.

...and this creates problems with moving data from one version of a 
program to the next.


Pascal

-- 
3rd European Lisp Workshop
July 3 - Nantes, France - co-located with ECOOP 2006
http://lisp-ecoop06.bknr.net/
From: Matthias Blume
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <m1r719fi2a.fsf@hana.uchicago.edu>
Pascal Costanza <··@p-cos.net> writes:

> Whether you consider something you cannot do with statically typed
> languages a bad idea or not is irrelevant. You were asking for things
> that you cannot do with statically typed languages.

The whole point of static type systems is to make sure that there are
things that one cannot do.  So the fact that there are such things are
not an argument per se against static types.

[ ... ]

> Beyond that, I am convinced that the ability to update a running
> system without the need to shut it down can be an important asset.

And I am convinced that updating a running system in the style of,
e.g., Erlang, can be statically typed.

>> Note that prohibiting directly self-modifying code does not prevent a
>> program from specifying another program to *replace* it.
>
> ...and this creates problems with moving data from one version of a
> program to the next.

How does this "create" such a problem?  The problem is there in either
approach.  In fact, I believe that the best chance we have of
addressing the problem is by adopting the "replace the code" model
along with a "translate the data where necessary at the time of
replacement".  Translating the data, i.e., re-establishing the
invariants expected by the updated/replaced code, seems much harder
(to me) in the case of self-modifying code.  Erlang got this one
right.
From: Marshall
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1151518563.638141.313110@b68g2000cwa.googlegroups.com>
Matthias Blume wrote:
>
> How does this "create" such a problem?  The problem is there in either
> approach.  In fact, I believe that the best chance we have of
> addressing the problem is by adopting the "replace the code" model
> along with a "translate the data where necessary at the time of
> replacement".  Translating the data, i.e., re-establishing the
> invariants expected by the updated/replaced code, seems much harder
> (to me) in the case of self-modifying code.  Erlang got this one
> right.

Pardon my ignorance, but what is the Erlang solution to this problem?


Marshall
From: Matthias Blume
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <m1irmlf6n8.fsf@hana.uchicago.edu>
"Marshall" <···············@gmail.com> writes:

> Matthias Blume wrote:
>>
>> How does this "create" such a problem?  The problem is there in either
>> approach.  In fact, I believe that the best chance we have of
>> addressing the problem is by adopting the "replace the code" model
>> along with a "translate the data where necessary at the time of
>> replacement".  Translating the data, i.e., re-establishing the
>> invariants expected by the updated/replaced code, seems much harder
>> (to me) in the case of self-modifying code.  Erlang got this one
>> right.
>
> Pardon my ignorance, but what is the Erlang solution to this problem?

Erlang relies on a combination of purity, concurrency, and message
passing, where messages can carry higher-order values.

Data structures are immutable, and each computational agent is a
thread.  Most threads consist a loop that explicitly passes state
around.  It dispatches on some input event, applies a state
transformer (which is a pure function), produces some output event (if
necessary), and goes back to the beginning of the loop (by
tail-calling itself) with the new state.

When a code update is necessary, a special event is sent to the
thread, passing along the new code to be executed.  The old code, upon
receiving such an event, ceases to go back to its own loop entry point
(by simply not performing the self-tail-call).  Instead it tail-calls
the new code with the current state.

If the new code can live with the same data structures that the old
code had, then it can directly proceed to perform real work.  If new
invariants are to be established, it can first run a translation
function on the current state before resuming operation.

From what I hear from the Erlang folks that I have talked to, this
approach works extremely well in practice.
From: Joachim Durchholz
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e85866$clu$1@online.de>
Matthias Blume schrieb:
> Erlang relies on a combination of purity, concurrency, and message
> passing, where messages can carry higher-order values.
> 
> Data structures are immutable, and each computational agent is a
> thread.  Most threads consist a loop that explicitly passes state
> around.  It dispatches on some input event, applies a state
> transformer (which is a pure function), produces some output event (if
> necessary), and goes back to the beginning of the loop (by
> tail-calling itself) with the new state.

Actually any Erlang process, when seen from the outside, is impure: it 
has observable state.
However, from what I hear, such state is kept to a minimum. I.e. the 
state involved is just the state that's mandated by the purpose of the 
process, not by computational bookkeeping - you won't send file 
descriptors in a message, but maybe information about the state of some 
hardware, or about a permanent log.

So to me, the approach of Erlang seems to amount to "make pure 
programming so easy and efficient that aren't tempted to introduce state 
that isn't already there".

Regards,
Jo
From: Pascal Costanza
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <4gfo6tF1cju7vU1@individual.net>
Matthias Blume wrote:
> Pascal Costanza <··@p-cos.net> writes:
> 
>> Whether you consider something you cannot do with statically typed
>> languages a bad idea or not is irrelevant. You were asking for things
>> that you cannot do with statically typed languages.
> 
> The whole point of static type systems is to make sure that there are
> things that one cannot do.  So the fact that there are such things are
> not an argument per se against static types.

I am not arguing against static type systems. I am just responding to 
the question what the things are that you cannot do in statically typed 
languages.

> [ ... ]
> 
>> Beyond that, I am convinced that the ability to update a running
>> system without the need to shut it down can be an important asset.
> 
> And I am convinced that updating a running system in the style of,
> e.g., Erlang, can be statically typed.

Maybe. The interesting question then is whether you can express the 
kinds of dynamic updates that are relevant in practice. Because a static 
type system always restricts what kinds of runtime behavior you can 
express in your language. I am still skeptical, because changing the 
types at runtime is basically changing the assumptions that the static 
type checker has used to check the program's types in the first place.

For example, all the approaches that I have seen in statically typed 
languages deal with adding to a running program (say, class fields and 
methods, etc.), but not with changing to, or removing from it.

>>> Note that prohibiting directly self-modifying code does not prevent a
>>> program from specifying another program to *replace* it.
>> ...and this creates problems with moving data from one version of a
>> program to the next.
> 
> How does this "create" such a problem?  The problem is there in either
> approach.  In fact, I believe that the best chance we have of
> addressing the problem is by adopting the "replace the code" model
> along with a "translate the data where necessary at the time of
> replacement".  Translating the data, i.e., re-establishing the
> invariants expected by the updated/replaced code, seems much harder
> (to me) in the case of self-modifying code.  Erlang got this one
> right.

...and the "translate the date where necessary" approach is essentially 
triggered by a dynamic type test (if value x is of an old version of 
type T, update it to reflect the new version of type T [1]). QED.


Pascal

[1] BTW, that's also the approach taken in CLOS.

-- 
3rd European Lisp Workshop
July 3 - Nantes, France - co-located with ECOOP 2006
http://lisp-ecoop06.bknr.net/
From: Matthias Blume
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <m1mzbxf725.fsf@hana.uchicago.edu>
Pascal Costanza <··@p-cos.net> writes:

>> And I am convinced that updating a running system in the style of,
>> e.g., Erlang, can be statically typed.
>
> Maybe. The interesting question then is whether you can express the
> kinds of dynamic updates that are relevant in practice. Because a
> static type system always restricts what kinds of runtime behavior you
> can express in your language. I am still skeptical, because changing
> the types at runtime is basically changing the assumptions that the
> static type checker has used to check the program's types in the first
> place.

That's why I find the Erlang model to be more promising.

I am extremely skeptical of code mutation at runtime which would
"change types", because to me types are approximations of program
invariants.  So if you do a modification that changes the types, it is
rather likely that you did something that also changed the invariants,
and existing code relying on those invariants will now break.

> For example, all the approaches that I have seen in statically typed
> languages deal with adding to a running program (say, class fields and
> methods, etc.), but not with changing to, or removing from it.

Of course, there are good reasons for that: removing fields or
changing their invariants requires that all /deployed/ code which
relied on their existence or their invariants must be made aware of
this change.  This is a semantic problem which happens to reveal
itself as a typing problem.  By making types dynamic, the problem does
not go away but is merely swept under the rug.

>>>> Note that prohibiting directly self-modifying code does not prevent a
>>>> program from specifying another program to *replace* it.
>>> ...and this creates problems with moving data from one version of a
>>> program to the next.
>> How does this "create" such a problem?  The problem is there in
>> either
>> approach.  In fact, I believe that the best chance we have of
>> addressing the problem is by adopting the "replace the code" model
>> along with a "translate the data where necessary at the time of
>> replacement".  Translating the data, i.e., re-establishing the
>> invariants expected by the updated/replaced code, seems much harder
>> (to me) in the case of self-modifying code.  Erlang got this one
>> right.
>
> ...and the "translate the date where necessary" approach is
> essentially triggered by a dynamic type test (if value x is of an old
> version of type T, update it to reflect the new version of type T
> [1]). QED.

But this test would have to already exist in code that was deployed
/before/ the change!  How did this code know what to test for, and how
did it know how to translate the data?  Plus, how do you detect that
some piece of data is "of an old version of type T"?  If v has type T
and T "changes" (whatever that might mean), how can you tell that v's
type is "the old T" rather than "the new T"!  Are there two different
Ts around now?  If so, how can you say that T has changed?

The bottom line is that even the concept of "changing types at
runtime" makes little sense.  Until someone shows me a /careful/
formalization that actually works, I just can't take it very
seriously.
From: Pascal Costanza
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <4gg291F1lsqo9U1@individual.net>
Matthias Blume wrote:
> Pascal Costanza <··@p-cos.net> writes:
> 
>>> And I am convinced that updating a running system in the style of,
>>> e.g., Erlang, can be statically typed.
>> Maybe. The interesting question then is whether you can express the
>> kinds of dynamic updates that are relevant in practice. Because a
>> static type system always restricts what kinds of runtime behavior you
>> can express in your language. I am still skeptical, because changing
>> the types at runtime is basically changing the assumptions that the
>> static type checker has used to check the program's types in the first
>> place.
> 
> That's why I find the Erlang model to be more promising.
> 
> I am extremely skeptical of code mutation at runtime which would
> "change types", because to me types are approximations of program
> invariants.  So if you do a modification that changes the types, it is
> rather likely that you did something that also changed the invariants,
> and existing code relying on those invariants will now break.

...no, it will throw exceptions that you can catch and handle, maybe 
interactively.

>> For example, all the approaches that I have seen in statically typed
>> languages deal with adding to a running program (say, class fields and
>> methods, etc.), but not with changing to, or removing from it.
> 
> Of course, there are good reasons for that: removing fields or
> changing their invariants requires that all /deployed/ code which
> relied on their existence or their invariants must be made aware of
> this change.  This is a semantic problem which happens to reveal
> itself as a typing problem.  By making types dynamic, the problem does
> not go away but is merely swept under the rug.

...and yet this requirement sometimes comes up.

>>>>> Note that prohibiting directly self-modifying code does not prevent a
>>>>> program from specifying another program to *replace* it.
>>>> ...and this creates problems with moving data from one version of a
>>>> program to the next.
>>> How does this "create" such a problem?  The problem is there in
>>> either
>>> approach.  In fact, I believe that the best chance we have of
>>> addressing the problem is by adopting the "replace the code" model
>>> along with a "translate the data where necessary at the time of
>>> replacement".  Translating the data, i.e., re-establishing the
>>> invariants expected by the updated/replaced code, seems much harder
>>> (to me) in the case of self-modifying code.  Erlang got this one
>>> right.
>> ...and the "translate the date where necessary" approach is
>> essentially triggered by a dynamic type test (if value x is of an old
>> version of type T, update it to reflect the new version of type T
>> [1]). QED.
> 
> But this test would have to already exist in code that was deployed
> /before/ the change!  How did this code know what to test for, and how
> did it know how to translate the data?

In CLOS, the test is indeed already there from the beginning. It's part 
of the runtime semantics. The translation of the data is handled by 
'update-instance-for-redefined-class, which is a generic function for 
which you can define your own methods. So this is user-extensible and 
can, for example, be provided as part of the program update.

Furthermore, slot accesses typically go through generic functions (aka 
setters and getters in other languages) for which you can provide 
methods that can perform useful behavior in case the corresponding slots 
have gone. These methods can also be provided as part of the program update.

Note that I am not claiming that CLOS provides the perfect solution, but 
it seems to work reasonably well that people use it in practice.

> Plus, how do you detect that
> some piece of data is "of an old version of type T"?  If v has type T
> and T "changes" (whatever that might mean), how can you tell that v's
> type is "the old T" rather than "the new T"!  Are there two different
> Ts around now?  If so, how can you say that T has changed?

Presumably, objects have references to their class metaobjects which 
contain version information and references to more current versions of 
themselves. This is not rocket science.

> The bottom line is that even the concept of "changing types at
> runtime" makes little sense.  Until someone shows me a /careful/
> formalization that actually works, I just can't take it very
> seriously.

Maybe this feature just isn't made for you.


Pascal

-- 
3rd European Lisp Workshop
July 3 - Nantes, France - co-located with ECOOP 2006
http://lisp-ecoop06.bknr.net/
From: Chris Smith
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <MPG.1f0a747a457f0a2898971e@news.altopia.net>
Chris F Clark <···@shell01.TheWorld.com> wrote:
> And to me the question is what kinds of types apply to these dynamic
> programs, where in fact you may have to solve the halting problem to
> know exactly when some statement is executed.

Yes, I believe (static) type systems will always end up approximating 
(conservatively) the possible behavior of programs in order to perform 
their verification.

> Or, perhaps we have static type checkers which can do
> computations of unbounded complexity.  However, I thought that one of
> the characteristics of type systems was that they did not allow
> unbounded complexity and weren't Turing Complete.

Honestly, I suspect you'd get disagreement within the field of type 
theory about this.  Certainly, there are plenty of researchers who have 
proposed type systems that potentially don't even terminate.  The 
consensus appears to be that they are worth studying within the field of 
type theory, but I note that Pierce still hasn't dropped the word 
"tractable" from his definition in his introductory text, despite 
acknowledging only a couple pages later that languages exist whose type 
systems are undecidable, and apparently co-authoring a paper on one of 
them.  The consensus seems to be that such systems are interesting if 
they terminate quickly for interesting cases (in which case, I suppose, 
you could hope to eventually be able to formalize what cases are 
interesting, and derive a truly tractable type system that checks that 
interesting subset).

Interestingly, Pierce gives ML as an example of a language whose type 
checker does not necesarily run in polynomial time (thus failing some 
concepts of "tractable") but that does just fine in practice.  I am just 
quoting here, so I don't know exactly how this is true.  Marshall 
mentioned template meta-programming in C++, which is definitely Turing-
complete.

-- 
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation
From: Greg Buchholz
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1151368443.491445.294480@p79g2000cwp.googlegroups.com>
Chris F Clark wrote:
> Thus, as we traverse a list, the first element might be an integer,
> the second a floating point value, the third a sub-list, the fourth
> and fifth, two more integers, and so on.  If you look statically at
> the head of the list, we have a very wide union of types going by.
> However, perhaps there is a mapping going on that can be discerned.
> For example, perhaps the list has 3 distinct types of elements
> (integers, floating points, and sub-lists) and it circles through the
> types in the order, first having one of each type, then two of each
> type, then four of each type, then eight, and so on.

  Sounds like an interesting problem.  Although not the exact type
specified above, here's something pretty similar that I could think of
implementing in Haskell.  (I don't know what a "sub-list" is, for
example).  Maybe some Haskell wizard could get rid of the tuples?


data Clark a b c = Nil | Cons a (Clark b c (a,a)) deriving Show

clark = (Cons 42 (Cons 3.14 (Cons "abc"
        (Cons (1,2) (Cons (1.2,3.4) (Cons ("foo","bar")
        (Cons ((9,8),(7,6)) (Cons ((0.1,0.2),(0.3,0.4)) Nil))))))))

main = print clark
From: Chris F Clark
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <sddac7zjji4.fsf@shell01.TheWorld.com>
"Greg Buchholz" <················@yahoo.com> writes:

> Chris F Clark wrote:
> > Thus, as we traverse a list, the first element might be an integer,
> > the second a floating point value, the third a sub-list, the fourth
> > and fifth, two more integers, and so on.  If you look statically at
> > the head of the list, we have a very wide union of types going by.
> > However, perhaps there is a mapping going on that can be discerned.
> > For example, perhaps the list has 3 distinct types of elements
> > (integers, floating points, and sub-lists) and it circles through the
> > types in the order, first having one of each type, then two of each
> > type, then four of each type, then eight, and so on.
> 
>   Sounds like an interesting problem.  Although not the exact type
> specified above, here's something pretty similar that I could think of
> implementing in Haskell.  (I don't know what a "sub-list" is, for
> example).  Maybe some Haskell wizard could get rid of the tuples?
> 
> 
> data Clark a b c = Nil | Cons a (Clark b c (a,a)) deriving Show
> 
> clark = (Cons 42 (Cons 3.14 (Cons "abc"
>         (Cons (1,2) (Cons (1.2,3.4) (Cons ("foo","bar")
>         (Cons ((9,8),(7,6)) (Cons ((0.1,0.2),(0.3,0.4)) Nil))))))))
> 
> main = print clark

Very impressive.  It looks right to me and simple enough to
understand.  I must find the time to learn a modern FP language.  Can
you write a fold for this that prints the data as a binary tree of
triples?  I have to believe it isn't that hard....

-Chris
        
From: Greg Buchholz
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1151425317.853996.5040@b68g2000cwa.googlegroups.com>
Chris F Clark wrote:
> Very impressive.  It looks right to me and simple enough to
> understand.  I must find the time to learn a modern FP language.  Can
> you write a fold for this that prints the data as a binary tree of
> triples?  I have to believe it isn't that hard....

{- Refactoring this as a fold is left as an exercise to the reader -}

data Clark a b c = Nil | Cons a (Clark b c (a,a)) deriving Show

clark =
  (Cons 42 (Cons 3.14 (Cons "abc"
  (Cons (1,2) (Cons (1.2,3.4) (Cons ("foo","bar")
  (Cons ((9,8),(7,6)) (Cons ((0.1,0.2),(0.3,0.4))
                        (Cons (("w","x"),("y","z")) Nil)))))))))

main = print (toTree clark)

data Tree a = Empty | Node a (Tree a) (Tree a) deriving Show

toTree :: Clark a b c -> Tree (a,b,c)
toTree (Cons x (Cons y (Cons z rest)))
         = Node (x,y,z)
                (toTree (fst (lift_c rest)))
                (toTree (snd (lift_c rest)))
toTree _ = Empty

lift_c :: Clark (a,a) (b,b) (c,c) -> (Clark a b c, Clark a b c)
lift_c Nil = (Nil,Nil)
lift_c (Cons (x,y) rest) = (Cons x (fst (lift_c rest)),
                            Cons y (snd (lift_c rest)))
From: David Hopwood
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <G%xng.477395$tc.466834@fe2.news.blueyonder.co.uk>
Chris F Clark wrote:
> Chris Smith <·······@twu.net> writes: 
>  
>>Unfortunately, I have to again reject this idea.  There is no such
>>restriction on type theory.  Rather, the word type is defined by type
>>theorists to mean the things that they talk about. 
> 
> Do you reject that there could be something more general than what a 
> type theorist discusses?  Or do you reject calling such things a type?
>  
> Let you write: 
> 
>>because we could say that anything that checks types is a type system,
>>and then worry about verifying that it's a sound type system without
>>worrying about whether it's a subset of the perfect type system. 
> 
> I'm particularly interested if something unsound (and perhaps 
> ambiguous) could be called a type system.

Yes, but not a useful one. The situation is the same as with unsound
formal systems; they still satisfy the definition of a formal system.

-- 
David Hopwood <····················@blueyonder.co.uk>
From: David Hopwood
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <U1yng.477449$tc.249586@fe2.news.blueyonder.co.uk>
David Hopwood wrote:
> Chris F Clark wrote:
> 
>>I'm particularly interested if something unsound (and perhaps 
>>ambiguous) could be called a type system.
> 
> Yes, but not a useful one. The situation is the same as with unsound
> formal systems; they still satisfy the definition of a formal system.

I meant "inconsistent formal systems".

-- 
David Hopwood <····················@blueyonder.co.uk>
From: Dr.Ruud
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e7hmnh.1ds.1@news.isolution.nl>
Chris Smith schreef:

> Static types are not fuzzy

Static types can be fuzzy as well. For example: a language can define
that extra accuracy and bits may be used for the implementation of
calculations: d = a * b / c
Often some minimum is guaranteed.


> I see it as quite reasonable when there's an effort by several
> participants in this thread to either imply or say outright that
> static type systems and dynamic type systems are variations of
> something generally called a "type system", and given that static
> type systems are quite formally defined, that we'd want to see a
> formal definition for a dynamic type system before accepting the
> proposition that they are of a kind with each other.

The 'dynamic type' is just another type.

-- 
Affijn, Ruud

"Gewoon is een tijger."
From: Chris Smith
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <MPG.1f05f9415a672d0c9896ee@news.altopia.net>
Dr.Ruud <··········@isolution.nl> wrote:
> Chris Smith schreef:
> 
> > Static types are not fuzzy
> 
> Static types can be fuzzy as well. For example: a language can define
> that extra accuracy and bits may be used for the implementation of
> calculations: d = a * b / c
> Often some minimum is guaranteed.

Who cares how many bits are declared by the type?  Some specific 
implementation may tie that kind of representation information to the 
type for convenience, but the type itself is not affected by how many 
bits are used by the representation.  The only thing affected by the 
representation is the evaluation semantics and the performance of the 
language, both of which only come into play after it's been shown that 
the program is well-typed.

> The 'dynamic type' is just another type.

That's essentially equivalent to giving up.  I doubt many people would 
be happy with the conclusion that dynamically typed languages are typed, 
but have only one type which is appropriate for all possible operations.  
That type system would not be implemented, since it's trivial and 
behaves identically to the lack of a type system, and then we're back 
where we started.

-- 
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation
From: Anton van Straaten
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <g7eng.156935$F_3.115360@newssvr29.news.prodigy.net>
Chris Smith wrote:
> Dr.Ruud <··········@isolution.nl> wrote:
...
>>The 'dynamic type' is just another type.
> 
> 
> That's essentially equivalent to giving up.  I doubt many people would 
> be happy with the conclusion that dynamically typed languages are typed, 
> but have only one type which is appropriate for all possible operations.  

I'm not sure if this is what you're getting at, but what you've written 
is precisely the position that type theorists take.  Having "only one 
type which is appropriate for all possible operations" is exactly what 
the term "untyped" implies.

> That type system would not be implemented, since it's trivial and 
> behaves identically to the lack of a type system, and then we're back 
> where we started.

This is why I've suggested that "untyped" can be a misleading term, when 
applied to dynamically-typed languages.

Anton
From: Vesa Karvonen
Subject: Saying "latently-typed language" is making a category mistake
Date: 
Message-ID: <e7gdcf$65m$1@oravannahka.helsinki.fi>
In comp.lang.functional Anton van Straaten <·····@appsolutions.com> wrote:
[...]
> I reject this comparison.  There's much more to it than that.  The point 
> is that the reasoning which programmers perform when working with an 
> program in a latently-typed language bears many close similiarities to 
> the purpose and behavior of type systems.

> This isn't an attempt to jump on any bandwagons, it's an attempt to 
> characterize what is actually happening in real programs and with real 
> programmers.  I'm relating that activity to type systems because that is 
> what it most closely relates to.
[...]

I think that we're finally getting to the bottom of things.  While reading
your reponses something became very clear to me: latent-typing and latent-
types are not a property of languages.  Latent-typing, also known as
informal reasoning, is something that all programmers do as a normal part
of programming.  To say that a language is latently-typed is to make a
category mistake, because latent-typing is not a property of languages.

A programmer, working in any language, whether typed or not, performs
informal reasoning.  I think that is fair to say that there is a
correspondence between type theory and such informal reasoning.  The
correspondence is like the correspondence between informal and formal
math.  *But* , informal reasoning (latent-typing) is not a property of
languages.

An example of a form of informal reasoning that (practically) every
programmer does daily is termination analysis.  There are type systems
that guarantee termination, but I think that is fair to say that it is not
yet understood how to make a practical general purpose language, whose
type system would guarantee termination (or at least I'm not aware of such
a language).  It should also be clear that termination analysis need not
be done informally.  Given a program, it may be possible to formally prove
that it terminates.

I'm now more convinced than ever that "(latently|dynamically)-typed
language" is an oxymoron.  The terminology really needs to be fixed.

-Vesa Karvonen
From: Chris Smith
Subject: Re: Saying "latently-typed language" is making a category mistake
Date: 
Message-ID: <MPG.1f05bdcdada1826b9896e9@news.altopia.net>
Vesa Karvonen <·············@cs.helsinki.fi> wrote:
> I think that we're finally getting to the bottom of things.  While reading
> your reponses something became very clear to me: latent-typing and latent-
> types are not a property of languages.  Latent-typing, also known as
> informal reasoning, is something that all programmers do as a normal part
> of programming.  To say that a language is latently-typed is to make a
> category mistake, because latent-typing is not a property of languages.

It's pretty clear to me that there are two different things here.  There 
is the set of reasoning that is done about programs by programmers.  
This may or may not have some subset called "type reasoning", depending 
on whether there's a real difference between type reasoning and any 
other reasoning about program behavior, or it's just a word that we've 
learned from early experience to use in certain rather arbitrary 
contexts.  In either case, then there are language features that look 
sort of type-like which can support that same reasoning about program 
behavior.

It is, to me, an interesting question whether those language features 
are really intrinsically connected to that form of reasoning about 
program behavior, or whether they are just one way of doing it where 
others would suffice as well.  Clearly, there is some variation in these 
supporting language features between languages that have them... for 
example, whether there are only a fixed set of type tags, or an infinite 
(or logically infinite, anyway) set, or whether the type tags may have 
other type tags as parameters, or whether they even exist as tags at all 
or as sets of possible behaviors.  Across all these varieties, the 
phrase "latently typed language" seems to be a reasonable shorthand for 
"language that supports the programmer in doing latent-type reasoning".  
Of course, there are still problems here, but really only if you share 
my skepticism that there's any difference between type reasoning and any 
other reasoning that's done by programmers on their programs.

Perhaps, if you wanted to be quite careful about your distinctions, you 
may want to choose different words for the two.  However, then you run 
into that same pesky "you can't choose other people's terminology" block 
again.

> A programmer, working in any language, whether typed or not, performs
> informal reasoning.  I think that is fair to say that there is a
> correspondence between type theory and such informal reasoning.  The
> correspondence is like the correspondence between informal and formal
> math.

I'd be more careful in the analogy, there.  Since type theory (as 
applied to programming languages, anyway) is really just a set of 
methods of proving arbitrary statements about program behaviors, a 
corresponding "informal type theory" would just be the entire set of 
informal reasoning by programmers about their programs.  That would 
agree with my skepticism, but probably not with too many other people.

> An example of a form of informal reasoning that (practically) every
> programmer does daily is termination analysis.

Or perhaps you agree with my skepticism, here.  This last paragraph 
sounds like you're even more convinced of it than I am.

-- 
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation
From: Chris Uppal
Subject: Re: Saying "latently-typed language" is making a category mistake
Date: 
Message-ID: <449c5b85$0$656$bed64819@news.gradwell.net>
Chris Smith wrote:

> Perhaps, if you wanted to be quite careful about your distinctions, you
> may want to choose different words for the two.  However, then you run
> into that same pesky "you can't choose other people's terminology" block
> again.

There may be more flexibility in this area than you'd think.  I get the
impression that (after only a few years hard work) the terms "weak typing" and
"strong typing", used as synonyms for what we are here calling "dynamic typing"
and "static typing", are starting to die out.

So be of good hope !

I may yet be persuaded to talk of "static type checks" vs "dynamc checks" (with
a slight, veilled, sneer on the word "type" -- oh, so very limiting ;-)


> I'd be more careful in the analogy, there.  Since type theory (as
> applied to programming languages, anyway) is really just a set of
> methods of proving arbitrary statements about program behaviors,

Not "arbitrary" -- there is only a specific class of statements that any given
type system can address.  And, even if a given statement can be proved, then it
might not be something that most people would want to call a statement of type
(e.g. x > y).


It seems to me that most (all ?  by definition ??)  kinds of reasoning where we
want to invoke the word "type" tend to have a form where we reduce values (and
other things we want to reason about) to equivalence classes[*] w.r.t the
judgements we wish to make, and (usually) enrich that structure with
more-or-less stereotypical rules about which classes are substitutable for
which other ones. So that once we know what "type" something is, we can
short-circuit a load of reasoning based on the language semantics, by
translating into type-land where we already have a much simpler calculus to
guide us to the same answer.

(Or representative symbols, or...  The point is just that we throw away the
details which don't affect the judgements.)


> a
> corresponding "informal type theory" would just be the entire set of
> informal reasoning by programmers about their programs.

I think that, just as for static theorem proving, there is informal reasoning
that fits the "type" label, and informal reasoning that doesn't.

I would say that if informal reasoning (or explanations) takes a form
more-or-less like "X can [not] do Y because it is [not] a Z", then that's
taking advantage of exactly the short-cuts mentioned above, and I would want to
call that informal type-based reasoning.  But I certainly wouldn't call most of
the reasoning I do type-based.

    -- chris
From: Chris Smith
Subject: Re: Saying "latently-typed language" is making a category mistake
Date: 
Message-ID: <MPG.1f062ebf181df5bf9896f0@news.altopia.net>
Chris Uppal wrote:
> > I'd be more careful in the analogy, there.  Since type theory (as
> > applied to programming languages, anyway) is really just a set of
> > methods of proving arbitrary statements about program behaviors,
> 
> Not "arbitrary" -- there is only a specific class of statements that any given
> type system can address.  And, even if a given statement can be proved, then it
> might not be something that most people would want to call a statement of type
> (e.g. x > y).

[My understanding is that we're discussing static types and type theory 
here.  If you mean dynamic types, then what I'm about to say won't 
apply.]

I acknowledge that there are things that type systems are not going to 
be capable of proving.  These are not excluded a priori from the set of 
problems that would count as type errors if the language's type system 
caught them.  They are merely excluded for pragmatic reasons from the 
set of errors that someone would be likely to write a type system to 
check.  That is the case simply because there is a decided disadvantage 
in having a compiler that isn't guaranteed to ever finish compiling your 
program, or that doesn't finish in polynomial time on some 
representative quantity in nearly all cases, etc.

On the other hand, statements about program behavior that can be proved 
in reasonable time frames are universally valid subjects for type 
systems.  If they don't look like type errors now, they nevertheless 
will when the type system is completed to demonstrate them.  If someone 
were to write a language in which is could be statically proved that a 
pair of expressions has the property that when each is evaluated to 
their respective values x and y, (x > y) will be true, then proving this 
would be a valid subject of type theory.  We would then define that 
property as a type, whose values are all pairs that have the property 
(or, I suppose, a nominal type whose values are constrained to be pairs 
that have this property, though that would get painful very quickly), 
and then checking this property when that type is required -- such as 
when I am passing (x - y) as a parameter to a function that requires a 
positive integer -- would be called type checking.

In fact, we've come full circle.  The intial point on which I disagreed 
with Torben, while it was couched in a terminological dispute, was 
whether there existed a (strict) subset of programming errors that are 
defined to be type errors.  Torben defined the relationship between 
static and dynamic types in terms of when they solved "type errors", 
implying that this set is the same between the two.  A lot of the 
questions I'm asking elsewhere in this thread are chosen to help me 
understand whether such a set truly exists for the dynamic type world; 
but I believe I know enough that I can absolutely deny its existence 
within the realm of type theory.  I believe that even Pierce's phrase 
"for proving the absence of certain program behaviors" in his definition 
of a type system only excludes possible proofs that are not interesting 
for determining correctness, if in fact it limits the possible scope of 
type systems at all.

However, your definition of type errors may be relevant for 
understanding concepts of dynamic types.  I had understood you to say 
earlier that you thought something could validly qualify as a dynamic 
type system without regard for the type of problem that it verifies 
(that was denoted by "something" in a previous conversation).  I suppose 
I was misinterpreting you.

So your definition was:

> It seems to me that most (all ?  by definition ??)  kinds of reasoning where we
> want to invoke the word "type" tend to have a form where we reduce values (and
> other things we want to reason about) to equivalence classes[*] w.r.t the
> judgements we wish to make, and (usually) enrich that structure with
> more-or-less stereotypical rules about which classes are substitutable for
> which other ones. So that once we know what "type" something is, we can
> short-circuit a load of reasoning based on the language semantics, by
> translating into type-land where we already have a much simpler calculus to
> guide us to the same answer.
> 
> (Or representative symbols, or...  The point is just that we throw away the
> details which don't affect the judgements.)

I don't even know where I'd start in considering the forms of reasoning 
that are used in proving things.  Hmm.  I'll have to ponder this for a 
while.

> I think that, just as for static theorem proving, there is informal reasoning
> that fits the "type" label, and informal reasoning that doesn't.

So while I disagree in the static case, it seems likely that this is 
true of what is meant by dynamic types, at least by most people in this 
discussion.  I'd previously classified you as not agreeing.

-- 
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation
From: Chris Smith
Subject: Re: Saying "latently-typed language" is making a category mistake
Date: 
Message-ID: <MPG.1f087618d92b2bef9896fd@news.altopia.net>
Chris Uppal wrote:
> It seems to me that most (all ?  by definition ??)  kinds of reasoning where we
> want to invoke the word "type" tend to have a form where we reduce values (and
> other things we want to reason about) to equivalence classes[*] w.r.t the
> judgements we wish to make, and (usually) enrich that structure with
> more-or-less stereotypical rules about which classes are substitutable for
> which other ones. So that once we know what "type" something is, we can
> short-circuit a load of reasoning based on the language semantics, by
> translating into type-land where we already have a much simpler calculus to
> guide us to the same answer.
> 
> (Or representative symbols, or...  The point is just that we throw away the
> details which don't affect the judgements.)

One question: is this more specific than simply saying that we use 
predicate logic?  A predicate divides a set into two subsets: those 
elements for which the predicate is true, and those for which it is 
false.  If we then apply axioms or theorems of the form P(x) -> Q(x), 
then we are reasoning from an "equivalence class", throwing away the 
details we don't care about (anything that's irrelevant to the 
predicate), and applying stereotypical rules (the theorem or axiom).  I 
don't quite understand what you mean by substituting classes.

-- 
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation
From: Patricia Shanahan
Subject: Re: Saying "latently-typed language" is making a category mistake
Date: 
Message-ID: <86Qmg.10725$o4.1907@newsread2.news.pas.earthlink.net>
Vesa Karvonen wrote:
...
> An example of a form of informal reasoning that (practically) every
> programmer does daily is termination analysis.  There are type systems
> that guarantee termination, but I think that is fair to say that it is not
> yet understood how to make a practical general purpose language, whose
> type system would guarantee termination (or at least I'm not aware of such
> a language).  It should also be clear that termination analysis need not
> be done informally.  Given a program, it may be possible to formally prove
> that it terminates.

To make the halting problem decidable one would have to do one of two
things: Depend on memory size limits, or have a language that really is
less expressive, at a very deep level, than any of the languages
mentioned in the newsgroups header for this message.

A language for which the halting problem is decidable must also be a
language in which it is impossible to simulate an arbitrary Turing
machine (TM). Otherwise, one could decide the notoriously undecidable TM
halting problem by generating a language X program that simulates the TM
and deciding whether the language X program halts.

One way out might be to depend on the boundedness of physical memory. A
language with a fixed maximum memory size cannot simulate an arbitrary
TM. However, the number of states for a program is so great that a
method that depends on its finiteness, but would not work for an
infinite memory model, is unlikely to be practical.

Patricia
From: Chris Smith
Subject: Re: Saying "latently-typed language" is making a category mistake
Date: 
Message-ID: <MPG.1f05b47d3b0b14599896e8@news.altopia.net>
Patricia Shanahan <····@acm.org> wrote:
> Vesa Karvonen wrote:
> ...
> > An example of a form of informal reasoning that (practically) every
> > programmer does daily is termination analysis.  There are type systems
> > that guarantee termination, but I think that is fair to say that it is not
> > yet understood how to make a practical general purpose language, whose
> > type system would guarantee termination (or at least I'm not aware of such
> > a language).  It should also be clear that termination analysis need not
> > be done informally.  Given a program, it may be possible to formally prove
> > that it terminates.
> 
> To make the halting problem decidable one would have to do one of two
> things: Depend on memory size limits, or have a language that really is
> less expressive, at a very deep level, than any of the languages
> mentioned in the newsgroups header for this message.

Patricia, perhaps I'm misunderstanding you here.  To make the halting 
problem decidable is quite a different thing than to say "given a 
program, I may be able to prove that it terminates" (so long as you also 
acknowledge the implicit other possibility: I also may not be able to 
prove it in any finite bounded amount of time, even if it does 
terminate!)

The fact that the halting problem is undecidable is not a sufficient 
reason to reject the kind of analysis that is performed by programmers 
to convince themselves that their programs are guaranteed to terminate.  
Indeed, proofs that algorithms are guaranteed to terminate even in 
pathological cases are quite common.  Indeed, every assignment of a 
worst-case time bound to an algorithm is also a proof of termination.

This is, of course, a very good reason to reject the idea that the 
static type system for any Turing-complete language could ever perform 
this same kind analysis.

-- 
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation
From: Patricia Shanahan
Subject: Re: Saying "latently-typed language" is making a category mistake
Date: 
Message-ID: <rLYmg.191$NP4.106@newsread1.news.pas.earthlink.net>
Chris Smith wrote:
> Patricia Shanahan <····@acm.org> wrote:
>> Vesa Karvonen wrote:
>> ...
>>> An example of a form of informal reasoning that (practically) every
>>> programmer does daily is termination analysis.  There are type systems
>>> that guarantee termination, but I think that is fair to say that it is not
>>> yet understood how to make a practical general purpose language, whose
>>> type system would guarantee termination (or at least I'm not aware of such
>>> a language).  It should also be clear that termination analysis need not
>>> be done informally.  Given a program, it may be possible to formally prove
>>> that it terminates.
>> To make the halting problem decidable one would have to do one of two
>> things: Depend on memory size limits, or have a language that really is
>> less expressive, at a very deep level, than any of the languages
>> mentioned in the newsgroups header for this message.
> 
> Patricia, perhaps I'm misunderstanding you here.  To make the halting 
> problem decidable is quite a different thing than to say "given a 
> program, I may be able to prove that it terminates" (so long as you also 
> acknowledge the implicit other possibility: I also may not be able to 
> prove it in any finite bounded amount of time, even if it does 
> terminate!)
> 
> The fact that the halting problem is undecidable is not a sufficient 
> reason to reject the kind of analysis that is performed by programmers 
> to convince themselves that their programs are guaranteed to terminate.  
> Indeed, proofs that algorithms are guaranteed to terminate even in 
> pathological cases are quite common.  Indeed, every assignment of a 
> worst-case time bound to an algorithm is also a proof of termination.
> 
> This is, of course, a very good reason to reject the idea that the 
> static type system for any Turing-complete language could ever perform 
> this same kind analysis.
> 

Of course, we can, and should, prefer programs we can prove terminate.
Languages can make it easier or harder to write provably terminating
programs.

Patricia
From: Pascal Costanza
Subject: Re: Saying "latently-typed language" is making a category mistake
Date: 
Message-ID: <4g251pF1knc6tU1@individual.net>
Patricia Shanahan wrote:
> Vesa Karvonen wrote:
> ...
>> An example of a form of informal reasoning that (practically) every
>> programmer does daily is termination analysis.  There are type systems
>> that guarantee termination, but I think that is fair to say that it is 
>> not
>> yet understood how to make a practical general purpose language, whose
>> type system would guarantee termination (or at least I'm not aware of 
>> such
>> a language).  It should also be clear that termination analysis need not
>> be done informally.  Given a program, it may be possible to formally 
>> prove
>> that it terminates.
> 
> To make the halting problem decidable one would have to do one of two
> things: Depend on memory size limits, or have a language that really is
> less expressive, at a very deep level, than any of the languages
> mentioned in the newsgroups header for this message.

Not quite. See http://en.wikipedia.org/wiki/ACL2


Pascal

-- 
3rd European Lisp Workshop
July 3 - Nantes, France - co-located with ECOOP 2006
http://lisp-ecoop06.bknr.net/
From: Matthias Blume
Subject: Re: Saying "latently-typed language" is making a category mistake
Date: 
Message-ID: <m2zmg46mem.fsf@hanabi.local>
Pascal Costanza <··@p-cos.net> writes:

> Patricia Shanahan wrote:
>> Vesa Karvonen wrote:
>> ...
>>> An example of a form of informal reasoning that (practically) every
>>> programmer does daily is termination analysis.  There are type systems
>>> that guarantee termination, but I think that is fair to say that it
>>> is not
>>> yet understood how to make a practical general purpose language, whose
>>> type system would guarantee termination (or at least I'm not aware
>>> of such
>>> a language).  It should also be clear that termination analysis need not
>>> be done informally.  Given a program, it may be possible to
>>> formally prove
>>> that it terminates.
>> To make the halting problem decidable one would have to do one of
>> two
>> things: Depend on memory size limits, or have a language that really is
>> less expressive, at a very deep level, than any of the languages
>> mentioned in the newsgroups header for this message.
>
> Not quite. See http://en.wikipedia.org/wiki/ACL2

What do you mean "not quite"?  Of course, Patricia is absolutely
right.  Termination-guaranteeing languages are fundamentally less
expressive than Turing-complete languages.  ACL2 was not mentioned in
the newsgroup header.
From: Pascal Costanza
Subject: Re: Saying "latently-typed language" is making a category mistake
Date: 
Message-ID: <4g28s6F1k9npnU1@individual.net>
Matthias Blume wrote:
> Pascal Costanza <··@p-cos.net> writes:
> 
>> Patricia Shanahan wrote:
>>> Vesa Karvonen wrote:
>>> ...
>>>> An example of a form of informal reasoning that (practically) every
>>>> programmer does daily is termination analysis.  There are type systems
>>>> that guarantee termination, but I think that is fair to say that it
>>>> is not
>>>> yet understood how to make a practical general purpose language, whose
>>>> type system would guarantee termination (or at least I'm not aware
>>>> of such
>>>> a language).  It should also be clear that termination analysis need not
>>>> be done informally.  Given a program, it may be possible to
>>>> formally prove
>>>> that it terminates.
>>> To make the halting problem decidable one would have to do one of
>>> two
>>> things: Depend on memory size limits, or have a language that really is
>>> less expressive, at a very deep level, than any of the languages
>>> mentioned in the newsgroups header for this message.
>> Not quite. See http://en.wikipedia.org/wiki/ACL2
> 
> What do you mean "not quite"?  Of course, Patricia is absolutely
> right.  Termination-guaranteeing languages are fundamentally less
> expressive than Turing-complete languages.  ACL2 was not mentioned in
> the newsgroup header.

ACL2 is a subset of Common Lisp, and programs written in ACL2 are 
executable in Common Lisp. comp.lang.lisp is not only about Common Lisp, 
but even if it were, ACL2 would fit.


Pascal

-- 
3rd European Lisp Workshop
July 3 - Nantes, France - co-located with ECOOP 2006
http://lisp-ecoop06.bknr.net/
From: Matthias Blume
Subject: Re: Saying "latently-typed language" is making a category mistake
Date: 
Message-ID: <m1hd2ckkez.fsf@hana.uchicago.edu>
Pascal Costanza <··@p-cos.net> writes:

> Matthias Blume wrote:
>> Pascal Costanza <··@p-cos.net> writes:
>> 
>>> Patricia Shanahan wrote:
>>>> Vesa Karvonen wrote:
>>>> ...
>>>>> An example of a form of informal reasoning that (practically) every
>>>>> programmer does daily is termination analysis.  There are type systems
>>>>> that guarantee termination, but I think that is fair to say that it
>>>>> is not
>>>>> yet understood how to make a practical general purpose language, whose
>>>>> type system would guarantee termination (or at least I'm not aware
>>>>> of such
>>>>> a language).  It should also be clear that termination analysis need not
>>>>> be done informally.  Given a program, it may be possible to
>>>>> formally prove
>>>>> that it terminates.
>>>> To make the halting problem decidable one would have to do one of
>>>> two
>>>> things: Depend on memory size limits, or have a language that really is
>>>> less expressive, at a very deep level, than any of the languages
>>>> mentioned in the newsgroups header for this message.
>>> Not quite. See http://en.wikipedia.org/wiki/ACL2
>> What do you mean "not quite"?  Of course, Patricia is absolutely
>> right.  Termination-guaranteeing languages are fundamentally less
>> expressive than Turing-complete languages.  ACL2 was not mentioned in
>> the newsgroup header.
>
> ACL2 is a subset of Common Lisp, and programs written in ACL2 are
> executable in Common Lisp. comp.lang.lisp is not only about Common
> Lisp, but even if it were, ACL2 would fit.

So what?

Patricia said "less expressive", you said "subset".  Where is the
contradiction?

Perhaps you could also bring up the subset of Common Lisp where the
only legal program has the form:

   nil

Obviously, this is a subset of Common Lisp and, therefore, fits
comp.lang.lisp.  Right?

Yes, if you restrict the language to make it less expressive, you can
guarantee termination.  Some restrictions that fit the bill already
exist.  IMHO, it is still wrong to say that "Lisp guaratees
termination", just because there is a subset of Lisp that does.

Matthias

PS: I'm not sure if this language guarantees termination, though.  :-)
From: Patricia Shanahan
Subject: Re: Saying "latently-typed language" is making a category mistake
Date: 
Message-ID: <IBSmg.10811$o4.4857@newsread2.news.pas.earthlink.net>
Pascal Costanza wrote:
> Matthias Blume wrote:
>> Pascal Costanza <··@p-cos.net> writes:
>>
>>> Patricia Shanahan wrote:
>>>> Vesa Karvonen wrote:
>>>> ...
>>>>> An example of a form of informal reasoning that (practically) every
>>>>> programmer does daily is termination analysis.  There are type systems
>>>>> that guarantee termination, but I think that is fair to say that it
>>>>> is not
>>>>> yet understood how to make a practical general purpose language, whose
>>>>> type system would guarantee termination (or at least I'm not aware
>>>>> of such
>>>>> a language).  It should also be clear that termination analysis 
>>>>> need not
>>>>> be done informally.  Given a program, it may be possible to
>>>>> formally prove
>>>>> that it terminates.
>>>> To make the halting problem decidable one would have to do one of
>>>> two
>>>> things: Depend on memory size limits, or have a language that really is
>>>> less expressive, at a very deep level, than any of the languages
>>>> mentioned in the newsgroups header for this message.
>>> Not quite. See http://en.wikipedia.org/wiki/ACL2
>>
>> What do you mean "not quite"?  Of course, Patricia is absolutely
>> right.  Termination-guaranteeing languages are fundamentally less
>> expressive than Turing-complete languages.  ACL2 was not mentioned in
>> the newsgroup header.
> 
> ACL2 is a subset of Common Lisp, and programs written in ACL2 are 
> executable in Common Lisp. comp.lang.lisp is not only about Common Lisp, 
> but even if it were, ACL2 would fit.

To prove Turing-completeness of ACL2 from Turing-completeness of Common
Lisp you would need to run the reduction the other way round, showing
that any Common Lisp program can be converted to, or emulated by, an
ACL2 program.

Patricia
From: Pascal Costanza
Subject: Re: Saying "latently-typed language" is making a category mistake
Date: 
Message-ID: <4g2e1bF1kmiteU1@individual.net>
Patricia Shanahan wrote:
> Pascal Costanza wrote:
>> Matthias Blume wrote:
>>> Pascal Costanza <··@p-cos.net> writes:
>>>
>>>> Patricia Shanahan wrote:
>>>>> Vesa Karvonen wrote:
>>>>> ...
>>>>>> An example of a form of informal reasoning that (practically) every
>>>>>> programmer does daily is termination analysis.  There are type 
>>>>>> systems
>>>>>> that guarantee termination, but I think that is fair to say that it
>>>>>> is not
>>>>>> yet understood how to make a practical general purpose language, 
>>>>>> whose
>>>>>> type system would guarantee termination (or at least I'm not aware
>>>>>> of such
>>>>>> a language).  It should also be clear that termination analysis 
>>>>>> need not
>>>>>> be done informally.  Given a program, it may be possible to
>>>>>> formally prove
>>>>>> that it terminates.
>>>>> To make the halting problem decidable one would have to do one of
>>>>> two
>>>>> things: Depend on memory size limits, or have a language that 
>>>>> really is
>>>>> less expressive, at a very deep level, than any of the languages
>>>>> mentioned in the newsgroups header for this message.
>>>> Not quite. See http://en.wikipedia.org/wiki/ACL2
>>>
>>> What do you mean "not quite"?  Of course, Patricia is absolutely
>>> right.  Termination-guaranteeing languages are fundamentally less
>>> expressive than Turing-complete languages.  ACL2 was not mentioned in
>>> the newsgroup header.
>>
>> ACL2 is a subset of Common Lisp, and programs written in ACL2 are 
>> executable in Common Lisp. comp.lang.lisp is not only about Common 
>> Lisp, but even if it were, ACL2 would fit.
> 
> To prove Turing-completeness of ACL2 from Turing-completeness of Common
> Lisp you would need to run the reduction the other way round, showing
> that any Common Lisp program can be converted to, or emulated by, an
> ACL2 program.

Sorry, obviously I was far from being clear. ACL2 is not 
Turing-complete. All iterations must be expressed in terms of 
well-founded recursion.


Pascal

-- 
3rd European Lisp Workshop
July 3 - Nantes, France - co-located with ECOOP 2006
http://lisp-ecoop06.bknr.net/
From: Chris Uppal
Subject: Re: Saying "latently-typed language" is making a category mistake
Date: 
Message-ID: <449c1f4c$0$662$bed64819@news.gradwell.net>
Pascal Costanza wrote:

> Sorry, obviously I was far from being clear. ACL2 is not
> Turing-complete. All iterations must be expressed in terms of
> well-founded recursion.

How expressive does that end up being for real problems ?   I mean obviously in
some sense it's crippling, but how much of a restiction would that be for
non-accademic programming.  Could I write an accountancy package in it, or an
adventure games, or a webserver, with not too much more difficulty than in a
similar language without that one aspect to its type system ?

Hmm, come to think of it those all hae endless loops at the outer level, with
the body of the loop being an event handler, so maybe only the handler should
be required to guaranteed to terminate.

    -- chris
From: Marshall
Subject: Re: Saying "latently-typed language" is making a category mistake
Date: 
Message-ID: <1151099384.490397.22510@i40g2000cwc.googlegroups.com>
Chris Uppal wrote:
> Pascal Costanza wrote:
>
> > Sorry, obviously I was far from being clear. ACL2 is not
> > Turing-complete. All iterations must be expressed in terms of
> > well-founded recursion.
>
> How expressive does that end up being for real problems ?   I mean obviously in
> some sense it's crippling, but how much of a restiction would that be for
> non-accademic programming.  Could I write an accountancy package in it, or an
> adventure games, or a webserver, with not too much more difficulty than in a
> similar language without that one aspect to its type system ?
>
> Hmm, come to think of it those all hae endless loops at the outer level, with
> the body of the loop being an event handler, so maybe only the handler should
> be required to guaranteed to terminate.

An example of a non-Turing-complete language (at least the core
language; the standard is getting huge) with guaranteed termination,
that is used everywhere and quite useful, is SQL.

I wouldn't want to write an accounting package in it; its abstraction
facilities are too weak. But the language is much more powerful
than it is usually given credit for.


Marshall
From: Pascal Costanza
Subject: Re: Saying "latently-typed language" is making a category mistake
Date: 
Message-ID: <4g30ugF1l571tU1@individual.net>
Chris Uppal wrote:
> Pascal Costanza wrote:
> 
>> Sorry, obviously I was far from being clear. ACL2 is not
>> Turing-complete. All iterations must be expressed in terms of
>> well-founded recursion.
> 
> How expressive does that end up being for real problems ?   I mean obviously in
> some sense it's crippling, but how much of a restiction would that be for
> non-accademic programming. 

I don't know your definition of "real problem" ;), but ACL2 is 
definitely used in industrial settings.

See 
http://www.cs.utexas.edu/users/moore/publications/how-to-prove-thms/intro-to-acl2.pdf


Pascal

-- 
3rd European Lisp Workshop
July 3 - Nantes, France - co-located with ECOOP 2006
http://lisp-ecoop06.bknr.net/
From: David Hopwood
Subject: Re: Saying "latently-typed language" is making a category mistake
Date: 
Message-ID: <0neng.213767$8W1.1948@fe1.news.blueyonder.co.uk>
Patricia Shanahan wrote:
> Vesa Karvonen wrote:
> ...
> 
>> An example of a form of informal reasoning that (practically) every
>> programmer does daily is termination analysis.  There are type systems
>> that guarantee termination, but I think that is fair to say that it is
>> not yet understood how to make a practical general purpose language, whose
>> type system would guarantee termination (or at least I'm not aware of
>> such a language).  It should also be clear that termination analysis need
>> not be done informally.  Given a program, it may be possible to formally
>> prove that it terminates.
> 
> To make the halting problem decidable one would have to do one of two
> things: Depend on memory size limits, or have a language that really is
> less expressive, at a very deep level, than any of the languages
> mentioned in the newsgroups header for this message.

I don't think Vesa was talking about trying to solve the halting problem.

A type system that required termination would indeed significantly restrict
language expressiveness -- mainly because many interactive processes are
*intended* not to terminate.

A type system that required an annotation on all subprograms that do not
provably terminate, OTOH, would not impact expressiveness at all, and would
be very useful. Such a type system could work by treating some dependent
type parameters as variants which must strictly decrease in a recursive
call or loop. For example, consider a recursive quicksort implementation.
The type of the 'sort' routine would take an array of length
(dependent type parameter) n. Since it only performs recursive calls to
itself with parameter strictly less than n, it is not difficult to prove
automatically that the quicksort terminates. The programmer would probably
just have to give hints in some cases as to which parameters are to be
treated as variants; the rest can be inferred.

-- 
David Hopwood <····················@blueyonder.co.uk>
From: Marshall
Subject: Re: Saying "latently-typed language" is making a category mistake
Date: 
Message-ID: <1151171209.684163.143410@c74g2000cwc.googlegroups.com>
David Hopwood wrote:
>
> A type system that required an annotation on all subprograms that do not
> provably terminate, OTOH, would not impact expressiveness at all, and would
> be very useful.

Interesting. I have always imagined doing this by allowing an
annotation on all subprograms that *do* provably terminate. If
you go the other way, you have to annotate every function that
uses general recursion (or iteration if you swing that way) and that
seems like it might be burdensome. Further, it imposes the
annotation requirement even where the programer might not
care about it, which the reverse does not do.


Marshall
From: Gabriel Dos Reis
Subject: Re: Saying "latently-typed language" is making a category mistake
Date: 
Message-ID: <m3zmg2tkdy.fsf@uniton.integrable-solutions.net>
"Marshall" <···············@gmail.com> writes:

| David Hopwood wrote:
| >
| > A type system that required an annotation on all subprograms that do not
| > provably terminate, OTOH, would not impact expressiveness at all, and would
| > be very useful.
| 
| Interesting. I have always imagined doing this by allowing an
| annotation on all subprograms that *do* provably terminate. If
| you go the other way, you have to annotate every function that
| uses general recursion (or iteration if you swing that way) and that
| seems like it might be burdensome. Further, it imposes the
| annotation requirement even where the programer might not
| care about it, which the reverse does not do.

simple things should stay simple.  Recursions that provably terminate
are among the simplest ones.  Annotations in those cases could be
allowed, but not required.  Otherwise the system might become very
irritating to program with.

-- Gaby
From: Marshall
Subject: Re: Saying "latently-typed language" is making a category mistake
Date: 
Message-ID: <1151175272.674919.141210@y41g2000cwy.googlegroups.com>
Gabriel Dos Reis wrote:
> "Marshall" <···············@gmail.com> writes:
>
> | David Hopwood wrote:
> | >
> | > A type system that required an annotation on all subprograms that do not
> | > provably terminate, OTOH, would not impact expressiveness at all, and would
> | > be very useful.
> |
> | Interesting. I have always imagined doing this by allowing an
> | annotation on all subprograms that *do* provably terminate. If
> | you go the other way, you have to annotate every function that
> | uses general recursion (or iteration if you swing that way) and that
> | seems like it might be burdensome. Further, it imposes the
> | annotation requirement even where the programer might not
> | care about it, which the reverse does not do.
>
> simple things should stay simple.  Recursions that provably terminate
> are among the simplest ones.  Annotations in those cases could be
> allowed, but not required.  Otherwise the system might become very
> irritating to program with.

Yes, exactly my point.


Marshall
From: David Hopwood
Subject: Termination and type systems
Date: 
Message-ID: <4ving.213803$8W1.62401@fe1.news.blueyonder.co.uk>
Marshall wrote:
> David Hopwood wrote:
> 
>>A type system that required an annotation on all subprograms that do not
>>provably terminate, OTOH, would not impact expressiveness at all, and would
>>be very useful.
> 
> Interesting. I have always imagined doing this by allowing an
> annotation on all subprograms that *do* provably terminate. If
> you go the other way, you have to annotate every function that
> uses general recursion (or iteration if you swing that way) and that
> seems like it might be burdensome.

Not at all. Almost all subprograms provably terminate (with a fairly
easy proof), even if they use general recursion or iteration.

If it were not the case that almost all functions provably terminate,
then the whole idea would be hopeless. If a subprogram F calls G, then
in order to prove that F terminates, we probably have to prove that G
terminates. Consider a program where only half of all subprograms are
annotated as provably terminating. In that case, we would be faced with
very many cases where the proof cannot be discharged, because an
annotated subprogram calls an unannotated one.

If, on the other hand, more than, say, 95% of subprograms provably
terminate, then it is much more likely that we (or the inference
mechanism) can easily discharge any particular proof involving more
than one subprogram. So provably terminating should be the default,
and other cases should be annotated.

In some languages, annotations may never or almost never be needed,
because they are implied by other characteristics. For example, the
concurrency model used in the language E (www.erights.org) is such
that there are implicit top-level event loops which do not terminate
as long as the associated "vat" exists, but handling a message is
always required to terminate.

> Further, it imposes the
> annotation requirement even where the programer might not
> care about it, which the reverse does not do.

If the annotation marks not-provably-terminating subprograms, then it
calls attention to those subprograms. This is what we want, since it is
less safe/correct to use a nonterminating subprogram than a terminating
one (in some contexts).

There could perhaps be a case for distinguishing annotations for
"intended not to terminate", and "intended to terminate, but we
couldn't prove it".

I do not know how well such a type system would work in practice; it may
be that typical programs would involve too many non-trivial proofs. This
is something that would have to be tried out in a research language.

-- 
David Hopwood <····················@blueyonder.co.uk>
From: Marshall
Subject: Re: Termination and type systems
Date: 
Message-ID: <1151265765.438798.203480@r2g2000cwb.googlegroups.com>
David Hopwood wrote:
> Marshall wrote:
> > David Hopwood wrote:
> >
> >>A type system that required an annotation on all subprograms that do not
> >>provably terminate, OTOH, would not impact expressiveness at all, and would
> >>be very useful.
> >
> > Interesting. I have always imagined doing this by allowing an
> > annotation on all subprograms that *do* provably terminate. If
> > you go the other way, you have to annotate every function that
> > uses general recursion (or iteration if you swing that way) and that
> > seems like it might be burdensome.
>
> Not at all. Almost all subprograms provably terminate (with a fairly
> easy proof), even if they use general recursion or iteration.

Well, um, hmmm. If a subprogram uses recursion, and it is not
structural recursion, then I don't know how to go about proving
it terminates. Are the proof-termination techniques I don't
know about?

If we can specify a total order on the domain or some
subcomponents of the domain, and show that recursive
calls are always invoked with arguments less than (by
the order) those of the parent call, (and the domain is
well founded) then we have a termination proof.

(If we don't use recursion at all, and we only invoke
proven-terminating functions, same thing; actually
this is a degenerate case of the above.)

For iteration? Okay, a bounded for-loop is probably easy,
but a while loop? What about a function that calculates
the next prime number larger than its argument? Do we
have to embed a proof of an infinity of primes somehow?
That seems burdensome.


> If it were not the case that almost all functions provably terminate,
> then the whole idea would be hopeless.

I agree!


> If a subprogram F calls G, then
> in order to prove that F terminates, we probably have to prove that G
> terminates. Consider a program where only half of all subprograms are
> annotated as provably terminating. In that case, we would be faced with
> very many cases where the proof cannot be discharged, because an
> annotated subprogram calls an unannotated one.

Right, and you'd have to be applying the non-terminating annotation
all over the place.



> If, on the other hand, more than, say, 95% of subprograms provably
> terminate, then it is much more likely that we (or the inference
> mechanism) can easily discharge any particular proof involving more
> than one subprogram. So provably terminating should be the default,
> and other cases should be annotated.

Well, I can still imagine that the programmer doesn't care to have
non-termination examined for every part of his code. In which case,
he would still be required to add annotations even when he doesn't
care about a particular subprograms lack of a termination proof.

The pay-for-it-only-if-you-want-it approach has some benefits.
On the other hand, if it really is as common and easy as you
propose, then annotating only when no proof is available is
perhaps feasible.

I'm still a bit sceptical, though.


> I do not know how well such a type system would work in practice; it may
> be that typical programs would involve too many non-trivial proofs. This
> is something that would have to be tried out in a research language.

Yeah.


Marshall
From: Dirk Thierbach
Subject: Re: Termination and type systems
Date: 
Message-ID: <20060626111326.18B1.0.NOFFLE@dthierbach.news.arcor.de>
David Hopwood <····················@blueyonder.co.uk> wrote:
> Marshall wrote:
>> David Hopwood wrote:

>>> A type system that required an annotation on all subprograms that
>>> do not provably terminate, OTOH, would not impact expressiveness
>>> at all, and would be very useful.

>> Interesting. I have always imagined doing this by allowing an
>> annotation on all subprograms that *do* provably terminate. 

Maybe the paper "Linear types and non-size-increasing polynomial time
computation" by Martin Hofmann is interesting in this respect.
From the abstract:

  We propose a linear type system with recursion operators for inductive
  datatypes which ensures that all definable functions are polynomial
  time computable. The system improves upon previous such systems in
  that recursive definitions can be arbitrarily nested; in particular,
  no predicativity or modality restrictions are made.

It does not only ensure termination, but termination in polynomial time,
so you can use those types to carry information about this as well.

> If the annotation marks not-provably-terminating subprograms, then it
> calls attention to those subprograms. This is what we want, since it is
> less safe/correct to use a nonterminating subprogram than a terminating
> one (in some contexts).

That would be certainly nice, but it may be almost impossible to do in
practice. It's already hard enough to guarantee termination with the
extra information present in the type annotation. If this information
is not present, then the language has to be probably restricted so
severely to ensure termination that it is more or less useless.

- Dirk
From: David Hopwood
Subject: Re: Termination and type systems
Date: 
Message-ID: <x7Zng.482167$xt.178802@fe3.news.blueyonder.co.uk>
Dirk Thierbach wrote:
> David Hopwood <····················@blueyonder.co.uk> wrote:
>>Marshall wrote:
>>>David Hopwood wrote:
> 
>>>>A type system that required an annotation on all subprograms that
>>>>do not provably terminate, OTOH, would not impact expressiveness
>>>>at all, and would be very useful.
> 
>>>Interesting. I have always imagined doing this by allowing an
>>>annotation on all subprograms that *do* provably terminate. 
> 
> Maybe the paper "Linear types and non-size-increasing polynomial time
> computation" by Martin Hofmann is interesting in this respect.

<http://citeseer.ist.psu.edu/hofmann98linear.html>

> From the abstract:
> 
>   We propose a linear type system with recursion operators for inductive
>   datatypes which ensures that all definable functions are polynomial
>   time computable. The system improves upon previous such systems in
>   that recursive definitions can be arbitrarily nested; in particular,
>   no predicativity or modality restrictions are made.
> 
> It does not only ensure termination, but termination in polynomial time,
> so you can use those types to carry information about this as well.

That's interesting, but linear typing imposes some quite severe
restrictions on programming style. From the example of 'h' on page 2,
it's clear that the reason for the linearity restriction is just to
ensure polynomial-time termination, and is not needed if we only want
to prove termination.

>>If the annotation marks not-provably-terminating subprograms, then it
>>calls attention to those subprograms. This is what we want, since it is
>>less safe/correct to use a nonterminating subprogram than a terminating
>>one (in some contexts).
> 
> That would be certainly nice, but it may be almost impossible to do in
> practice. It's already hard enough to guarantee termination with the
> extra information present in the type annotation. If this information
> is not present, then the language has to be probably restricted so
> severely to ensure termination that it is more or less useless.

I think you're overestimating the difficulty of the problem. The fact
that *in general* it can be arbitrarily hard to prove termination, can
obscure the fact that for large parts of a typical program, proving
termination is usually trivial.

I just did a manual termination analysis for a 11,000-line multithreaded
C program; it's part of a control system for some printer hardware. This
program has:

 - 212 functions that trivially terminate. By this I mean that the function
   only contains loops that have easily recognized integer loop variants,
   and only calls other functions that are known to trivially terminate,
   without use of recursion. A compiler could easily prove these functions
   terminating without need for any annotations.

 - 8 functions that implement non-terminating event loops.

 - 23 functions that intentionally block. All of these should terminate
   (because of timeouts, or because another thread should do something that
   releases the block), but this cannot be easily proven with the kind of
   analysis we're considering here.

 - 3 functions that read from a stdio stream into a fixed-length buffer. This
   is guaranteed to terminate when reading a normal file, but not when reading
   from a pipe. In fact the code never reads from a pipe, but that is not
   locally provable.

 - 2 functions that iterate over a linked list of network adaptors returned
   by the Win32 GetAdaptorsInfo API. (This structure could be recognized as
   finite if we were calling a standard library abstraction over the OS API,
   but we are relying on Windows not to return a cyclic list.)

 - 1 function that iterates over files in a directory. (Ditto.)

 - 2 functions that iterate over a queue, implemented as a linked list. The queue
   is finite, but this is not locally provable. Possibly an artifact of the lack
   of abstraction from using C, where the loop is not easily recognizable as an
   iteration over a finite data structure.

 - 1 function with a loop that terminates for a non-trivial reason that involves
   some integer arithmetic, and is probably not expressible in a type system.
   The code aready had a comment informally justifying why the loop terminates,
   and noting a constraint needed to avoid an arithmetic overflow.
   (Bignum arithmetic would avoid the need for that constraint.)

 - 2 functions implementing a simple parsing loop, which terminates because the
   position of the current token is guaranteed to advance -- but this probably
   would not be provable given how the loop is currently written. A straightforward
   change (to make the 'next token' function return the number of characters to
   advance, asserted to be > 0, instead of a pointer to the next token) would make
   the code slightly clearer, and trivially terminating.


So for this program, out of 254 functions:

 - 83.5% are trivially terminating already.

 - 12.2% would need annotations saying that they are intended to block or not
   to terminate. This is largely an artifact -- we could remove the need for
   *all* of these annotations by adopting an event-loop concurrency model, and
   only attempting to prove termination of functions within a "turn" (one
   iteration of each top-level loop).

 - 2.4% would need annotations saying that termination is too hard to prove --
   but only because they interact with the operating system.
   This is obviously not a type system problem; fixing it requires careful
   design of the language's standard library.

 - 0.8% would need annotations only in C; in a language with standard libraries
   that provide data structures that are guaranteed to be finite, they probably
   would not.

 - 1.2% would need changes to the code, or are genuinely too difficult to prove
   to be terminating using a type system.


This is obviously not a large program (although I wouldn't want to do a manual
analysis like this for a much larger one), and it may be atypical in that it has
almost no use of recursion (or maybe that *is* typical for C programs). It would
be interesting to do the same kind of analysis for a functional program, or one
that does a more "algorithmically intensive" task.

-- 
David Hopwood <····················@blueyonder.co.uk>
From: Dirk Thierbach
Subject: Re: Termination and type systems
Date: 
Message-ID: <20060627065532.5BA.0.NOFFLE@dthierbach.news.arcor.de>
David Hopwood <····················@blueyonder.co.uk> wrote:
> Dirk Thierbach wrote:

> That's interesting, but linear typing imposes some quite severe
> restrictions on programming style. From the example of 'h' on page 2,
> it's clear that the reason for the linearity restriction is just to
> ensure polynomial-time termination, and is not needed if we only want
> to prove termination.

Yes. It's just an example of what can be actually done with a typesystem.

>> It's already hard enough to guarantee termination with the extra
>> information present in the type annotation. If this information is
>> not present, then the language has to be probably restricted so
>> severely to ensure termination that it is more or less useless.

> I think you're overestimating the difficulty of the problem. The fact
> that *in general* it can be arbitrarily hard to prove termination, can
> obscure the fact that for large parts of a typical program, proving
> termination is usually trivial.

Yes. The problem is the small parts where it's not trivial (and the
size of this part depends on the actual problem the program is trying
to solve). Either you make the language so restrictive that these
parts are hard (see your remark above) or impossible to write, or
you'll be left with a language where the compiler cannot prove
termination of those small parts, so it's not a "type system" in the
usual sense.

Of course one could write an optional tool that automatically proves
termination of the easy parts, and leaves the rest to the programmer,
but again, that's a different thing.

- Dirk
From: Matthias Blume
Subject: Re: Saying "latently-typed language" is making a category mistake
Date: 
Message-ID: <m2veqq6j3j.fsf@hanabi.local>
David Hopwood <····················@blueyonder.co.uk> writes:

> Patricia Shanahan wrote:
>> Vesa Karvonen wrote:
>> ...
>> 
>>> An example of a form of informal reasoning that (practically) every
>>> programmer does daily is termination analysis.  There are type systems
>>> that guarantee termination, but I think that is fair to say that it is
>>> not yet understood how to make a practical general purpose language, whose
>>> type system would guarantee termination (or at least I'm not aware of
>>> such a language).  It should also be clear that termination analysis need
>>> not be done informally.  Given a program, it may be possible to formally
>>> prove that it terminates.
>> 
>> To make the halting problem decidable one would have to do one of two
>> things: Depend on memory size limits, or have a language that really is
>> less expressive, at a very deep level, than any of the languages
>> mentioned in the newsgroups header for this message.
>
> I don't think Vesa was talking about trying to solve the halting problem.
>
> A type system that required termination would indeed significantly restrict
> language expressiveness -- mainly because many interactive processes are
> *intended* not to terminate.

Most interactive processes are written in such a way that they
(effectively) consist of an infinitely repeated application of some
function f that maps the current state and the input to the new state
and the output.

   f : state * input -> state * output

This function f itself has to terminate, i.e., if t has to be
guaranteed that after any given input, there will eventually be an
output.  In most interactive systems the requirements are in fact much
stricter: the output should come "soon" after the input has been
received.

I am pretty confident that the f for most (if not all) existing
interactive systems could be coded in a language that enforces
termination.  Only the loop that repeatedly applies f would have to be
coded in a less restrictive language.
From: David Hopwood
Subject: Re: Saying "latently-typed language" is making a category mistake
Date: 
Message-ID: <lcyng.477585$tc.94439@fe2.news.blueyonder.co.uk>
Matthias Blume wrote:
> David Hopwood <····················@blueyonder.co.uk> writes:
>>Patricia Shanahan wrote:
>>>Vesa Karvonen wrote:
>>>...
>>>
>>>>An example of a form of informal reasoning that (practically) every
>>>>programmer does daily is termination analysis.  [...]
>>>>Given a program, it may be possible to formally
>>>>prove that it terminates.
>>>
>>>To make the halting problem decidable one would have to do one of two
>>>things: Depend on memory size limits, or have a language that really is
>>>less expressive, at a very deep level, than any of the languages
>>>mentioned in the newsgroups header for this message.
>>
>>I don't think Vesa was talking about trying to solve the halting problem.
>>
>>A type system that required termination would indeed significantly restrict
>>language expressiveness -- mainly because many interactive processes are
>>*intended* not to terminate.
> 
> Most interactive processes are written in such a way that they
> (effectively) consist of an infinitely repeated application of some
> function f that maps the current state and the input to the new state
> and the output.
> 
>    f : state * input -> state * output
> 
> This function f itself has to terminate, i.e., if t has to be
> guaranteed that after any given input, there will eventually be an
> output.  In most interactive systems the requirements are in fact much
> stricter: the output should come "soon" after the input has been
> received.
> 
> I am pretty confident that the f for most (if not all) existing
> interactive systems could be coded in a language that enforces
> termination.  Only the loop that repeatedly applies f would have to be
> coded in a less restrictive language.

This is absolutely consistent with what I said. While f could be coded in
a system that *required* termination, the outer loop could not.

As I mentioned in a follow-up, event loop languages such as E enforce this
program structure, which would almost or entirely eliminate the need for
annotations in a type system that proves termination of some subprograms.

-- 
David Hopwood <····················@blueyonder.co.uk>
From: Anton van Straaten
Subject: Re: Saying "latently-typed language" is making a category mistake
Date: 
Message-ID: <oAQmg.49060$fb2.24820@newssvr27.news.prodigy.net>
Vesa Karvonen wrote:
> I think that we're finally getting to the bottom of things.  While reading
> your reponses something became very clear to me: latent-typing and latent-
> types are not a property of languages.  Latent-typing, also known as
> informal reasoning, is something that all programmers do as a normal part
> of programming.  To say that a language is latently-typed is to make a
> category mistake, because latent-typing is not a property of languages.
> 
> A programmer, working in any language, whether typed or not, performs
> informal reasoning.  I think that is fair to say that there is a
> correspondence between type theory and such informal reasoning.  The
> correspondence is like the correspondence between informal and formal
> math.  *But* , informal reasoning (latent-typing) is not a property of
> languages.

Well, it's obviously the case that latent types as I've described them 
are not part of the usual semantics of dynamically typed languages.  In 
other messages, I've mentioned types like "number -> number" which have 
no meaning in a dynamically typed language.  You can only write them in 
comments (unless you implement some kind of type handling system), and 
language implementations aren't aware of such types.

OTOH, a programmer reasoning explicitly about such types, writing them 
in comments, and perhaps using assertions to check them has, in a sense, 
     defined a language.  Having done that, and reasoned about the types 
in his program, he manually erases them, "leaving" code written in the 
original dynamically-typed language.  You can think of it as though it 
were generated code, complete with comments describing types, injected 
during the erasure process.

So, to address your category error objection, I would say that latent 
typing is a property of latently-typed languages, which are typically 
informally-defined supersets of what we know as dynamically-typed languages.

I bet that doesn't make you happy, though.  :D

Still, if that sounds a bit far-fetched, let me start again at ground 
level, with a Haskell vs. Scheme example:

   let double x = x * 2

vs.:

   (define (double x) (* x 2))

Programmers in both languages do informal reasoning to figure out the 
type of 'double'.  I'm assuming that the average Haskell programmer 
doesn't write out a proof whenever he wants to know the type of a term, 
and doesn't have a compiler handy.

But the Haskell programmer's informal reasoning takes place in the 
context of a well-defined formal type system.  He knows what the "type 
of double" means: the language defines that for him.

The type-aware Scheme programmer doesn't have that luxury: before he can 
talk about types, he has to invent a type system, something to give 
meaning to an expression such as "number -> number".  Performing that 
invention gives him types -- albeit informal types, a.k.a. latent types.

In the Haskell case, the types are a property of the language.  If 
you're willing to acknowledge the existence of something like latent 
types, what are they a property of?  Just the amorphous informal cloud 
which surrounds dynamically-typed languages?  Is that a satisfactory 
explanation of these two quite similar examples?

I want to mention two other senses in which latent types become 
connected to real languages.  That doesn't make them properties of the 
formal semantics of the language, but the connection is a real one at a 
different level.

The first is that in a language without a rich formal type system, 
informal reasoning outside of the formal type system becomes much more 
important.  Conversely, in Haskell, even if you accept the existence of 
latent types, they're close enough to static types that it's hardly 
necessary to consider them.  This is why latent types are primarily 
associated with languages without rich formal type systems.

The second connection is via tags: these are part of the definition of a 
dynamically-typed language, and if the programmer is reasoning 
explicitly about latent types, tags are a useful tool to help ensure 
that assumptions about types aren't violated.  So this is a connection 
between a feature in the semantics of the language, and these 
extra-linguistic latent types.

> An example of a form of informal reasoning that (practically) every
> programmer does daily is termination analysis.  There are type systems
> that guarantee termination, but I think that is fair to say that it is not
> yet understood how to make a practical general purpose language, whose
> type system would guarantee termination (or at least I'm not aware of such
> a language).  It should also be clear that termination analysis need not
> be done informally.  Given a program, it may be possible to formally prove
> that it terminates.

Right.  And this is partly why talking about latent types, as opposed to 
the more general "informal reasoning", makes sense: because latent types 
are addressing the same kinds of things that static types can capture. 
Type-like things.

> I'm now more convinced than ever that "(latently|dynamically)-typed
> language" is an oxymoron.  The terminology really needs to be fixed.

I agree that fixing is needed.  The challenge is to do it in a way that 
accounts for, rather than simply ignores, the many informal correlations 
to formal type concepts that exist in dynamically-typed languages. 
Otherwise, the situation won't improve.

Anton
From: J�rgen Exner
Subject: Re: Saying "latently-typed language" is making a category mistake
Date: 
Message-ID: <_kSmg.983$Wh.149@trnddc04>
Vesa Karvonen wrote:
> In comp.lang.functional Anton van Straaten <·····@appsolutions.com>
> wrote: [...]
>> I reject this comparison.  There's much more to it than that.  The
>
> I think that we're finally getting to the bottom of things.  While
> reading your reponses something became very clear to me:
> latent-typing and latent- types are not a property of languages.
> Latent-typing, also known as informal reasoning, is something that

And what does this have to do with Perl? Are you an alter ego of Xha Lee who 
spills his 'wisdom' across any newsgroup he can find?

jue 
From: Pascal Costanza
Subject: Re: Saying "latently-typed language" is making a category mistake
Date: 
Message-ID: <4g24rlF1l16viU1@individual.net>
Vesa Karvonen wrote:
> In comp.lang.functional Anton van Straaten <·····@appsolutions.com> wrote:
> [...]
>> I reject this comparison.  There's much more to it than that.  The point 
>> is that the reasoning which programmers perform when working with an 
>> program in a latently-typed language bears many close similiarities to 
>> the purpose and behavior of type systems.
> 
>> This isn't an attempt to jump on any bandwagons, it's an attempt to 
>> characterize what is actually happening in real programs and with real 
>> programmers.  I'm relating that activity to type systems because that is 
>> what it most closely relates to.
> [...]
> 
> I think that we're finally getting to the bottom of things.  While reading
> your reponses something became very clear to me: latent-typing and latent-
> types are not a property of languages.  Latent-typing, also known as
> informal reasoning, is something that all programmers do as a normal part
> of programming.  To say that a language is latently-typed is to make a
> category mistake, because latent-typing is not a property of languages.

I disagree with you and agree with Anton. Here, it is helpful to 
understand the history of Scheme a bit: parts of its design are a 
reaction to what Schemers perceived as having failed in Common Lisp (and 
other previous Lisp dialects).

One particularly illuminating example is the treatment of nil in Common 
Lisp. That value is a very strange beast in Common Lisp because it 
stands for several concepts at the same time: most importantly the empty 
list and the boolean false value. Its type is also "interesting": it is 
both a list and a symbol at the same time. It is also "interesting" that 
its quoted value is equivalent to the value nil itself. This means that 
the following two forms are equivalent:

(if nil 42 4711)
(if 'nil 42 4711)

Both forms evaluate to 4711.

It's also the case that taking the car or cdr (first or rest) of nil 
doesn't give you an error, but simply returns nil as well.

The advantage of this design is that it allows you to express a lot of 
code in a very compact way. See 
http://www.apl.jhu.edu/~hall/lisp/Scheme-Ballad.text for a nice 
illustration.

The disadvantage is that it is mostly impossible to have a typed view of 
nil, at least one that clearly disambiguates all the cases. There are 
also other examples where Common Lisp conflates different types, and 
sometimes only for special cases. [1]

Now compare this with the Scheme specification, especially this section: 
http://www.schemers.org/Documents/Standards/R5RS/HTML/r5rs-Z-H-6.html#%25_sec_3.2

This clearly deviates strongly from Common Lisp (and other Lisp 
dialects). The emphasis here is on a clear separation of all the types 
specified in the Scheme standard, without any exception. This is exactly 
what makes it straightforward in Scheme to have a latently typed view of 
programs, in the sense that Anton describes. So latent typing is a 
property that can at least be enabled / supported by a programming 
language, so it is reasonable to talk about this as a property of some 
dynamically typed languages.



Pascal

[1] Yet Common Lisp allows you to write beautiful code, more often than 
not especially _because_ of these "weird" conflations, but that's a 
different topic.

-- 
3rd European Lisp Workshop
July 3 - Nantes, France - co-located with ECOOP 2006
http://lisp-ecoop06.bknr.net/
From: David Hopwood
Subject: Re: Saying "latently-typed language" is making a category mistake
Date: 
Message-ID: <qCing.477123$xt.142183@fe3.news.blueyonder.co.uk>
Pascal Costanza wrote:
> Vesa Karvonen wrote:
> 
>> I think that we're finally getting to the bottom of things.  While
>> reading your reponses something became very clear to me: latent-typing and
>> latent-types are not a property of languages.  Latent-typing, also known as
>> informal reasoning, is something that all programmers do as a normal part
>> of programming.  To say that a language is latently-typed is to make a
>> category mistake, because latent-typing is not a property of languages.
> 
> I disagree with you and agree with Anton. Here, it is helpful to
> understand the history of Scheme a bit: parts of its design are a
> reaction to what Schemers perceived as having failed in Common Lisp (and
> other previous Lisp dialects).
> 
> One particularly illuminating example is the treatment of nil in Common
> Lisp. That value is a very strange beast in Common Lisp because it
> stands for several concepts at the same time: most importantly the empty
> list and the boolean false value. Its type is also "interesting": it is
> both a list and a symbol at the same time. It is also "interesting" that
> its quoted value is equivalent to the value nil itself. This means that
> the following two forms are equivalent:
> 
> (if nil 42 4711)
> (if 'nil 42 4711)
> 
> Both forms evaluate to 4711.
> 
> It's also the case that taking the car or cdr (first or rest) of nil
> doesn't give you an error, but simply returns nil as well.
> 
> The advantage of this design is that it allows you to express a lot of
> code in a very compact way. See
> http://www.apl.jhu.edu/~hall/lisp/Scheme-Ballad.text for a nice
> illustration.
> 
> The disadvantage is that it is mostly impossible to have a typed view of
> nil, at least one that clearly disambiguates all the cases. There are
> also other examples where Common Lisp conflates different types, and
> sometimes only for special cases. [1]
> 
> Now compare this with the Scheme specification, especially this section:
> http://www.schemers.org/Documents/Standards/R5RS/HTML/r5rs-Z-H-6.html#%25_sec_3.2
> 
> This clearly deviates strongly from Common Lisp (and other Lisp
> dialects). The emphasis here is on a clear separation of all the types
> specified in the Scheme standard, without any exception. This is exactly
> what makes it straightforward in Scheme to have a latently typed view of
> programs, in the sense that Anton describes. So latent typing is a
> property that can at least be enabled / supported by a programming
> language, so it is reasonable to talk about this as a property of some
> dynamically typed languages.

If anything, I think that this example supports my and Vesa's point.
The example demonstrates that languages
*that are not distinguished in whether they are called latently typed*
support informal reasoning about types to varying degrees.

-- 
David Hopwood <····················@blueyonder.co.uk>
From: David Hopwood
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <A%cng.475952$xt.286049@fe3.news.blueyonder.co.uk>
Anton van Straaten wrote:
> I'm suggesting that if a language classifies and tags values in a way
> that supports the programmer in static reasoning about the behavior of
> terms, that calling it "untyped" does not capture the entire picture,
> even if it's technically accurate in a restricted sense (i.e. in the
> sense that terms don't have static types that are known within the
> language).
> 
> Let me come at this from another direction: what do you call the
> classifications into number, string, vector etc. that a language like
> Scheme does?  And when someone writes a program which includes the
> following lines, how would you characterize the contents of the comment:
> 
> ; third : integer -> integer
> (define (third n) (quotient n 3))

I would call it an informal type annotation. But the very fact that
it has to be expressed as a comment, and is not checked, means that
the *language* is not typed (even though Scheme is dynamically tagged,
and even though dynamic tagging provides *partial* support for a
programming style that uses this kind of informal annotation).

-- 
David Hopwood <····················@blueyonder.co.uk>
From: David Hopwood
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <tkjng.477126$xt.379128@fe3.news.blueyonder.co.uk>
David Hopwood wrote:
> Anton van Straaten wrote:
> 
>>I'm suggesting that if a language classifies and tags values in a way
>>that supports the programmer in static reasoning about the behavior of
>>terms, that calling it "untyped" does not capture the entire picture,
>>even if it's technically accurate in a restricted sense (i.e. in the
>>sense that terms don't have static types that are known within the
>>language).
>>
>>Let me come at this from another direction: what do you call the
>>classifications into number, string, vector etc. that a language like
>>Scheme does?  And when someone writes a program which includes the
>>following lines, how would you characterize the contents of the comment:
>>
>>; third : integer -> integer
>>(define (third n) (quotient n 3))
> 
> I would call it an informal type annotation. But the very fact that
> it has to be expressed as a comment, and is not checked,

What I meant to say here is "and is not used in any way by the language
implementation,"

> means that
> the *language* is not typed (even though Scheme is dynamically tagged,
> and even though dynamic tagging provides *partial* support for a
> programming style that uses this kind of informal annotation).

-- 
David Hopwood <····················@blueyonder.co.uk>
From: Ketil Malde
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <egodwgyz8c.fsf@polarvier.ii.uib.no>
Anton van Straaten <·····@appsolutions.com> writes:

> But a program as seen by the programmer has types: the programmer
> performs (static) type inference when reasoning about the program, and
> debugs those inferences when debugging the program, finally ending up
> with a program which has a perfectly good type scheme.

I'd be tempted to go further, and say that the programmer performs 
(informally, incompletely, and probably incorrectly ) a proof of
correctness, and that type correctness is just a part of general
correctness. 

-k
-- 
If I haven't seen further, it is by standing in the footprints of giants
From: Chris Smith
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <MPG.1f0118eb59a614fa9896cd@news.altopia.net>
Joe Marshall <··········@gmail.com> wrote:
> They *do* have a related meaning.  Consider this code fragment:
> (car "a string")

My feeling is that this code is obviously wrong.  It is so obviously 
wrong, in fact, that the majority of automated error detection systems, 
if written for Lisp, would probably manage to identify it as wrong at 
some point.  This includes the Lisp runtime.  So far, I haven't 
mentioned types.

> A string is *not* a valid argument to CAR.  Ask anyone why and
> they will tell you `It's the wrong type.'

Indeed, they will.  We're assuming, of course that they know a little 
Lisp... otherwise, it may be entirely reasonable for someone to expect 
that (car "a string") is 'a' and (cdr "a string") is " string"... but 
I'll ignore that, even though I haven't yet convinced myself that it's 
not relevant to why this is colloquially considered a type error.

I believe that in this sense, the 'anyone' actually means "type" in the 
sense that you mean "type".  The fact that a static type system detects 
this error is somewhat coincidental (except for the fact that, again, 
any general error-detection scheme worth its salt probably detects this 
error), and orthogonal to whether it is considered a type error by our 
hypothetical 'anyone'.

> Both `static typing' and `dynamic typing' (in the colloquial sense) are
> strategies to detect this sort of error.

I will repeat that static typing is a strategy for detecting errors in 
general, on the basis of tractable syntactic methods.  There are some 
types of errors that are easier to detect in such a system than 
others... but several examples have been given of problems solved by 
static type systems that are not of the colloquial "It's the wrong 
type" variety that you mention here.  The examples so far have included 
detecting division by zero, or array bounds checking.  Other type 
systems can check dimensionality (correct units).

Another particularly interesting example may be the following from 
Ocaml:

  let my_sqrt x = if x < 0.0 then None else Some(sqrt(x));;

Then, if I attempt to use my_sqrt in a context that requires a float, 
the compiler will complain about a type violation, since the type of the 
expression is "float option".  So this is a type error *in Ocaml*, but 
it's not the kind of thing that gets intuitively classified as a type 
error.  In fact, it's roughly equivalent to a NullPointerException at 
runtime in Java, and few Java programmers would consider a 
NullPointerException to be somehow "actually a type error" that the 
compiler just doesn't catch.  In this case, when the error appears in 
Ocaml, it appears to be "obviously" a type error, but that's only 
because the type system was designed to catch some class of program 
errors, of which this is a member.

> It's hardly mythical.  (car "a string") is obviously an error and you
> don't need a static type system to know that.

Sure.  The question is whether it means much to say that it's a "type 
error".  So far, I'd agree with either of two statements, depending on 
the usage of the word "type":

a) Yes, it means something, but Torben's definition of a static type 
system was wrong, because static type systems are not specifically 
looking for type errors.

    or

b) No, "type error" just means "error that can be caught by the type 
system", so it is circular and meaningless to use the phrase in defining 
a kind of type system.

> I mean that this has been argued time and time again in comp.lang.lisp
> and probably the other groups as well.

My apologies, then.  It has not been discussed so often in any newsgroup 
that I followed up until now, though Marshall has now convinced me to 
read comp.lang.functional, so I might see these endless discussions from 
now on.

> In fact, we become rather confused when you say `a
> correctly typed program cannot go wrong at runtime' because we've seen
> plenty of runtime errors from code that is `correctly typed'.

Actually, I become a little confused by that as well.  I suppose it 
would be true of a "perfect" static type system, but I haven't seen one 
of those yet.  (Someone did email me earlier today to point out that the 
type system of a specification language called LOTOS supposedly is 
perfect in that sense, that every correctly typed program is also 
correct, but I've yet to verify this for myself.  It seems rather 
difficult to believe.)

Unless I suddenly have some kind of realization in the future about the 
feasibility of a perfect type system, I probably won't make that 
statement that you say confuses you.

> >     An attempt to generalize the definition of "type" from programming
> >     language type theory to eliminate the requirement that they are
> >     syntactic in nature yields something meaningless.  Any concept of
> >     "type" that is not syntactic is a completely different thing from
> >     static types.
> 
> Agreed.  That is why there is the qualifier `dynamic'.  This indicates
> that it is a completely different thing from static types.

If we agree about this, then there is no need to continue this 
discussion.  I'm not sure we do agree, though, because I doubt we'd be 
right here in this conversation if we did.

This aspect of being a "completely different thing" is why I objected to 
Torben's statement of the form: static type systems detect type 
violations at compile time, whereas dynamic type systems detect type 
violations at runtime.  The problem with that statement is that "type 
violations" means different things in the first and second half, and 
that's obscured by the form of the statement.  It would perhaps be 
clearer to say something like:

    Static type systems detect some bugs at compile time, whereas
    dynamic type systems detect type violations at runtime.

Here's one interesting consequence of the change.  It is easy to 
recognize that static and dynamic type systems are largely orthogonal to 
each other in the sense above.  Java, for example (since that's one of 
the newsgroups on the distribution list for this thread), restricting 
the field of view to reference types for simplicity's sake, clearly has 
both a very well-developed static type system, and a somewhat well-
developed dynamic type system.  There are dynamic "type" errors that 
pass the compiler and are caught by the runtime; and there are errors 
that are caught by the static type system.  There is indeed considerable 
overlap involved, but nevertheless, neither is made redundant.  In fact, 
one way of understanding the headaches of Java 1.5 generics is to note 
that the two different meanings of "type errors" are no longer in 
agreement with each other!

> We're all rubes here, so don't try to educate us with your
> high-falutin' technical terms.

That's not my intention.

-- 
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation
From: Chris Smith
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <MPG.1f011a01ffcb49649896ce@news.altopia.net>
I <·······@twu.net> wrote:
>     Static type systems detect some bugs at compile time, whereas
>     dynamic type systems detect type violations at runtime.

PS: In order to satisfy the Python group (among others not on the cross-
post list), we'd need to include "duck typing," which fits neither of 
the two definitions above.  We'd probably need to modify the definition 
of dynamic type systems, since most source tend to classify it as a 
dynamic type system.  It's getting rather late, though, and I don't 
intend to think about how to do that.

-- 
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation
From: Joe Marshall
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1150823341.079992.98350@y41g2000cwy.googlegroups.com>
Chris Smith wrote:
> Joe Marshall <··········@gmail.com> wrote:
> >
> > Agreed.  That is why there is the qualifier `dynamic'.  This indicates
> > that it is a completely different thing from static types.
>
> If we agree about this, then there is no need to continue this
> discussion.  I'm not sure we do agree, though, because I doubt we'd be
> right here in this conversation if we did.

I think we do agree.

The issue of `static vs. dynamic types' comes up about twice a year in
comp.lang.lisp  It generally gets pretty heated, but eventually people
come to understand what the other person is saying (or they get bored
and drop out of the conversation - I'm not sure which).  Someone always
points out that the phrase `dynamic types' really has no meaning in the
world of static type analysis.  (Conversely, the notion of a `static
type' that is available at runtime has no meaning in the dynamic
world.)  Much confusion usually follows.

You'll get much farther in your arguments by explaining what you mean
in detail rather than attempting to force a unification of teminology.
You'll also get farther by remembering that many of the people here
have not had much experience with real static type systems.  The static
typing of C++ or Java is so primitive that it is barely an example of
static typing at all, yet these are the most common examples of
statically typed languages people typically encounter.
From: genea
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1150827075.175209.8800@i40g2000cwc.googlegroups.com>
Joe Marshall wrote:
{...}
> The issue of `static vs. dynamic types' comes up about twice a year in
> comp.lang.lisp  It generally gets pretty heated, but eventually people
> come to understand what the other person is saying (or they get bored
> and drop out of the conversation - I'm not sure which).  {...}

  I think that the thing about "Language Expressiveness" is just so
elusive, as it is based on each programmers way of thinking about
things, and also the general types of problems that that programmer is
dealing with on a daily basis.  There are folks out there like the Paul
Grahams of the world, that do wonderfully complex things in Lisp,
eschew totally the facilities of the CLOS, the lisp object system, and
get the job done ... just because they can hack and have a mind picture
of what all the type matches are "in there head".  I used to use forth,
where everything goes on a stack, and it is up to the programmer to
remember what the heck the type of a thing was that was stored there...
maybe an address of a string, another word {read function}, or an
integer...  NO TYPING AT ALL, but can you be expressive in forth... You
can if you are a good forth programmer... NOW that being said, I think
that the reason I like Haskell, a very strongly typed language, is that
because of it's type system, the language is able to do things like
lazy evaluation, and then though it is strongly typed, and has
wonderful things like type classes, a person can write wildly
expressive code, and NEVER write a thing like:
    fromtoby :: forall b a.
            (Num a, Enum a) =>
            a -> a -> a -> (a -> b) -> [b]
The above was generated by the Haskell Compiler for me... and it does
that all the time, without any fuss.  I just wrote the function and it
did the rest for me... By the way the function was a replacement for
the { for / to / by } construct of like a C language and it was done in
one line...  THAT TO ME IS EXPRESSIVE, when I can build whole new
features into my language in just a few lines... usually only one
line..   I think that is why guys who are good to great in dynamic or
if it floats your boat, type free languages like Lisp and Scheme find
their languages so expressive, because  they have found the macro's or
whatever else facility to give them the power...  to extend their
language to meet a problem, or fit maybe closer to an entire class of
problems.. giving them the tool box..  Haskeller', folks using Ruby,
Python, ML, Ocaml, Unicon.... even C or whatever... By building either
modules, or libraries... and understand that whole attitude of tool
building.. can be equally as expressive "for their own way of doing
things".  Heck, I don't use one size of hammer out in my workshop, and
I sure as hell don't on my box...   I find myself picking up Icon and
it's derivative Unicon to do a lot of data washing chores.. as it
allows functions as first class objects... any type can be stored in a
list...  tables... like a hash ... with any type as the data  and the
key... you can do    things like
i := 0
every write(i+:=1 || read())
which will append a sequence of line numbers to the lines read in from
stdin.. pretty damn concise and expressive in my book...  Now for other
problems .. Haskell with it's single type lists... which of course can
have tuples, which themselves have tuples in them with a list as one
element of that tuple... etc.. and you can build accessors for all of
that for function application brought to bear on one element of one
tuple allowing mappings of that function to all of that particular
element of ... well you get the idea.. It is all about how you see
things and how your particular mindset is...  I would agree with
someone that early on in the discussion said the thing about "type's
warping your mind" and depending on the individual, it is either a good
warp or a bad warp, but that is why there is Ruby and Python, and
Haskell and even forth... for a given problem, and a given programmer,
any one of those or even Cobol or Fortran might be the ideal tool... if
nothing else based on that persons familiarity or existing tool and
code base.   

enough out of me... 
-- gene
From: David Hopwood
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <UW3mg.463049$xt.149948@fe3.news.blueyonder.co.uk>
genea wrote:
> [...] NOW that being said, I think
> that the reason I like Haskell, a very strongly typed language, is that
> because of it's type system, the language is able to do things like
> lazy evaluation, [...]

Lazy evaluation does not depend on, nor is it particularly helped by
static typing (assuming that's what you mean by "strongly typed" here).

An example of a non-statically-typed language that supports lazy evaluation
is Oz. (Lazy functions are explicitly declared in Oz, as opposed to Haskell's
implicit lazy evaluation, but that's not because of the difference in type
systems.)

-- 
David Hopwood <····················@blueyonder.co.uk>
From: Pascal Costanza
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <4fo19aF1jsudeU1@individual.net>
Chris Smith wrote:
> Torben �gidius Mogensen <·······@app-3.diku.dk> wrote:
>> That's not really the difference between static and dynamic typing.
>> Static typing means that there exist a typing at compile-time that
>> guarantess against run-time type violations.  Dynamic typing means
>> that such violations are detected at run-time.  This is orthogonal to
>> strong versus weak typing, which is about whether such violations are
>> detected at all.  The archetypal weakly typed language is machine code
>> -- you can happily load a floating point value from memory, add it to
>> a string pointer and jump to the resulting value.  ML and Scheme are
>> both strongly typed, but one is statically typed and the other
>> dynamically typed.
> 
> Knowing that it'll cause a lot of strenuous objection, I'll nevertheless 
> interject my plea not to abuse the word "type" with a phrase like 
> "dynamically typed".  If anyone considers "untyped" to be perjorative, 
> as some people apparently do, then I'll note that another common term is 
> "type-free," which is marketing-approved but doesn't carry the 
> misleading connotations of "dynamically typed."  We are quickly losing 
> any rational meaning whatsoever to the word "type," and that's quite a 
> shame.

The words "untyped" or "type-free" only make sense in a purely 
statically typed setting. In a dynamically typed setting, they are 
meaningless, in the sense that there are _of course_ types that the 
runtime system respects.

Types can be represented at runtime via type tags. You could insist on 
using the term "dynamically tagged languages", but this wouldn't change 
a lot. Exactly _because_ it doesn't make sense in a statically typed 
setting, the term "dynamically typed language" is good enough to 
communicate what we are talking about - i.e. not (static) typing.

> By way of extending the point, let me mention that there is no such 
> thing as a universal class of things that are called "run-time type 
> violations".  At runtime, there is merely correct code and incorrect 
> code.

No, there is more: There is safe and unsafe code (i.e., code that throws 
exceptions or that potentially just does random things). There are also 
runtime systems where you have the chance to fix the reason that caused 
the exception and continue to run your program. The latter play very 
well with dynamic types / type tags.

> To the extent that anything is called a "type" at runtime, this 
> is a different usage of the word from the usage by which we may define 
> languages as being statically typed (which means just "typed").  In 
> typed OO languages, this runtime usage is often called the "class", for 
> example, to distinguish it from type.

What type of person are you to tell other people what terminology to use? ;)

Ha! Here I used "type" in just another sense of the word. ;)

It is important to know the context in which you are discussing things. 
For example, "we" Common Lispers use the term "type" as defined in 
http://www.lispworks.com/documentation/HyperSpec/Body/26_glo_t.htm . You 
cannot possibly argue that "our" use of the word "type" is incorrect 
because in "our" context, when we talk about Common Lisp, the use of the 
word "type" better be consistent with that definition. (You can say that 
you don't like the definition, that it is unsound, or whatever, but that 
doesn't change anything here.)

> This cleaner terminology eliminates a lot of confusion.  For example, it 
> clarifies that there is no binary division between strongly typed 
> languages and weakly typed languages, since the division between a "type 
> error" and any other kind of error is arbitrary, depending only on 
> whether the type system in a particular language happens to catch that 
> error.  For example, type systems have been defined to try to catch unit 
> errors in scientific programming, or to catch out-of-bounds array 
> indices... yet these are not commonly called "type errors" only because 
> such systems are not in common use.

What type system catches division by zero? That is, statically? Would 
you like to program in such a language?

> This isn't just a matter of preference in terminology.  The definitions 
> above (which are, in my experience, used widely by most non-academic 
> language design discussions) actually limit our understanding of 
> language design by pretending that certain delicate trade-offs such as 
> the extent of the type system, or which language behavior is allowed to 
> be non-deterministic or undefined, are etched in stone.  This is simply 
> not so.  If types DON'T mean a compile-time method for proving the 
> absence of certain program behaviors, then they don't mean anything at 
> all.  Pretending that there's a distinction at runtime between "type 
> errors" and "other errors" serves only to confuse things and 
> artificially limit which problems we are willing to concieve as being 
> solvable by types.

Your problem doesn't exist. Just say "types" when you're amongst your 
own folks, and "static types" when you're amongst a broader audience, 
and everything's fine. Instead of focusing on terminology, just focus on 
the contents.


Pascal

-- 
3rd European Lisp Workshop
July 3 - Nantes, France - co-located with ECOOP 2006
http://lisp-ecoop06.bknr.net/
From: Chris Smith
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <MPG.1f0093611056cbb19896c7@news.altopia.net>
Pascal Costanza <··@p-cos.net> wrote:
> Types can be represented at runtime via type tags. You could insist on 
> using the term "dynamically tagged languages", but this wouldn't change 
> a lot. Exactly _because_ it doesn't make sense in a statically typed 
> setting, the term "dynamically typed language" is good enough to 
> communicate what we are talking about - i.e. not (static) typing.

Okay, fair enough.  It's certainly possible to use the same sequence of 
letters to mean two different things in different contexts.  The problem 
arises, then, when Torben writes:

: That's not really the difference between static and dynamic typing.
: Static typing means that there exist a typing at compile-time that
: guarantess against run-time type violations.  Dynamic typing means
: that such violations are detected at run-time.

This is clearly not using the word "type" to mean two different things 
in different contexts.  Rather, it is speaking under the mistaken 
impression that "static typing" and "dynamic typing" are varieties of 
some general thing called "typing."  In fact, the phrase "dynamically 
typed" was invented to do precisely that.  My argument is not really 
with LISP programmers talking about types, by which they would mean 
approximately the same thing Java programmers mean by "class."  My point 
here concerns the confusion that results from the conception that there 
is this binary distinction (or continuum, or any other simple 
relationship) between a "statically typed" and a "dynamically typed" 
language.

Torben's (and I don't mean to single out Torben -- the terminology is 
used quite widely) classification of dynamic versus static type systems 
depends on the misconception that there is some universal definition to 
the term "type error" or "type violation" and that the only question is 
how we address these well-defined things.  It's that misconception that 
I aim to challenge.

> No, there is more: There is safe and unsafe code (i.e., code that throws 
> exceptions or that potentially just does random things). There are also 
> runtime systems where you have the chance to fix the reason that caused 
> the exception and continue to run your program. The latter play very 
> well with dynamic types / type tags.

Yes, I was oversimplifying.

> What type system catches division by zero? That is, statically?

I can define such a type system trivially.  To do so, I simply define a 
type for integers, Z, and a subtype for non-zero integers, Z'.  I then 
define the language such that division is only possible in an expression 
that looks like << z / z' >>, where z has type Z and z' has type Z'.  
The language may then contain an expression:

  z 0? t1 : t2

in which t1 is evaluated in the parent type environment, but t2 is 
evaluated in the type environment augmented by (z -> Z'), the type of 
the expression is the intersection type of t1 and t2 evaluated in those 
type environments, and the evaluation rules are defined as you probably 
expect.

> Would you like to program in such a language?

No.  Type systems for real programming languages are, of course, a 
balance between rigor and usability.  This particular set of type rules 
doesn't seem to exhibit a good balance.  Perhaps there is a way to 
achieve it in a way that is more workable, but it's not immediately 
obvious.

As another example, from Pierce's text "Types and Programming 
Languages", Pierce writes: "Static elimination of array-bounds checking 
is a long-standing goal for type system designers.  In principle, the 
necessary mechanisms (based on dependent types) are well understood, but 
packaging them in a form that balances expressive power, predictability 
and tractability of typechecking, and complexity of program annotations 
remains a significant challenge."  Again, this could quite validly be 
described as a type error, just like division by zero or ANY other 
program error... it's just that the type system that solves it doesn't 
look appealing, so everyone punts the job to runtime checks (or, in some 
cases, to the CPU's memory protection features and/or the user's ability 
to fix resulting data corruption).

Why aren't these things commonly considered type errors?  There is only 
one reason: there exists no widely used language which solves them with 
types.  (I mean in the programming language type theory sense of "type"; 
since many languages "tag" arrays with annotations indicating their 
dimensions, I guess you could say that we do solve them with types in 
the LISP sense).

> Your problem doesn't exist. Just say "types" when you're amongst your 
> own folks, and "static types" when you're amongst a broader audience, 
> and everything's fine.

I think I've explained why that's not the case.  I don't have a 
complaint about anyone speaking of types.  It's the confusion from 
pretending that the two definitions are comparable that I'm pointing 
out.

-- 
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation
From: Pascal Costanza
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <4fo9c6F1k7pesU1@individual.net>
Chris Smith wrote:
> Pascal Costanza <··@p-cos.net> wrote:
>> Types can be represented at runtime via type tags. You could insist on 
>> using the term "dynamically tagged languages", but this wouldn't change 
>> a lot. Exactly _because_ it doesn't make sense in a statically typed 
>> setting, the term "dynamically typed language" is good enough to 
>> communicate what we are talking about - i.e. not (static) typing.
> 
> Okay, fair enough.  It's certainly possible to use the same sequence of 
> letters to mean two different things in different contexts.  The problem 
> arises, then, when Torben writes:
> 
> : That's not really the difference between static and dynamic typing.
> : Static typing means that there exist a typing at compile-time that
> : guarantess against run-time type violations.  Dynamic typing means
> : that such violations are detected at run-time.
> 
> This is clearly not using the word "type" to mean two different things 
> in different contexts.  Rather, it is speaking under the mistaken 
> impression that "static typing" and "dynamic typing" are varieties of 
> some general thing called "typing."  In fact, the phrase "dynamically 
> typed" was invented to do precisely that.  My argument is not really 
> with LISP programmers talking about types, by which they would mean 
> approximately the same thing Java programmers mean by "class."  My point 
> here concerns the confusion that results from the conception that there 
> is this binary distinction (or continuum, or any other simple 
> relationship) between a "statically typed" and a "dynamically typed" 
> language.

There is an overlap in the sense that some static type systems cover 
only types as sets of values whose correct use could as well be checked 
dynamically.

Yes, it's correct that more advanced static type systems can provide 
more semantics than that (and vice versa).


Pascal

-- 
3rd European Lisp Workshop
July 3 - Nantes, France - co-located with ECOOP 2006
http://lisp-ecoop06.bknr.net/
From: David Hopwood
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <up3mg.438312$tc.289369@fe2.news.blueyonder.co.uk>
Pascal Costanza wrote:
> Chris Smith wrote:
> 
>> Knowing that it'll cause a lot of strenuous objection, I'll
>> nevertheless interject my plea not to abuse the word "type" with a
>> phrase like "dynamically typed".  If anyone considers "untyped" to be
>> perjorative, as some people apparently do, then I'll note that another
>> common term is "type-free," which is marketing-approved but doesn't
>> carry the misleading connotations of "dynamically typed."  We are
>> quickly losing any rational meaning whatsoever to the word "type," and
>> that's quite a shame.
> 
> The words "untyped" or "type-free" only make sense in a purely
> statically typed setting. In a dynamically typed setting, they are
> meaningless, in the sense that there are _of course_ types that the
> runtime system respects.
> 
> Types can be represented at runtime via type tags. You could insist on
> using the term "dynamically tagged languages", but this wouldn't change
> a lot. Exactly _because_ it doesn't make sense in a statically typed
> setting, the term "dynamically typed language" is good enough to
> communicate what we are talking about - i.e. not (static) typing.

Oh, but it *does* make sense to talk about dynamic tagging in a statically
typed language.

That's part of what makes the term "dynamically typed" harmful: it implies
a dichotomy between "dynamically typed" and "statically typed" languages,
when in fact dynamic tagging and static typing are (mostly) independent
features.

-- 
David Hopwood <····················@blueyonder.co.uk>
From: Ben Morrow
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <5btmm3-ghu.ln1@osiris.mauzo.dyndns.org>
Quoth David Hopwood <····················@blueyonder.co.uk>:
> Pascal Costanza wrote:
> > Chris Smith wrote:
> > 
> > Types can be represented at runtime via type tags. You could insist on
> > using the term "dynamically tagged languages", but this wouldn't change
> > a lot. Exactly _because_ it doesn't make sense in a statically typed
> > setting, the term "dynamically typed language" is good enough to
> > communicate what we are talking about - i.e. not (static) typing.
> 
> Oh, but it *does* make sense to talk about dynamic tagging in a statically
> typed language.

Though I'm *seriously* reluctant to encourage this thread...

A prime example of this is Perl, which has both static and dynamic
typing. Variables are statically typed scalar/array/hash, and then
scalars are dynamically typed string/int/unsigned/float/ref.

> That's part of what makes the term "dynamically typed" harmful: it implies
> a dichotomy between "dynamically typed" and "statically typed" languages,
> when in fact dynamic tagging and static typing are (mostly) independent
> features.

Nevertheless, I see no problem in calling both of these 'typing'. They
are both means to the same end: causing a bunch of bits to be
interpreted in a meaningful fashion. The only difference is whether the
distinction is made a compile- or run-time. The above para had no
ambiguities...

Ben

-- 
Every twenty-four hours about 34k children die from the effects of poverty.
Meanwhile, the latest estimate is that 2800 people died on 9/11, so it's like
that image, that ghastly, grey-billowing, double-barrelled fall, repeated
twelve times every day. Full of children. [Iain Banks]  ·········@tiscali.co.uk
From: Marshall
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1150902731.828394.175560@m73g2000cwd.googlegroups.com>
David Hopwood wrote:
>
> Oh, but it *does* make sense to talk about dynamic tagging in a statically
> typed language.
>
> That's part of what makes the term "dynamically typed" harmful: it implies
> a dichotomy between "dynamically typed" and "statically typed" languages,
> when in fact dynamic tagging and static typing are (mostly) independent
> features.

That's really coming home to me in this thread: the terminology is *so*
bad. I have noticed this previously in the differences between
structural
and nominal typing; many typing issues associated with this distinction
are falsely labeled as a static-vs-dynamic issues, since so many
statically
type languages are nominally typed.

We need entirely new, finer grained terminology.


Marshall
From: Chris Uppal
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <449acc54$0$664$bed64819@news.gradwell.net>
Joe Marshall wrote:

> What we need is an FAQ entry for how to talk about types with people
> who are technically adept, but non-specialists.  Or alternatively, an
> FAQ of how to explain the term `dynamic typing' to a type theorist.

You could point people at
    "a regular series on object-oriented type theory, aimed
    specifically at non-theoreticians."
which was published on/in JoT from:
    http://www.jot.fm/issues/issue_2002_05/column5
to
    http://www.jot.fm/issues/issue_2005_09/column1

Only 20 episodes ! (But #3 seems to be missing.)

Actually the first one has (in section four) a quick and painless overview of
several kinds of type theory.  I haven't read the rest (yet, and maybe never
;-)

    -- chris
From: Andreas Rossberg
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e7b25p$8v164$1@hades.rz.uni-saarland.de>
David Hopwood wrote:
> 
> Oh, but it *does* make sense to talk about dynamic tagging in a statically
> typed language.

It even makes perfect sense to talk about dynamic typing in a statically 
typed language - but keeping the terminology straight, this rather 
refers to something like described in the well-known paper of the same 
title (and its numerous follow-ups):

   Martin Abadi, Luca Cardelli, Benjamin Pierce, Gordon Plotkin
   Dynamic typing in a statically-typed language.
   Proc. 16th Symposium on Principles of Programming Languages, 1989
   / TOPLAS 13(2), 1991

Note how this is totally different from simple tagging, because it deals 
with real types at runtime.

- Andreas
From: Vesa Karvonen
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e77vma$ipe$1@oravannahka.helsinki.fi>
In comp.lang.functional Chris Smith <·······@twu.net> wrote:
[...]
> Knowing that it'll cause a lot of strenuous objection, I'll nevertheless 
> interject my plea not to abuse the word "type" with a phrase like 
> "dynamically typed".  If anyone considers "untyped" to be perjorative, 
> as some people apparently do, then I'll note that another common term is 
> "type-free," which is marketing-approved but doesn't carry the 
> misleading connotations of "dynamically typed."  We are quickly losing 
> any rational meaning whatsoever to the word "type," and that's quite a 
> shame.
[...]

FWIW, I agree and have argued similarly on many occasions (both on the
net (e.g. http://groups.google.fi/group/comp.programming/msg/ba3ccfde4734313a?hl=fi&)
and person-to-person).  The widely used terminology (statically /
dynamically typed, weakly / strongly typed) is extremely confusing to
beginners and even to many with considerable practical experience.

-Vesa Karvonen
From: Joachim Durchholz
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e79rh0$e5k$1@online.de>
Torben �gidius Mogensen schrieb:
> That's not really the difference between static and dynamic typing.
> Static typing means that there exist a typing at compile-time that
> guarantess against run-time type violations.  Dynamic typing means
> that such violations are detected at run-time. 

Agreed.

 > This is orthogonal to
> strong versus weak typing, which is about whether such violations are
> detected at all.  The archetypal weakly typed language is machine code
> -- you can happily load a floating point value from memory, add it to
> a string pointer and jump to the resulting value.

I'd rather call machine code "untyped".
("Strong typing" and "weak typing" don't have a universally accepted 
definition anyway, and I'm not sure that this terminology is helpful 
anyway.)

> Anyway, type inference for statically typed langauges don't make them
> any more dynamically typed.  It just moves the burden of assigning the
> types from the programmer to the compiler.  And (for HM type systems)
> the compiler doesn't "guess" at a type -- it finds the unique most
> general type from which all other legal types (within the type system)
> can be found by instantiation.

Hmm... I think this distinction doesn't cover all cases.

Assume a language that
a) defines that a program is "type-correct" iff HM inference establishes 
that there are no type errors
b) compiles a type-incorrect program anyway, with an establishes 
rigorous semantics for such programs (e.g. by throwing exceptions as 
appropriate).
The compiler might actually refuse to compile type-incorrect programs, 
depending on compiler flags and/or declarations in the code.

Typed ("strongly typed") it is, but is it statically typed or 
dynamically typed?
("Softly typed" doesn't capture it well enough - if it's declarations in 
the code, then those part of the code are statically typed.)

> You miss some of the other benefits of static typing,
> though, such as a richer type system -- soft typing often lacks
> features like polymorphism (it will find a set of monomorphic
> instances rather than the most general type) and type classes.

That's not a property of soft typing per se, it's a consequence of 
tacking on type inference on a dynamically-typed language that wasn't 
designed for allowing strong type guarantees.

Regards,
Jo
From: Marshall
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1150902275.129780.50260@p79g2000cwp.googlegroups.com>
Joachim Durchholz wrote:
>
> Hmm... I think this distinction doesn't cover all cases.
>
> Assume a language that
> a) defines that a program is "type-correct" iff HM inference establishes
> that there are no type errors
> b) compiles a type-incorrect program anyway, with an establishes
> rigorous semantics for such programs (e.g. by throwing exceptions as
> appropriate).
> The compiler might actually refuse to compile type-incorrect programs,
> depending on compiler flags and/or declarations in the code.
>
> Typed ("strongly typed") it is, but is it statically typed or
> dynamically typed?

I think what this highlights is the fact that our existing terminology
is not up to the task of representing all the possible design
choices we could make. Some parts of dynamic vs. static
a mutually exclusive; some parts are orthogonal. Maybe
we have reached the point where trying to cram everything
in two one of two possible ways of doing things isn't going
to cut it any more.

Could it be that the US two-party system has influenced our
thinking?</joke>


Marshall
From: Chris Smith
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <MPG.1f033f2c90b5a97e9896d4@news.altopia.net>
Marshall <···············@gmail.com> wrote:
> I think what this highlights is the fact that our existing terminology
> is not up to the task of representing all the possible design
> choices we could make. Some parts of dynamic vs. static
> a mutually exclusive; some parts are orthogonal.

Really?  I can see that in a strong enough static type system, many 
dynamic typing features would become unobservable and therefore would be 
pragmatically excluded from any probable implementations... but I don't 
see any other kind of mutual exclusion between the two.

-- 
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation
From: Marshall
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1150986901.299913.139200@m73g2000cwd.googlegroups.com>
Pascal Costanza wrote:
>
> A statically type language requires you to think about two models of
> your program at the same time: the static type model and the dynamic
> behavioral model. A static type system ensures that these two
> _different_ (that's important!) perspectives are always in sync. This is
> especially valuable in settings where you know your domain well and want
> to rely on feedback by your compiler that you haven't made any mistakes
> in encoding your knowledge. (A static type system based on type
> inferencing doesn't essentially change the requirement to think in two
> models at the same time.)
>
> A dynamically typed language is especially well suited when you don't
> (yet) have a good idea about your domain and you want to use programming
> especially to explore that domain. Some static typing advocates claim
> that static typing is still suitable for exploring domains because of
> the compiler's feedback about the preliminary encoding of your
> incomplete knowledge, but the disadvantages are a) that you still have
> to think about two models at the same time when you don't even have
> _one_ model ready and b) that you cannot just run your incomplete
> program to see what it does as part of your exploration.
>
> A statically typed language with a dynamic type treats dynamic typing as
> the exception, not as the general approach, so this doesn't help a lot
> in the second setting (or so it seems to me).
>
> A language like Common Lisp treats static typing as the exception, so
> you can write a program without static types / type checks, but later on
> add type declarations as soon as you get a better understanding of your
> domain. Common Lisp implementations like CMUCL or SBCL even include
> static type inference to aid you here, which gives you warnings but
> still allows you to run a program even in the presence of static type
> errors. I guess the feedback you get from such a system is probably not
> "strong" enough to be appreciated by static typing advocates in the
> first setting (where you have a good understanding of your domain).

I am sceptical of the idea that when programming in a dynamically
typed language one doesn't have to think about both models as well.
I don't have a good model of the mental process of working
in a dynamically typed language, but how could that be the case?
(I'm not asking rhetorically.) Do you then run your program over
and over, mechanically correcting the code each time you discover
a type error? In other words, if you're not thinking of the type model,
are you using the runtime behavior of the program as an assistant,
the way I use the static analysis of the program as an assistant?

I don't accept the idea about pairing the appropriateness of each
system according to whether one is doing exploratory programming.
I do exploratory programming all the time, and I use the static type
system as an aide in doing so. Rather I think this is just another
manifestation of the differences in the mental processes between
static typed programmers and dynamic type programmers, which
we are beginning to glimpse but which is still mostly unknown.

Oh, and I also want to say that of all the cross-posted mega threads
on static vs. dynamic typing, this is the best one ever. Most info;
least flames. Yay us!


Marshall
From: Pascal Costanza
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <449BD37B.10904@p-cos.net>
Marshall wrote:

> I am sceptical of the idea that when programming in a dynamically
> typed language one doesn't have to think about both models as well.
> I don't have a good model of the mental process of working
> in a dynamically typed language, but how could that be the case?
> (I'm not asking rhetorically.) Do you then run your program over
> and over, mechanically correcting the code each time you discover
> a type error? In other words, if you're not thinking of the type model,
> are you using the runtime behavior of the program as an assistant,
> the way I use the static analysis of the program as an assistant?

Yes.

> I don't accept the idea about pairing the appropriateness of each
> system according to whether one is doing exploratory programming.
> I do exploratory programming all the time, and I use the static type
> system as an aide in doing so. Rather I think this is just another
> manifestation of the differences in the mental processes between
> static typed programmers and dynamic type programmers, which
> we are beginning to glimpse but which is still mostly unknown.

Probably.

> Oh, and I also want to say that of all the cross-posted mega threads
> on static vs. dynamic typing, this is the best one ever. Most info;
> least flames. Yay us!

Yay! :)


Pascal

-- 
3rd European Lisp Workshop
July 3 - Nantes, France - co-located with ECOOP 2006
http://lisp-ecoop06.bknr.net/
From: Marshall
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1150988625.527780.148710@c74g2000cwc.googlegroups.com>
Pascal Costanza wrote:
>
> Consider a simple expression like 'a + b': In a dynamically typed
> language, all I need to have in mind is that the program will attempt to
> add two numbers. In a statically typed language, I additionally need to
> know that there must a guarantee that a and b will always hold numbers.

I still don't really see the difference.

I would not expect that the dynamic programmer will be
thinking that this code will have two numbers most of the
time but sometimes not, and fail. I would expect that in both
static and dynamic, the thought is that that code is adding
two numbers, with the difference being the static context
gives one a proof that this is so. In this simple example,
the static case is better, but this is not free, and the cost
of the static case is evident elsewhere, but maybe not
illuminated by this example.

This thread's exploration of the mindset of the two kinds
of programmers is difficult. It is actually quite difficult,
(possibly impossible) to reconstruct mental states
though introspection. Nonetheless I don't see any
other way to proceed. Pair programming?


> My goal is not to convince anyone, my goal is to illustrate for those
> who are interested in getting a possibly different perspective.

Yes, and thank you for doing so.


Marshall
From: Marshall
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1150995359.404878.316920@m73g2000cwd.googlegroups.com>
Marshall wrote:
>
> In this simple example,
> the static case is better, but this is not free, and the cost
> of the static case is evident elsewhere, but maybe not
> illuminated by this example.

Ugh, please forgive my ham-fisted use of the word "better."
Let me try again:

In this simple example, the static case is provides you with
a guarantee of type safety, but this is not free, and the
cost of the static case may be evident elsewhere, even
if not illuminated by this example.


Marshall
From: Chris F Clark
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <sddhd2dvwav.fsf@shell01.TheWorld.com>
Pascal Costanza wrote:
> Consider a simple expression like 'a + b': In a dynamically typed
> language, all I need to have in mind is that the program will attempt to
> add two numbers. In a statically typed language, I additionally need to
> know that there must a guarantee that a and b will always hold numbers.

"Marshall" <···············@gmail.com> replied:
> I would not expect that the dynamic programmer will be
> thinking that this code will have two numbers most of the
> time but sometimes not, and fail. I would expect that in both
> static and dynamic, the thought is that that code is adding
> two numbers, with the difference being the static context
> gives one a proof that this is so.

I don't think the the problem is down at the atom level.  It's really
at the structure (tuple) level.  If we view 'a + b' as applying the
function + to some non-atomic objects a and b.  The question is how
well do we specify what a and b are.  From my limited exposure to
programs that use dynamic typing, I would surmise that developers who
use dynamic typing have very broad and open definitions of the "types"
of a and b, and they don't want the type system complaining that they
have nailed the definitions down quite yet.  And, if you read some of
the arguments in this topic, one assumes that they want to even be
able to correct the "types" of a and b at run-time, when they find out
that they have made a mistake without taking the system down.

So, for a concrete example, let b be a "file" type (either text file
or directory) and we get a whole system up and running with those two
types.  But, we discover a need for "symbolic links".  So, we want to
change the file type to have 3 sub-types.  In a dynamic type system,
because the file "type" is loose (it's operationally defined by what
function one applies to b), this isn't an issue.  As each function is
recoded to deal with the 3rd sub-type, the program becomes more
functional.  If we need to demo the program before we have worked out
how symbolic links work for some operation, it is not a problem.  Run
the application, but don't exercise that combination of type and
operation.

In many ways, this can also be done in languages with type inference.
I don't understand the process by which one does it, so I can't
explain it.  Perhaps someone else will--please....

Back to the example, the file type is not an atomic type, but
generally some structure/tuple/list/record/class with members or
fields.  A function which renames files by looking at only the name
field may work perfectly adequately with the new symbolic link subtype
without change because the new subtype has the same name field and
uses it the same way.  A spell-check function which works on text
files using the "contents" field might need to be recoded for symbolic
links if the contents field for that subtype is "different" (e.g. the
name of the target).  The naive spell check function might appear to
work, but really do the wrong thing, (i.e. checking if the target file
name is made of legal words).  Thus, the type problems are not at the
atomic level (where contents is a string), but at the structure level,
where one needs to know which fields have which meanings for which
subtypes.

-Chris
From: Pascal Bourguignon
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <873bdx9l8m.fsf@thalassa.informatimago.com>
"Marshall" <···············@gmail.com> writes:

> Pascal Costanza wrote:
>>
>> Consider a simple expression like 'a + b': In a dynamically typed
>> language, all I need to have in mind is that the program will attempt to
>> add two numbers. In a statically typed language, I additionally need to
>> know that there must a guarantee that a and b will always hold numbers.
>
> I still don't really see the difference.
>
> I would not expect that the dynamic programmer will be
> thinking that this code will have two numbers most of the
> time but sometimes not, and fail. I would expect that in both
> static and dynamic, the thought is that that code is adding
> two numbers, with the difference being the static context
> gives one a proof that this is so. In this simple example,
> the static case is better, but this is not free, and the cost
> of the static case is evident elsewhere, but maybe not
> illuminated by this example.
>
> This thread's exploration of the mindset of the two kinds
> of programmers is difficult. It is actually quite difficult,
> (possibly impossible) to reconstruct mental states
> though introspection. Nonetheless I don't see any
> other way to proceed. Pair programming?

Well this is a question of data flow.  As I explained, there's a whole
body of functions that don't process concretely the data they get.
But of course, eventually you must write a function that do some
concrete processing on the data it gets.  That's when you consider the
type of the values. Such functions may be generic functions with
methods dispatching on the actual type of the parameters, or you may
encounter some TYPECASE or COND inside the function before calling
non-abstract "primitives" that work only on some specific type.

-- 
__Pascal Bourguignon__                     http://www.informatimago.com/

"I have challenged the entire quality assurance team to a Bat-Leth
contest.  They will not concern us again."
From: Pascal Costanza
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <4g22l7F1j8p4mU1@individual.net>
Marshall wrote:
> Pascal Costanza wrote:
>> Consider a simple expression like 'a + b': In a dynamically typed
>> language, all I need to have in mind is that the program will attempt to
>> add two numbers. In a statically typed language, I additionally need to
>> know that there must a guarantee that a and b will always hold numbers.
> 
> I still don't really see the difference.
> 
> I would not expect that the dynamic programmer will be
> thinking that this code will have two numbers most of the
> time but sometimes not, and fail. I would expect that in both
> static and dynamic, the thought is that that code is adding
> two numbers, with the difference being the static context
> gives one a proof that this is so.

There is a third option: it may be that at the point where I am writing 
this code, I simply don't bother yet whether a and b will always be 
numbers. In case something other than numbers pop up, I can then make a 
decision how to proceed from there.

> In this simple example,
> the static case is better, but this is not free, and the cost
> of the static case is evident elsewhere, but maybe not
> illuminated by this example.

Yes, maybe the example is not the best one. This kind of example, 
however, occurs quite often when programming in an object-oriented 
style, where you don't know yet what objects will and will not respond 
to a message / generic function. Even in the example above, it could be 
that you can give an appropriate definition for + for objects other than 
numbers.


Pascal

-- 
3rd European Lisp Workshop
July 3 - Nantes, France - co-located with ECOOP 2006
http://lisp-ecoop06.bknr.net/
From: Marshall
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1151081129.290382.297530@r2g2000cwb.googlegroups.com>
Pascal Costanza wrote:
> Marshall wrote:
> > Pascal Costanza wrote:
> >> Consider a simple expression like 'a + b': In a dynamically typed
> >> language, all I need to have in mind is that the program will attempt to
> >> add two numbers. In a statically typed language, I additionally need to
> >> know that there must a guarantee that a and b will always hold numbers.
> >
> > I still don't really see the difference.
> >
> > I would not expect that the dynamic programmer will be
> > thinking that this code will have two numbers most of the
> > time but sometimes not, and fail. I would expect that in both
> > static and dynamic, the thought is that that code is adding
> > two numbers, with the difference being the static context
> > gives one a proof that this is so.
>
> There is a third option: it may be that at the point where I am writing
> this code, I simply don't bother yet whether a and b will always be
> numbers. In case something other than numbers pop up, I can then
> make a decision how to proceed from there.

Ouch; I have a really hard time understanding this.

I can't see how you'd call + on a and b if you think they might
not be numbers. If they could be something other than numbers,
and you're treating them as if they are, is that sort of like
doing a case analysis and only filling in one of the cases?
If so, wouldn't you want to record that fact somehow?


Marshall
From: Chris Smith
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <MPG.1f05cdfeea12187a9896eb@news.altopia.net>
Marshall <···············@gmail.com> wrote:
> Ouch; I have a really hard time understanding this.
> 
> I can't see how you'd call + on a and b if you think they might
> not be numbers. If they could be something other than numbers,
> and you're treating them as if they are, is that sort of like
> doing a case analysis and only filling in one of the cases?
> If so, wouldn't you want to record that fact somehow?

The obvious answer -- I don't know if it's what Pascal meant -- is that 
they might be 4x4 matrices, or anything else that behaves predictably 
under some operation that could be called addition.  As a mathematical 
analogy, the entire mathematical field of group theory comes from the 
idea of performing operations on values without really knowing what the 
values (or the operations) really are, but only knowing a few axioms by 
which the relationship between the objects and the operation is 
constrained.

[As an interesting coincidence, group theory papers frequently use the 
addition symbol to represent the (arbitrary) binary relation of an 
Abelian group.  Not that this has anything to do with choosing "+" for 
the example here.]

Programming languages do this all the time, as well.  The most popular 
example is the OO sense of the word polymorphism.  That's all about 
being able to write code that works with a range of values regardless of 
(or, at least, a range that less constraining than equlity in) types.

-- 
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation
From: Chris Smith
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <MPG.1f05cf3d1a88cce09896ec@news.altopia.net>
Chris Smith <·······@twu.net> wrote:
> Programming languages do this all the time, as well.  The most popular 
> example is the OO sense of the word polymorphism.  That's all about 
> being able to write code that works with a range of values regardless of 
> (or, at least, a range that less constraining than equlity in) types.

It's now dawned on me that I need not have restricted this to the OO 
sense of polymorphism.  So let me rephrase.  Polymorphism *is* when 
programming languages do that.

-- 
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation
From: Marshall
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1151100023.126405.56230@c74g2000cwc.googlegroups.com>
Chris Smith wrote:
> Marshall <···············@gmail.com> wrote:
> > Ouch; I have a really hard time understanding this.
> >
> > I can't see how you'd call + on a and b if you think they might
> > not be numbers. If they could be something other than numbers,
> > and you're treating them as if they are, is that sort of like
> > doing a case analysis and only filling in one of the cases?
> > If so, wouldn't you want to record that fact somehow?
>
> The obvious answer -- I don't know if it's what Pascal meant -- is that
> they might be 4x4 matrices, or anything else that behaves predictably
> under some operation that could be called addition.  As a mathematical
> analogy, the entire mathematical field of group theory comes from the
> idea of performing operations on values without really knowing what the
> values (or the operations) really are, but only knowing a few axioms by
> which the relationship between the objects and the operation is
> constrained.
>
> [As an interesting coincidence, group theory papers frequently use the
> addition symbol to represent the (arbitrary) binary relation of an
> Abelian group.  Not that this has anything to do with choosing "+" for
> the example here.]

Sure. This is what abstract algebra is all about; group theory is
just an example. In particular, much of Stepanov's life work
has been developing abstract algebra within the framework
of a static type system.

http://www.stepanovpapers.com/
In particular, see the paper "Greatest Common Measure, the
Last 2500 Years."


> Programming languages do this all the time, as well.  The most popular
> example is the OO sense of the word polymorphism.  That's all about
> being able to write code that works with a range of values regardless of
> (or, at least, a range that less constraining than equlity in) types.

Exactly.


Marshall
From: Duane Rettig
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <o0bqsj33eq.fsf@franz.com>
"Marshall" <···············@gmail.com> writes:

> Pascal Costanza wrote:
>> Marshall wrote:
>> > Pascal Costanza wrote:
>> >> Consider a simple expression like 'a + b': In a dynamically typed
>> >> language, all I need to have in mind is that the program will attempt to
>> >> add two numbers. In a statically typed language, I additionally need to
>> >> know that there must a guarantee that a and b will always hold numbers.
>> >
>> > I still don't really see the difference.
>> >
>> > I would not expect that the dynamic programmer will be
>> > thinking that this code will have two numbers most of the
>> > time but sometimes not, and fail. I would expect that in both
>> > static and dynamic, the thought is that that code is adding
>> > two numbers, with the difference being the static context
>> > gives one a proof that this is so.
>>
>> There is a third option: it may be that at the point where I am writing
>> this code, I simply don't bother yet whether a and b will always be
>> numbers. In case something other than numbers pop up, I can then
>> make a decision how to proceed from there.
>
> Ouch; I have a really hard time understanding this.
>
> I can't see how you'd call + on a and b if you think they might
> not be numbers. If they could be something other than numbers,
> and you're treating them as if they are, is that sort of like
> doing a case analysis and only filling in one of the cases?
> If so, wouldn't you want to record that fact somehow?

But suppose + were polymorphic and someone (dynamically) added a
method on + to handle strings?  If the type error were that a and b
happened to be strings, there would be at least two courses of action
in the development of the program: limit a and b so that they are
never strings at this point, or define a method on + to add correct
functionality for the + operator.  Note that I use the term "correct",
because the correctness of the subprogram (including any added methods
on +) depends both on the correctness of usage within the language,
and also correctness of extensibilities allowed by the language for
the purpose of satisfying goals of the program.  In the many cases
where the eventual goals of the program are not known at the beginning
of the design, this extensibility is a good thing to have.

Note that the above paragraph explicitly leaves out Common Lisp,
because + is not extensible in CL.  CL does, however, allow for methods
to be defined on functions not already defined by the standard, so
the argument could still apply to some generic function named my-+,
or even on my-package::+ where the CL defined + operator is shadowed.

-- 
Duane Rettig    ·····@franz.com    Franz Inc.  http://www.franz.com/
555 12th St., Suite 1450               http://www.555citycenter.com/
Oakland, Ca. 94607        Phone: (510) 452-2000; Fax: (510) 452-0182   
From: Darren New
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <gcVmg.3875$MF6.3310@tornado.socal.rr.com>
Marshall wrote:
> I can't see how you'd call + on a and b if you think they might
> not be numbers. 

Now substitute "<" for "+" and see if you can make the same argument. :-)


-- 
   Darren New / San Diego, CA, USA (PST)
     Native Americans used every part
     of the buffalo, including the wings.
From: Marshall
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1151099579.205144.145130@r2g2000cwb.googlegroups.com>
Darren New wrote:
> Marshall wrote:
> > I can't see how you'd call + on a and b if you think they might
> > not be numbers.
>
> Now substitute "<" for "+" and see if you can make the same argument. :-)

If your point is about overloading, then I don't see how it affects my
point. My point was, I don't see why you'd be writing code using
operators that you thought might not be applicable to the operands.


Marshall
From: Darren New
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <rRZmg.20658$uy3.7348@tornado.socal.rr.com>
Marshall wrote:
> point. My point was, I don't see why you'd be writing code using
> operators that you thought might not be applicable to the operands.

I see. I thought your point was that you might write code for which you 
didn't know the type of arguments. Unless you equate "type" with what 
smalltalk calls "protocol" (i.e., the type is the collection of 
operators applicable to values in the type), these are two different 
statements.

-- 
   Darren New / San Diego, CA, USA (PST)
     Native Americans used every part
     of the buffalo, including the wings.
From: Duane Rettig
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <o07j37339p.fsf@franz.com>
Darren New <····@san.rr.com> writes:

> Marshall wrote:
>> I can't see how you'd call + on a and b if you think they might
>> not be numbers.
>
> Now substitute "<" for "+" and see if you can make the same argument. :-)

Sure.  Properly extended to handle strings, "abc" < "def" might return
true.

-- 
Duane Rettig    ·····@franz.com    Franz Inc.  http://www.franz.com/
555 12th St., Suite 1450               http://www.555citycenter.com/
Oakland, Ca. 94607        Phone: (510) 452-2000; Fax: (510) 452-0182   
From: Andreas Rossberg
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e7echr$8r8i0$3@hades.rz.uni-saarland.de>
Pascal Bourguignon wrote:
> 
> For example, sort doesn't need to know what type the objects it sorts
> are.  It only needs to be given a function that is able to compare the
> objects.

Sure. That's why any decent type system supports polymorphism of this 
sort. (And some of them can even infer which comparison function to pass 
for individual calls, so that the programmer does not have to bother.)

- Andreas
From: Pascal Bourguignon
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <87ac859m1p.fsf@thalassa.informatimago.com>
Matthias Blume <····@my.address.elsewhere> writes:

> Pascal Bourguignon <···@informatimago.com> writes:
>
>> Moreover, a good proportion of the program and a good number of
>> algorithms don't even need to know the type of the objects they
>> manipulate.
>>
>> For example, sort doesn't need to know what type the objects it sorts
>> are.  It only needs to be given a function that is able to compare the
>> objects.
>
> Of course, some statically typed languages handle this sort of thing
> routinely.
>
>> Only a few "primitive" functions need specific types.
>
> Your sort function from above also has a specific type -- a type which
> represents the fact that the objects to be sorted must be acceptable
> input to the comparison function.

Well, not exactly.  sort is a higher level function. The type of its
arguments is an implicit parameter of the sort function.

     (sort "Hello World"  (function char<=))
     --> " HWdellloor"

     (sort '(52 12 42 37) (function <=))
     --> (12 37 42 52)

     (sort (list (make-instance 'person               :name "Pascal")
                 (make-instance 'unit                 :name "Pascal")
                 (make-instance 'programming-language :name "Pascal"))
           (lambda (a b) (string<= (class-name (class-of a))
                                   (class-name (class-of b)))))
     --> (#<PERSON #x205763FE>
          #<PROGRAMMING-LANGUAGE #x205765BE>
          #<UNIT #x205764DE>)


In Common Lisp, sort is specified to take a parameter of type SEQUENCE
= (or vector list), and if a list it should be a proper list,
and a function taking two arguments (of any type) 
and returning a generalized boolean (that is anything can be returned,
NIL is false, something else is true)


So you could say that:

   (sort (sequence element-type)
         (function (element-type element-type) boolean))
    --> (sequence element-type)

but element-type is not a direct parameter of sort, and can change for
all calls event at the same call point:

(mapcar (lambda (s) (sort s (lambda (a b) (<= (sxhash a) (sxhash b)))))
        (list (vector 52 12 42 37)
              (list   52 12 42 37)
              (list "abc" 'def (make-instance 'person :name "Zorro") 76)))
--> (#(12 37 42 52)
      (12 37 42 52)
      (76 #<PERSON #x2058D496> DEF "abc"))


>> So basically, you've got a big black box of applicaition code in the
>> middle that doesn't care what type of value they get, and you've got a
>> few input values of a specific type, a few processing functions
>> needing a specific type and returning a specific type, and a few
>> output values that are expected to be of a specific type.  At anytime,
>> you may change the type of the input values, and ensure that the
>> needed processing functions will be able to handle this new input
>> type, and the output gets mapped to the expected type.
>
> ...or you type-check your "black box" and make sure that no matter how
> you will ever change the type of the inputs (in accordance with the
> interface type of the box) you get a valid program.

When?  At run-time?  All the modifications I spoke of can be done at
run-time in Lisp.

-- 
__Pascal Bourguignon__                     http://www.informatimago.com/
The mighty hunter
Returns with gifts of plump birds,
Your foot just squashed one.
From: Matthias Blume
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <m1odwlkuc2.fsf@hana.uchicago.edu>
Pascal Bourguignon <···@informatimago.com> writes:

> Matthias Blume <····@my.address.elsewhere> writes:
>
>> Pascal Bourguignon <···@informatimago.com> writes:
>>
>>> Moreover, a good proportion of the program and a good number of
>>> algorithms don't even need to know the type of the objects they
>>> manipulate.
>>>
>>> For example, sort doesn't need to know what type the objects it sorts
>>> are.  It only needs to be given a function that is able to compare the
>>> objects.
>>
>> Of course, some statically typed languages handle this sort of thing
>> routinely.
>>
>>> Only a few "primitive" functions need specific types.
>>
>> Your sort function from above also has a specific type -- a type which
>> represents the fact that the objects to be sorted must be acceptable
>> input to the comparison function.
>
> Well, not exactly.

What do you mean by "not exactly".

>  sort is a higher level function. The type of its
> arguments is an implicit parameter of the sort function.

What do you mean by "higher-level"? Maybe you meant "higher-order" or
"polymorphic"?

[ rest snipped ]

You might want to look up "System F".

>>> So basically, you've got a big black box of applicaition code in the
>>> middle that doesn't care what type of value they get, and you've got a
>>> few input values of a specific type, a few processing functions
>>> needing a specific type and returning a specific type, and a few
>>> output values that are expected to be of a specific type.  At anytime,
>>> you may change the type of the input values, and ensure that the
>>> needed processing functions will be able to handle this new input
>>> type, and the output gets mapped to the expected type.
>>
>> ...or you type-check your "black box" and make sure that no matter how
>> you will ever change the type of the inputs (in accordance with the
>> interface type of the box) you get a valid program.
>
> When?

When what?
From: Pascal Bourguignon
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <87y7vp85sc.fsf@thalassa.informatimago.com>
Matthias Blume <····@my.address.elsewhere> writes:

> Pascal Bourguignon <···@informatimago.com> writes:
>
>> Matthias Blume <····@my.address.elsewhere> writes:
>>
>>> Pascal Bourguignon <···@informatimago.com> writes:
>>>
>>>> Moreover, a good proportion of the program and a good number of
>>>> algorithms don't even need to know the type of the objects they
>>>> manipulate.
>>>>
>>>> For example, sort doesn't need to know what type the objects it sorts
>>>> are.  It only needs to be given a function that is able to compare the
>>>> objects.
>>>
>>> Of course, some statically typed languages handle this sort of thing
>>> routinely.
>>>
>>>> Only a few "primitive" functions need specific types.
>>>
>>> Your sort function from above also has a specific type -- a type which
>>> represents the fact that the objects to be sorted must be acceptable
>>> input to the comparison function.
>>
>> Well, not exactly.
>
> What do you mean by "not exactly".
>
>>  sort is a higher level function. The type of its
>> arguments is an implicit parameter of the sort function.
>
> What do you mean by "higher-level"? Maybe you meant "higher-order" or
> "polymorphic"?

Yes, that's what I wanted to say.

> [ rest snipped ]
>
> You might want to look up "System F".
> [...]
>>> ...or you type-check your "black box" and make sure that no matter how
>>> you will ever change the type of the inputs (in accordance with the
>>> interface type of the box) you get a valid program.
>>
>> When?
>
> When what?

When will you type-check the "black box"?

A function such as:

(defun f (x y)
   (if (g x)
      (h x y)
      (i y x)))

in the context of a given program could be type-infered statically as
taking an integer and a string as argument.  If the compiler did this
inference, it could perhaps generate code specific to these types.

But it's always possible at run-time that new functions and new
function calls be generated such as:

(let ((x "two"))
  (eval `(defmethod g ((self ,(type-of x))) t))
  (eval `(defmethod h ((x ,(type-of x)) (y string)) 
           (,(intern (format nil "DO-SOMETHING-WITH-A-~A" (type-of x))) x) 
           (do-something-with-a-string y)))
  (funcall (compile nil `(let ((x ,x)) (lambda () (f x "Hi!"))))))

Will you execute the whole type-inference on the whole program "black
box" everytime you define a new function?  Will you recompile all the
"black box" functions to take into account the new type the arguments
can be now?

This wouldn't be too efficient.  Let's just say that by default, all
arguments and variable are of type T, so the type checking is trivial,
and the generated code is, by default, totally generic.


Only the few concrete, low-level functions need to know the types of
their arguments and variables.  In these functions, either the lisp
compiler will do the type inference (starting from the predefined
primitives), or the programmer will declare the types to inform the
compiler what to expect.

(defun do-something-with-a-string (x)
  (declare (string x))
  ...)

(defun do-something-with-a-integer (x)
  (declare (integer x))
  ...)

...

-- 
__Pascal Bourguignon__                     http://www.informatimago.com/

PUBLIC NOTICE AS REQUIRED BY LAW: Any use of this product, in any
manner whatsoever, will increase the amount of disorder in the
universe. Although no liability is implied herein, the consumer is
warned that this process will ultimately lead to the heat death of
the universe.
From: David Hopwood
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1PEmg.472643$xt.199547@fe3.news.blueyonder.co.uk>
Pascal Bourguignon wrote:
> But it's always possible at run-time that new functions and new
> function calls be generated such as:
> 
> (let ((x "two"))
>   (eval `(defmethod g ((self ,(type-of x))) t))
>   (eval `(defmethod h ((x ,(type-of x)) (y string)) 
>            (,(intern (format nil "DO-SOMETHING-WITH-A-~A" (type-of x))) x) 
>            (do-something-with-a-string y)))
>   (funcall (compile nil `(let ((x ,x)) (lambda () (f x "Hi!"))))))
> 
> Will you execute the whole type-inference on the whole program "black
> box" everytime you define a new function?  Will you recompile all the
> "black box" functions to take into account the new type the arguments
> can be now?

Yes, why not?

> This wouldn't be too efficient.

It's rare, so it doesn't need to be efficient. 'eval' is inherently
inefficient, anyway.

-- 
David Hopwood <····················@blueyonder.co.uk>
From: David Hopwood
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <5GEmg.208020$8W1.13213@fe1.news.blueyonder.co.uk>
Pascal Bourguignon wrote:
> Pascal Costanza <··@p-cos.net> writes:
>>Andreas Rossberg wrote:
>>>Pascal Costanza wrote:
>>>
>>>>Consider a simple expression like 'a + b': In a dynamically typed
>>>>language, all I need to have in mind is that the program will
>>>>attempt to add two numbers. In a statically typed language, I
>>>>additionally need to know that there must a guarantee that a and b
>>>>will always hold numbers.
>>>
>>>I'm confused. Are you telling that you just write a+b in your
>>>programs without trying to ensure that a and b are in fact numbers??
>>
>>Basically, yes.
>>
>>Note that this is a simplistic example. Consider, instead, sending a
>>message to an object, or calling a generic function, without ensuring
>>that there will be applicable methods for all possible cases. When I
>>get a "message not understood" exception, I can then decide whether
>>that kind of object shouldn't be a receiver in the first place, or
>>else whether I should define an appropriate method. I don't want to be
>>forced to decide this upfront, because either I don't want to be
>>bothered, or maybe I simply can't because I don't understand the
>>domain well enough yet, or maybe I want to keep a hook to be able to
>>update the program appropriately while it is running.
> 
> Moreover, a good proportion of the program and a good number of
> algorithms don't even need to know the type of the objects they
> manipulate.
> 
> For example, sort doesn't need to know what type the objects it sorts
> are.  It only needs to be given a function that is able to compare the
> objects.

But this is true also in a statically typed language with parametric
polymorphism.

[...]
> Why should adding a few functions or methods, and providing input
> values of a new type be rejected from a statically checked  point of
> view by a compiled program that would be mostly bit-for-bit the same
> with or without this new type?

It usually wouldn't be -- adding methods in a typical statically typed
OO language is unlikely to cause type errors (unless there is a naming
conflict, in some cases). Nor would adding new types or new functions.

(*Using* new methods without declaring them would cause an error, yes.)

-- 
David Hopwood <····················@blueyonder.co.uk>
From: Chris Smith
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <MPG.1f033ed6670cd7239896d3@news.altopia.net>
Joachim Durchholz <··@durchholz.org> wrote:
> Assume a language that
> a) defines that a program is "type-correct" iff HM inference establishes 
> that there are no type errors
> b) compiles a type-incorrect program anyway, with an establishes 
> rigorous semantics for such programs (e.g. by throwing exceptions as 
> appropriate).

So the compiler now attempts to prove theorems about the program, but 
once it has done so it uses the results merely to optimize its runtime 
behavior and then throws the results away.  I'd call that not a 
statically typed language, then.  The type-checking behavior is actually 
rather irrelevant both to the set of valid programs of the language, and 
to the language semantics (since the same could be accomplished without 
the type checking).  It is only relevant to performance.  Obviously, the 
language probably qualifies as dynamically typed for most common 
definitions of that term, but I'm not ready to accept one definition and 
claim to understand it, yet, so I'll be cautious about classsifying the 
language.

> The compiler might actually refuse to compile type-incorrect programs, 
> depending on compiler flags and/or declarations in the code.

Then those compiler flags would cause the compiler to accept a different 
language, and that different language would be a statically typed 
language (by which I don't mean to exclude the possibility of its also 
being dynamically typed).

> Typed ("strongly typed") it is, but is it statically typed or 
> dynamically typed?

So my answer is that it's not statically typed in the first case, and is 
statically typed in the second case, and it intuitively appears to be 
dynamically typed at least in the first, and possibly in the second as 
well.

-- 
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation
From: Joe Marshall
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1150998222.352746.65520@i40g2000cwc.googlegroups.com>
Chris Smith wrote:
> Joachim Durchholz <··@durchholz.org> wrote:
> > Assume a language that
> > a) defines that a program is "type-correct" iff HM inference establishes
> > that there are no type errors
> > b) compiles a type-incorrect program anyway, with an establishes
> > rigorous semantics for such programs (e.g. by throwing exceptions as
> > appropriate).
>
> So the compiler now attempts to prove theorems about the program, but
> once it has done so it uses the results merely to optimize its runtime
> behavior and then throws the results away.  I'd call that not a
> statically typed language, then.

You're assuming that type-correctness is an all-or-nothing property
(well, technically it *is*, but bear with me).  What if the compiler is
unable to prove a theorem about the entire program, but *can* prove a
theorem about a subset of the program.  The theorems would naturally be
conditional, e.g.  `Provided the input is an integer, the program is
type-safe', or time-bounded, e.g. `Until the program attempts to invoke
function FOO, the program is type-safe.'

Of course, we could encode that by restricting the type of the input
and everything would be copacetic, but suppose there is a requirement
that floating point numbers are valid input.  For some reason, our
program is not type-safe for floats, but as a developer who is working
on the integer math routines, I have no intention of running that code.
 The compiler cannot prove that I won't perversely enter a float, but
it can prove that if I enter an integer everything is type-safe.  I can
therefore run, debug, and use a subset of the program.

That's the important point:  I want to run broken code.  I want to run
as much of the working fragments as I can, and I want a `safety net' to
prevent me from performing undefined operations, but I want the safety
net to catch me at the *last* possible moment.  I'm not playing it safe
and staying where the compiler can prove I'll be ok.  I'm living
dangerously and wandering near the edge where the compiler can't quite
prove that I'll fail.

Once I've done the major amount of exploratory programming, I may very
well want to tighten things up.  I'd like to have the compiler prove
that some mature library is type-safe and that all callers to the
library use it in a type-correct manner.  I'd like the compiler to say
`Hey, did you know that if the user enters a floating point number
instead of his name that your program will crash and burn?', but I
don't want that sort of information until I'm pretty sure about what I
want the program to do in the normal case.
From: Marshall
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1151026856.064429.36460@c74g2000cwc.googlegroups.com>
Joe Marshall wrote:
>
> That's the important point:  I want to run broken code.

I want to make sure I understand. I can think of several things
you might mean by this. It could be:
1) I want to run my program, even though I know parts of it
are broken, because I think there are parts that are not broken
and I want to try them out.
2) I want to run my program, even though it is broken, and I
want to run right up to a broken part and trap there, so I can
use the runtime facilities of the language to inspect what's
going on.


> I want to run
> as much of the working fragments as I can, and I want a `safety net' to
> prevent me from performing undefined operations, but I want the safety
> net to catch me at the *last* possible moment.

This statement is interesting, because the conventional wisdom (at
least as I'm used to hearing it) is that it is best to catch bugs
at the *first* possible moment. But I think maybe we're talking
about different continua here. The last last last possible moment
is after the software has shipped to the customer, and I'm pretty
sure that's not what you mean. I think maybe you mean something
more like 2) above.


Marshall
From: Joe Marshall
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1151079774.429512.68840@c74g2000cwc.googlegroups.com>
Marshall wrote:
> Joe Marshall wrote:
> >
> > That's the important point:  I want to run broken code.
>
> I want to make sure I understand. I can think of several things
> you might mean by this. It could be:
> 1) I want to run my program, even though I know parts of it
> are broken, because I think there are parts that are not broken
> and I want to try them out.

I think I mean a little of each of these.

Nearly *every* program I have ever used is buggy, so this is actually a
normal state of affairs even when I'm not debugging or developing.

> 2) I want to run my program, even though it is broken, and I
> want to run right up to a broken part and trap there, so I can
> use the runtime facilities of the language to inspect what's
> going on.

I do this quite often.  Sometimes I'll develop `in the debugger'.  I'll
change some piece of code and run the program until it traps.  Then,
without exiting the debugger, I'll fix the immediate problem and
restart the program at the point it trapped.  This technique takes a
bit of practice, but if you are working on something that's complex and
has a lot of state, it saves a lot of time because you don't have to
reconstruct the state every time you make a change.

> > I want to run
> > as much of the working fragments as I can, and I want a `safety net' to
> > prevent me from performing undefined operations, but I want the safety
> > net to catch me at the *last* possible moment.
>
> This statement is interesting, because the conventional wisdom (at
> least as I'm used to hearing it) is that it is best to catch bugs
> at the *first* possible moment. But I think maybe we're talking
> about different continua here. The last last last possible moment
> is after the software has shipped to the customer, and I'm pretty
> sure that's not what you mean. I think maybe you mean something
> more like 2) above.

It is often the case that the proximate cause of a bug is nowhere near
where the fix needs to be applied.  This is especially true with the
really hard bugs.  A static type system will tell me that there (may)
exist a path through the code that results in an error, but the runtime
check that fails will provide a witness.
From: Marshall
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1151098655.828408.286630@y41g2000cwy.googlegroups.com>
Joe Marshall wrote:
> Marshall wrote:
> > Joe Marshall wrote:
> > >
> > > That's the important point:  I want to run broken code.
> >
> > I want to make sure I understand. I can think of several things
> > you might mean by this. It could be:
> > 1) I want to run my program, even though I know parts of it
> > are broken, because I think there are parts that are not broken
> > and I want to try them out.
>
> I think I mean a little of each of these.
>
> Nearly *every* program I have ever used is buggy, so this is actually a
> normal state of affairs even when I'm not debugging or developing.

That makes some sense to me.

It seems quite reasonable, as a language feature, to be
able to say to the computer: I know there are a few type
errors; just run this damn program as best you can.


> > 2) I want to run my program, even though it is broken, and I
> > want to run right up to a broken part and trap there, so I can
> > use the runtime facilities of the language to inspect what's
> > going on.
>
> I do this quite often.  Sometimes I'll develop `in the debugger'.  I'll
> change some piece of code and run the program until it traps.  Then,
> without exiting the debugger, I'll fix the immediate problem and
> restart the program at the point it trapped.  This technique takes a
> bit of practice, but if you are working on something that's complex and
> has a lot of state, it saves a lot of time because you don't have to
> reconstruct the state every time you make a change.

Wow, interesting.

(I say the following only to point out differing strategies of
development, not to say one or the other is right or bad or
whatever.)

I occasionally find myself doing the above, and when I do,
I take it as a sign that I've screwed up. I find that process
excruciating, and I do everything I can to avoid it. Over the
years, I've gotten more and more adept at trying to turn
as many bugs as possible into type errors, so I don't have
to run the debugger.

Now, there might be an argument to be made that if I had
been using a dynamic language, the process wouldn't be
so bad, and I woudn't dislike it so much. But mabe:

As a strawman: suppose there are two broad categories
of mental strategies for thinking about bugs, and people
fall naturally into one way or the other, the way there
are morning people and night people. One group is
drawn to the static analysis, one group hates it.
One group hates working in the debugger, one group
is able to use that tool very effectively and elegantly.

Anyway, it's a thought.


> > > I want to run
> > > as much of the working fragments as I can, and I want a `safety net' to
> > > prevent me from performing undefined operations, but I want the safety
> > > net to catch me at the *last* possible moment.
> >
> > This statement is interesting, because the conventional wisdom (at
> > least as I'm used to hearing it) is that it is best to catch bugs
> > at the *first* possible moment. But I think maybe we're talking
> > about different continua here. The last last last possible moment
> > is after the software has shipped to the customer, and I'm pretty
> > sure that's not what you mean. I think maybe you mean something
> > more like 2) above.
>
> It is often the case that the proximate cause of a bug is nowhere near
> where the fix needs to be applied.  This is especially true with the
> really hard bugs.  A static type system will tell me that there (may)
> exist a path through the code that results in an error, but the runtime
> check that fails will provide a witness.

Ah, the witness! I see the witness idea as supporting my thesis
that some people are using formal existential reasoning as a
tool for informal universal reasoning.

Very interesting. I want to again emphasize that I am not
interested in labelling one or the other of these approaches
as better or worse; rather, I want to understand both of
them, particularly to see if it is possible to have a single
language that incorporate both approaches.

This is not to say that I don't find one approach more appealing
than the other; rather it is to assert that I can tell the difference
between my personal preference and Truth. (Sometimes.)


Marshall
From: David Hopwood
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <mAjng.465989$tc.228623@fe2.news.blueyonder.co.uk>
Marshall wrote:
> Joe Marshall wrote:
> 
>>Marshall wrote:
> 
>>>2) I want to run my program, even though it is broken, and I
>>>want to run right up to a broken part and trap there, so I can
>>>use the runtime facilities of the language to inspect what's
>>>going on.
>>
>>I do this quite often.  Sometimes I'll develop `in the debugger'.  I'll
>>change some piece of code and run the program until it traps.  Then,
>>without exiting the debugger, I'll fix the immediate problem and
>>restart the program at the point it trapped.  This technique takes a
>>bit of practice, but if you are working on something that's complex and
>>has a lot of state, it saves a lot of time because you don't have to
>>reconstruct the state every time you make a change.

The problem with this is that from that point on, what you're running
is neither the old nor the new program, since its state may have been
influenced by the bug before you corrected it.

I find it far simpler to just restart the program after correcting
anything. If this is too difficult, I change the design to make it
less difficult.

> Wow, interesting.
> 
> (I say the following only to point out differing strategies of
> development, not to say one or the other is right or bad or
> whatever.)
> 
> I occasionally find myself doing the above, and when I do,
> I take it as a sign that I've screwed up. I find that process
> excruciating, and I do everything I can to avoid it. Over the
> years, I've gotten more and more adept at trying to turn
> as many bugs as possible into type errors, so I don't have
> to run the debugger.
> 
> Now, there might be an argument to be made that if I had
> been using a dynamic language, the process wouldn't be
> so bad, and I woudn't dislike it so much. But mabe:
> 
> As a strawman: suppose there are two broad categories
> of mental strategies for thinking about bugs, and people
> fall naturally into one way or the other, the way there
> are morning people and night people. One group is
> drawn to the static analysis, one group hates it.
> One group hates working in the debugger, one group
> is able to use that tool very effectively and elegantly.
> 
> Anyway, it's a thought.

I don't buy this -- or at least, I am not in either group.

A good debugger is invaluable regardless of your attitude to type
systems. Recently I was debugging two programs written to do similar
things in the same statically typed language (C), but where a debugger
could not be used for one of the programs. It took maybe 5 times
longer to find and fix each bug without the debugger, and I found it
a much more frustrating experience.

-- 
David Hopwood <····················@blueyonder.co.uk>
From: Joe Marshall
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1151341232.468150.254980@c74g2000cwc.googlegroups.com>
David Hopwood wrote:
> > Joe Marshall wrote:
> >>
> >>I do this quite often.  Sometimes I'll develop `in the debugger'.  I'll
> >>change some piece of code and run the program until it traps.  Then,
> >>without exiting the debugger, I'll fix the immediate problem and
> >>restart the program at the point it trapped.  This technique takes a
> >>bit of practice, but if you are working on something that's complex and
> >>has a lot of state, it saves a lot of time because you don't have to
> >>reconstruct the state every time you make a change.
>
> The problem with this is that from that point on, what you're running
> is neither the old nor the new program, since its state may have been
> influenced by the bug before you corrected it.

Yes.  The hope is that it is closer to the new program than to the old.

> I find it far simpler to just restart the program after correcting
> anything. If this is too difficult, I change the design to make it
> less difficult.

In the most recent case where I was doing this, I was debugging
transaction rollback that involved multiple database extents.  It was
somewhat painful to set up a clean database to the point where I wanted
to try the rollback, and I wanted a pristine database for each trial so
I could examine the raw bits left by a rollback.  It was pretty easy to
deal with simple errors in the debugger, so I chose to do that instead.





>
> > Wow, interesting.
> >
> > (I say the following only to point out differing strategies of
> > development, not to say one or the other is right or bad or
> > whatever.)
> >
> > I occasionally find myself doing the above, and when I do,
> > I take it as a sign that I've screwed up. I find that process
> > excruciating, and I do everything I can to avoid it. Over the
> > years, I've gotten more and more adept at trying to turn
> > as many bugs as possible into type errors, so I don't have
> > to run the debugger.
> >
> > Now, there might be an argument to be made that if I had
> > been using a dynamic language, the process wouldn't be
> > so bad, and I woudn't dislike it so much. But mabe:
> >
> > As a strawman: suppose there are two broad categories
> > of mental strategies for thinking about bugs, and people
> > fall naturally into one way or the other, the way there
> > are morning people and night people. One group is
> > drawn to the static analysis, one group hates it.
> > One group hates working in the debugger, one group
> > is able to use that tool very effectively and elegantly.
> >
> > Anyway, it's a thought.
>
> I don't buy this -- or at least, I am not in either group.
>
> A good debugger is invaluable regardless of your attitude to type
> systems. Recently I was debugging two programs written to do similar
> things in the same statically typed language (C), but where a debugger
> could not be used for one of the programs. It took maybe 5 times
> longer to find and fix each bug without the debugger, and I found it
> a much more frustrating experience.
> 
> -- 
> David Hopwood <····················@blueyonder.co.uk>
From: Ole Nielsby
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <44a19c52$0$15784$14726298@news.sunsite.dk>
David Hopwood <...nospam....uk> wrote:

> A good debugger is invaluable regardless of your attitude
> to type systems.

I found that certain language features help greatly in pinning
the errors, when programming in my own impure fp language
(PILS).

Originally, I implemented a single-stepping debugger for the
language, but I trashed it for three reasons, of which two are
rather uninteresting here: it messed up the GUI system, and the
evaluation order of PILS made single-stepping a confusing
experience.

The interesting reason is: I didn't need single-stepping, partly
because the programming system is unit-test friendly, partly
because of language features that seem to mimick the concept
of taking responsibility.

Basically, the PILS execution mechanism is pessimistic. When
a function is called, the assumption is it won't work, and the
only way the function can return a result is by throwing an
exception. So, if a function call doesn't throw an exception,
it has failed and execution will stop unless the function call
is explicitly marked as fault tolerant, as in

    f (x) else "undefined"

This has been merged with tail call flattening - rather than
throwing the result, an (expression, binding) thing is thrown
(I guess .NET'ers would call it an anonymous parameterless
delegate), and the expression is then evaluated after the throw.

Mimickally speaking, when a function body performs this
throw, it takes responsibility for getting the job done, and
if it fails, it will suffer the peril of having its guts exposed
in public by the error message window - much like certain
medieval penal systems. I will spare you for the gory details,
just let me add that to honor the demands of modern
bureaucracy, I added a "try the other office" feature so that
there are ways of getting things done without ever taking
responsibility.

Strangely, this all started 20 years ago as I struggled
implementing an early PILS version on an 8-bit Z80
processor, at a blazing 4 MHz. I needed a fast way of
escaping from a failing pattern match, and it turned
out the conditional return operation of this particular
processor was the fastest I could get. Somehow, the
concept of "if it returns, it's bad" crept into the language
design. I was puzzled by its expressiveness and the ease
of using it, only years later did I realize that its ease
of use came from the mimicking of responsibility.

I'm still puzzled that the notion of mimicking a cultural/
social concept came from a microprocessor instruction.
It seems like I - and perhaps computer language designers
in general - have a blind spot. We dig physical objects and
mathematical theories, and don't take hints from the
"soft sciences"...

Perhaps I'm over-generalizing, it just strikes me that the
dispute always seems to stand between math versus object
orientation, with seemingly no systematic attempts at
utilizing insigts from the study of languages and cultural
concepts. It's like as if COBOL and VB scared us from
ever considering the prospect of making programming
languages readable to the uninitiated.
 
From: Pascal Costanza
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <4g23hgF1l4434U1@individual.net>
Marshall wrote:
> Joe Marshall wrote:
>> That's the important point:  I want to run broken code.
> 
> I want to make sure I understand. I can think of several things
> you might mean by this. It could be:
> 1) I want to run my program, even though I know parts of it
> are broken, because I think there are parts that are not broken
> and I want to try them out.
> 2) I want to run my program, even though it is broken, and I
> want to run right up to a broken part and trap there, so I can
> use the runtime facilities of the language to inspect what's
> going on.
> 
> 
>> I want to run
>> as much of the working fragments as I can, and I want a `safety net' to
>> prevent me from performing undefined operations, but I want the safety
>> net to catch me at the *last* possible moment.
> 
> This statement is interesting, because the conventional wisdom (at
> least as I'm used to hearing it) is that it is best to catch bugs
> at the *first* possible moment. But I think maybe we're talking
> about different continua here. The last last last possible moment
> is after the software has shipped to the customer, and I'm pretty
> sure that's not what you mean. I think maybe you mean something
> more like 2) above.

Nowadays, we have more options wrt what it means to "ship" code. It 
could be that your program simply runs as a (web) service to which you 
have access even after the customer has started to use the program. See 
http://www.paulgraham.com/road.html for a good essay on this idea.


Pascal

-- 
3rd European Lisp Workshop
July 3 - Nantes, France - co-located with ECOOP 2006
http://lisp-ecoop06.bknr.net/
From: Neelakantan Krishnaswami
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <slrne9lvbv.r45.neelk@gs3106.sp.cs.cmu.edu>
In article <·······················@i40g2000cwc.googlegroups.com>, Joe
Marshall wrote:
> 
> That's the important point:  I want to run broken code.  I want to run
> as much of the working fragments as I can, and I want a `safety net' to
> prevent me from performing undefined operations, but I want the safety
> net to catch me at the *last* possible moment.  I'm not playing it safe
> and staying where the compiler can prove I'll be ok.  I'm living
> dangerously and wandering near the edge where the compiler can't quite
> prove that I'll fail.

Hi Joe,

How do you write programs? Specifically, how do you write and debug
higher-order programs that involve lots of combinators (eg, code
that's partially CPS-converted, or in state-passing style, and also
uses maps and folds)?

The reason I ask is that I see that there are Scheme programmers that
manage to do this successfully. However, I switched to ML because I
just couldn't get that kind of code right without having type errors
to guide me. 

Since people like you and Matthias and Shriram obviously *can* write
this kind of code, I'm curious what your strategies are.

-- 
Neel Krishnaswami
·····@cs.cmu.edu
From: Ketil Malde
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <egy7vkz0k9.fsf@polarvier.ii.uib.no>
Chris Smith <·······@twu.net> writes:

> Joachim Durchholz <··@durchholz.org> wrote:
>> Assume a language that
>> a) defines that a program is "type-correct" iff HM inference establishes 
>> that there are no type errors
>> b) compiles a type-incorrect program anyway, with an establishes 
>> rigorous semantics for such programs (e.g. by throwing exceptions as 
>> appropriate).

> So the compiler now attempts to prove theorems about the program, but 
> once it has done so it uses the results merely to optimize its runtime 
> behavior and then throws the results away. 

Hmm... here's my compiler front end (for Haskell or other languages
that ..er.. define 'undefined'):

  while(type error) {
    replace an incorrectly typed function with 'undefined'
  }

Wouldn't the resulting program still be statically typed?

In practice I prefer to (and do) define troublesome functions as
'undefined' manually during development.

-k
-- 
If I haven't seen further, it is by standing in the footprints of giants
From: Pascal Costanza
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <4ffdntF1iei4hU1@individual.net>
Torben �gidius Mogensen wrote:
> Pascal Costanza <··@p-cos.net> writes:
> 
>> Torben �gidius Mogensen wrote:
>>
>>> On a similar note, is a statically typed langauge more or less
>>> expressive than a dynamically typed language?  Some would say less, as
>>> you can write programs in a dynamically typed language that you can't
>>> compile in a statically typed language (without a lot of encoding),
>>> whereas the converse isn't true.
>> It's important to get the levels right here: A programming language
>> with a rich static type system is more expressive at the type level,
>> but less expressive at the base level (for some useful notion of
>> expressiveness ;).
>>
>>> However, I think this is misleading,
>>> as it ignores the feedback issue: It takes longer for the average
>>> programmer to get the program working in the dynamically typed
>>> language.
>> This doesn't seem to capture what I hear from Haskell programmers who
>> say that it typically takes quite a while to convince the Haskell
>> compiler to accept their programs. (They perceive this to be
>> worthwhile because of some benefits wrt correctness they claim to get
>> in return.)
> 
> That's the point: Bugs that in dynamically typed languages would
> require testing to find are found by the compiler in a statically
> typed language.

Yes. However, unfortunately statically typed languages also reject 
programs that don't have such bugs. It's a tradeoff whether you want to 
spend time to deal with them or not.

> So whil eit may take onger to get a program thatgets
> past the compiler, it takes less time to get a program that works.

That's incorrect. See http://haskell.org/papers/NSWC/jfp.ps - especially 
Figure 3.


Pascal

-- 
3rd European Lisp Workshop
July 3 - Nantes, France - co-located with ECOOP 2006
http://lisp-ecoop06.bknr.net/
From: fireblade
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1150489731.537704.241070@c74g2000cwc.googlegroups.com>
I don't know anything about Perl and little about Python so I skipped
those parts.

However your  Example: Symbolic Computation rings some bells.
The sole concept of program who can write program , especially one who
can manipulate and create  variables and their arrived to me far before
I 've heard about Lisp and it's macros.
So the concept of expressing symbols is something natural contained in
Lisp
that other languages are supressing.

Graham stated this more clearly than I, that programming in one
language
makes you think in that language.

I've read a bunch of tips from C/C++/Java book like:
1. Avoid macros.
2. Avoid functions that do very complicated things and they look naive?

In the opposite is the Lisp with it's wishfull thinking and almost
unlimited way of thinking and implementing your ideas

Somebody said that Lisp is like  a lever it helps you achieve something
that
you were afraid to even think  to do it in a lesser languages.

Actually Lisp is more like a pulley.
A lot of strength for heavy lifting
and enaphe rope to hang yourself .

the choice is all yours.
cheers
bobi
From: Torben Ægidius Mogensen
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <7zwtbh8h59.fsf@app-1.diku.dk>
Pascal Costanza <··@p-cos.net> writes:

> Torben �gidius Mogensen wrote:

> > So while it may take longer to get a program that gets
> > past the compiler, it takes less time to get a program that works.
> 
> That's incorrect. See http://haskell.org/papers/NSWC/jfp.ps -
> especially Figure 3.

There are many other differences between these languages than static
vs. dynamic types, and some of these differences are likely to be more
significant.  What you need to test is langauges with similar features
and syntax, except one is statically typed and the other dynamically
typed.

And since these languages would be quite similar, you can use the same
test persons: First let one half solve a problem in the statically
typed language and the other half the same problem in the dynamically
typed language, then swap for the next problem.  If you let a dozen
persons each solve half a dozen problems, half in the statically typed
language and half in the dynamically typed language (using different
splits for each problem), you might get a useful figure.

        Torben
From: Pascal Costanza
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <4ffulsF1ie6ifU1@individual.net>
Torben �gidius Mogensen wrote:
> Pascal Costanza <··@p-cos.net> writes:
> 
>> Torben �gidius Mogensen wrote:
> 
>>> So while it may take longer to get a program that gets
>>> past the compiler, it takes less time to get a program that works.
>> That's incorrect. See http://haskell.org/papers/NSWC/jfp.ps -
>> especially Figure 3.
> 
> There are many other differences between these languages than static
> vs. dynamic types, and some of these differences are likely to be more
> significant.  What you need to test is langauges with similar features
> and syntax, except one is statically typed and the other dynamically
> typed.
> 
> And since these languages would be quite similar, you can use the same
> test persons: First let one half solve a problem in the statically
> typed language and the other half the same problem in the dynamically
> typed language, then swap for the next problem.  If you let a dozen
> persons each solve half a dozen problems, half in the statically typed
> language and half in the dynamically typed language (using different
> splits for each problem), you might get a useful figure.

...and until then claims about the influence of static type systems on 
the speed with which you can implement working programs are purely 
guesswork. That's the only point I need to make to show that your 
original unqualified statement, namely that it takes less time to get a 
program that works, is incorrect.


Pascal

-- 
3rd European Lisp Workshop
July 3 - Nantes, France - co-located with ECOOP 2006
http://lisp-ecoop06.bknr.net/
From: Neelakantan Krishnaswami
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <slrne98hlu.phm.neelk@gs3106.sp.cs.cmu.edu>
In article <···············@individual.net>, Pascal Costanza wrote:
> 
> ...and until then claims about the influence of static type systems
> on the speed with which you can implement working programs are
> purely guesswork. That's the only point I need to make to show that
> your original unqualified statement, namely that it takes less time
> to get a program that works, is incorrect.

No, they're more than guesswork. You can have perfectly good
explanations without p-values.

For example, I switched over from using Dylan and Scheme to using ML,
because I wanted to program in a very higher-order style. I found that
I couldn't effectively write these kinds of programs in a dynamically
typed setting, because I had trouble localizing errors. That is, when
you build up functions from combinators, the time at which you
construct an incorrect function can be very far from the place where
you actually use that function. And worse, the stack trace you get is
usually unhelpful, since it contains things like '<anonymous call>'
ten times over. However, almost all of my errors were caught at the
site of the error with static type checking, because the function
signatures didn't match up.

Conversely, someone who tells me that he implemented a lazy patching
service to a CL program on top of CHANGE-CLASS to add functionality to
a service that can't afford any downtime is also making a perfectly
intelligible non-statistical claim.

I mean, yeah, we have to rely on our judgement and experience to
evaluate the relative importance of such explanations, but really our
capacity to do that is what distinguishes us from the tools we build.



-- 
Neel Krishnaswami
·····@cs.cmu.edu
From: Matthew D Swank
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <pan.2006.06.17.19.54.24.884809@c.net>
On Sat, 17 Jun 2006 18:15:26 +0000, Neelakantan Krishnaswami wrote:

> In article <···············@individual.net>, Pascal Costanza wrote:
>> 
>> ...and until then claims about the influence of static type systems
>> on the speed with which you can implement working programs are
>> purely guesswork. That's the only point I need to make to show that
>> your original unqualified statement, namely that it takes less time
>> to get a program that works, is incorrect.
> 
> No, they're more than guesswork. You can have perfectly good
> explanations without p-values.
> 
> For example, I switched over from using Dylan and Scheme to using ML,
> because I wanted to program in a very higher-order style. I found that
> I couldn't effectively write these kinds of programs in a dynamically
> typed setting, because I had trouble localizing errors. That is, when
> you build up functions from combinators, the time at which you
> construct an incorrect function can be very far from the place where
> you actually use that function. And worse, the stack trace you get is
> usually unhelpful, since it contains things like '<anonymous call>'
> ten times over. However, almost all of my errors were caught at the
> site of the error with static type checking, because the function
> signatures didn't match up.
> 

I also struggle with this in Common Lisp.  CLOS (especially funcallable
instances) can help, but the over head of such objects is significant if
am creating millions of short lived closures.  It is not necessarily a
static vs. dynamic typing issue, but it does seems more expensive to
create and track (higher-order) function type information at run-time.  Is
this a fundamental limitation of dynamically typed languages or just
(common) lisp?


Matt
"You do not really understand something unless you can
 explain it to your grandmother." — Albert Einstein.
From: Neelakantan Krishnaswami
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <slrne99juq.rko.neelk@gs3106.sp.cs.cmu.edu>
In article <······························@c.net>, Matthew D Swank wrote:
> On Sat, 17 Jun 2006 18:15:26 +0000, Neelakantan Krishnaswami wrote:
>> 
>> For example, I switched over from using Dylan and Scheme to using
>> ML, because I wanted to program in a very higher-order style. I
>> found that I couldn't effectively write these kinds of programs in
>> a dynamically typed setting, because I had trouble localizing
>> errors. That is, when you build up functions from combinators, the
>> time at which you construct an incorrect function can be very far
>> from the place where you actually use that function.
> 
> I also struggle with this in Common Lisp.  CLOS (especially
> funcallable instances) can help, but the over head of such objects
> is significant if am creating millions of short lived closures.  It
> is not necessarily a static vs. dynamic typing issue, but it does
> seems more expensive to create and track (higher-order) function
> type information at run-time.  Is this a fundamental limitation of
> dynamically typed languages or just (common) lisp?

I think the main issue for development is that you want the blame for
the error to be in the right place -- the error message should tell
you what was the cause of the error, and not just where the error
happened. This is hard in a functional language, because combinators
delay the usage of a function. 

Eg, if you have a compose function:

(define (compose f g) 
  (lambda (x) (f (g x))))

and you write (compose map +), then the call to compose will succeed,
even though the returned function will fail when it is called (due to
map not getting enough arguments). 

However, if you use a contract system like DrScheme has, then the the
runtime will have enough info to successfully locate where a runtime
error happened. I haven't used DrScheme's contracts, but I bet that
would address many of those ease-of-development issues. 

The main difficulty with contract checking, IMO, is that it can change
the space complexity of your program. Functional programs use
tail-call optimizations to use constant stack space. However, if you
need to check the post-condition of a call, then your recursive calls
can go from being in tail position to being in non-tail position,
which can shift space usage from O(1) to O(N). Eg:

(define (foo x) (returns: blah?) ;; totally made-up syntax
   ...
   (foo expr))

becomes

(define (foo e)
   ...
   (let ((result (foo expr)))
      (if (blah? result)
          result
          (raise-appropriate-error))))

I think this is fundamental to doing runtime postcondition checks, but
it's possible there's some clever trick I missed.


-- 
Neel Krishnaswami
·····@cs.cmu.edu
From: Pascal Costanza
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <4flj1aF1j6f35U2@individual.net>
Matthew D Swank wrote:
> On Sat, 17 Jun 2006 18:15:26 +0000, Neelakantan Krishnaswami wrote:
> 
>> In article <···············@individual.net>, Pascal Costanza wrote:
>>> ...and until then claims about the influence of static type systems
>>> on the speed with which you can implement working programs are
>>> purely guesswork. That's the only point I need to make to show that
>>> your original unqualified statement, namely that it takes less time
>>> to get a program that works, is incorrect.
>> No, they're more than guesswork. You can have perfectly good
>> explanations without p-values.
>>
>> For example, I switched over from using Dylan and Scheme to using ML,
>> because I wanted to program in a very higher-order style. I found that
>> I couldn't effectively write these kinds of programs in a dynamically
>> typed setting, because I had trouble localizing errors. That is, when
>> you build up functions from combinators, the time at which you
>> construct an incorrect function can be very far from the place where
>> you actually use that function. And worse, the stack trace you get is
>> usually unhelpful, since it contains things like '<anonymous call>'
>> ten times over. However, almost all of my errors were caught at the
>> site of the error with static type checking, because the function
>> signatures didn't match up.
>>
> 
> I also struggle with this in Common Lisp.  CLOS (especially funcallable
> instances) can help, but the over head of such objects is significant if
> am creating millions of short lived closures.  It is not necessarily a
> static vs. dynamic typing issue, but it does seems more expensive to
> create and track (higher-order) function type information at run-time.  Is
> this a fundamental limitation of dynamically typed languages or just
> (common) lisp?

Have you tried something like this?

(defmacro named-lambda (name (&rest lambda-list) &body body)
   (if *debug*
      `(flet ((,name ,lambda-list ,@body)) #',name)
      `(lambda ,lambda-list ,@body)))


Pascal

-- 
3rd European Lisp Workshop
July 3 - Nantes, France - co-located with ECOOP 2006
http://lisp-ecoop06.bknr.net/
From: Matthew D Swank
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <pan.2006.06.19.01.15.38.950317@c.net>
On Sun, 18 Jun 2006 20:06:34 +0200, Pascal Costanza wrote:

> Matthew D Swank wrote:
>> On Sat, 17 Jun 2006 18:15:26 +0000, Neelakantan Krishnaswami wrote:
>> 
>>> In article <···············@individual.net>, Pascal Costanza wrote:
>>>> ...and until then claims about the influence of static type systems
>>>> on the speed with which you can implement working programs are
>>>> purely guesswork. That's the only point I need to make to show that
>>>> your original unqualified statement, namely that it takes less time
>>>> to get a program that works, is incorrect.
>>> No, they're more than guesswork. You can have perfectly good
>>> explanations without p-values.
>>>
>>> For example, I switched over from using Dylan and Scheme to using ML,
>>> because I wanted to program in a very higher-order style. I found that
>>> I couldn't effectively write these kinds of programs in a dynamically
>>> typed setting, because I had trouble localizing errors. That is, when
>>> you build up functions from combinators, the time at which you
>>> construct an incorrect function can be very far from the place where
>>> you actually use that function. And worse, the stack trace you get is
>>> usually unhelpful, since it contains things like '<anonymous call>'
>>> ten times over. However, almost all of my errors were caught at the
>>> site of the error with static type checking, because the function
>>> signatures didn't match up.
>>>
>> 
>> I also struggle with this in Common Lisp.  CLOS (especially funcallable
>> instances) can help, but the over head of such objects is significant if
>> am creating millions of short lived closures.  It is not necessarily a
>> static vs. dynamic typing issue, but it does seems more expensive to
>> create and track (higher-order) function type information at run-time.  Is
>> this a fundamental limitation of dynamically typed languages or just
>> (common) lisp?
> 
> Have you tried something like this?
> 
> (defmacro named-lambda (name (&rest lambda-list) &body body)
>    (if *debug*
>       `(flet ((,name ,lambda-list ,@body)) #',name)
>       `(lambda ,lambda-list ,@body)))
> 
> 
> Pascal

That is useful in debugging. However, I was being a little imprecise in my
complaint.  The other major reason (previously omitted) that I wrap
functions in CLOS instances is so I can use generic functions.  What would
be nice is some limited deftype support in CLOS.  I realize that this is
almost the same thing as asking for predicate dispatch. 

What I had in mind though, is something I'll call "tag dispatch": the
ability to define union types by simply adding objects to a weak container.

Something like:

(defmacro define-tag-type (name &optional qualifier)
  (let ((pred-name (gensym))
         (val-name (gensym)))
    `(let ((members (make-weak-collection)))
       (defun ,pred-name (val)
         (and (weak-collection-member val members) t))
       (deftype ,name ()
         (list 'satisfies ',pred-name))
       (defmethod make-instance ((class (eql ',name)) &rest args)
         (let ((,val-name (getf args :val)))
           ,@(if qualifier 
                 `((unless (funcall ,qualifier ,val-name)
                     (error "~a is not a valid ~a" ,val-name ',name))
                   (add-if-not-member ,val-name members)
                   ,val-name)
                 `((add-if-not-member ,val-name members)
                   ,val-name))))
       ',name)))

And, then, be able to use these types in method qualifiers.  

This part of the reason I've been slowly making my way through AMOP. 
However, weak-containers aren't standard, or even well supported across
implementations.*

Matt

*Am I right in assuming that using standard containers (lists, hashes,
etc) will cause members of the collection never to be garbage collected?

-- 
"You do not really understand something unless you can
 explain it to your grandmother." — Albert Einstein.
From: Pascal Bourguignon
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <878xntsuny.fsf@thalassa.informatimago.com>
Matthew D Swank <·······································@c.net> writes:
> That is useful in debugging. However, I was being a little imprecise in my
> complaint.  The other major reason (previously omitted) that I wrap
> functions in CLOS instances is so I can use generic functions.  What would
> be nice is some limited deftype support in CLOS.  I realize that this is
> almost the same thing as asking for predicate dispatch. 
>
> What I had in mind though, is something I'll call "tag dispatch": the
> ability to define union types by simply adding objects to a weak container.
>
> Something like:
>
> (defmacro define-tag-type (name &optional qualifier)
>   (let ((pred-name (gensym))
>          (val-name (gensym)))
>     `(let ((members (make-weak-collection)))
>        (defun ,pred-name (val)
>          (and (weak-collection-member val members) t))
>        (deftype ,name ()
>          (list 'satisfies ',pred-name))
>        (defmethod make-instance ((class (eql ',name)) &rest args)
>          (let ((,val-name (getf args :val)))
>            ,@(if qualifier 
>                  `((unless (funcall ,qualifier ,val-name)
>                      (error "~a is not a valid ~a" ,val-name ',name))
>                    (add-if-not-member ,val-name members)
>                    ,val-name)
>                  `((add-if-not-member ,val-name members)
>                    ,val-name))))
>        ',name)))
>
> And, then, be able to use these types in method qualifiers.  
>
> This part of the reason I've been slowly making my way through AMOP. 
> However, weak-containers aren't standard, or even well supported across
> implementations.*

Have a look at closer-weak at:
http://www.informatimago.com/develop/lisp/index.html#clext


> *Am I right in assuming that using standard containers (lists, hashes,
> etc) will cause members of the collection never to be garbage collected?

Yes, of course.


-- 
__Pascal Bourguignon__                     http://www.informatimago.com/

The world will now reboot.  don't bother saving your artefacts.
From: Matthew D Swank
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <pan.2006.06.19.21.24.08.822649@c.net>
On Mon, 19 Jun 2006 05:15:13 +0200, Pascal Bourguignon wrote:

> Have a look at closer-weak at:
> http://www.informatimago.com/develop/lisp/index.html#clext

Thanks!

-- 
"You do not really understand something unless you can
 explain it to your grandmother." — Albert Einstein.
From: Pascal Costanza
Subject: Predicate dispatch, was Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <4fo1oeF1jiad6U1@individual.net>
Matthew D Swank wrote:
> On Sun, 18 Jun 2006 20:06:34 +0200, Pascal Costanza wrote:
> 
>> Matthew D Swank wrote:
>>> On Sat, 17 Jun 2006 18:15:26 +0000, Neelakantan Krishnaswami wrote:
>>>
>>>> In article <···············@individual.net>, Pascal Costanza wrote:
>>>>> ...and until then claims about the influence of static type systems
>>>>> on the speed with which you can implement working programs are
>>>>> purely guesswork. That's the only point I need to make to show that
>>>>> your original unqualified statement, namely that it takes less time
>>>>> to get a program that works, is incorrect.
>>>> No, they're more than guesswork. You can have perfectly good
>>>> explanations without p-values.
>>>>
>>>> For example, I switched over from using Dylan and Scheme to using ML,
>>>> because I wanted to program in a very higher-order style. I found that
>>>> I couldn't effectively write these kinds of programs in a dynamically
>>>> typed setting, because I had trouble localizing errors. That is, when
>>>> you build up functions from combinators, the time at which you
>>>> construct an incorrect function can be very far from the place where
>>>> you actually use that function. And worse, the stack trace you get is
>>>> usually unhelpful, since it contains things like '<anonymous call>'
>>>> ten times over. However, almost all of my errors were caught at the
>>>> site of the error with static type checking, because the function
>>>> signatures didn't match up.
>>>>
>>> I also struggle with this in Common Lisp.  CLOS (especially funcallable
>>> instances) can help, but the over head of such objects is significant if
>>> am creating millions of short lived closures.  It is not necessarily a
>>> static vs. dynamic typing issue, but it does seems more expensive to
>>> create and track (higher-order) function type information at run-time.  Is
>>> this a fundamental limitation of dynamically typed languages or just
>>> (common) lisp?
>> Have you tried something like this?
>>
>> (defmacro named-lambda (name (&rest lambda-list) &body body)
>>    (if *debug*
>>       `(flet ((,name ,lambda-list ,@body)) #',name)
>>       `(lambda ,lambda-list ,@body)))
>>
>>
>> Pascal
> 
> That is useful in debugging. However, I was being a little imprecise in my
> complaint.  The other major reason (previously omitted) that I wrap
> functions in CLOS instances is so I can use generic functions.  What would
> be nice is some limited deftype support in CLOS.  I realize that this is
> almost the same thing as asking for predicate dispatch. 
> 
> What I had in mind though, is something I'll call "tag dispatch": the
> ability to define union types by simply adding objects to a weak container.
> 
> Something like:
> 
> (defmacro define-tag-type (name &optional qualifier)
>   (let ((pred-name (gensym))
>          (val-name (gensym)))
>     `(let ((members (make-weak-collection)))
>        (defun ,pred-name (val)
>          (and (weak-collection-member val members) t))
>        (deftype ,name ()
>          (list 'satisfies ',pred-name))
>        (defmethod make-instance ((class (eql ',name)) &rest args)
>          (let ((,val-name (getf args :val)))
>            ,@(if qualifier 
>                  `((unless (funcall ,qualifier ,val-name)
>                      (error "~a is not a valid ~a" ,val-name ',name))
>                    (add-if-not-member ,val-name members)
>                    ,val-name)
>                  `((add-if-not-member ,val-name members)
>                    ,val-name))))
>        ',name)))
> 
> And, then, be able to use these types in method qualifiers.  
> 
> This part of the reason I've been slowly making my way through AMOP. 
> However, weak-containers aren't standard, or even well supported across
> implementations.*

That shouldn't be too hard to implement with the CLOS MOP. The hard part 
with predicate dispatch is to determine an order in which predicates are 
checked. For example, you can consider dispatching on classes as a 
predicate dispatch on the predicate typep. What you get for free when 
dispatching on classes is that the class hierarchy gives you a "natural" 
order of specificity. Note that eql specializers are specified to always 
be more specific than their classes - exactly to make it possible to 
still rely on such a "natural" specificity.

The problem with general predicate dispatch is that in general, you 
cannot sort predicates in a meaningful way. (Consider predicates for 
prime and odd numbers - due to the presence of 2 in the list of prime 
numbers, prime is not "naturally" more specific than odd, only almost. ;)

However, what you describe seems to be solvable by specifying that 
members of collections are more specific than their respective classes 
(similar to eql specializers). So it should be possible to make this 
work, at least in principle.


Pascal

-- 
3rd European Lisp Workshop
July 3 - Nantes, France - co-located with ECOOP 2006
http://lisp-ecoop06.bknr.net/
From: Pascal Costanza
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <4flipeF1j6f35U1@individual.net>
Neelakantan Krishnaswami wrote:
> In article <···············@individual.net>, Pascal Costanza wrote:
>> ...and until then claims about the influence of static type systems
>> on the speed with which you can implement working programs are
>> purely guesswork. That's the only point I need to make to show that
>> your original unqualified statement, namely that it takes less time
>> to get a program that works, is incorrect.
> 
> No, they're more than guesswork. You can have perfectly good
> explanations without p-values.
> 
> For example, I switched over from using Dylan and Scheme to using ML,
> because I wanted to program in a very higher-order style. I found that
> I couldn't effectively write these kinds of programs in a dynamically
> typed setting, because I had trouble localizing errors. That is, when
> you build up functions from combinators, the time at which you
> construct an incorrect function can be very far from the place where
> you actually use that function. And worse, the stack trace you get is
> usually unhelpful, since it contains things like '<anonymous call>'
> ten times over. However, almost all of my errors were caught at the
> site of the error with static type checking, because the function
> signatures didn't match up.
> 
> Conversely, someone who tells me that he implemented a lazy patching
> service to a CL program on top of CHANGE-CLASS to add functionality to
> a service that can't afford any downtime is also making a perfectly
> intelligible non-statistical claim.
> 
> I mean, yeah, we have to rely on our judgement and experience to
> evaluate the relative importance of such explanations, but really our
> capacity to do that is what distinguishes us from the tools we build.

I was worried about Torben's original unqualified statement. I am not at 
all against making informed decisions. (My impression wrt debugging in 
Scheme is that Common Lisp are generally more helpful - but I could 
simply be not experienced enough with Scheme. I don't know about Dylan 
in this regard.)


Pascal

-- 
3rd European Lisp Workshop
July 3 - Nantes, France - co-located with ECOOP 2006
http://lisp-ecoop06.bknr.net/
From: Dr.Ruud
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e6ueor.1bk.1@news.isolution.nl>
Torben �gidius Mogensen schreef:

> Bugs that in dynamically typed languages would
> require testing to find are found by the compiler in a statically
> typed language.  So whil[e ]it may take [l]onger to get a program
that[ ]
> gets past the compiler, it takes less time to get a program that
works.

If it were that simple, I would say: compile time type inference is the
only way to go.

-- 
Affijn, Ruud

"Gewoon is een tijger."
From: genea
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1150470822.647000.323290@i40g2000cwc.googlegroups.com>
Torben Ægidius Mogensen wrote:

> There are several aspects relevant to this issue, some of which are:
>  - Compactness: How much do I have to type to do what I want?
 ......
>  - Naturality: How much effort does it take to convert the concepts of
>    my problem into the concepts of the language?
>  - Feedback: Will the language provide sensible feedback when I write
>    nonsensical things?
>  - Reuse: How much effort does it take to reuse/change code to solve a
>    similar problem?
.....

I am fairly new to Haskell, but the compactness of the language and the
way you can express a lot in a very small amount of real estate is very
important.. I used to program back in the 80's in forth a lot.... okay
I'm a dinosaur!, but "good" definitions were usually very short, and
sweet.  Unicon/Icon that I used {still do!} in the imperative world,
very compact.  I will give an example that covers compact, reusable,
and because of static typing when will give back mis-type info when you
load a new "a or xs" into it to form a new function.
-- what is happening below is a is being replaced with the curried
lambda: ((++) 3) and
-- xs a list: [1..6], so when that definition of f is used it is type
checked to see if the
--  elements in xs match the type of a, so if this were going to be
compiled, it would
-- checked and guaranteed to work.

Prelude> :t f
f :: forall a b. (Show b) => (a -> b) -> [a] -> IO ()
Prelude> let f a xs = putStr $ foldr (++) "\n"  $ map (((++) "\n").
show . a ) xs
Prelude> f ((*) 3) [1..6]
3
6
9
12
15
18
Prelude>

another substitution of parameters.. using the same definition of f
allowed by the the
polymorphic parameters allowed in Haskell add to the versatility and
reusability angle:

Prelude> f sqrt  [0.5,1.0..4]
0.7071067811865476
1.0
1.224744871391589
1.4142135623730951
1.5811388300841898
1.7320508075688772
1.8708286933869707
2.0

Same function 'f" now used with a different type..
[0.5,1.0..4] :: forall a. (Fractional a, Enum a) => [a]

I don't know, but this just makes programming fun, for me anyway, and
if it is fun, it is expressive.. I've heard this statement made about
Ruby and Unicon, to name a few... some would say Python.. but it really
applies to the functional languages too, with all their strict typing,
with the type inference mechanisms, it isn't usually that big a deal..
If you just get into it, and can learn to take some constructive
criticism from your compiler, well hey, it is a really patient
teacher... you might get frustrated at times.. but the compiler will
happily remind you of the same type mis-matches, until you get a handle
on some concept and never once complain...
 Happy Programming to all!
-- gene
From: Pascal Costanza
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <4esnesF1gag05U2@individual.net>
Xah Lee wrote:
> in March, i posted a essay “What is Expressiveness in a Computer
> Language”, archived at:
> http://xahlee.org/perl-python/what_is_expresiveness.html
> 
> I was informed then that there is a academic paper written on this
> subject.
> 
> On the Expressive Power of Programming Languages, by Matthias
> Felleisen, 1990.
> http://www.ccs.neu.edu/home/cobbe/pl-seminar-jr/notes/2003-sep-26/expressive-slides.pdf
> 
> Has anyone read this paper? And, would anyone be interested in giving a
> summary?

The paper itself has a good summary.


Pascal

-- 
3rd European Lisp Workshop
July 3 - Nantes, France - co-located with ECOOP 2006
http://lisp-ecoop06.bknr.net/
From: PofN
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1149877726.498412.294850@y43g2000cwc.googlegroups.com>
Xah Lee wrote:
[the usual toff-topic trolling stuff]

Shit, da troll is back. Abuse reports need to go to
abuse [] pacbell.net and abuse [] swbell.net this time.
From: Kaz Kylheku
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1149877938.754797.119750@g10g2000cwb.googlegroups.com>
Xah Lee wrote:
> Has anyone read this paper? And, would anyone be interested in giving a
> summary?

Not you, of course. Too busy preparing the next diatribe against UNIX,
Perl, etc. ;)
From: Claudio Grondi
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <e7f0l7$jlt$1@newsreader2.netcologne.de>
Xah Lee wrote:
> in March, i posted a essay “What is Expressiveness in a Computer
> Language”, archived at:
> http://xahlee.org/perl-python/what_is_expresiveness.html
> 
> I was informed then that there is a academic paper written on this
> subject.
> 
> On the Expressive Power of Programming Languages, by Matthias
> Felleisen, 1990.
> http://www.ccs.neu.edu/home/cobbe/pl-seminar-jr/notes/2003-sep-26/expressive-slides.pdf
> 
> Has anyone read this paper? And, would anyone be interested in giving a
> summary?
> 
> thanks.
> 
>    Xah
>    ···@xahlee.org
>  ∑ http://xahlee.org/
> 
Looking this thread growing it appears to me, that at least one thing 
becomes evident here:

Xah unwillingness to learn from the bad past experience contaminates 
others (who are still posting to his trolling threads).

Here another try to rescue these ones who are just virgin enough not to 
know what I am speaking about:

Citation from http://www.xahlee.org/Netiquette_dir/_/art_trolling.html :
"""
What I want this document to focus on is how to create entertaining 
trolls. I have drawn on the expertise of the writer's of some of 
Usenet's finest and best remembered trolls. Trolls are for fun. The 
object of recreational trolling is to sit back and laugh at all those 
gullible idiots that will believe *anything*.
[...]
Section 5    Know Your Audience
Remember that you have two audiences. The people who are going to get 
the maximum enjoyment out of your post are other trollers. You need to 
keep in contact with them through both your troll itself and the way you 
direct its effect. It is trollers that you are trying to entertain so be 
creative - trollers don't just want a laugh from you they want to see 
good trolls so that they can also learn how to improve their own in the 
never ending search for the perfect troll.
[...]
Section 6    Following-Up
Try not to follow-up to your own troll. The troll itself quickly becomes
forgotten in the chaos and if you just sit back you can avoid being 
blamed for causing it. Remember, if you do follow up you are talking to 
an idiot. Treat them with the ill-respect they deserve.
"""

Claudio Grondi (a past 'gullible idiot' who learned to enjoy the fun of 
being the audience)
From: Chris Smith
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <MPG.1f04b4c5169cbb149896e3@news.altopia.net>
Claudio Grondi <··············@freenet.de> wrote:
> Looking this thread growing it appears to me, that at least one thing 
> becomes evident here:
> 
> Xah unwillingness to learn from the bad past experience contaminates 
> others (who are still posting to his trolling threads).
> 
> Here another try to rescue these ones who are just virgin enough not to 
> know what I am speaking about:

I am enjoying the discussion.  I think several other people are, too.  
(At least, I hope so!)  It matters not one whit to me who started the 
thread, or the merits of the originating post.  Why does it matter to 
you?

-- 
Chris Smith - Lead Software Developer / Technical Trainer
MindIQ Corporation
From: Timo Stamm
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <449b07bb$0$11073$9b4e6d93@newsread4.arcor-online.net>
Claudio Grondi schrieb:
> Looking this thread growing it appears to me, that at least one thing 
> becomes evident here:
> 
> Xah unwillingness to learn from the bad past experience contaminates 
> others (who are still posting to his trolling threads).

This is actually one of the most interesting threads I have read in a 
long time. If you ignore the evangelism, there is a lot if high-quality 
information and first-hand experience you couldn't find in a dozen books.
From: Marshall
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1151028214.668765.182330@p79g2000cwp.googlegroups.com>
Timo Stamm wrote:
>
> This is actually one of the most interesting threads I have read in a
> long time. If you ignore the evangelism, there is a lot if high-quality
> information and first-hand experience you couldn't find in a dozen books.

Hear hear! This is an *excellent* thread. The evangelism is at
rock-bottom
and the open exploration of other people's way of thinking is at what
looks to me like an all-time high.


Marshall
From: Joe Marshall
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1151080381.150026.236440@r2g2000cwb.googlegroups.com>
Marshall wrote:
> Timo Stamm wrote:
> >
> > This is actually one of the most interesting threads I have read in a
> > long time. If you ignore the evangelism, there is a lot if high-quality
> > information and first-hand experience you couldn't find in a dozen books.
>
> Hear hear! This is an *excellent* thread. The evangelism is at
> rock-bottom
> and the open exploration of other people's way of thinking is at what
> looks to me like an all-time high.

What are you, some sort of neo-static pseudo-relational
object-disoriented fascist?
From: Marshall
Subject: Re: What is Expressiveness in a Computer Language
Date: 
Message-ID: <1151099051.667267.202730@g10g2000cwb.googlegroups.com>
Joe Marshall wrote:
> Marshall wrote:
> > Timo Stamm wrote:
> > >
> > > This is actually one of the most interesting threads I have read in a
> > > long time. If you ignore the evangelism, there is a lot if high-quality
> > > information and first-hand experience you couldn't find in a dozen books.
> >
> > Hear hear! This is an *excellent* thread. The evangelism is at
> > rock-bottom
> > and the open exploration of other people's way of thinking is at what
> > looks to me like an all-time high.
>
> What are you, some sort of neo-static pseudo-relational
> object-disoriented fascist?

I prefer to think in terms of "object orientatedness."

But you're close. I'd say rather that I'm a quasi-theoretical,
neo-relational, crypto-logical fascist. But definitely fascist.

Once after a lengthy debate with a schemer, he
dubbed me (good naturedly) "Il Duce." Heh.


Marshall
From: Dr.Ruud
Subject: Typing (was: Re: What is Expressiveness in a Computer Language)
Date: 
Message-ID: <e7f96i.p0.1@news.isolution.nl>
Timo Stamm schreef:

> This is actually one of the most interesting threads I have read in a
> long time. If you ignore the evangelism, there is a lot if
> high-quality information and first-hand experience you couldn't find
> in a dozen books.

Much of what is talked about, is in these articles (and their links)
http://www.mindview.net/WebLog/log-0066
http://en.wikipedia.org/wiki/Dynamic_typing

-- 
Affijn, Ruud

"Gewoon is een tijger."