ANSI question (arose from the wordcount thread)

From: Nicolas Neuss
Subject: ANSI question (arose from the wordcount thread)
Date: Tue, 07 Jun 2005 08:39:41 +0000
Message-ID: <87wtp6h86q.fsf@ortler.iwr.uni-heidelberg.de>

Hello,

in the wordcount thread, Eric Lavigne has spotted a problem in a construct
of mine which boils down to the following question: is the following code
snippet conforming or not?  If yes, it would be a good candidate for Paul's
test suite.  Implementations differ on it: Allegro gives an error,
CMUCL/SBCL yield 1, and CLISP yields 7 (which was the desired result).

(let ((*readtable* (copy-readtable)))
  (set-syntax-from-char #\7 #\Space)
    (length (format nil "~7D" 1)))

Yours, Nicolas.

Re: ANSI question (arose from the wordcount thread) Paul F. Dietz
- Re: ANSI question (arose from the wordcount thread) Harald Hanche-Olsen
  - Re: ANSI question (arose from the wordcount thread) Marcin 'Qrczak' Kowalczyk
    - Re: ANSI question (arose from the wordcount thread) Barry Margolin
      - Re: ANSI question (arose from the wordcount thread) Matthias Buelow
        Re: ANSI question (arose from the wordcount thread) Barry Margolin
    - Re: ANSI question (arose from the wordcount thread) Kent M Pitman

From: Paul F. Dietz
Subject: Re: ANSI question (arose from the wordcount thread)
Date: Tue, 07 Jun 2005 08:55:41 +0000
Message-ID: <tZWdnQX-RcmQ_DjfRVn-hw@dls.net>

Nicolas Neuss wrote:
> Hello,
> 
> in the wordcount thread, Eric Lavigne has spotted a problem in a construct
> of mine which boils down to the following question: is the following code
> snippet conforming or not?  If yes, it would be a good candidate for Paul's
> test suite.  Implementations differ on it: Allegro gives an error,
> CMUCL/SBCL yield 1, and CLISP yields 7 (which was the desired result).
> 
> (let ((*readtable* (copy-readtable)))
>   (set-syntax-from-char #\7 #\Space)
>     (length (format nil "~7D" 1)))

The string there is read long before the readtable fiddling is done,
so there's no reason why the 7 it contains should be treated unusually.

	Paul

From: Harald Hanche-Olsen
Subject: Re: ANSI question (arose from the wordcount thread)
Date: Tue, 07 Jun 2005 10:12:05 +0000
Message-ID: <pcou0kamq6i.fsf@shuttle.math.ntnu.no>

+ "Paul F. Dietz" <·····@dls.net>:

| Nicolas Neuss wrote:
| > Hello,
| > in the wordcount thread, Eric Lavigne has spotted a problem in a
| > construct
| > of mine which boils down to the following question: is the following code
| > snippet conforming or not?  If yes, it would be a good candidate for Paul's
| > test suite.  Implementations differ on it: Allegro gives an error,
| > CMUCL/SBCL yield 1, and CLISP yields 7 (which was the desired result).
| > (let ((*readtable* (copy-readtable)))
| >   (set-syntax-from-char #\7 #\Space)
| >     (length (format nil "~7D" 1)))
| 
| The string there is read long before the readtable fiddling is done,
| so there's no reason why the 7 it contains should be treated unusually.

But even if the readtable fiddling had been done earlier, why should
that affect how a string is read?  A string is a string is a string.
(Unless you mess with the readtable syntax for the double quote.)

I also fail to see why readtable syntax should have anything to do
with how format interprets its argument.  Oh, wait - FORMAT needs to
read the argument.  But surely, that needs to be done with standard
I/O syntax, or else with custom code?  I'd say the code is conforming,
and should yield 7.

-- 
* Harald Hanche-Olsen     <URL:http://www.math.ntnu.no/~hanche/>
- Debating gives most of us much more psychological satisfaction
  than thinking does: but it deprives us of whatever chance there is
  of getting closer to the truth.  -- C.P. Snow

From: Marcin 'Qrczak' Kowalczyk
Subject: Re: ANSI question (arose from the wordcount thread)
Date: Tue, 07 Jun 2005 11:33:04 +0000
Message-ID: <8764wqbdvz.fsf@qrnik.zagroda>

Harald Hanche-Olsen <······@math.ntnu.no> writes:

> I also fail to see why readtable syntax should have anything to do
> with how format interprets its argument.  Oh, wait - FORMAT needs to
> read the argument.  But surely, that needs to be done with standard
> I/O syntax, or else with custom code?  I'd say the code is conforming,
> and should yield 7.

This shows the danger of aspect oriented programming like practices,
i.e. changing the behavior of a function by other means than providing
different arguments. It may be convenient but is error-prone.

Without knowing implementation details it's hard to say which changes
tweaking the behavior of a function like READ influence the behavior
of a function like FORMAT which may use or not use READ internally.

Even if it's specified that it's not influenced (in this case
http://www.lisp.org/HyperSpec/Body/sec_22-3.html says "signed (sign
is optional) decimal numbers" which means that it shouldn't depend on
reading parameters), it's easy to make an error in the implementation.

-- 
   __("<         Marcin Kowalczyk
   \__/       ······@knm.org.pl
    ^^     http://qrnik.knm.org.pl/~qrczak/

From: Barry Margolin
Subject: Re: ANSI question (arose from the wordcount thread)
Date: Wed, 08 Jun 2005 05:31:40 +0000
Message-ID: <barmar-13065E.01314008062005@comcast.dca.giganews.com>

In article <··············@qrnik.zagroda>,
 Marcin 'Qrczak' Kowalczyk <······@knm.org.pl> wrote:

> Harald Hanche-Olsen <······@math.ntnu.no> writes:
> 
> > I also fail to see why readtable syntax should have anything to do
> > with how format interprets its argument.  Oh, wait - FORMAT needs to
> > read the argument.  But surely, that needs to be done with standard
> > I/O syntax, or else with custom code?  I'd say the code is conforming,
> > and should yield 7.
> 
> This shows the danger of aspect oriented programming like practices,
> i.e. changing the behavior of a function by other means than providing
> different arguments. It may be convenient but is error-prone.

This reminds me of the common implementations of IP "dotted-quad" 
address parsing.  Most of them are written in C and use its atoi() 
function to translate each component of the address.  But this function 
implements C's convention of treating a number beginning with 0 as 
octal.  The result is that many users get unexpected results when they 
format IP addresses like 192.168.010.004 -- this is parsed as if it were 
192.168.8.4.

I've never heard anyone say that this was intentional, I suspect it just 
happened in some early Unix implementation, and as a result it has 
become a de facto standard.

-- 
Barry Margolin, ······@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***

From: Matthias Buelow
Subject: Re: ANSI question (arose from the wordcount thread)
Date: Mon, 13 Jun 2005 01:41:55 +0000
Message-ID: <3h46j2Fekih1U1@news.dfncis.de>

Barry Margolin wrote:

> This reminds me of the common implementations of IP "dotted-quad" 
> address parsing.  Most of them are written in C and use its atoi() 
> function to translate each component of the address.  But this function 
> implements C's convention of treating a number beginning with 0 as 
> octal.  The result is that many users get unexpected results when they 
> format IP addresses like 192.168.010.004 -- this is parsed as if it were 
> 192.168.8.4.
> 
> I've never heard anyone say that this was intentional, I suspect it just 
> happened in some early Unix implementation, and as a result it has 
> become a de facto standard.

I would rather think it was a bug in some particular implementation.

$ cat t.c
#include <stdlib.h>
#include <stdio.h>

int main(int argc, char *argv[])
{
    if (argc>1)
        printf("%d\n", atoi(argv[1]));
    return 0;
}
$ cc t.c
$ ./a.out 010
10
$

(tested on FreeBSD and Fedora Linux).

mkb.

From: Barry Margolin
Subject: Re: ANSI question (arose from the wordcount thread)
Date: Tue, 14 Jun 2005 00:42:43 +0000
Message-ID: <barmar-21BFA6.20424313062005@comcast.dca.giganews.com>

In article <··············@news.dfncis.de>,
 Matthias Buelow <···@incubus.de> wrote:

> Barry Margolin wrote:
> 
> > This reminds me of the common implementations of IP "dotted-quad" 
> > address parsing.  Most of them are written in C and use its atoi() 
> > function to translate each component of the address.  But this function 
> > implements C's convention of treating a number beginning with 0 as 
> > octal.  The result is that many users get unexpected results when they 
> > format IP addresses like 192.168.010.004 -- this is parsed as if it were 
> > 192.168.8.4.
> > 
> > I've never heard anyone say that this was intentional, I suspect it just 
> > happened in some early Unix implementation, and as a result it has 
> > become a de facto standard.
> 
> I would rather think it was a bug in some particular implementation.
> 
> $ cat t.c
> #include <stdlib.h>
> #include <stdio.h>
> 
> int main(int argc, char *argv[])
> {
>     if (argc>1)
>         printf("%d\n", atoi(argv[1]));
>     return 0;
> }
> $ cc t.c
> $ ./a.out 010
> 10
> $
> 
> (tested on FreeBSD and Fedora Linux).

I was apparently confusing atoi() and strtol().  The latter is defined 
to recognize 0 and 0x prefixes for octal and hex.

-- 
Barry Margolin, ······@alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***

From: Kent M Pitman
Subject: Re: ANSI question (arose from the wordcount thread)
Date: Wed, 08 Jun 2005 05:41:49 +0000
Message-ID: <ud5qxv1u3.fsf@nhplace.com>

Marcin 'Qrczak' Kowalczyk <······@knm.org.pl> writes:

> Harald Hanche-Olsen <······@math.ntnu.no> writes:
> 
> > I also fail to see why readtable syntax should have anything to do
> > with how format interprets its argument.  Oh, wait - FORMAT needs to
> > read the argument.  But surely, that needs to be done with standard
> > I/O syntax, or else with custom code?  I'd say the code is conforming,
> > and should yield 7.
> 
> This shows the danger of aspect oriented programming like practices,
> i.e. changing the behavior of a function by other means than providing
> different arguments. It may be convenient but is error-prone.
> 
> Without knowing implementation details it's hard to say which changes
> tweaking the behavior of a function like READ influence the behavior
> of a function like FORMAT which may use or not use READ internally.

I think it's implied by the fact that conforming programs are allowed
to change these parameters that doing so will not violate the integrity
of the processor.  Sounds like a simple bug to me.

> Even if it's specified that it's not influenced (in this case
> http://www.lisp.org/HyperSpec/Body/sec_22-3.html says "signed (sign
> is optional) decimal numbers" which means that it shouldn't depend on
> reading parameters), it's easy to make an error in the implementation.

But, importantly, errors can be fixed.

Your remarks above about aspect-oriented programming as an
implementation strategy seem to suggest that you could argue it wasn't
a bug but a feature... I just don't see it, though.