Formatting large numbers with ~g?

From: Raymond Toy (RT/EUS)
Subject: Formatting large numbers with ~g?
Date: Thu, 24 Jan 2008 15:00:44 +0000
Message-ID: <sxd3asnl143.fsf@rtp.ericsson.se>

In the hyperspec, there is this paragraph:

    The full form is ~w,d,e,k,overflowchar,padchar,exponentcharG. The
    format in which to print arg depends on the magnitude (absolute
    value) of the arg. Let n be an integer such that 10^n-1 <= |arg| <
    10^n. Let ee equal e+2, or 4 if e is omitted. Let ww equal w-ee,
    or nil if w is omitted. If d is omitted, first let q be the number
    of digits needed to print arg with no loss of information and
    without leading or trailing zeros; then let d equal (max q (min n
    7)). Let dd equal d-n.

    If 0 <= dd <= d, then arg is printed as if by the format
    directives

        ~ww,dd,,overflowchar,···········@T

My question is about the part "q be the number of digits needed to
print arg with no loss of information and without leading or trailing
zeros".  What is that supposed to mean?  Do is it mean significant
digits or the whole number?

For example, what should (format nil "~g" 1d300) print?  CMUCL
currently prints 1 followed by 300 zeros.  This is correct if we
interpret q as requiring all the nonsignificant zeros to be printed
too.  But 1d300 really only needs 1 digit to be printed to (barring
round-off issues), so 1d300 would be formatted as 1.000000d300.

This, to me, makes more sense.  q is really the number of significant
digits needed to print the number without loss of information.

Is my interpretation wrong?

Ray

Re: Formatting large numbers with ~g? Geoffrey Summerhayes
Re: Formatting large numbers with ~g? Raymond Toy (RT/EUS)
- Re: Formatting large numbers with ~g? Raymond Toy (RT/EUS)
  - Re: Formatting large numbers with ~g? Geoffrey Summerhayes
    - Re: Formatting large numbers with ~g? Raymond Toy (RT/EUS)
      - Re: Formatting large numbers with ~g? Geoffrey Summerhayes
        Re: Formatting large numbers with ~g? Geoffrey Summerhayes
        Re: Formatting large numbers with ~g? Raymond Toy (RT/EUS)
        Re: Formatting large numbers with ~g? Geoffrey Summerhayes
        Re: Formatting large numbers with ~g? Raymond Toy (RT/EUS)
Re: Formatting large numbers with ~g? Kent M Pitman
- Re: Formatting large numbers with ~g? Raymond Toy (RT/EUS)
  - Re: Formatting large numbers with ~g? Kent M Pitman

From: Geoffrey Summerhayes
Subject: Re: Formatting large numbers with ~g?
Date: Thu, 24 Jan 2008 22:06:08 +0000
Message-ID: <c4ada7a6-3cfb-44ef-ae1e-26395945302f@s8g2000prg.googlegroups.com>

On Jan 24, 10:00 am, ···········@ericsson.com (Raymond Toy (RT/EUS))
wrote:
> In the hyperspec, there is this paragraph:
>
>     The full form is ~w,d,e,k,overflowchar,padchar,exponentcharG. The
>     format in which to print arg depends on the magnitude (absolute
>     value) of the arg. Let n be an integer such that 10^n-1 <= |arg| <
>     10^n. Let ee equal e+2, or 4 if e is omitted. Let ww equal w-ee,
>     or nil if w is omitted. If d is omitted, first let q be the number
>     of digits needed to print arg with no loss of information and
>     without leading or trailing zeros; then let d equal (max q (min n
>     7)). Let dd equal d-n.
>
>     If 0 <= dd <= d, then arg is printed as if by the format
>     directives
>
>         ~ww,dd,,overflowchar,···········@T
>
> My question is about the part "q be the number of digits needed to
> print arg with no loss of information and without leading or trailing
> zeros".  What is that supposed to mean?  Do is it mean significant
> digits or the whole number?
>
> For example, what should (format nil "~g" 1d300) print?  CMUCL
> currently prints 1 followed by 300 zeros.  This is correct if we
> interpret q as requiring all the nonsignificant zeros to be printed
> too.  But 1d300 really only needs 1 digit to be printed to (barring
> round-off issues), so 1d300 would be formatted as 1.000000d300.
>
> This, to me, makes more sense.  q is really the number of significant
> digits needed to print the number without loss of information.
>
> Is my interpretation wrong?

At first, I was going to say 'Yes, your interpretation is wrong.'
Then I started looking at the specification a little closer and
changed my mind. It's the let d=(max q (min n 7)), there isn't
a reason for that. If they wanted q to consider the whole number
that would mean q >= n so (max q 7) would be enough.

Then I fooled around with ~G a bit more, looked at the examples,
looked at the spec, and decided, IMHO, that ~G is broken. I
don't think it does what it should be doing.

My personal opinion is that the section for dd should say:

  If d is omitted, first let q be the number of
  significant digits needed to print arg with no
  loss of information then let d equal (max q (min n 7))
  and let dd equal d-n. OTHERWISE let dd equal d.

That way we'd get (from "|~20,3G|"):

* (format t "|~16,····@T|" 100)
|         100.000    |
* (format t "|~20,3E|" 1d30)
|           1.000d+30|

Rather than what we currently get:

* (format t "|~20,3G|" 100)
|            100.    |
* (format t "|~20,3G|" 1d30)
|           1.000d+30|

Which strikes me as what was likely the
original intention. But I'm just guessing,
any comments Kent?

----
Geoff

From: Raymond Toy (RT/EUS)
Subject: Re: Formatting large numbers with ~g?
Date: Fri, 25 Jan 2008 19:01:08 +0000
Message-ID: <sxdd4rpk9vv.fsf@rtp.ericsson.se>

    G> My personal opinion is that the section for dd should say:

    G>   If d is omitted, first let q be the number of
    G>   significant digits needed to print arg with no
    G>   loss of information then let d equal (max q (min n 7))
    G>   and let dd equal d-n. OTHERWISE let dd equal d.

I believe there are also cases where dd > w for ~F or d > w for ~E.  I
think these are unspecified behaviour, so the spec should probably
adjust dd and d so that bad values aren't used.

    G> That way we'd get (from "|~20,3G|"):

    G> * (format t "|~16,····@T|" 100)
    G> |         100.000    |
    G> * (format t "|~20,3E|" 1d30)
    G> |           1.000d+30|

This makes sense to me.

Ray

From: Raymond Toy (RT/EUS)
Subject: Re: Formatting large numbers with ~g?
Date: Fri, 25 Jan 2008 22:58:37 +0000
Message-ID: <sxd8x2djyw2.fsf@rtp.ericsson.se>

>>>>> "Raymond" == Raymond Toy <···········@ericsson.com> writes:

    G> My personal opinion is that the section for dd should say:

    G> If d is omitted, first let q be the number of
    G> significant digits needed to print arg with no
    G> loss of information then let d equal (max q (min n 7))
    G> and let dd equal d-n. OTHERWISE let dd equal d.

    Raymond> I believe there are also cases where dd > w for ~F or d > w for ~E.  I
    Raymond> think these are unspecified behaviour, so the spec should probably
    Raymond> adjust dd and d so that bad values aren't used.

One other question.  Let's assume q is the number of significant
digits and consider (format nil "~g" 1.2d40).  According to the spec,
we set d to 7, and should use ~w,de to print out the number.  But if
the original format was ~g, do we really want to use ~,7e to print out
the answer?  I think it would make more sense to use ~e, since d
wasn't originally specified.

Ray

From: Geoffrey Summerhayes
Subject: Re: Formatting large numbers with ~g?
Date: Tue, 29 Jan 2008 16:09:15 +0000
Message-ID: <b021ed07-eaf2-43ab-921c-b1edb995e8c6@l32g2000hse.googlegroups.com>

On Jan 25, 5:58 pm, ···········@ericsson.com (Raymond Toy (RT/EUS))
wrote:
>
> One other question.  Let's assume q is the number of significant
> digits and consider (format nil "~g" 1.2d40).  According to the spec,
> we set d to 7, and should use ~w,de to print out the number.  But if
> the original format was ~g, do we really want to use ~,7e to print out
> the answer?  I think it would make more sense to use ~e, since d
> wasn't originally specified.

Maybe. Having a default value does make for a
nicer table output over a range of values.

Not that it matters all that much, ~g on the
implementations  I've looked at seem fairly
consistent in output, but it's easy enough
to provide your own formating rule using ~/

CL > (loop for x in '(3 3.14159 1d30 -2.7182818d200)
           do (format t "~&|~20,5,4,,'%/G/|~%" x))
|       3.00000      |
|       3.14159      |
|       1.00000D+0030|
|      -2.71828D+0200|
NIL
CL > (loop for x in '(3 3.14159 1d30 -2.7182818d200)
           do (format t "~&|~20,5,2,,··@/G/|~%" x))
|        +3.00000    |
|        +3.14159    |
|        +1.00000D+30|
|%%%%%%%%%%%%%%%%%%%%|

----
Geoff

From: Raymond Toy (RT/EUS)
Subject: Re: Formatting large numbers with ~g?
Date: Wed, 30 Jan 2008 15:22:07 +0000
Message-ID: <sxdsl0fgwyo.fsf@rtp.ericsson.se>

>>>>> "Geoffrey" == Geoffrey Summerhayes <·······@gmail.com> writes:

    Geoffrey> On Jan 25, 5:58�pm, ···········@ericsson.com (Raymond Toy (RT/EUS))
    Geoffrey> wrote:
    >> 
    >> One other question. �Let's assume q is the number of significant
    >> digits and consider (format nil "~g" 1.2d40). �According to the spec,
    >> we set d to 7, and should use ~w,de to print out the number. �But if
    >> the original format was ~g, do we really want to use ~,7e to print out
    >> the answer? �I think it would make more sense to use ~e, since d
    >> wasn't originally specified.

    Geoffrey> Maybe. Having a default value does make for a
    Geoffrey> nicer table output over a range of values.

    Geoffrey> Not that it matters all that much, ~g on the
    Geoffrey> implementations  I've looked at seem fairly
    Geoffrey> consistent in output, but it's easy enough
    Geoffrey> to provide your own formating rule using ~/

Yes, of course, I can always create my own.  I thought about a Fortran
equivalent that would drop the exponent marker if the exponent was
greater than or equal to 100.  Too lazy to actually implement it,
though.

    Geoffrey> CL > (loop for x in '(3 3.14159 1d30 -2.7182818d200)
    Geoffrey>            do (format t "~&|~20,5,4,,'%/G/|~%" x))
    Geoffrey> |       3.00000      |
    Geoffrey> |       3.14159      |
    Geoffrey> |       1.00000D+0030|
    Geoffrey> |      -2.71828D+0200|

The current CVS sources for cmucl produces:

|        3.0000      |
|        3.1416      |
|       1.00000d+0030|
|      -2.71828d+0200|

I had noticed that if you follow the spec, and use the computed dd
value, you would get one less significant digit with ~F than with ~E.
I thought about making them the same.  Then you'd get a nice print out
like you show.  I like your output better. :-)

Another issue.  What should (format nil "~8g" 1.2345678d7) produce?
If we follow the spec, ww = 4, n = 8, q = 8, d = 8, and dd = 0.  Then we
should use ~4,0,···@@T.  Of course 12345678 can't fit in a field of
width 4, so we print out "12345678.    "

But 1.2345678d7 can be printed nicely as ~8e:  "1.235d+7", so we
don't gratuitously exceed the specified field width.

Ray

From: Geoffrey Summerhayes
Subject: Re: Formatting large numbers with ~g?
Date: Wed, 30 Jan 2008 19:02:22 +0000
Message-ID: <eb022b4f-b282-4dc5-b2ea-7d1587d450b4@v4g2000hsf.googlegroups.com>

On Jan 30, 10:22 am, ···········@ericsson.com (Raymond Toy (RT/EUS))
wrote:
> >>>>> "Geoffrey" == Geoffrey Summerhayes <·······@gmail.com> writes:
>
>
>     Geoffrey> CL > (loop for x in '(3 3.14159 1d30 -2.7182818d200)
>     Geoffrey>            do (format t "~&|~20,5,4,,'%/G/|~%" x))
>     Geoffrey> |       3.00000      |
>     Geoffrey> |       3.14159      |
>     Geoffrey> |       1.00000D+0030|
>     Geoffrey> |      -2.71828D+0200|
>
> The current CVS sources for cmucl produces:
>
> |        3.0000      |
> |        3.1416      |
> |       1.00000d+0030|
> |      -2.71828d+0200|
>
> I had noticed that if you follow the spec, and use the computed dd
> value, you would get one less significant digit with ~F than with ~E.
> I thought about making them the same.  Then you'd get a nice print out
> like you show.  I like your output better. :-)

;;;; little ugly, CALCULATE-D is terrible so I'm
;;;; not including it :-P Hopefully the format strings
;;;; will survive the posting :)

(defun G(stream argument colon at-sign &optional w d e k overflow
padchar exponentchar)
 (let* ((n (if (zerop argument) 1
            (1+ (floor (realpart (log argument 10))))))
        (ee (if (null e) 4 (+ 2 e)))
        (ww (if (null w) nil (- w ee)))
        (dd (if (null d)
              (progn
                (setf d (calculate-d n argument))
                (- d n))
             d)))
    (format stream
     (if (<= n d)
       (format nil
          ······@[~A~],·····@['~A~]~^,···@[@~*~]F~~~A,,,·@['~A~]<~~>"
              (list ww dd nil)(list overflow padchar) at-sign ee
padchar)
      (format nil
          ······@[~A~],·····@['~A~]~^,···@[@~]E"
             (list w d e k)(list overflow padchar exponentchar) at-
sign))
     argument)))

> Another issue.  What should (format nil "~8g" 1.2345678d7) produce?
> If we follow the spec, ww = 4, n = 8, q = 8, d = 8, and dd = 0.  Then we
> should use ~4,0,···@@T.  Of course 12345678 can't fit in a field of
> width 4, so we print out "12345678.    "
>
> But 1.2345678d7 can be printed nicely as ~8e:  "1.235d+7", so we
> don't gratuitously exceed the specified field width.

That is a decision I agree with, the numbers printed in this
case aren't the same. 12345678 versus 12350000.

Without more specific instructions the algorithm will always
choose the accuracy of the presentation over its formatting.

----
Geoff

From: Geoffrey Summerhayes
Subject: Re: Formatting large numbers with ~g?
Date: Wed, 30 Jan 2008 19:33:26 +0000
Message-ID: <d136cd15-313f-4377-85d7-b37a4e002d54@h11g2000prf.googlegroups.com>

On Jan 30, 2:02 pm, Geoffrey Summerhayes <·······@gmail.com> wrote:
>
> ;;;; little ugly, CALCULATE-D is terrible so I'm
> ;;;; not including it :-P Hopefully the format strings
> ;;;; will survive the posting :)
>
> (defun G(stream argument colon at-sign &optional w d e k overflow
> padchar exponentchar)
>  (let* ((n (if (zerop argument) 1
>             (1+ (floor (realpart (log argument 10))))))

Bloody H. Copied an earlier version to the thumb drive.
Changed the IF to (or (not numberp argument) (zerop argument))
to handle non-numeric arguments.

>        (format nil
>           ······@[~A~],·····@['~A~]~^,···@[@~*~]F~~~A,,,·@['~A~]<~~>"
>               (list ww dd nil)(list overflow padchar) at-sign ee
> padchar)

Folled around with the idea of adding padchar to
both sides of a number, didn't really like the result
so that got dropped. Changed the ~A's after as well.

---
Geoff

From: Raymond Toy (RT/EUS)
Subject: Re: Formatting large numbers with ~g?
Date: Wed, 30 Jan 2008 21:12:04 +0000
Message-ID: <sxdk5lrggrf.fsf@rtp.ericsson.se>

>>>>> "Geoffrey" == Geoffrey Summerhayes <·······@gmail.com> writes:

    Geoffrey> ;;;; little ugly, CALCULATE-D is terrible so I'm
    Geoffrey> ;;;; not including it :-P Hopefully the format strings
    Geoffrey> ;;;; will survive the posting :)

[code snipped]

EFORMATTOOHAIRY

:-)

Nice solution.  There's probably a problem with roundoff when
calculating n via log, but that can be solved.

    >> Another issue. �What should (format nil "~8g" 1.2345678d7) produce?
    >> If we follow the spec, ww = 4, n = 8, q = 8, d = 8, and dd = 0. �Then we
    >> should use ~4,0,···@@T. �Of course 12345678 can't fit in a field of
    >> width 4, so we print out "12345678. � �"
    >> 
    >> But 1.2345678d7 can be printed nicely as ~8e: �"1.235d+7", so we
    >> don't gratuitously exceed the specified field width.

    Geoffrey> That is a decision I agree with, the numbers printed in this
    Geoffrey> case aren't the same. 12345678 versus 12350000.

    Geoffrey> Without more specific instructions the algorithm will always
    Geoffrey> choose the accuracy of the presentation over its formatting.

Yes, this is acceptable to me.  

What about this case: (format nil "~8g" 1.234567d0).  Here, ww = 4, n
= 2, q = 7, d = 7, and dd = 6.  This is supposed to be printed as
~4,6F.  Can't remember if this is a legal ~F spec or not.  But this
could be printed as ~4,2F without violating the width constraint:

"1.23    ".

But then this would have fewer digits than using ~8e:

"1.235d+0".

Ray

From: Geoffrey Summerhayes
Subject: Re: Formatting large numbers with ~g?
Date: Wed, 30 Jan 2008 21:53:19 +0000
Message-ID: <22e81f39-5c8b-4f8a-a35c-f8d58e617f74@v17g2000hsa.googlegroups.com>

On Jan 30, 4:12 pm, ···········@ericsson.com (Raymond Toy (RT/EUS))
wrote:
> >>>>> "Geoffrey" == Geoffrey Summerhayes <·······@gmail.com> writes:
>
>     Geoffrey> ;;;; little ugly, CALCULATE-D is terrible so I'm
>     Geoffrey> ;;;; not including it :-P Hopefully the format strings
>     Geoffrey> ;;;; will survive the posting :)
>
> [code snipped]
>
> EFORMATTOOHAIRY
>
> :-)
>
> Nice solution.  There's probably a problem with roundoff when
> calculating n via log, but that can be solved.

Yeah, probably, it was more of a proof of concept.
Determining q is a tougher problem, the work involved
is akin to printing the number twice algorithm-wise. Once
to determine significant digits, then the actual
print.

<*snip*>

> What about this case: (format nil "~8g" 1.234567d0).  Here, ww = 4, n
> = 2, q = 7, d = 7, and dd = 6.  This is supposed to be printed as
> ~4,6F.  Can't remember if this is a legal ~F spec or not.  But this
> could be printed as ~4,2F without violating the width constraint:
>
> "1.23    ".
>

7-2=5 :)

Since there's no way to print ~4,5F without violating w, I'd say
this applied:
  If the overflowchar parameter is omitted, then the scaled value
  is printed using more than w characters, as many more as may be
  needed.

So it would print as ~,5F. Same thing, accuracy vs format.

---
Geoff

From: Raymond Toy (RT/EUS)
Subject: Re: Formatting large numbers with ~g?
Date: Thu, 31 Jan 2008 14:33:08 +0000
Message-ID: <sxdbq72gj4r.fsf@rtp.ericsson.se>

>>>>> "Geoffrey" == Geoffrey Summerhayes <·······@gmail.com> writes:

    Geoffrey> On Jan 30, 4:12�pm, ···········@ericsson.com (Raymond Toy (RT/EUS))
    Geoffrey> wrote:
    >> >>>>> "Geoffrey" == Geoffrey Summerhayes <·······@gmail.com> writes:
    >> 
    >> � � Geoffrey> ;;;; little ugly, CALCULATE-D is terrible so I'm
    >> � � Geoffrey> ;;;; not including it :-P Hopefully the format strings
    >> � � Geoffrey> ;;;; will survive the posting :)
    >> 
    >> [code snipped]
    >> 
    >> EFORMATTOOHAIRY
    >> 
    >> :-)
    >> 
    >> Nice solution. �There's probably a problem with roundoff when
    >> calculating n via log, but that can be solved.

    Geoffrey> Yeah, probably, it was more of a proof of concept.
    Geoffrey> Determining q is a tougher problem, the work involved
    Geoffrey> is akin to printing the number twice algorithm-wise. Once
    Geoffrey> to determine significant digits, then the actual
    Geoffrey> print.

CMUCL also prints it twice.  

    Geoffrey> <*snip*>

    >> What about this case: (format nil "~8g" 1.234567d0). �Here, ww = 4, n
    >> = 2, q = 7, d = 7, and dd = 6. �This is supposed to be printed as
    >> ~4,6F. �Can't remember if this is a legal ~F spec or not. �But this
    >> could be printed as ~4,2F without violating the width constraint:
    >> 
    >> "1.23 � �".
    >> 

    Geoffrey> 7-2=5 :)

    Geoffrey> Since there's no way to print ~4,5F without violating w, I'd say
    Geoffrey> this applied:
    Geoffrey>   If the overflowchar parameter is omitted, then the scaled value
    Geoffrey>   is printed using more than w characters, as many more as may be
    Geoffrey>   needed.

    Geoffrey> So it would print as ~,5F. Same thing, accuracy vs format.

That is in the section for ~F, and not ~G, but I this is a good reason
to let ~G defer to ~F.

So, let me summarize the results of this discussion

o q is the number of significant digits needed to print the number
  without loss of accuracy.
o If d is given, it is used in both ~F and ~G
o Everything else is as given in the spec.

With this change, we get

(loop for x in '(3 3.14159 3.1415926535d1 3.14159d-1 3.14159d-2
			  3.1415926535d3 1d30 -2.7182818d200)
           do (format t "~&|~20,5,4,,'%G|~%" x))
|       3.00000      |
|       3.14159      |
|      31.41593      |
|       0.31416      |
|       3.14159d-0002|
|    3141.59265      |
|       1.00000d+0030|
|      -2.71828d+0200|

Note, however, that Fortran gives this:

      print 1, 3.0, 3.14159,3.1415926535d1, 3.14159d-1, 3.14159d-2,
     $     3.1415926535d3, 1d30, -2.7182818d200
    1 format(1x,'|',1pg20.5e4,'|')
      end
      
 |        3.0000      |
 |        3.1416      |
 |        31.416      |
 |       0.31416      |
 |       3.14159E-0002|
 |        3141.6      |
 |       1.00000E+0030|
 |      -2.71828E+0200|

This is what ~G would give, if, for ~F, we used dd, even if d were
given.  For consistency with Fortran, perhaps it should be done this
way.

Ray

From: Kent M Pitman
Subject: Re: Formatting large numbers with ~g?
Date: Fri, 01 Feb 2008 05:01:38 +0000
Message-ID: <ufxwdz2vh.fsf@nhplace.com>

···········@ericsson.com (Raymond Toy (RT/EUS)) writes:

Sorry for the delay in responding.  My copy of the hardcopy ANSI CL
document was stored off-site and I had to make time to retrieve it
before I could comment on some of the issues I wanted to touch.

And also, apologies in advance for the long text-lines in code
examples below.  I didn't want to break them...

> In the hyperspec, there is this paragraph:
> 
>     The full form is ~w,d,e,k,overflowchar,padchar,exponentcharG. The
>     format in which to print arg depends on the magnitude (absolute
>     value) of the arg. Let n be an integer such that 
>      10^n-1 <= |arg| < 10^n

I don't think it affects this discussion, but this transcription worried
me and I had to dredge out my hardcopy to be sure ... Apparently, this
is a CLHS formatting bug.  The ANSI CL spec has the "n-1" raised, as in:

    n-1              n
  10    <= |arg| < 10

So (absent 2D formatting), it should be shown as

  10^(n-1) <= |arg| < 10^n

(I'll make a note that this should get looked at in the next version of CLHS.)

>     Let ee equal e+2, or 4 if e is omitted. Let ww equal w-ee,
>     or nil if w is omitted. If d is omitted, first let q be the number
>     of digits needed to print arg with no loss of information and
>     without leading or trailing zeros; then let d equal (max q (min n
>     7)). Let dd equal d-n.
> 
>     If 0 <= dd <= d, then arg is printed as if by the format
>     directives
> 
>         ~ww,dd,,overflowchar,···········@T

According to the online version of CLTL2 (which is a sibling, not a
parent, of ANSI CL, but which shares the common ancestor of CLTL1 and
which has change bars where it differs), the original CLTL requirement
is:

|     The full form is ~w,d,e,k,overflowchar,padchar,exponentcharG. The
|     format in which to print arg depends on the magnitude (absolute
|     value) of the arg. Let n be an integer such that
|      10^(n-1) <= |arg| < 10^n
|     (If arg is zero, let n be 0.)
|     Let ee equal e+2, or 4 if e is omitted. Let ww equal w-ee,
|     or nil if w is omitted. If d is omitted, first let q be the number
|     of digits needed to print arg with no loss of information and
|     without leading or trailing zeros; then let d equal (max q (min n
|     7)). Let dd equal d-n.
|
|     If 0ddd, then arg is printed as if by the format directives 
|
|       ~ww,dd,,overflowchar,···········@T

I'm not sure what happened to the "(If arg is zero, let n be 0.)"  by
the way, but it's missing in ANSI CL, too, so even if it's an editing
error, it's what we have.  (There may be a cleanup issue or it might
have gotten moved or rewritten somewhere--I didn't look hard enough to
know.  Both I and my predecessor editor, Kathy Chapman, tried hard not
to gratuitously delete text from CLTL unless by committee vote or
unless the same information was being moved to another place.  But
it's not inconceivable that we goofed somewhere.)

I'm also not sure about the "d" here, now that I look at it, since d
is BOTH passed as a parameter AND given a formula for computing it.
I don't know if that works by strict side-effect, or if those were two
different d's, one that is only used locally and not passed through to
the call to E.

> My question is about the part "q be the number of digits needed to
> print arg with no loss of information and without leading or trailing
> zeros".  What is that supposed to mean?  Do is it mean significant
> digits or the whole number?
> 
> For example, what should (format nil "~g" 1d300) print?  CMUCL
> currently prints 1 followed by 300 zeros.  This is correct if we
> interpret q as requiring all the nonsignificant zeros to be printed
> too.  But 1d300 really only needs 1 digit to be printed to (barring
> round-off issues), so 1d300 would be formatted as 1.000000d300.

Why so many zeros?  See some notes on this below.

> This, to me, makes more sense.  q is really the number of significant
> digits needed to print the number without loss of information.
> 
> Is my interpretation wrong?

Your interpretation of what conforming means may be the right thing to
inquire about.

It is possible to have text that is differently construed by different
vendors, and for both interpretations to be correct in the sense that
neither can be shown to be canonically right.

This is very important in the case of some deliberate ambiguity, but
also means that in the case of accidental ambiguity, vendors are free
to make their case and do as they think best.  I would hope that as
these issues were raised, where there is no major reason to do
otherwise, vendors would tend to converge on one or another interpretation.

Speaking only to my personal preference, with emphasis that I have no
special authority to interpret the meaning (nor even did I have such
special authority when I was editor, which was a technically neutral
role), I have to say that I think ~G is by its nature something that
is not trying to be about "tight control of notational device" since
it allows different styles to be used for different arguments.  What
it seems to me to be about is making rational heuristic choices that
optimize space, so I certainly think that if there is an ambiguity, it
should be resolved in favor of not printing gratuitous digits and
instead trying to make sure things fit.  Which favors not printing 300
digits if they will not have a numerical correctness effect.  But that's
just my personal preference, and I would stop short of saying that either
interpretation is wrong.

Speaking only to my personal opinions and preference from here on, and
with emphasis that I have no special authority to interpret the
meaning (nor even did I have such special authority when I was editor,
which was a technically neutral role), I would say this:

First, it seems to me that the philosophy behind ~G choosing one or the
other of the ~E or ~F formats is that "it's trying to do something 
good-looking".  And so it seems odd to me that anyone could consider
"print 300 digits" good-looking.  That makes me think the interpretation
that favors that should not be preferred if there is an interpretation
that is consistent, that favors a shorter text, and that doesn't lose
precision.

So it's not obvious to me why 1d300 is not the right thing to print here.
It seems to me that...
 Given arg = 1.0d300 = 10^300
    10^(300) <= |arg| < 10^(301)
 therefore n=301
 ee = 4
 ww = nil
 q = 1
 d = (max 1 (min 301 7)) = (max 1 7) = 7
 dd = 7-301 = -294

(If I didn't make a mistake.) Continuing...

Is 0 <= -294 <= 7 ? No.  So don't use ~F.

What I'd expect is would be to use ~E.

If you assume the "d" to be passed to ~E is the original d (omitted),
this results in:
(format nil "~E" 1.0d300)
=> "9.999999999999999D+299" ; This troubles me also because it's not numerically equal.  Hopefully just an LWW bug.  I'd have expected something more like "1.0D+300" here.

If you assume the "d" to be passed is the newly computed d, it looks like
(format nil "~,7d" 1.0d300)
=> "10.0000000D+299" ; LWW 5.0.1 [???]

And yet, LWW 5.0.1 (Personal) gives
(format nil "~G" 1.0d300)
"1000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000.0    "

So, personally, I'm as baffled as the next person on this.  It doesn't
do what I'd have expected in at least this one implementation.  But
then, I emphasize, that doesn't make it automatically wrong.  It's
something where the vendor might have other interpretations.

And following on this...

1000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000.0
+1F++0 #| +1F++0 is single-float plus-infinity |#

But...

1000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000.0d0
1.0D300

(So the presence of the "d" marker here is potentially critical.)

In sum...

(progn 
  (print (list (lisp-implementation-type) (lisp-implementation-version)))
    (dolist (num '(1.0d300 1.0e300 1.0e35))
      (dolist (format-string '("~E" "~F" "~G" "~,7E" "~,7F" "~,7G"))
        (let ((numstr (format nil format-string num)))
          (format t "~&~A~6T~S  ~S  ~S" format-string num (= (read-from-string numstr) num) numstr)))))

("LispWorks Personal Edition" "5.0.1") 
~E    1.0D300  NIL  "9.999999999999999D+299"
~F    1.0D300  T  "1.0D300"
~G    1.0D300  NIL  "1000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000.0    "
~,7E  1.0D300  T  "10.0000000D+299"
~,7F  1.0D300  NIL  "1000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000.0000000"
~,7G  1.0D300  T  "10.0000000D+299"
~E    +1F++0 #| +1F++0 is single-float plus-infinity |#  T  "+1F++0 #| +1F++0 is single-float plus-infinity |#"
~F    +1F++0 #| +1F++0 is single-float plus-infinity |#  T  "+1F++0 #| +1F++0 is single-float plus-infinity |#"
~G    +1F++0 #| +1F++0 is single-float plus-infinity |#  T  "+1F++0 #| +1F++0 is single-float plus-infinity |#"
~,7E  +1F++0 #| +1F++0 is single-float plus-infinity |#  T  "+1F++0 #| +1F++0 is single-float plus-infinity |#"
~,7F  +1F++0 #| +1F++0 is single-float plus-infinity |#  T  "+1F++0 #| +1F++0 is single-float plus-infinity |#"
~,7G  +1F++0 #| +1F++0 is single-float plus-infinity |#  T  "+1F++0 #| +1F++0 is single-float plus-infinity |#"
~E    1.0E35  T  "1.0E+35"
~F    1.0E35  T  "1.0E35"
~G    1.0E35  NIL  "100000000000000000000000000000000000.    "
~,7E  1.0E35  T  "1.0000000E+35"
~,7F  1.0E35  T  "100000000000000000000000000000000000.0000000"
~,7G  1.0E35  T  "1.0000000E+35"
NIL

The question of whether it's reasonable to print #|...|# after certain
floats, or whether ~E and friends should copy this effect, I'll leave
for another day...  Too many other issues here.

Certainly a can of worms.

From: Raymond Toy (RT/EUS)
Subject: Re: Formatting large numbers with ~g?
Date: Fri, 01 Feb 2008 15:41:11 +0000
Message-ID: <sxdr6fwfzvs.fsf@rtp.ericsson.se>

>>>>> "Kent" == Kent M Pitman <Kent> writes:

    Kent> ···········@ericsson.com (Raymond Toy (RT/EUS)) writes:

[nice clear explanation snipped]

    Kent> I'm also not sure about the "d" here, now that I look at it, since d
    Kent> is BOTH passed as a parameter AND given a formula for computing it.
    Kent> I don't know if that works by strict side-effect, or if those were two
    Kent> different d's, one that is only used locally and not passed through to
    Kent> the call to E.

My interpretation so far is that the computed dd (whether the original
d is given or not) is used for ~F.  But for ~E, we use the original d,
not the computed one.  My rationale is that it makes the results look
like what Fortran would produce.  (Fortran requires that w and d be
given, though.)


    Kent> So it's not obvious to me why 1d300 is not the right thing to print here.
    Kent> It seems to me that...
    Kent>  Given arg = 1.0d300 = 10^300
    Kent>     10^(300) <= |arg| < 10^(301)
    Kent>  therefore n=301
    Kent>  ee = 4
    Kent>  ww = nil
    Kent>  q = 1
    Kent>  d = (max 1 (min 301 7)) = (max 1 7) = 7
    Kent>  dd = 7-301 = -294

    Kent> (If I didn't make a mistake.) Continuing...

    Kent> Is 0 <= -294 <= 7 ? No.  So don't use ~F.

    Kent> What I'd expect is would be to use ~E.

    Kent> If you assume the "d" to be passed to ~E is the original d (omitted),
    Kent> this results in:
    Kent> (format nil "~E" 1.0d300)
    Kent> => "9.999999999999999D+299" ; This troubles me also because it's not numerically equal.  Hopefully just an LWW bug.  I'd have expected something more like "1.0D+300" here.

I'm pretty sure this is a bug.  CMUCL used to print something like
this due to a roundoff error.  That has been replaced with a different
printing algorithm.

    Kent> If you assume the "d" to be passed is the newly computed d, it looks like
    Kent> (format nil "~,7d" 1.0d300)
    Kent> => "10.0000000D+299" ; LWW 5.0.1 [???]

Likewise, this is also probably a roundoff error.  I assume you didn't
really mean ~,7d but either ~,7g or ~,7e.

    Kent> And yet, LWW 5.0.1 (Personal) gives
    Kent> (format nil "~G" 1.0d300)
    Kent> "1000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000.0    "

Someone tells me that an earlier version of LW used to print 1.0d+300,
but now it prints what you show above.

    Kent> So, personally, I'm as baffled as the next person on this.  It doesn't
    Kent> do what I'd have expected in at least this one implementation.  But
    Kent> then, I emphasize, that doesn't make it automatically wrong.  It's
    Kent> something where the vendor might have other interpretations.

    Kent> And following on this...

    Kent> 1000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000.0
    Kent> +1F++0 #| +1F++0 is single-float plus-infinity |#

Presumably, LW converts that to a single-float, but 1e300 is too big
for a single-float and returns infinity.  Just curious.  Does +1f++0
get read back in as a number?

    Kent> In sum...

So I think we all agree that q is really the number of significant
digits, not the all the digits needed to print out the number.

Ray

From: Kent M Pitman
Subject: Re: Formatting large numbers with ~g?
Date: Fri, 01 Feb 2008 19:34:26 +0000
Message-ID: <uodb0cvy5.fsf@nhplace.com>

···········@ericsson.com (Raymond Toy (RT/EUS)) writes:

> So I think we all agree that q is really the number of significant
> digits, not the all the digits needed to print out the number.

Well, no one has expressed a disagreement on what they wish for.
That's a pretty weak form of universal quantification.  It also
doesn't imply that it's a canonical opinion; I have a preferred
interpretation but I think there is legitimate wiggle room were
someone to disagree.