Help with using format control strings

From: Jon Allen Boone
Subject: Help with using format control strings
Date: Tue, 15 Oct 2002 20:16:11 +0000
Message-ID: <m3k7kj8nlg.fsf@validus.delamancha.org>

Folks,

    I'm curious about the On-Line HyperSpec section which describes
  the ~R format control string.  The URL is:

   http://www.lispworks.com/reference/HyperSpec/Body/22_cba.htm

    It describes the full form of the control string as:

    ~radix,mincol,padchar,commachar,comma-intervalR

    although the ~B format control string is:

    ~mincol,padchar,commachar,comma-intervalB

    The examples given [at the URL I included] include:

    (format nil "~19,0,' ,4:B" 3333) => "0000 1101 0000 0101"

    Every CL implementation I've tried fails on this example
  [apparently because the 0 needs to be single-quoted to make it a
  character?].  I tried CLISP, CMUCL, LispWorks(win32), ACL(both win32
  and Linux) and CormanLisp.

    I take it that the format control should do the following:

    19 minimum columns
    0 is the pad char
    <SPACE> is the commachar
    comma-chars should occur every 4 bits

    Now, if I replace 0 with '0 it produces the following output on
    all of the above mentioned implementations:

    (format nil "~19,'0,' ,4:B" 3333)
    "000001101 0000 0101"

    Is the pad-char output [the first 4 leading '0s] not supposed to
  be comma'd every 4 characters as the rest of the output is [and as
  the example indicates]?

    This came up in some exercises I'm working through in a new book,
  _Hacker's Delight_ by Henry S. Warren, Jr. [ISBN 0-201-91465-4]
  which is in spirit a follow-on to the 1972 MIT Research memo HAKMEM
  [according to Guy L. Steele, Jr., who wrote the Foreword].

    Rather than just memorize the formulas, I am attempting to
  convince myself that they are correct by looking at the binary
  representation of numbers and performing the operations in my head.
  So, I wanted to use 32-bit padded binary representation to print out
  the numbers.

    On a related note, I'm confused about how to go about getting a
  binary representation of a negative number.  When I hand a negative
  value to the format control above, it just sticks a negative sign in
  front of the binary digits that represent the absolute value of the
  number in question - like this:

  (format nil "~19,'0,' ,4:B" -3333)
  "0000-1101 0000 0101"

    Can someone steer me in the right direction?

-jon
-- 
------------------
Jon Allen Boone
········@delamancha.org

Re: Help with using format control strings Erik Naggum
Re: Help with using format control strings Thomas A. Russ

From: Erik Naggum
Subject: Re: Help with using format control strings
Date: Wed, 16 Oct 2002 02:34:11 +0000
Message-ID: <3243724451319802@naggum.no>

* Jon Allen Boone
| Every CL implementation I've tried fails on this example [apparently
| because the 0 needs to be single-quoted to make it a character?].

  That example is one of the very small number of actual bugs in the
  standard.  Note, however, that examples are not authoritative in any way
  and are not formally part of the standard.  This is an editorial decision
  I support wholeheartedly.  Examples are generally a bad thing outside of
  carefully controlled educational settings (as tutorials) because people
  just as easily learn wrong things from them as the intended purpose if
  they are not sufficiently guided in what they are supposed to learn.

  In this case, you have not read the specification, but have "believed"
  the example.  There is really nothing more important to say on this than
  that this is the wrong order of preference.  The right order of preference
  in a specification and especially a standard is the normative clauses,
  then the normative clauses, then the normative clauses again just to make
  sure, and only if you got the point from the normative clauses, you can
  look at the non-normative clauses, in which you should report anything
  not immediately reconcilable with the normative clauses as a mistake in
  the standard, not with your understanding of it.

| This came up in some exercises I'm working through in a new book,
| _Hacker's Delight_ by Henry S. Warren, Jr. [ISBN 0-201-91465-4] which is
| in spirit a follow-on to the 1972 MIT Research memo HAKMEM [according to
| Guy L. Steele, Jr., who wrote the Foreword].

  Thanks for the reference.  I had somehow missed this book.

| Rather than just memorize the formulas, I am attempting to convince
| myself that they are correct by looking at the binary representation of
| numbers and performing the operations in my head.  So, I wanted to use
| 32-bit padded binary representation to print out the numbers.

  Since you are on the topic of cute hacks, try

(format <stream> "~41,'0,' ,4:B" (dpb <integer> (byte 32 0) (ash 1 32)))

  This also solves your negative 32-bit integer printing problem neatly.
  Exercise: Explain why.  :)

-- 
Erik Naggum, Oslo, Norway

Act from reason, and failure makes you rethink and study harder.
Act from faith, and failure makes you blame someone and push harder.

From: Thomas A. Russ
Subject: Re: Help with using format control strings
Date: Thu, 17 Oct 2002 23:39:13 +0000
Message-ID: <ymiy98wbppa.fsf@sevak.isi.edu>

Jon Allen Boone <········@delamancha.org> writes:

>     Now, if I replace 0 with '0 it produces the following output on
>     all of the above mentioned implementations:
> 
>     (format nil "~19,'0,' ,4:B" 3333)
>     "000001101 0000 0101"
> 
>     Is the pad-char output [the first 4 leading '0s] not supposed to
>   be comma'd every 4 characters as the rest of the output is [and as
>   the example indicates]?

Apparently the implementations all add the pad character to the output
to fill it in without applying the comma rule.

I think a while back (a fairly long while), this was discussed here.  I
think the final sense was that this was a fairly reasonable practice,
since otherwise Lisp would need to do rather involved inference to
figure out what was reasonable.  This is mainly because you would really
only want to have the padchar following the comma convention if it were
a legal digit character.  If one were to choose * instead of 0

  (format nil "~19,'*,' ,4:B" 3333)

   "*****1101 0000 0101"  is probably nicer than
   "**** 1101 0000 0101"

This is even more clearly so with the following example:

  (format nil "~19,'*,' ,4:B" 3333)
  "******101 0011 0101"  versus
  "**** *101 0011 0101"

It really only makes sense to use the comma character when one is trying
to zero pad a number.  Now I suppose that one could make a case for
treating zero specially, but that wasn't done.

And the : and @ flags were already taken :<

>   On a related note, I'm confused about how to go about getting a
>   binary representation of a negative number.  When I hand a negative
>   value to the format control above, it just sticks a negative sign in
>   front of the binary digits that represent the absolute value of the
>   number in question - like this:
> 
>   (format nil "~19,'0,' ,4:B" -3333)
>   "0000-1101 0000 0101"
> 
>     Can someone steer me in the right direction?

Well, that is because there isn't really a 2's complement way of storing
arbitrary precision integers.

I think that you really need to come up with your own specialized
formatting routines for handling this.

-- 
Thomas A. Russ,  USC/Information Sciences Institute          ···@isi.edu