From: William Bland
Subject: Opposite of FORMAT?
Date: 
Message-ID: <pan.2004.10.08.02.06.00.496284@abstractnonsense.com>
One of the recent threads on CL's FORMAT vs. C's printf got me thinking: 
In the standard C library you have a kind of opposite to printf, in the
form of scanf.  Has anyone written something like this in CL, to act as an
opposite to FORMAT?

For example I'm thinking you would be able to do

(unformat stream ··@R" var1)

to read a roman numeral from stream into var1.  Has anyone done it?  Are
there hidden (or obvious!) problems?

Cheers,
	Bill.
-- 
"If you give someone Fortran, he has Fortran. If you give someone Lisp,
he has any language he pleases." -- Guy Steele

From: Jeff
Subject: Re: Opposite of FORMAT?
Date: 
Message-ID: <Mep9d.142767$wV.61794@attbi_s54>
William Bland wrote:

> One of the recent threads on CL's FORMAT vs. C's printf got me
> thinking:  In the standard C library you have a kind of opposite to
> printf, in the form of scanf.  Has anyone written something like this
> in CL, to act as an opposite to FORMAT?

Not to sound rude, but isn't that what READ-FROM-STRING is for? Why
bother writing a parser when one is already built into the language for
you?

Jeff
From: Tim Bradshaw
Subject: Re: Opposite of FORMAT?
Date: 
Message-ID: <1097228607.046943.270720@f14g2000cwb.googlegroups.com>
Jeff wrote:
> William Bland wrote:
>
> > One of the recent threads on CL's FORMAT vs. C's printf got me
> > thinking:  In the standard C library you have a kind of opposite to
> > printf, in the form of scanf.  Has anyone written something like
this
> > in CL, to act as an opposite to FORMAT?
>
> Not to sound rude, but isn't that what READ-FROM-STRING is for? Why
> bother writing a parser when one is already built into the language
for
> you?
> 
> Jeff
From: William Bland
Subject: Re: Opposite of FORMAT?
Date: 
Message-ID: <pan.2004.10.08.06.58.26.642762@abstractnonsense.com>
On Fri, 08 Oct 2004 05:05:48 +0000, Jeff wrote:

> William Bland wrote:
> 
>> One of the recent threads on CL's FORMAT vs. C's printf got me
>> thinking:  In the standard C library you have a kind of opposite to
>> printf, in the form of scanf.  Has anyone written something like this
>> in CL, to act as an opposite to FORMAT?
> 
> Not to sound rude, but isn't that what READ-FROM-STRING is for? Why
> bother writing a parser when one is already built into the language for
> you?

Not to sound rude, but READ-FROM-STRING doesn't do what I was talking
about.  It doesn't read roman numerals, to take an obvious example.
Perhaps more usefully I was thinking the hypothetical function would be
able to do things like:

(unformat "This is the second time I've explained it"
          "This is the ~:R time I've explained it" n)
=> n=2

(unformat "This is the foo time I've explained it"
          "This is the ~:R time I've explained it" n)
=> error

(unformat "Serial number: 49-96-02-D2"
          "Serial number: ~,,'-,2:X" serial)
=> serial=1234567890

(unformat "|tom|dick|harry|"
          "|~{~A|~}" list)
=> list=(tom dick harry)

Note that the first argument to FORMAT can be a string or a stream.  The
hypothetical UNFORMAT is just the same.

I don't claim the function would be easy to write (for me at least,
being just a CL newbie), but I do think it would be useful.  And it
clearly does things that READ-FROM-STRING doesn't.

Cheers,
	Bill.
-- 
"If you give someone Fortran, he has Fortran. If you give someone Lisp,
he has any language he pleases." -- Guy Steele
From: Jock Cooper
Subject: Re: Opposite of FORMAT?
Date: 
Message-ID: <m3brfd16hh.fsf@jcooper02.sagepub.com>
William Bland <·······@abstractnonsense.com> writes:

> On Fri, 08 Oct 2004 05:05:48 +0000, Jeff wrote:
> 
> > William Bland wrote:
> > 
> >> One of the recent threads on CL's FORMAT vs. C's printf got me
> >> thinking:  In the standard C library you have a kind of opposite to
> >> printf, in the form of scanf.  Has anyone written something like this
> >> in CL, to act as an opposite to FORMAT?
> > 
> > Not to sound rude, but isn't that what READ-FROM-STRING is for? Why
> > bother writing a parser when one is already built into the language for
> > you?
> 
> Not to sound rude, but READ-FROM-STRING doesn't do what I was talking
> about.  It doesn't read roman numerals, to take an obvious example.
> Perhaps more usefully I was thinking the hypothetical function would be
> able to do things like:
> 
> (unformat "This is the second time I've explained it"
>           "This is the ~:R time I've explained it" n)
> => n=2
> 
> (unformat "This is the foo time I've explained it"
>           "This is the ~:R time I've explained it" n)
> => error
> 
> (unformat "Serial number: 49-96-02-D2"
>           "Serial number: ~,,'-,2:X" serial)
> => serial=1234567890
> 
> (unformat "|tom|dick|harry|"
>           "|~{~A|~}" list)
> => list=(tom dick harry)
> 
> Note that the first argument to FORMAT can be a string or a stream.  The
> hypothetical UNFORMAT is just the same.
> 
> I don't claim the function would be easy to write (for me at least,
> being just a CL newbie), but I do think it would be useful.  And it
> clearly does things that READ-FROM-STRING doesn't.
> 

I think you would be better off using a regexp package for stuff like this.
Yes you'd get something like ("second" 12 18) back instead of '2' but I 
think the code would be more readable for nontrivial cases.

Also, FORMAT's operators are much more powerful than *printf/*scanf - right
off the bat you'll have to chuck a bunch of them that just aren't simple 
output codes like ~[ ~] ~// etc.
From: William Bland
Subject: Re: Opposite of FORMAT?
Date: 
Message-ID: <pan.2004.10.08.20.13.53.262409@abstractnonsense.com>
On Fri, 08 Oct 2004 10:10:02 -0700, Jock Cooper wrote:

> William Bland <·······@abstractnonsense.com> writes:
>> 
>> Perhaps more usefully I was thinking the hypothetical function would be
>> able to do things like:
>> 
>> (unformat "This is the second time I've explained it"
>>           "This is the ~:R time I've explained it" n)
>> => n=2
>> 
>> (unformat "This is the foo time I've explained it"
>>           "This is the ~:R time I've explained it" n)
>> => error
>> 
>> (unformat "Serial number: 49-96-02-D2"
>>           "Serial number: ~,,'-,2:X" serial)
>> => serial=1234567890
>> 
>> (unformat "|tom|dick|harry|"
>>           "|~{~A|~}" list)
>> => list=(tom dick harry)
> 
> I think you would be better off using a regexp package for stuff like
> this. Yes you'd get something like ("second" 12 18) back instead of '2'
> but I think the code would be more readable for nontrivial cases.

I'm not convinced.  I don't think regex-based code for checking that
"Serial number: 49-96-02-D2" is well-formed, and then returning the
integer 1234567890, would be as readable as in the example above.

> Also, FORMAT's operators are much more powerful than *printf/*scanf - right
> off the bat you'll have to chuck a bunch of them that just aren't simple 
> output codes like ~[ ~] ~// etc.

I don't see why you would get rid of ~[~] - could be quite useful:

(unformat "I am a man with a plan"
          "I am a ~[man~;car~;duck~] with a ~[hat~;coat~;plan~]"
          thing1 thing2)
=> thing1 = 0, thing2 = 2

Of course there would be some things, like ~//, that wouldn't make sense,
but it does seem to me that they are few and that a lot of FORMAT would
make sense... maybe it's time I just went ahead and tried implementing it
(after I have a look at the format-setf code Lars posted a link to of
course).  If you're right I'll find out it's useless and regex is much
better, and I will still have learned something in the process, so it
won't be wasted effort...  Hmm, maybe I'll do it.

Cheers,
	Bill.
-- 
"If you give someone Fortran, he has Fortran. If you give someone Lisp,
he has any language he pleases." -- Guy Steele
From: Jock Cooper
Subject: Re: Opposite of FORMAT?
Date: 
Message-ID: <m37jq02a8o.fsf@jcooper02.sagepub.com>
William Bland <·······@abstractnonsense.com> writes:

> On Fri, 08 Oct 2004 10:10:02 -0700, Jock Cooper wrote:
> 
> > William Bland <·······@abstractnonsense.com> writes:
> >> 
> >> Perhaps more usefully I was thinking the hypothetical function would be
> >> able to do things like:
> >> 
> >> (unformat "This is the second time I've explained it"
> >>           "This is the ~:R time I've explained it" n)
> >> => n=2
> >> 
> >> (unformat "This is the foo time I've explained it"
> >>           "This is the ~:R time I've explained it" n)
> >> => error
> >> 
> >> (unformat "Serial number: 49-96-02-D2"
> >>           "Serial number: ~,,'-,2:X" serial)
> >> => serial=1234567890
> >> 
> >> (unformat "|tom|dick|harry|"
> >>           "|~{~A|~}" list)
> >> => list=(tom dick harry)
> > 
> > I think you would be better off using a regexp package for stuff like
> > this. Yes you'd get something like ("second" 12 18) back instead of '2'
> > but I think the code would be more readable for nontrivial cases.
> 
> I'm not convinced.  I don't think regex-based code for checking that
> "Serial number: 49-96-02-D2" is well-formed, and then returning the
> integer 1234567890, would be as readable as in the example above.

Well, certainly using a regexp package would require you to post
process the result but I think you'd have more power in specifying the
pattern.  In this example any variation from using numbers (decimal or
hex) will exceed FORMAT's power to match.  Then you will be wishing
you could say things like "match any character in [0-9A-Z]", which
brings you back to regexp type stuff.

Something along these lines is not so bad:

(defvar *sn-pat* "^Serial number: ([0-9a-fA-F-]+)$")
(let ((match (match-regexp *sn-pat* input-string)))
  (when match 
    (read-from-string (s+ "#x" (delete #\- (get-sub-match match 0))))))
; assuming s+ is shorthand for concatenate 'string
; and get-sub-match is defined to get the nth substr match

> 
> > Also, FORMAT's operators are much more powerful than *printf/*scanf - right
> > off the bat you'll have to chuck a bunch of them that just aren't simple 
> > output codes like ~[ ~] ~// etc.
> 
> I don't see why you would get rid of ~[~] - could be quite useful:
> 
> (unformat "I am a man with a plan"
>           "I am a ~[man~;car~;duck~] with a ~[hat~;coat~;plan~]"
>           thing1 thing2)
> => thing1 = 0, thing2 = 2

To me "~[man~;car~;duck~]" is just a more clunky way of doing a regexp
"(man|car|duck)".. granted with regexp you get the string back, not an
index.  This may not be bad anyway.  I think that regexp and the
format codes here are in essence doing the same thing, that is,
specifying an expected/allowed pattern.
 
> Of course there would be some things, like ~//, that wouldn't make sense,
> but it does seem to me that they are few and that a lot of FORMAT would
> make sense... maybe it's time I just went ahead and tried implementing it
> (after I have a look at the format-setf code Lars posted a link to of
> course).  If you're right I'll find out it's useless and regex is much
> better, and I will still have learned something in the process, so it
> won't be wasted effort...  Hmm, maybe I'll do it.
> 

Yes, it definitely would be fun and informative to write and hack
around with.
From: Christophe Rhodes
Subject: Re: Opposite of FORMAT?
Date: 
Message-ID: <sqhdp5m09m.fsf@cam.ac.uk>
William Bland <·······@abstractnonsense.com> writes:

> Of course there would be some things, like ~//, that wouldn't make sense,
> but it does seem to me that they are few and that a lot of FORMAT would
> make sense... maybe it's time I just went ahead and tried implementing it
> (after I have a look at the format-setf code Lars posted a link to of
> course).  If you're right I'll find out it's useless and regex is much
> better, and I will still have learned something in the process, so it
> won't be wasted effort...  Hmm, maybe I'll do it.

Good luck.  I enjoyed writing the code I wrote that many years ago,
but I wouldn't recommend its wide dissemination without quite a lot of
work: because of FORMAT's flexibility and graceful degradation, things
such as (setf (format nil "~2,'0D" x) "frob") make conceptual sense;
there is of course the trivially uninvertible (format nil "~A~A" x y),
and there are many many other edge cases.  The reason that explicit
parsers (of which regexes are a class) are favoured is probably
because they at least have the virtue of being relatively unambiguous.

Christophe
From: Jeff
Subject: Re: Opposite of FORMAT?
Date: 
Message-ID: <tUw9d.211937$D%.5489@attbi_s51>
William Bland wrote:

> On Fri, 08 Oct 2004 05:05:48 +0000, Jeff wrote:
> 
> > William Bland wrote:
> > 
> >> One of the recent threads on CL's FORMAT vs. C's printf got me
> >> thinking:  In the standard C library you have a kind of opposite to
> >> printf, in the form of scanf.  Has anyone written something like
> >> this in CL, to act as an opposite to FORMAT?
> > 
> > Not to sound rude, but isn't that what READ-FROM-STRING is for? Why
> > bother writing a parser when one is already built into the language
> > for you?
> 
> Not to sound rude, but READ-FROM-STRING doesn't do what I was talking
> about.  It doesn't read roman numerals, to take an obvious example.
> Perhaps more usefully I was thinking the hypothetical function would
> be able to do things like:
 
Well, at least neither of us is being rude... ;)

What I mean to imply is that I think most people tend not to "use" they
have at their disposal. Yes, sometimes we are forced to deal with data
that isn't of our choosing -- in which case we'll need a regular
expression parser or something else. But, I find that most of the time
we paint ourselves into corners more often that not.

What do you need this for? Is the roman numerals example real life? If
not, then why not use what's already there? If you later needed
something to handle roman numerals, just make a new reader macro to
handle that.

The libc functions of (scan|print)f functions arrose because C doesn't
have a parser and programmers needed it. But even they have their
limits. They don't parse roman numerals, and their biggest drawback is
that they can't be extended. You can extend the lisp reader.

If you really needed to parse arbitrary data from a string, wouldn't be
just as easy to use a regular expression?

Jeff
From: William Bland
Subject: Re: Opposite of FORMAT?
Date: 
Message-ID: <pan.2004.10.08.14.52.52.80287@abstractnonsense.com>
On Fri, 08 Oct 2004 13:48:09 +0000, Jeff wrote:

> William Bland wrote:
> 
>> On Fri, 08 Oct 2004 05:05:48 +0000, Jeff wrote:
>> 
>> > William Bland wrote:
>> > 
>> >> One of the recent threads on CL's FORMAT vs. C's printf got me
>> >> thinking:  In the standard C library you have a kind of opposite to
>> >> printf, in the form of scanf.  Has anyone written something like
>> >> this in CL, to act as an opposite to FORMAT?
>> > 
>> > Not to sound rude, but isn't that what READ-FROM-STRING is for? Why
>> > bother writing a parser when one is already built into the language
>> > for you?
>> 
>> Not to sound rude, but READ-FROM-STRING doesn't do what I was talking
>> about.  It doesn't read roman numerals, to take an obvious example.
>> Perhaps more usefully I was thinking the hypothetical function would
>> be able to do things like:
>  
> Well, at least neither of us is being rude... ;)

:-)

> What do you need this for? Is the roman numerals example real life? If
> not, then why not use what's already there? If you later needed
> something to handle roman numerals, just make a new reader macro to
> handle that.

No, being able to read roman numerals isn't very useful - just like
being able to write them with FORMAT isn't very useful most of the time.
But I do think the other examples I gave are useful.

> If you really needed to parse arbitrary data from a string, wouldn't be
> just as easy to use a regular expression?

Sure.  And then you have to learn *two* sub-languages:  FORMAT for output,
and whatever regular expression language you choose for input.  I know
some regex languages already, but I still think it would be nicer if I
only had to know *one* sub-language and could use it both for output and
for input.

Cheers,
	Bill.
-- 
"If you give someone Fortran, he has Fortran. If you give someone Lisp,
he has any language he pleases." -- Guy Steele
From: Pascal Bourguignon
Subject: Re: Opposite of FORMAT?
Date: 
Message-ID: <878yahzx1k.fsf@thalassa.informatimago.com>
William Bland <·······@abstractnonsense.com> writes:

> One of the recent threads on CL's FORMAT vs. C's printf got me thinking: 
> In the standard C library you have a kind of opposite to printf, in the
> form of scanf.  Has anyone written something like this in CL, to act as an
> opposite to FORMAT?
> 
> For example I'm thinking you would be able to do
> 
> (unformat stream ··@R" var1)
> 
> to read a roman numeral from stream into var1.  Has anyone done it?  Are
> there hidden (or obvious!) problems?

Check:

http://www.google.com/groups?q=string-to-number+group:comp.lang.lisp+author:Pascal+author:Bourguignon&hl=en&lr=&selm=87y8vuyrgi.fsf%40thalassa.informatimago.com&rnum=1

You can easily "improve" from this...

Also, check:

http://www.lispworks.com/reference/HyperSpec/Body/v_pr_rda.htm
http://www.lispworks.com/reference/HyperSpec/Body/m_pr_unr.htm

-- 
__Pascal Bourguignon__                     http://www.informatimago.com/

Voting Democrat or Republican is like choosing a cabin in the Titanic.
From: Lars Brinkhoff
Subject: Re: Opposite of FORMAT?
Date: 
Message-ID: <85vfdlbual.fsf@junk.nocrew.org>
William Bland <·······@abstractnonsense.com> writes:
> (unformat stream ··@R" var1)
> to read a roman numeral from stream into var1.

http://www-jcsu.jesus.cam.ac.uk/~csr21/format-setf.lisp

-- 
Lars Brinkhoff,         Services for Unix, Linux, GCC, HTTP
Brinkhoff Consulting    http://www.brinkhoff.se/