From: B.B.
Subject: harder questions from lisp newbie
Date: 
Message-ID: <DoNotSpamthegoat4-9B12B9.18375011042004@library.airnews.net>
   I have tab-delimited date files all over the place.  I'd like to 
build some of my experimental programs around reading them in, doing 
whatever processing, and spitting new tab-delimited files back out.  The 
hangup is that these files have a mix of line termination schemes.  Some 
terminate with a lf (#\Linefeed), some use a cr (#\Newline), and others 
use the crlf pair.
   Is there such a thing as readline-preserving-whitespace (presumably a 
function I could tell to read into a string until it sees either a cr or 
nl character, leaving that character on the input) or at least a neat 
way of getting that effect?
   I attempted to build one from parts, but it's broken and I do not 
understand why.

(defun readline-funky-terminals (&optional (stream t))
  (loop for Line = (make-string 1000)
        and C = (read-char stream) then (read-char stream)
        and place = 0 then (1+ place)
        do (print C) ;debug
           (case C
             (#\Linefeed
               (print 'linefeed)                     ;debug
               (return (values Line #\Linefeed)))
             (#\Return
               (print 'return)                       ;debug
               (return (values Line #\Return)))
             (otherwise
               (format t "setting ~W at ~A" C place) ;debug
               (print (setf (elt Line place) C))))   ;debug!!
        until  (> place 1000)
        finally
          (print 'crap)                              ;debug
          (values Line #\Newline)))

Behavior ought to be that it will read characters from a stream until 
seeing a newline or carriage return, at which point it will return two 
values: a sequence (string type) containing all the prior characters and 
the terminal character.  Aborts and returns a #\newline if the sequence 
gets full.
   Here's what has be confused:  at the end of a line from a file I got 
this output:

#\o setting #\o at 62
#\o 
#\p setting #\p at 63
#\p 
#\Newline 
LINEFEED 
"" ;
#\Newline

   The CASE saw a linefeed, but everything else wants to call it a 
newline.  Is this due to an automatic translation the printer does when 
showing all of this?  In other words, are those #\Newlines really 
#\Linefeeds?  If so, can I suppress that behavior?
   Also, why is the string printed out empty?  Am I abusing sequences?  
I figure I am since I'm treating them just like I would C strings.

   Other questions I haven't had time to experiment much with yet.  How 
do I get format to print a real-life tab character without passing it as 
an argument?  I'd like to print records back out in tab-delimited 
fashion, but I don't want to double the number of arguments I pass just 
to get tabbed output.
   With regards to calling to C code, I've found 
http://clisp.sourceforge.net/impnotes.html#affi but is there a page 
anywhere that gives more examples?  And is there a way for C programs to 
call Lisp programs inside clisp?  Or is there a lightweight lisp I could 
embed in a C program?
   In the above function, the error handling is crap.  What I'd prefer 
would be a buffer object I can dynamically grow.  Is there a class like 
that available in a library I can play with?  Or is there a way I can 
get that effect with built-ins without much complexity?
   Any word on the sanity simply having this function throw an error if 
the fixed-size buffer overruns?  I suppose I could let it go and fall 
back on elt's error, but that seems ugly.

-- 
B.B.           --I am not a goat!       thegoat4 at airmail.net
    Fire the stupid--Vote.

From: Adam Warner
Subject: Re: harder questions from lisp newbie
Date: 
Message-ID: <pan.2004.04.12.00.05.47.655719@consulting.net.nz>
Hi B.B.,

> I have tab-delimited date files all over the place.  I'd like to build
> some of my experimental programs around reading them in, doing whatever
> processing, and spitting new tab-delimited files back out.  The hangup
> is that these files have a mix of line termination schemes.  Some
> terminate with a lf (#\Linefeed), some use a cr (#\Newline), and others
> use the crlf pair.

You mentioned CLISP later in your post. There is no way to read the
distinction using a character stream in CLISP:
<http://clisp.sourceforge.net/impnotes.html#clhs-newline>

To distinguish this you will need a binary stream. Note that CLISP
prohibits setting *terminal-io* as a binary stream:
<http://www.geocrawler.com/mail/msg.php3?msg_id=6008951&list=1124>

Regards,
Adam
From: Kenny Tilton
Subject: Re: harder questions from lisp newbie
Date: 
Message-ID: <pqwec.25723$Nn4.4982952@twister.nyc.rr.com>
B.B. wrote:
>    I have tab-delimited date files all over the place.  I'd like to 
> build some of my experimental programs around reading them in, doing 
> whatever processing, and spitting new tab-delimited files back out.  The 
> hangup is that these files have a mix of line termination schemes.  Some 
> terminate with a lf (#\Linefeed), some use a cr (#\Newline), and others 
> use the crlf pair.
>    Is there such a thing as readline-preserving-whitespace (presumably a 
> function I could tell to read into a string until it sees either a cr or 
> nl character, leaving that character on the input) or at least a neat 
> way of getting that effect?
>    I attempted to build one from parts, but it's broken and I do not 
> understand why.
> 
> (defun readline-funky-terminals (&optional (stream t))
>   (loop for Line = (make-string 1000)

this should be "(loop with line = etc". "With" is used to initialize 
stuff before iterating, "for" is an iteration term. so you were 
reinitializing "line" to a new string on each read-char.

>         and C = (read-char stream) then (read-char stream)
>         and place = 0 then (1+ place)
>         do (print C) ;debug
>            (case C
>              (#\Linefeed
>                (print 'linefeed)                     ;debug
>                (return (values Line #\Linefeed)))
>              (#\Return
>                (print 'return)                       ;debug
>                (return (values Line #\Return)))
>              (otherwise
>                (format t "setting ~W at ~A" C place) ;debug
>                (print (setf (elt Line place) C))))   ;debug!!
>         until  (> place 1000)
>         finally
>           (print 'crap)                              ;debug
>           (values Line #\Newline)))
> 

...snip...

>    Also, why is the string printed out empty?  Am I abusing sequences?  
> I figure I am since I'm treating them just like I would C strings.

What you did was OK, and if there were no other issues may work once you 
change the "for" to "with", but ...


>    In the above function, the error handling is crap.  What I'd prefer 
> would be a buffer object I can dynamically grow.  Is there a class like 
> that available in a library I can play with?  Or is there a way I can 
> get that effect with built-ins without much complexity?

here is another way to build up a string, in a function which changes 
stuff like "glPushMatrix" to the symbol 'gl-push-matrix:

(defun lisp-fn (n$) ;; string input
   (loop with ln = (make-array 0 :element-type 'character
                                 :adjustable t :fill-pointer 0)
         and n$len = (length n$)
         for n upfrom 0
         for c across n$
         when (and (plusp n)
                (upper-case-p c)
                (or (lower-case-p (elt n$ (1- n)))
                    (unless (>= (1+ n) n$len)
                      (lower-case-p (elt n$ (1+ n))))))
         do (vector-push-extend #\- ln)
         do (vector-push-extend (char-upcase c) ln)
         finally (return (intern ln))))

If you just return ln it will be a lisp string of the correct length.

kt

-- 
Home? http://tilton-technology.com
Cells? http://www.common-lisp.net/project/cells/
Cello? http://www.common-lisp.net/project/cello/
Why Lisp? http://alu.cliki.net/RtL%20Highlight%20Film
Your Project Here! http://alu.cliki.net/Industry%20Application
From: Thomas F. Burdick
Subject: Re: harder questions from lisp newbie
Date: 
Message-ID: <xcvn05h9kac.fsf@famine.OCF.Berkeley.EDU>
Kenny Tilton <·······@nyc.rr.com> writes:

> B.B. wrote:
>
> >    In the above function, the error handling is crap.  What I'd prefer 
> > would be a buffer object I can dynamically grow.  Is there a class like 
> > that available in a library I can play with?  Or is there a way I can 
> > get that effect with built-ins without much complexity?
> 
> here is another way to build up a string, in a function which changes 
> stuff like "glPushMatrix" to the symbol 'gl-push-matrix:
> 
> (defun lisp-fn (n$) ;; string input
>    (loop with ln = (make-array 0 :element-type 'character
>                                  :adjustable t :fill-pointer 0)
>          and n$len = (length n$)
>          for n upfrom 0
>          for c across n$
>          when (and (plusp n)
>                 (upper-case-p c)
>                 (or (lower-case-p (elt n$ (1- n)))
>                     (unless (>= (1+ n) n$len)
>                       (lower-case-p (elt n$ (1+ n))))))
>          do (vector-push-extend #\- ln)
>          do (vector-push-extend (char-upcase c) ln)
>          finally (return (intern ln))))
> 
> If you just return ln it will be a lisp string of the correct length.

For the OP, there are two techniques to build up a string of unknown
length; what Kenny just showed you, using adjustable vectors and
vector-push-extend, or you can use with-output-to-string:

  (with-output-to-string (string)
    (loop with *print-pretty* = nil
          ...
          do (print foo string)
          ...))

Use whichever feels stylistically better at the time, and don't sweat
the technique.  In CMUCL/SBCL, w-o-t-s is actually more efficient.

-- 
           /|_     .-----------------------.                        
         ,'  .\  / | No to Imperialist war |                        
     ,--'    _,'   | Wage class war!       |                        
    /       /      `-----------------------'                        
   (   -.  |                               
   |     ) |                               
  (`-.  '--.)                              
   `. )----'                               
From: Wade Humeniuk
Subject: Re: harder questions from lisp newbie
Date: 
Message-ID: <KXyec.48025$J56.9992@edtnps89>
Try this function

(defun readline (&optional (stream *standard-input*))
   "Similar to common-lisp:read-line but has control
    over line termination.  Will return line terminated
    by LF, CR or end-of-file."
   (let ((string (make-array 128 :adjustable t :fill-pointer 0 :element-type 'base-char)))
     (loop for c = (read-char stream nil nil)
           while c do
           (case c
             ((#\lf #\cr) (return (values string c)))
             (otherwise (vector-push-extend c string)))
           finally (return (values string :eof)))))

Wade
From: Peter Seibel
Subject: Re: harder questions from lisp newbie
Date: 
Message-ID: <m3ekqtp1xi.fsf@javamonkey.com>
Wade Humeniuk <····································@telus.net> writes:

> Try this function
>
> (defun readline (&optional (stream *standard-input*))
>    "Similar to common-lisp:read-line but has control
>     over line termination.  Will return line terminated
>     by LF, CR or end-of-file."
>    (let ((string (make-array 128 :adjustable t :fill-pointer 0 :element-type 'base-char)))
>      (loop for c = (read-char stream nil nil)
>            while c do
>            (case c
>              ((#\lf #\cr) (return (values string c)))
>              (otherwise (vector-push-extend c string)))
>            finally (return (values string :eof)))))

Does that really work in some implementation? Modulo bivalent streams,
if stream is a character-stream, as it must be given that you're
passing it to READ-CHAR, then you'd never get a #\lf or #\cr back as
the stream would convert them (or the right combination thereof) to a
#\Newline character.

-Peter

-- 
Peter Seibel                                      ·····@javamonkey.com

         Lisp is the red pill. -- John Fraser, comp.lang.lisp
From: Duane Rettig
Subject: Re: harder questions from lisp newbie
Date: 
Message-ID: <4fzb9hydh.fsf@franz.com>
Peter Seibel <·····@javamonkey.com> writes:

> Wade Humeniuk <····································@telus.net> writes:
> 
> > Try this function
> >
> > (defun readline (&optional (stream *standard-input*))
> >    "Similar to common-lisp:read-line but has control
> >     over line termination.  Will return line terminated
> >     by LF, CR or end-of-file."
> >    (let ((string (make-array 128 :adjustable t :fill-pointer 0 :element-type 'base-char)))
> >      (loop for c = (read-char stream nil nil)
> >            while c do
> >            (case c
> >              ((#\lf #\cr) (return (values string c)))
> >              (otherwise (vector-push-extend c string)))
> >            finally (return (values string :eof)))))
> 
> Does that really work in some implementation? Modulo bivalent streams,
> if stream is a character-stream, as it must be given that you're
> passing it to READ-CHAR, then you'd never get a #\lf or #\cr back as
> the stream would convert them (or the right combination thereof) to a
> #\Newline character.

Note however in 13.1.7 that #\newline and #\linefeed might in fact be
the same character.

-- 
Duane Rettig    ·····@franz.com    Franz Inc.  http://www.franz.com/
555 12th St., Suite 1450               http://www.555citycenter.com/
Oakland, Ca. 94607        Phone: (510) 452-2000; Fax: (510) 452-0182   
From: Peter Seibel
Subject: Re: harder questions from lisp newbie
Date: 
Message-ID: <m3hdvmno77.fsf@javamonkey.com>
Duane Rettig <·····@franz.com> writes:

> Peter Seibel <·····@javamonkey.com> writes:
>
>> Wade Humeniuk <····································@telus.net> writes:
>> 
>> > Try this function
>> >
>> > (defun readline (&optional (stream *standard-input*))
>> >    "Similar to common-lisp:read-line but has control
>> >     over line termination.  Will return line terminated
>> >     by LF, CR or end-of-file."
>> >    (let ((string (make-array 128 :adjustable t :fill-pointer 0 :element-type 'base-char)))
>> >      (loop for c = (read-char stream nil nil)
>> >            while c do
>> >            (case c
>> >              ((#\lf #\cr) (return (values string c)))
>> >              (otherwise (vector-push-extend c string)))
>> >            finally (return (values string :eof)))))
>> 
>> Does that really work in some implementation? Modulo bivalent streams,
>> if stream is a character-stream, as it must be given that you're
>> passing it to READ-CHAR, then you'd never get a #\lf or #\cr back as
>> the stream would convert them (or the right combination thereof) to a
>> #\Newline character.
>
> Note however in 13.1.7 that #\newline and #\linefeed might in fact be
> the same character.

But they might not. So this code should probably also treat #\Newline
as an end of line marker in case, for instance, on a CRLF platform the
#\Return #\Linefeed sequence gets translated to a distinct #\Newline
character that isn't EQL to either of those two characters.

-Peter

-- 
Peter Seibel                                      ·····@javamonkey.com

         Lisp is the red pill. -- John Fraser, comp.lang.lisp
From: Duane Rettig
Subject: Re: harder questions from lisp newbie
Date: 
Message-ID: <47jwikrhw.fsf@franz.com>
Peter Seibel <·····@javamonkey.com> writes:

> Duane Rettig <·····@franz.com> writes:
> 
> > Peter Seibel <·····@javamonkey.com> writes:
> >
> >> Wade Humeniuk <····································@telus.net> writes:
> >> 
> >> > Try this function
> >> >
> >> > (defun readline (&optional (stream *standard-input*))
> >> >    "Similar to common-lisp:read-line but has control
> >> >     over line termination.  Will return line terminated
> >> >     by LF, CR or end-of-file."
> >> >    (let ((string (make-array 128 :adjustable t :fill-pointer 0 :element-type 'base-char)))
> >> >      (loop for c = (read-char stream nil nil)
> >> >            while c do
> >> >            (case c
> >> >              ((#\lf #\cr) (return (values string c)))
> >> >              (otherwise (vector-push-extend c string)))
> >> >            finally (return (values string :eof)))))
> >> 
> >> Does that really work in some implementation? Modulo bivalent streams,
> >> if stream is a character-stream, as it must be given that you're
> >> passing it to READ-CHAR, then you'd never get a #\lf or #\cr back as
> >> the stream would convert them (or the right combination thereof) to a
> >> #\Newline character.
> >
> > Note however in 13.1.7 that #\newline and #\linefeed might in fact be
> > the same character.
> 
> But they might not. So this code should probably also treat #\Newline
> as an end of line marker in case, for instance, on a CRLF platform the
> #\Return #\Linefeed sequence gets translated to a distinct #\Newline
> character that isn't EQL to either of those two characters.

Your question was "Does that really work on some implementation?",
and the answer should be "Yes, on any implementation for which #\Newline
and #\Linefeed are the same".  The #\cr is redundant, in the code above;
if the system is one which combines cr/lf into one character, but the
lisp's read-char doesn't do that merge to #\newline, then there are
likely to be other problems anyway.

The fact that there are likely implementations on which the above code
doesn't work makes the code non-portable, but it doesn't stop the answer
to the question "are there some systems on which it works?" from being
"yes".

-- 
Duane Rettig    ·····@franz.com    Franz Inc.  http://www.franz.com/
555 12th St., Suite 1450               http://www.555citycenter.com/
Oakland, Ca. 94607        Phone: (510) 452-2000; Fax: (510) 452-0182   
From: Peter Seibel
Subject: Re: harder questions from lisp newbie
Date: 
Message-ID: <m3isg2m1lz.fsf@javamonkey.com>
Duane Rettig <·····@franz.com> writes:

> Peter Seibel <·····@javamonkey.com> writes:
>
>> Duane Rettig <·····@franz.com> writes:
>> 
>> > Peter Seibel <·····@javamonkey.com> writes:
>> >
>> >> Wade Humeniuk <····································@telus.net> writes:
>> >> 
>> >> > Try this function
>> >> >
>> >> > (defun readline (&optional (stream *standard-input*))
>> >> >    "Similar to common-lisp:read-line but has control
>> >> >     over line termination.  Will return line terminated
>> >> >     by LF, CR or end-of-file."
>> >> >    (let ((string (make-array 128 :adjustable t :fill-pointer 0 :element-type 'base-char)))
>> >> >      (loop for c = (read-char stream nil nil)
>> >> >            while c do
>> >> >            (case c
>> >> >              ((#\lf #\cr) (return (values string c)))
>> >> >              (otherwise (vector-push-extend c string)))
>> >> >            finally (return (values string :eof)))))
>> >> 
>> >> Does that really work in some implementation? Modulo bivalent streams,
>> >> if stream is a character-stream, as it must be given that you're
>> >> passing it to READ-CHAR, then you'd never get a #\lf or #\cr back as
>> >> the stream would convert them (or the right combination thereof) to a
>> >> #\Newline character.
>> >
>> > Note however in 13.1.7 that #\newline and #\linefeed might in fact be
>> > the same character.
>> 
>> But they might not. So this code should probably also treat #\Newline
>> as an end of line marker in case, for instance, on a CRLF platform the
>> #\Return #\Linefeed sequence gets translated to a distinct #\Newline
>> character that isn't EQL to either of those two characters.
>
> Your question was "Does that really work on some implementation?",
> and the answer should be "Yes, on any implementation for which #\Newline
> and #\Linefeed are the same".  The #\cr is redundant, in the code above;
> if the system is one which combines cr/lf into one character, but the
> lisp's read-char doesn't do that merge to #\newline, then there are
> likely to be other problems anyway.
>
> The fact that there are likely implementations on which the above code
> doesn't work makes the code non-portable, but it doesn't stop the answer
> to the question "are there some systems on which it works?" from being
> "yes".

Good point. I forgot (and didn't read) what my own original question
was.

-Peter

-- 
Peter Seibel                                      ·····@javamonkey.com

         Lisp is the red pill. -- John Fraser, comp.lang.lisp