From: Richard Smith
Subject: newbie post - by delimited text file reader suite any good?
Date: 
Message-ID: <m2slq3yw3n.fsf@82-35-197-237.cable.ubr01.enfi.blueyonder.co.uk>
Hello everyone

I've seen the advanced communication going on on the group.  On the
other hand :-) , here is my group of functions to process mainly TSV
(Tab Separated Variable) files.

I've used recursive design in line with eg. "ANSI Common Lisp" by Paul
Graham - so easy for a novice like me to understand.
(avoided the "loop" macro, with its Perl-like (?) infix notation)

Sample invocation

(my-conv-nums-tree (my-delimfile-slurp-to-list #\tab "x^2.txt"))
((0 0) (0.010101 1.0203E-4) (0.0...


where "x^2.txt"
0	0
0.010101	0.00010203
0.020202	0.000408122
0.030303	0.000918274
...


Useful?

Improvements?

Comments?


Richard Smith


--------------------------------


; Work recursively so can understand the thing!

; Do all "reversing", trimming spaces and that sort of stuff with the
; unix tools beforehand

; in the "toplevel" there can be a problem with "tab" characters not
; being "honoured" in a typed-in sample input string - producing a
; confusing non-function of the function - which is not the function's
; fault
;eg. (my-delimtext-reader #\tab "hi	everyone	I'm	here")

; this is not tail-recursive, so has limits on size of file (?)

(defun my-delimtext-parse (delimchar mystring)
  "read substrings into list from a string on basis of a specified delimiter"
  (when (not (zerop (length mystring)))
    (let ((dp (position delimchar mystring)))
      (cons
       (subseq mystring 0 dp)
       (if dp
	   (my-delimtext-parse delimchar (subseq mystring (+ dp 1))))))))


(defun my-delimfile-slurp-to-list (delimchar thisfile)
  "slurp entire delim'd file into lisp list of (sub)strings"
  (let ((thelist ()))
    (with-open-file (mystream thisfile :direction :input)
		    (do ((thisline (read-line mystream nil 'eof) 
				   (read-line mystream nil 'eof)))
			((eql thisline 'eof) (nreverse thelist))
		      (setf thelist (cons (my-delimtext-parse delimchar thisline) thelist))))))


(defun my-conv-nums-tree (tr)
  "convert string representations of numbers to number
for a list of lists of strings"
  (cond ((stringp tr)
	 (if (numberp (read-from-string tr))
	     (read-from-string tr)
	   tr))
	((atom tr) tr)
	(t
	 (cons (my-conv-nums-tree (car tr))
	       (my-conv-nums-tree (cdr tr))))))


--------------------------------

From: Rob Warnock
Subject: Re: newbie post - by delimited text file reader suite any good?
Date: 
Message-ID: <fOqdnQnaE_GetJjZnZ2dnUVZ_t-dnZ2d@speakeasy.net>
Richard Smith  <······@weldsmith4.co.uk> wrote:
+---------------
| here is my group of functions to process mainly TSV
| (Tab Separated Variable) files.
...
| (defun my-delimtext-parse (delimchar mystring) ...)
+---------------

You might want to look at <http://www.cliki.net/SPLIT-SEQUENCE>,
which was designed by community consensus over a considerable
period of discussion [and as a result provides most of the
optional arguments one expects of CL sequence functions].
It can be used like this:

    > (with-open-file (s "x^2.txt")
	(loop for line = (read-line s nil nil)
	      while line
	  collect (split-sequence:split-sequence #\tab line)))

    (("0" "0") ("0.010101" "0.00010203") ("0.020202" "0.000408122")
     ("0.030303" "0.000918274"))
    > 


-Rob

-----
Rob Warnock			<····@rpw3.org>
627 26th Avenue			<URL:http://rpw3.org/>
San Mateo, CA 94403		(650)572-2607
From: Pascal Bourguignon
Subject: Re: newbie post - by delimited text file reader suite any good?
Date: 
Message-ID: <87y7zv2g2s.fsf@thalassa.informatimago.com>
Richard Smith <······@weldsmith4.co.uk> writes:

> Hello everyone
>
> I've seen the advanced communication going on on the group.  On the
> other hand :-) , here is my group of functions to process mainly TSV
> (Tab Separated Variable) files.
>
> I've used recursive design in line with eg. "ANSI Common Lisp" by Paul
> Graham - so easy for a novice like me to understand.
> (avoided the "loop" macro, with its Perl-like (?) infix notation)
>
> Sample invocation
>
> (my-conv-nums-tree (my-delimfile-slurp-to-list #\tab "x^2.txt"))
> ((0 0) (0.010101 1.0203E-4) (0.0...
>
>
> where "x^2.txt"
> 0	0
> 0.010101	0.00010203
> 0.020202	0.000408122
> 0.030303	0.000918274
> ...

Common Lisp is more powerful than you think:

(let ((*read-eval* nil))
  (with-open-file (file "x^2.txt")
    (loop for line = (read-line file nil nil)
          while line 
          collect (ignore-errors (read-from-string (format nil "(~A)" line))))))

--> ((0 0) (0.010101 1.0203E-4) (0.020202 4.08122E-4) (0.030303 9.18274E-4))

You might have a more complex file format, but a tabulation is not a
reason for a custom-made parser.



Otherwise, I'd lose all these "my" or "my-" prefix.
If you want to mark your territory, you can use a package:

(defpackage "UK.CO.WELDSMITH4.SMITH.RICHARD.DELIM-FILE"
   (:use "COMMON-LISP"))
(in-package  "UK.CO.WELDSMITH4.SMITH.RICHARD.DELIM-FILE")
...

-- 
__Pascal Bourguignon__                     http://www.informatimago.com/
From: Tayssir John Gabbour
Subject: Re: newbie post - by delimited text file reader suite any good?
Date: 
Message-ID: <1141179148.882030.166900@i39g2000cwa.googlegroups.com>
Pascal Bourguignon wrote:
> Richard Smith <······@weldsmith4.co.uk> writes:
> > Sample invocation
> > (my-conv-nums-tree (my-delimfile-slurp-to-list #\tab "x^2.txt"))
> > ((0 0) (0.010101 1.0203E-4) (0.0...
>
> Common Lisp is more powerful than you think:
>
> (let ((*read-eval* nil))
>   (with-open-file (file "x^2.txt")
>     (loop for line = (read-line file nil nil)
>           while line
>           collect (ignore-errors (read-from-string (format nil "(~A)" line))))))

You know, I've probably never seen code which more typifies Lisp.

* "Eh, slap parentheses around anything and it'll be fine."
* "Error-handling only matters when it matters."
* "Oh yeah, remember to bind *read-eval* to nil."


Tayssir
--
See no evil, *read-no-evil*
From: Pascal Bourguignon
Subject: Re: newbie post - by delimited text file reader suite any good?
Date: 
Message-ID: <87bqwq3nle.fsf@thalassa.informatimago.com>
"Tayssir John Gabbour" <···········@yahoo.com> writes:

> Pascal Bourguignon wrote:
>> Richard Smith <······@weldsmith4.co.uk> writes:
>> > Sample invocation
>> > (my-conv-nums-tree (my-delimfile-slurp-to-list #\tab "x^2.txt"))
>> > ((0 0) (0.010101 1.0203E-4) (0.0...
>>
>> Common Lisp is more powerful than you think:
>>
>> (let ((*read-eval* nil))
>>   (with-open-file (file "x^2.txt")
>>     (loop for line = (read-line file nil nil)
>>           while line
>>           collect (ignore-errors (read-from-string (format nil "(~A)" line))))))
>
> You know, I've probably never seen code which more typifies Lisp.
>
> * "Eh, slap parentheses around anything and it'll be fine."
> * "Error-handling only matters when it matters."
> * "Oh yeah, remember to bind *read-eval* to nil."

Well give the lack of specifications, you can hardly do a finer job...

-- 
__Pascal Bourguignon__                     http://www.informatimago.com/

Pour moi, la grande question n'a jamais �t�: �Qui suis-je? O� vais-je?� 
comme l'a formul� si adroitement notre ami Pascal, mais plut�t: 
�Comment vais-je m'en tirer?� -- Jean Yanne
From: Tayssir John Gabbour
Subject: Re: newbie post - by delimited text file reader suite any good?
Date: 
Message-ID: <1141182348.544669.227780@p10g2000cwp.googlegroups.com>
Pascal Bourguignon wrote:
> "Tayssir John Gabbour" <···········@yahoo.com> writes:
> >> Common Lisp is more powerful than you think:
> >>
> >> (let ((*read-eval* nil))
> >>   (with-open-file (file "x^2.txt")
> >>     (loop for line = (read-line file nil nil)
> >>           while line
> >>           collect (ignore-errors (read-from-string (format nil "(~A)" line))))))
> >
> > You know, I've probably never seen code which more typifies Lisp.
> >
> > * "Eh, slap parentheses around anything and it'll be fine."
> > * "Error-handling only matters when it matters."
> > * "Oh yeah, remember to bind *read-eval* to nil."
>
> Well give the lack of specifications, you can hardly do a finer job...

Just to be clear, that was the opposite of a criticism; little things
like that code snippet impress me, I guess....

Reminds me of a Lisp meeting when we wanted to move monitors around to
do a nice presentation. Most people's instinct is to fumble around to
clear a place (move cables & etc), then move the monitor. However, this
fellow impatiently picked up the heavy CRT, ambled towards the
cluttered table, and said, "Ok, I'm gonna need to put this heavy thing
down..." and for some reason, it struck me as an unusually efficient
thing to do, as I cleared a spot faster than I'd have otherwise done,
methodical as I perhaps was then. (Having been used to
bondage-and-discipline tools like Java.)

Contrast that with a Java-and-Rational using company I worked with.
Needed to move monitors downstairs. To my horror, I realized how deep
the madness of busy-work sunk into one's soul -- as a leader grabbed a
monitor and trudged down the stairs, with everyone following lock-step,
I simply grabbed the nearby monitor-cart, loaded it up with 4 or 8
monitors, and pushed it into the elevator...


Tayssir
From: Pascal Bourguignon
Subject: Re: newbie post - by delimited text file reader suite any good?
Date: 
Message-ID: <877j7e3g7x.fsf@thalassa.informatimago.com>
"Tayssir John Gabbour" <···········@yahoo.com> writes:

> Pascal Bourguignon wrote:
>> "Tayssir John Gabbour" <···········@yahoo.com> writes:
>> >> Common Lisp is more powerful than you think:
>> >>
>> >> (let ((*read-eval* nil))
>> >>   (with-open-file (file "x^2.txt")
>> >>     (loop for line = (read-line file nil nil)
>> >>           while line
>> >>           collect (ignore-errors (read-from-string (format nil "(~A)" line))))))
>> >
>> > You know, I've probably never seen code which more typifies Lisp.
>> >
>> > * "Eh, slap parentheses around anything and it'll be fine."
>> > * "Error-handling only matters when it matters."
>> > * "Oh yeah, remember to bind *read-eval* to nil."
>>
>> Well give the lack of specifications, you can hardly do a finer job...
>
> Just to be clear, that was the opposite of a criticism; little things
> like that code snippet impress me, I guess....

Ok. Good :-)

> Reminds me of a Lisp meeting when we wanted to move monitors around to
> do a nice presentation. Most people's instinct is to fumble around to
> clear a place (move cables & etc), then move the monitor. However, this
> fellow impatiently picked up the heavy CRT, ambled towards the
> cluttered table, and said, "Ok, I'm gonna need to put this heavy thing
> down..." and for some reason, it struck me as an unusually efficient
> thing to do, as I cleared a spot faster than I'd have otherwise done,
> methodical as I perhaps was then. (Having been used to
> bondage-and-discipline tools like Java.)
>
> Contrast that with a Java-and-Rational using company I worked with.
> Needed to move monitors downstairs. To my horror, I realized how deep
> the madness of busy-work sunk into one's soul -- as a leader grabbed a
> monitor and trudged down the stairs, with everyone following lock-step,
> I simply grabbed the nearby monitor-cart, loaded it up with 4 or 8
> monitors, and pushed it into the elevator...

At least you could do it yourself.  I've seen corporations where to
move a computer you had to print a form in triplicate and let the
specialized department do it...

-- 
__Pascal Bourguignon__                     http://www.informatimago.com/

"Klingon function calls do not have "parameters" -- they have
"arguments" and they ALWAYS WIN THEM."
From: David Trudgett
Subject: Re: newbie post - by delimited text file reader suite any good?
Date: 
Message-ID: <m38xrv2bk8.fsf@rr.trudgett>
Pascal Bourguignon <······@informatimago.com> writes:

> Richard Smith <······@weldsmith4.co.uk> writes:
>
>> Hello everyone
>>
>> I've seen the advanced communication going on on the group.  On the
>> other hand :-) , here is my group of functions to process mainly TSV
>> (Tab Separated Variable) files.

c.l.l is just as much for newbies like us (well, I suppose I'm not
quite a newbie any more) as anyone else.


>>
>> I've used recursive design in line with eg. "ANSI Common Lisp" by
>> Paul Graham - so easy for a novice like me to understand.  (avoided
>> the "loop" macro, with its Perl-like (?) infix notation)
>>
>> Sample invocation
>>
>> (my-conv-nums-tree (my-delimfile-slurp-to-list #\tab "x^2.txt"))
>> ((0 0) (0.010101 1.0203E-4) (0.0...
>>
>>
>> where "x^2.txt"
>> 0	0
>> 0.010101	0.00010203
>> 0.020202	0.000408122
>> 0.030303	0.000918274
>> ...
>
> Common Lisp is more powerful than you think:
>
> (let ((*read-eval* nil))
>   (with-open-file (file "x^2.txt")
>     (loop for line = (read-line file nil nil)
>           while line 
>           collect (ignore-errors 
>                      (read-from-string (format nil "(~A)" line))))))
>
> --> ((0 0) (0.010101 1.0203E-4) (0.020202 4.08122E-4) 
>     (0.030303 9.18274E-4))
>
> You might have a more complex file format, but a tabulation is not a
> reason for a custom-made parser.

Welcome to c.l.l, Richard. I'm sure Pascal isn't *deliberately* trying
to make you feel like a fool.

Your code showed some interesting features, and is pretty good for a
CL newbie, which all of us are or have been at one time or another.

Naturally, suggestions for improvement could be made at several
different levels, from "do it a completely different way" ala Pascal
B., to "try using this function instead of that", to "why not name
this some other way?" I'm sure you'll get some handy hints from better
Lispers than me. If not, I'll chime in.

Cheers,

David



-- 

David Trudgett
http://www.zeta.org.au/~wpower/

What I don't know is not as much of a problem
as what I am sure I know that just ain't so.

    -- Mark Twain
From: Pascal Bourguignon
Subject: Re: newbie post - by delimited text file reader suite any good?
Date: 
Message-ID: <87irqy3ogj.fsf@thalassa.informatimago.com>
David Trudgett <······@zeta.org.au.nospamplease> writes:
>> Common Lisp is more powerful than you think:
>>
>> (let ((*read-eval* nil))
>>   (with-open-file (file "x^2.txt")
>>     (loop for line = (read-line file nil nil)
>>           while line 
>>           collect (ignore-errors 
>>                      (read-from-string (format nil "(~A)" line))))))
>>
>> --> ((0 0) (0.010101 1.0203E-4) (0.020202 4.08122E-4) 
>>     (0.030303 9.18274E-4))
>>
>> You might have a more complex file format, but a tabulation is not a
>> reason for a custom-made parser.
>
> Welcome to c.l.l, Richard. I'm sure Pascal isn't *deliberately* trying
> to make you feel like a fool.

Did I?  If so, it sure wasn't deliberate.

By the way, from the way the specifications were written, I assumed
that the line was an important structuring construct, and thus read
the data line by line, and then, reread it to parse each line.

Often, this is not really a constrain, and we know that we only have
to read the items two by two; then it's even simplier:

 (let ((*read-eval* nil))
   (with-open-file (file "x^2.txt")
     (loop for left  = (read file nil nil)
           for right = (read file nil nil)
           while right ; or perhaps: (and left right)
           collect (list left right))))

-- 
__Pascal Bourguignon__                     http://www.informatimago.com/

"Specifications are for the weak and timid!"
From: Lars Brinkhoff
Subject: Re: newbie post - by delimited text file reader suite any good?
Date: 
Message-ID: <85zmkacvi9.fsf@junk.nocrew.org>
Richard Smith <······@weldsmith4.co.uk> writes:
> (setf thelist (cons (my-delimtext-parse delimchar thisline) thelist))

(push (my-delimtext-parse delimchar thisline) thelist)
From: Richard Smith
Subject: Re: newbie post - by delimited text file reader suite any good?
Date: 
Message-ID: <m2mzgayo8o.fsf@82-35-197-237.cable.ubr01.enfi.blueyonder.co.uk>
Lars Brinkhoff <·········@nocrew.org> writes:

> Richard Smith <······@weldsmith4.co.uk> writes:
> > (setf thelist (cons (my-delimtext-parse delimchar thisline) thelist))
> 
> (push (my-delimtext-parse delimchar thisline) thelist)

That is a good tip.  I didn't like the look of the original line.


For all other posts:

"split-sequence" - yes, but it's got that "loop" construct in it, so
I find it impenetrable.

"csv-parser" - got that working well on CSV files - but big "proper"
(!) program to do much more complicated job.  Could have hacked it to
do tab-delimited -- but started afresh.

"fare-csv" - looks like a fabuolous program, but again it's aimed at
CSV files - particularly files produced by MSExcel (?)


Any good tutorials on understanding the "loop" construct?  It doesn't
come naturally after learning lisp seeing the work with lists and
tending to a recursive style of programming.


Thanks Lars and folk in general

Richard Smith
From: Rob Warnock
Subject: Re: newbie post - by delimited text file reader suite any good?
Date: 
Message-ID: <Bd-dncGe4ppr8pvZRVn-tg@speakeasy.net>
Richard Smith  <······@weldsmith4.co.uk> wrote:
+---------------
| "split-sequence" - yes, but it's got that "loop" construct in it, so
| I find it impenetrable.
+---------------

(*sigh*) So just rewrite it with DO [which is IMHO uglier
than the LOOP version, but YM obviously V]:

    > (with-open-file (s "x^2.txt")
	(do ((line (read-line s nil nil) (read-line s nil nil))
	     (results nil (cons (split-sequence:split-sequence #\tab line)
				results)))
	    ((null line) (reverse results))))

    (("0" "0") ("0.010101" "0.00010203") ("0.020202" "0.000408122")
     ("0.030303" "0.000918274"))
    >

Or, if you prefer an imperative, side-effecting style
[generates pretty much the same code]:

      (with-open-file (s "x^2.txt")
	(do ((line (read-line s nil nil) (read-line s nil nil))
	     (results nil))
	    ((null line) (reverse results))
	  (push (split-sequence:split-sequence #\tab line) results)))

+---------------
| "csv-parser" - got that working well on CSV files - but big "proper"
| (!) program to do much more complicated job.  Could have hacked it to
| do tab-delimited -- but started afresh.
+---------------

I didn't see that request in the thread, or I would have posted this
[of course, it uses that ugly LOOP thingy]:

    ;;; PARSE-CSV-LINE -- Parse one CSV line into a list of fields,
    ;;; stripping quotes and field-internal escape characters.
    ;;; Lexical states: '(normal quoted escaped quoted+escaped)
    ;;;
    (defun parse-csv-line (line)
      (when (or (string= line "")           ; special-case blank lines
		(char= #\# (char line 0)))  ; or those starting with "#"
	(return-from parse-csv-line '()))
      (loop for c across line
	    with state = 'normal
	    and results = '()
	    and chars = '() do
	(ecase state
	  ((normal)
	   (case c
	     ((#\") (setq state 'quoted))
	     ((#\\) (setq state 'escaped))
	     ((#\,)
	      (push (coerce (nreverse chars) 'string) results)
	      (setq chars '()))
	     (t (push c chars))))
	  ((quoted)
	   (case c
	     ((#\") (setq state 'normal))
	     ((#\\) (setq state 'quoted+escaped))
	     (t (push c chars))))
	  ((escaped) (push c chars) (setq state 'normal))
	  ((quoted+escaped) (push c chars) (setq state 'quoted)))
	finally
	 (progn
	   (push (coerce (nreverse chars) 'string) results) ; close open field
	   (return (nreverse results)))))


-Rob

-----
Rob Warnock			<····@rpw3.org>
627 26th Avenue			<URL:http://rpw3.org/>
San Mateo, CA 94403		(650)572-2607
From: Marc Battyani
Subject: Re: newbie post - by delimited text file reader suite any good?
Date: 
Message-ID: <o8-dna4h_qIXl5rZRVny0Q@giganews.com>
"Rob Warnock" <····@rpw3.org> wrote

> Or, if you prefer an imperative, side-effecting style
> [generates pretty much the same code]:
>
>      (with-open-file (s "x^2.txt")
> (do ((line (read-line s nil nil) (read-line s nil nil))
>      (results nil))
>     ((null line) (reverse results))
>   (push (split-sequence:split-sequence #\tab line) results)))
>
>> I didn't see that request in the thread, or I would have posted this
> [of course, it uses that ugly LOOP thingy]:
>
[...]

As this topic come up frequently, I've put an improved version of this, sent
by Rob, as a code snippset in the Common Lisp Directory so that it can be
re-used next time ;-)

Here it is: http://www.cl-user.net/asp/html-docs/process-file-snippset

Marc
From: Edi Weitz
Subject: Re: newbie post - by delimited text file reader suite any good?
Date: 
Message-ID: <u7j7dq3bk.fsf@agharta.de>
On Wed, 01 Mar 2006 19:16:32 GMT, Richard Smith <······@weldsmith4.co.uk> wrote:

> Any good tutorials on understanding the "loop" construct?

  <http://www.gigamonkeys.com/book/loop-for-black-belts.html>

-- 

European Common Lisp Meeting 2006: <http://weitz.de/eclm2006/>

Real email: (replace (subseq ·········@agharta.de" 5) "edi")