From: nels tomlinson
Subject: need help reading csv file into lisp
Date: 
Message-ID: <3AAF7AE4.D739EE3F@purdue.edu>
Hello, all,

I'm new to lisp, and in need of a kickstart.   I'm trying to read a
comma separated values file using Gnu common lisp, and have the result
be a lisp array of numbers (assuming that the csv file has ascii
representations of numbers in it...).   I've figured out how to read an
arbitrary file as strings, but then I can't quite figure out how to
parse the strings.  Probably I'm going about this all wrong, so any
example code or pointers into the hyperref would be much appreciated.

Thanks,
Nels

P.S.
So far, I've puzzled out the following bit of code, which  reads in the
csv file as a list of strings:

(defun read-all-lines (input-stream eof-ind)
  (let ((result (read-line input-stream nil eof-ind)))
    (if (eq result eof-ind)
 nil
      (cons result (read-all-lines input-stream eof-ind)))));

(with-open-file (instream "./test.slime") (read-all-lines instream (list
'$eof$)));

If the csv file is

1.2,3.4
5.6,7.8

that returns

("1.2,3.4" "5.6,7.8")

Now I have only to parse that into an array... I seem to be stuck there.

From: Joe Marshall
Subject: Re: need help reading csv file into lisp
Date: 
Message-ID: <zoeo7429.fsf@content-integrity.com>
nels tomlinson <········@purdue.edu> writes:

> Hello, all,
> 
> I'm new to lisp, and in need of a kickstart.   I'm trying to read a
> comma separated values file using Gnu common lisp, and have the result
> be a lisp array of numbers (assuming that the csv file has ascii
> representations of numbers in it...).   I've figured out how to read an
> arbitrary file as strings, but then I can't quite figure out how to
> parse the strings.  Probably I'm going about this all wrong, so any
> example code or pointers into the hyperref would be much appreciated.
> 
> Thanks,
> Nels
> 
> P.S.
> So far, I've puzzled out the following bit of code, which  reads in the
> csv file as a list of strings:
> 
> (defun read-all-lines (input-stream eof-ind)
>   (let ((result (read-line input-stream nil eof-ind)))
>     (if (eq result eof-ind)
>  nil
>       (cons result (read-all-lines input-stream eof-ind)))));

This will be ok for small files, but you are liable to run out of
stack on larger files.  If you turn this into a loop, you can avoid
that:

(defun read-all-lines (input-stream)
  (do ((line nil)
       (result nil (cons line result)))
      ((eq line :eof) (nreverse result))
   (setq line (read-line input-stream nil :eof))))

Because it is a loop, it will accumulate the lines in reverse order,
so we nreverse the result.  Note that since read-line will be
returning a string, any non-string will do as an EOF value.  I like
using the :eof keyword.

Alternatively, you could track the end of the list:

(defun read-all-lines (input-stream)
  (let* ((line nil)
         (result (cons nil nil))
         (tail result))
    (loop until (eq (setq line (read-line input-stream nil :eof)) :eof)
          do (setq tail (setf (cdr tail) (cons line nil))))
    (cdr result)))

> 
> (with-open-file (instream "./test.slime") (read-all-lines instream (list
> '$eof$)));
> 
> If the csv file is
> 
> 1.2,3.4
> 5.6,7.8
> 
> that returns
> 
> ("1.2,3.4" "5.6,7.8")
> 
> Now I have only to parse that into an array... I seem to be stuck there.

Rather than collecting a list of strings, parse each string as you
collect it.  POSITION and READ-FROM-STRING come in handy:

(defun read-fields (string character)
  "Returns a the result of calling read on the substrings of the
   original string split on the character.  Substrings will not 
   include character.   Adjacent instances of character will result 
   in NILs in the result."
  (let ((*read-eval* nil)) ;; avoid nasty surprises
    (do ((scan (position character string)
               (position character string :start (1+ scan)))
         (previous 0 (1+ scan))
         (answer nil (cons (read-from-string string nil nil :start previous :end scan) answer)))
        ((null scan) (nreverse (cons (read-from-string string nil nil :start previous) answer))))))

(defun parse-csv-stream (input-stream)
  (let* ((line nil)
         (result (cons nil nil))
         (tail result)
         (line-count 0))
    (loop until (eq (setq line (read-line input-stream nil :eof)) :eof)
          do (incf line-count) (setq tail (setf (cdr tail) (cons (read-fields line #\,) nil))))
    (values (cdr result) line-count)))

parse-csv-stream will return a list of lists, each sublist will have
the result of calling read on the result of breaking the contents of
the line on the commas.  Now all you have to do is make the array

(multiple-value-bind (parsed-lines length)
    (with-open-file (stream "foo.csv") (parse-csv-stream stream))
  (make-array (list length 2) :initial-contents parsed-lines))

This code is rather ad hoc; it will only correctly parse files where
the values are guaranteed to be in the form of Common Lisp numbers,
where there are 2 numbers per line (well, that's easily changed, but
this assumes that every line has the same number of values), there are
no extraneous characters, and you want to split the lines on a single
character (the comma).

A better solution would allow you to parameterize how you decoded a
line (e.g., so you could pass in a set of characters to split upon)
what to do with ill-formed lines, how to filter out extraneous
characters, etc.  If performance is important, you might want to write
a little state machine driven off the character codes, rather than
reading each line in to a string, reading the string, then discarding
the string.

But for the task at hand, it should work ok.

(I had to hack something similar, but it needed ultra-high performance
because it was the bottleneck in a certain process.  I used a foreign
function to call mmap to get the file contents into memory, then I
faked up a displaced string to point at the contents.  Certain fields
were identified by fixed string prefixes, so I used some code that,
when given a pattern string, would write the lisp code that would
perform a boyer-moore search on an arbitrary target.  I then fed this
to the compiler to generate specialized, but very fast, string search
functions.  The resulting code was a couple of orders of magnitude
faster than the naive original code that used READ-LINE and SEARCH.)


-----= Posted via Newsfeeds.Com, Uncensored Usenet News =-----
http://www.newsfeeds.com - The #1 Newsgroup Service in the World!
-----==  Over 80,000 Newsgroups - 16 Different Servers! =-----
From: Jose Carlos Senciales
Subject: Re: need help reading csv file into lisp
Date: 
Message-ID: <3AAF9468.BF30538F@lcc.uma.es>
 (defun read-all-lines (input-stream eof-ind)
  (let ((result (substitute #\SPACE #\,(read-line input-stream nil
eof-ind))))
    (if (equal result eof-ind) nil
      (cons result (read-all-lines input-stream eof-ind)))))

If the csv file is

1.2,3.4
5.6,7.8

that returns

("1.2 3.4" "5.6 7.8")

And then  use read-from-string for example:

(setf *Arr* (make-array '(10)))  ; a array

 (DO*
   ((Dato "1.2 2.3 5.5 6.7") ;a example String
    (i 0 (1+ i))
    (Valor-Linea (multiple-value-bind (Dat Index) (read-from-string Dato NIL
'EOF :start 0)(List Dat Index))
                         (multiple-value-bind (Dat Index)(read-from-string
Dato NIL 'EOF :start (cadr Valor-Linea))(List Dat Index))))
                         ((equal 'EOF (car Valor-Linea)))
                         (setf (aref *Arr* i)(car Valor-Linea)))


wait this help you :-)


nels tomlinson wrote:

> Hello, all,
>
> I'm new to lisp, and in need of a kickstart.   I'm trying to read a
> comma separated values file using Gnu common lisp, and have the result
> be a lisp array of numbers (assuming that the csv file has ascii
> representations of numbers in it...).   I've figured out how to read an
> arbitrary file as strings, but then I can't quite figure out how to
> parse the strings.  Probably I'm going about this all wrong, so any
> example code or pointers into the hyperref would be much appreciated.
>
> Thanks,
> Nels
>
> P.S.
> So far, I've puzzled out the following bit of code, which  reads in the
> csv file as a list of strings:
>
> (defun read-all-lines (input-stream eof-ind)
>   (let ((result (read-line input-stream nil eof-ind)))
>     (if (eq result eof-ind)
>  nil
>       (cons result (read-all-lines input-stream eof-ind)))));
>
> (with-open-file (instream "./test.slime") (read-all-lines instream (list
> '$eof$)));
>
> If the csv file is
>
> 1.2,3.4
> 5.6,7.8
>
> that returns
>
> ("1.2,3.4" "5.6,7.8")
>
> Now I have only to parse that into an array... I seem to be stuck there.

--
*

          *                *

*
----------------------------
Jos� Carlos Senciales Chaves
    ······@lcc.uma.es
----------------------------
From: ········@hex.net
Subject: Re: need help reading csv file into lisp
Date: 
Message-ID: <BzNr6.57816$lj4.1414796@news6.giganews.com>
nels tomlinson <········@purdue.edu> writes:
> I'm new to lisp, and in need of a kickstart.   I'm trying to read a
> comma separated values file using Gnu common lisp, and have the result
> be a lisp array of numbers (assuming that the csv file has ascii
> representations of numbers in it...).   I've figured out how to read an
> arbitrary file as strings, but then I can't quite figure out how to
> parse the strings.  Probably I'm going about this all wrong, so any
> example code or pointers into the hyperref would be much appreciated.
> 
> Thanks,
> Nels
> 
> P.S.
> So far, I've puzzled out the following bit of code, which  reads in the
> csv file as a list of strings:
> 
> (defun read-all-lines (input-stream eof-ind)
>   (let ((result (read-line input-stream nil eof-ind)))
>     (if (eq result eof-ind)
>  nil
>       (cons result (read-all-lines input-stream eof-ind)))));
> 
> (with-open-file (instream "./test.slime") (read-all-lines instream (list
> '$eof$)));
> 
> If the csv file is
> 
> 1.2,3.4
> 5.6,7.8
> 
> that returns
> 
> ("1.2,3.4" "5.6,7.8")
> 
> Now I have only to parse that into an array... I seem to be stuck there.

You're going to have to produce some sort of state machine to process
those strings.

A somewhat oversimplified place to start:

(defun parse-csv-line (string)
  (let ((currfield "")
	(collected-value "")
	(state 'invalue)
	(full-string (concatenate 'string string "|")))
 ;;; Added in | to provide a visible terminator
    (loop
      for ch across full-string
      do
      (cond
     ;;; Cases:  Digits, dots, commas
     ;;; States: 'building-number 'done-number
       ((member ch '(#\1 #\2 #\3 #\4 #\5 #\6 #\7 #\8 #\9 #\0 #\. #\-))
	(if (eq state 'donefield)
	    (progn
	      (setf collected-value "")
	      (setf state 'invalue)))
	(setf collected-value
	      (concatenate 'string collected-value (make-string 1 :initial-element ch))))
       ((member ch '(#\, #\|))
	(setf state 'donefield)))
      when (eq state 'donefield)
      collect collected-value)))

10. Break [16]> (parse-csv-line "1.2,3.4,5.6,7.8,9")

("1.2" "3.4" "5.6" "7.8" "9")
11. Break [17]> (parse-csv-line "1.4,2.5")

("1.4" "2.5")
11. Break [17]> 

Note that this produces a list of strings that you'll then need to
transform into numbers.  And that it doesn't deal with alphanumeric
values, or spaces, or CSV fields that are enclosed by quotes, or #\\
inside such quotes to "quote the quotes."  

So considerable learning is still available :-)
-- 
(concatenate 'string "cbbrowne" ·@acm.org")
http://www.ntlug.org/~cbbrowne/lisp.html
Rules of  the Evil  Overlord #120. "Since  nothing is  more irritating
than a hero  defeating you with basic math skills,  all of my personal
weapons  will be  modified to  fire one  more shot  than  the standard
issue." <http://www.eviloverlord.com/>
From: nels tomlinson
Subject: Re: need help reading csv file into lisp
Date: 
Message-ID: <3AAFF4B7.7BB3CEF0@purdue.edu>
Hello, again,

Thanks for all the help.  There is a lot there, and it looks like plenty to
get me over the hump.  When I get it digested, I'll try to post a coherent
summary, and my  read-csv-into-array function if it's usable.

Thanks again,
Nels

nels tomlinson wrote:

> Hello, all,
>
> I'm new to lisp, and in need of a kickstart.   I'm trying to read a
> comma separated values file using Gnu common lisp, and have the result
> be a lisp array of numbers (assuming that the csv file has ascii
> representations of numbers in it...).   I've figured out how to read an
> arbitrary file as strings, but then I can't quite figure out how to
> parse the strings.  Probably I'm going about this all wrong, so any
> example code or pointers into the hyperref would be much appreciated.
>
> Thanks,
> Nels
>
> P.S.
> So far, I've puzzled out the following bit of code, which  reads in the
> csv file as a list of strings:
>
> (defun read-all-lines (input-stream eof-ind)
>   (let ((result (read-line input-stream nil eof-ind)))
>     (if (eq result eof-ind)
>  nil
>       (cons result (read-all-lines input-stream eof-ind)))));
>
> (with-open-file (instream "./test.slime") (read-all-lines instream (list
> '$eof$)));
>
> If the csv file is
>
> 1.2,3.4
> 5.6,7.8
>
> that returns
>
> ("1.2,3.4" "5.6,7.8")
>
> Now I have only to parse that into an array... I seem to be stuck there.
From: Kent M Pitman
Subject: Re: need help reading csv file into lisp
Date: 
Message-ID: <sfw7l1s0wxs.fsf@world.std.com>
nels tomlinson <········@purdue.edu> writes:

> (defun read-all-lines (input-stream eof-ind)
>   (let ((result (read-line input-stream nil eof-ind)))
>     (if (eq result eof-ind)
>  nil
>       (cons result (read-all-lines input-stream eof-ind)))));
> 
> (with-open-file (instream "./test.slime") (read-all-lines instream (list
> '$eof$)));

Dispense with this '$eof$ thing.  You only have to make up weird names
like this if obvious objects will conflict.  READ-LINE defautsl its eof 
object to NIL, which is a fine value, because it never ordinarily returns
that and because NIL is the natural false value.  And  you're not returning
the value, so whatever it is doesn't matter.

  (defun read-all-lines (input-stream)
    (loop for line = (read-line input-stream nil nil)
          while line
          collect line))

This won't blow out of stack space, which yours will. This won't require
making up extra objects for eof-values that aren't really needed.  You only
need the trick you're using when doing READ, which can return arbitrary 
objects, and even then I'd do:

 (defvar *eof* (list '*eof*))

 ... (read input-stream nil *eof*) ...

rather than making a new eof value every time.