From: Janet
Subject: Reading an excel spreadsheet using Lisp
Date: 
Message-ID: <3fdc241b$1@news.starhub.net.sg>
Hi guys, I am looking to for a lisp program that will read rows & columns
from an excel spreadsheet.
Does anyone out there has such a program or knows the whereabouts of one.
Thanks in advance!!

From: Chris Perkins
Subject: Re: Reading an excel spreadsheet using Lisp
Date: 
Message-ID: <6cb6c81f.0312141550.c520452@posting.google.com>
"Janet" <·····@starhub.net.sg> wrote in message news:<··········@news.starhub.net.sg>...
> Hi guys, I am looking to for a lisp program that will read rows & columns
> from an excel spreadsheet.
> Does anyone out there has such a program or knows the whereabouts of one.
> Thanks in advance!!

Other people have given the best advice on approaching your problem. 
But, if you absolutely must read the Excel spreadsheet directly from
file, those files are in the Microsoft OLE 2 Compound Document format.
 That format has a number of API's in the Win32 library, but the
format itself is _not_ documented.   The API routines you'll want
revolve around IStorage, IStream, and others.  Check out
http://msdn.microsoft.com for details.

Essentially, that document type is the FAT16 file system mapped to a
single file instead of a hard drive.  And, as compound document
formats go, it's pretty bad.  It makes no provisions for compression,
encryption, signatures, etc.


The Apache Jakarta project has a member project named POI
(http://jakarta.apache.org/poi/) that has produced some open source
Java classes for reading/writing OLE files, and in particular, Excel
files.   If memory serves, POIFS was the first library produced that
was later developed into the whole POI project, and POIFS stands for
"Poorly Implemented File System".


I personally don't know of any CL code for parsing these abominations,
but would also be interested if anyone knows of anything.


Chris Perkins
Media Lab, Inc.
From: Edi Weitz
Subject: Re: Reading an excel spreadsheet using Lisp
Date: 
Message-ID: <87r7z7c56f.fsf@bird.agharta.de>
On 14 Dec 2003 15:50:38 -0800, ········@medialab.com (Chris Perkins) wrote:

> But, if you absolutely must read the Excel spreadsheet directly from
> file, those files are in the Microsoft OLE 2 Compound Document
> format.  That format has a number of API's in the Win32 library, but
> the format itself is _not_ documented.  The API routines you'll want
> revolve around IStorage, IStream, and others.  Check out
> http://msdn.microsoft.com for details.
>
> Essentially, that document type is the FAT16 file system mapped to a
> single file instead of a hard drive.  And, as compound document
> formats go, it's pretty bad.  It makes no provisions for
> compression, encryption, signatures, etc.
>
> The Apache Jakarta project has a member project named POI
> (http://jakarta.apache.org/poi/) that has produced some open source
> Java classes for reading/writing OLE files, and in particular, Excel
> files.  If memory serves, POIFS was the first library produced that
> was later developed into the whole POI project, and POIFS stands for
> "Poorly Implemented File System".

There's also a Perl module which allegedly can parse native Excel
files:

  <http://search.cpan.org/~kwitknr/Spreadsheet-ParseExcel-0.2602/>

Edi.
From: Bruno Haible
Subject: Re: Reading an excel spreadsheet using Lisp
Date: 
Message-ID: <brku7q$8o5$3@laposte.ilog.fr>
Chris Perkins <········@medialab.com> wrote:
>
> The Apache Jakarta project has a member project named POI
> (http://jakarta.apache.org/poi/) that has produced some open source
> Java classes for reading/writing OLE files, and in particular, Excel
> files.

Unfortunately, in its current state, this library fails to handle cells
with formulas. With other approaches, you get the result of evaluating
the formula; but with POI, you get neither the formula nor its result.
If the Excel files you handle contain no formulas, POI might still be
fine for you...

                       Bruno
From: Barry Wilkes
Subject: Re: Reading an excel spreadsheet using Lisp
Date: 
Message-ID: <ekv7pdd2.fsf@acm.org>
"Janet" <·····@starhub.net.sg> writes:

> Hi guys, I am looking to for a lisp program that will read rows & columns
> from an excel spreadsheet.
> Does anyone out there has such a program or knows the whereabouts of one.
> Thanks in advance!!

I guess there are at least three ways to approach this, depending on what you
are trying to acheive.

1.  Read the raw Excel file format.  This is hard.  You also have to modify
your software everytime MS modifies this propriatory format.

2.  Save the Excel file in a format that is easy to read.  Something like .csv
(comma seperated variable) format.  If all you want is access to data that has
been stored in a spreadsheet, this may be acceptable.  

3.  Use OLE/COM/Automation.  This is possibly the easiest way, assuming you
don't have to write your own COM substrate.  If you are using LispWorks, then
you are home and dry.  Just read up on the Excel object model and the
LispWorks interface to COM. 

Frankly, if I'm looking to modify an Excel spreadsheet programmatically, I
would use (gasp!) VBA.  Much easier.  If I needed to do some stuff that was
better done in another language, I would be more inclined to write an Excel
addin,  or use COM.  Then just use this from VBA.


Barry.
From: Jock Cooper
Subject: Re: Reading an excel spreadsheet using Lisp
Date: 
Message-ID: <m3u1422ah8.fsf@jcooper02.sagepub.com>
Barry Wilkes <·······@acm.org> writes:

> "Janet" <·····@starhub.net.sg> writes:
> 
> > Hi guys, I am looking to for a lisp program that will read rows & columns
> > from an excel spreadsheet.
> > Does anyone out there has such a program or knows the whereabouts of one.
> > Thanks in advance!!
>snip
> 3.  Use OLE/COM/Automation.  This is possibly the easiest way, assuming you
> don't have to write your own COM substrate.  If you are using LispWorks, then
> you are home and dry.  Just read up on the Excel object model and the
> LispWorks interface to COM. 

For ACL users Allegro also has this ability.. I just wrote a report
module that creates output in MS WORD format.  At least one of the
example programs that Franz supplies manipulates an excel sheet.  The
example code is a little clunky, but some syntactic sugar should make it
manageable.  I whipped up a few macros and my code (for example) looks
something like this

(let% logoshape = call shapes.addtextbox :orientation 1 :left 37 :top top-pos 
                                          :width 250 :height 35)
(set% logoshape.line.forecolor.rgb = #x00ffffff)
(call% logoshape.select)
(call% selection.inlineshapes.addpicture :filename "logofile.bmp" 
                                   :linktofile nil :savewithdocument t))
(let% borders = prop document.sections[1].borders)
(set% borders.enable = t)
(loop for item from 1 to 4
      do
      (set% borders[item].linewidth = 24)))

Not nearly as nice as doing the same thing from an M$ language but
definitely usable.

The doc page for ACL's OLE is here:
http://www.franz.com/support/documentation/6.2/doc/ole.htm


--
Jock Cooper
www.fractal-recursions.com
From: Bruno Haible
Subject: Re: Reading an excel spreadsheet using Lisp
Date: 
Message-ID: <brktu7$8o5$2@laposte.ilog.fr>
Barry Wilkes <·······@acm.org> wrote:
> I guess there are at least three ways to approach this
> 1.  Read the raw Excel file format.
> 2.  Save the Excel file in a format that is easy to read.
> 3.  Use OLE/COM/Automation.

There is also
4. (On Windows systems only) Define an ODBC data source pointing to your
   Excel file, and access this ODBC data source through a Lisp/ODBC interface.

                      Bruno
From: Alain Picard
Subject: Re: Reading an excel spreadsheet using Lisp
Date: 
Message-ID: <873cbl8a3v.fsf@memetrics.com>
Bruno Haible <·····@clisp.org> writes:

>
> There is also
> 4. (On Windows systems only) Define an ODBC data source pointing to your
>    Excel file, and access this ODBC data source through a Lisp/ODBC interface.

That's fantastic!  I didn't know you could do that?  Do
you know if the reverse is possible; i.e. writing into
an excel file via ODBC?
From: Dmitri Ivanov
Subject: Re: Reading an excel spreadsheet using Lisp
Date: 
Message-ID: <brn40h$26u$2@news.aha.ru>
Hello Alain,
"Alain Picard" <·······················@optushome.com.au> wrote:

AP> Bruno Haible <·····@clisp.org> writes:
AP>
AP>>
AP>> There is also
AP>> 4. (On Windows systems only) Define an ODBC data source pointing
AP>> to your   Excel file, and access this ODBC data source through a
AP>> Lisp/ODBC interface.
AP>
AP> That's fantastic!  I didn't know you could do that?  Do
AP> you know if the reverse is possible; i.e. writing into
AP> an excel file via ODBC?

Yes, it is. You can insert and update, but not delete.
Below is an example of interacting via YstokSQL.

(setq excel-db (sql:connect "excel-test" :autocommit t))

(sql:create-table [Table1]  ; create a sheet
  '(([id] integer)
    ([str] string)))

(dotimes (i 10)
  (sql:insert-records :into [Table1] :attributes '([id] [str])
                      :values (list i (random-string 30))))

(sql:update-records [Table1] :attributes [str] :values "My string"
                             :where [= [id] 2])
--
Sincerely,
Dmitri Ivanov
lisp.ystok.ru
From: james anderson
Subject: Re: Reading an excel spreadsheet using Lisp
Date: 
Message-ID: <3FDC8F10.B2CBCDF0@setf.de>
export the spreadsheet as comma separated values, define a readtable in which
comma is whitespace, read the lines into a string buffer, and read the data
out of the string buffer using the readtable.

...


Janet wrote:
> 
> Hi guys, I am looking to for a lisp program that will read rows & columns
> from an excel spreadsheet.
> Does anyone out there has such a program or knows the whereabouts of one.
> Thanks in advance!!
From: Edi Weitz
Subject: Re: Reading an excel spreadsheet using Lisp
Date: 
Message-ID: <87k74zdqpm.fsf@bird.agharta.de>
On Sun, 14 Dec 2003 17:30:21 +0100, james anderson <··············@setf.de> wrote:

> export the spreadsheet as comma separated values, define a readtable
> in which comma is whitespace, read the lines into a string buffer,
> and read the data out of the string buffer using the readtable.

That won't work if the data contains literal commas because they have
to be quoted somehow.

I've occasionally used the code below to read CSV data generated by MS
Excel. It turns out that (despite of the "comma" in CSV) the Excel
delimiter is a semicolon and literal semicolons are embedded into
strings enclosed by quotes. Quotes themselves are encoded by two
quotes in a row.

Edi.


  * (with-open-file (s "test.csv")
      (loop for line = (read-line s nil)
            while line
            do (format t "~A~%   ->   ~S~%~%" line (csv-split line))))
  1;2;3;4;5
     ->   ("1" "2" "3" "4" "5")

  "";";";"""";4;5
     ->   ("" ";" "\"" "4" "5")

  """abc""";abc;"""""";"abc;abc";"""abc;abc"""
     ->   ("\"abc\"" "abc" "\"\"" "abc;abc" "\"abc;abc\"")

  "abc""";abc;"""""";"abc;abc";"""abc;abc"""
     ->   ("abc\"" "abc" "\"\"" "abc;abc" "\"abc;abc\"")

  ;;"""""";"abc;abc";
     ->   ("" "" "\"\"" "abc;abc" "")

  NIL



  (defun csv-split (line)
    "Reads a line with comma-separated data and returns a list of the
  corresponding values \(as fresh strings)."
    (loop with start = 0
          for even = t then (if (char= #\" char) (not even) even)
          for char across line
          for pos from 0
          ;; only accept semicolon as delimiter if the number of quotes
          ;; already seen is even
          when (and (char= #\; char) even)
          collect (un-quote line start pos) into result
          and do (setq start (1+ pos))
          finally (return (nconc result
                                 (list (un-quote line start))))))

  (defun un-quote (string start &optional (end (length string)))
    "Unquotes and returns the part of the string STRING denoted by the
  bounding index designators START and END. This function always returns
  a fresh string."
    (cond ((and (< start (length string))
                (char= #\" (char string start)))
            ;; starts with a quote, so we must unquote
            (when (>= (1+ start) end)
              (error "Strings starting with a quote must be at least two characters long."))
            (when (char/= #\" (char string (1- end)))
              (error "Expected quote at position ~A in string ~S."
                     (1- end) string))
            (let ((collector (make-array (- end start 2)
                                         :element-type 'character
                                         :fill-pointer 0))
                  (pos (1+ start)))
              (loop
                (cond ((>= pos (1- end))
                        ;; done, so return what we've collected
                        (return-from un-quote collector))
                      ((char= #\" (char string pos))
                        ;; looking at a quote, so skip two quotes and
                        ;; collect one
                        (cond ((= (1+ pos) (1- end))
                                (error "String ~S has an odd number of quotes."
                                       (subseq string start end)))
                              ((char/= #\" (char string (1+ pos)))
                                (error "Expected quote at position ~A in string ~S."
                                       (1+ pos) string))
                              (t
                                (vector-push-extend #\" collector)
                                (incf pos 2))))
                      (t
                        ;; any other character, collect it
                        (vector-push-extend (char string pos) collector)
                        (incf pos))))))
          (t
            ;; does not start with a quote, so just return the substring
            (subseq string start end))))
From: Edi Weitz
Subject: Re: Reading an excel spreadsheet using Lisp
Date: 
Message-ID: <87brqbdq0a.fsf@bird.agharta.de>
On Sun, 14 Dec 2003 22:39:17 +0100, Edi Weitz <···@agharta.de> wrote:

> On Sun, 14 Dec 2003 17:30:21 +0100, james anderson <··············@setf.de> wrote:
>
>> export the spreadsheet as comma separated values, define a
>> readtable in which comma is whitespace, read the lines into a
>> string buffer, and read the data out of the string buffer using the
>> readtable.
>
> That won't work if the data contains literal commas because they
> have to be quoted somehow.

Sorry, I wasn't precise enough: This approach /will/ work if the data
is encoded such that strings are always enclosed in quotes and quotes
are escaped as in CL strings. This seems to be the standard meaning
for "comma-separated values" but, alas, it isn't the convention used
by MS Excel.

Edi.

PS: By "quotes" I actually mean "double quotes" here and in my other
    message.
From: Bruno Haible
Subject: Re: Reading an excel spreadsheet using Lisp
Date: 
Message-ID: <brkt3i$8o5$1@laposte.ilog.fr>
Edi Weitz <···@agharta.de> wrote:
> It turns out that (despite of the "comma" in CSV) the Excel
> delimiter is a semicolon

That's because you're operating in a German locale. In this locale,
the decimal separator is a comma and the column separator is a semicolon.
Thus if your program is fed a CSV file with unknown origin, you have to
resort to heuristics in order to decide whether to interpret the comma as
a column separator or as a decimal separator.

                  Bruno
From: Edi Weitz
Subject: Re: Reading an excel spreadsheet using Lisp
Date: 
Message-ID: <87n09taokx.fsf@bird.agharta.de>
On 15 Dec 2003 18:04:34 GMT, Bruno Haible <·····@clisp.org> wrote:

> Edi Weitz <···@agharta.de> wrote:
>> It turns out that (despite of the "comma" in CSV) the Excel
>> delimiter is a semicolon
>
> That's because you're operating in a German locale. In this locale,
> the decimal separator is a comma and the column separator is a
> semicolon.

Hehe - where do you know that from? Actually, I'm using an English
version of Windows XP and an English version of Office XP. But you're
right - I had set the locale for number representation to German to be
compatible with one of my clients.

Thanks,
Edi.
From: Rob Warnock
Subject: Re: Reading an excel spreadsheet using Lisp
Date: 
Message-ID: <2qydnU8FD8--B0CiXTWc-w@speakeasy.net>
james anderson  <··············@setf.de> wrote:
+---------------
| export the spreadsheet as comma separated values, define a readtable...
+---------------

Using readtables for that is overkill, and besides, it doesn't
handle quoted strings with the field delimiter in it, or escaped
quotes or escaped escapes [all of which I have seen in CSV files].
Here's a quick hack[1] I wrote once that handles all of those;

;;; PARSE-CSV-LINE -- Parse one CSV line into a list of fields,
;;; stripping quotes and field-internal escape characters.
;;; Simple FSM with states '(:NORMAL :QUOTED :ESCAPED :QUOTED+ESCAPED).
(defun parse-csv-line (line)
  (when (string= line "")
    (return-from parse-csv-line '()))
  ;; assert: line contains at least one field
  (loop for c across line
        with state = :normal
        and results = '()
        and chars = '()
    do (ecase state
	 ((:normal)
	  (case c
	    ((#\") (setq state :quoted))
	    ((#\\) (setq state :escaped))
	    ((#\,)
	     (push (coerce (nreverse chars) 'string) results)
	     (setq chars '()))
	    (t (push c chars))))
	 ((:quoted)
	  (case c
	    ((#\") (setq state :normal))
	    ((#\\) (setq state :quoted+escaped))
	    (t (push c chars))))
	 ((:escaped) (push c chars) (setq state :normal))
	 ((:quoted+escaped) (push c chars) (setq state :quoted)))
    finally
     (progn
       (push (coerce (nreverse chars) 'string) results) ; close open field
       (return (nreverse results)))))

;;; sample driver
(defun parse-csv-file (filename)
  (with-open-file (s filename)
	 (loop for line = (read-line s nil nil)
	       while line
	   collect (parse-csv-line line))))


-Rob

[1] Magic constants for delimiter & escape characters are hard-coded,
    unlike Alain Picard's nice, generalized, parameterized version.

-----
Rob Warnock			<····@rpw3.org>
627 26th Avenue			<URL:http://rpw3.org/>
San Mateo, CA 94403		(650)572-2607
From: Alain Picard
Subject: Re: Reading an excel spreadsheet using Lisp
Date: 
Message-ID: <878ylebrf0.fsf@memetrics.com>
james anderson <··············@setf.de> writes:

> export the spreadsheet as comma separated values, define a readtable in which
> comma is whitespace, read the lines into a string buffer, and read the data
> out of the string buffer using the readtable.

And  Here's some code to do it:

================================================================
;;                      -*- mode: lisp -*-
;;
;; CSV parsing/writing utilities, a la Microsoft Excel.
;;
;; Author: Alain Picard <·······@optushome.com.au>
;;         (also ············@memetrics.com)
;;
;; Version: 0.2
;; File: $Id: csv-parser.lisp,v 1.8 2003/12/15 04:24:19 kooks Exp $
;;
;; License:
;; This code is placed under the Lesser GNU Public License (LGPL)
;; (see http://www.fsf.org/licenses/lgpl.html) as
;; clarified for Lisp by Franz when they released AllegroServe (see
;; http://allegroserve.sourceforge.net/license-allogroserve.txt)
;;
;; What this clarification basically means is that compiling this
;; file and loading it into your lisp image, either at time of
;; delivery or runtime, does not make your program a derivative of
;; this one.  YOU ARE EXPRESSLY PERMITTED TO LOAD THIS FILE INTO
;; YOUR LISP IMAGE, AT ANY TIME, FOR ANY AND ALL (INCLUDING COMMERCIAL)
;; PURPOSES.  In particular, clause 5 of the LGPL is NOT invoked
;; by you embedding or loading this code, interpeted or compiled,
;; into your application.
;;
;; Of course, if you make modifications to this file, then the terms
;; of the LGPL hold, and you must redistribute the sources of this
;; file and your modifications with your application.
;;
;; Lastly, if you DO make useful changes to this code, I would
;; appreciate receiving the changes (though this NOT a requirement
;; of the license of this code.)
;;
;;
;; This software is "as is", and has no warranty of any kind.  The
;; author assumes no responsibility for the consequences of any use
;; of this software.
;;
;;
;; Notes
;; -----
;;  Differences from Far�'s CSV reader:
;;  * this one can import DOS formatted files into a unix image,
;;    and not have the redundant ^M splattered at the end of the fields.
;;    This is important when processing a file on a Unix server which
;;    is being uploaded via HTTP by some poor schmuck on a windoze box.
;;
;;  * Handles empty trailing fields properly
;;
;;  * comes with test suite, to see the semantics I've chosen to implement
;;
;;  * I've included a couple of high-level goodies, like
;;    do-csv-file and map-csv-file, as well as the ability
;;    to control the range of line-numbers on which to apply
;;    your code.  Pretty simple stuff, but useful.
;;
;;  * You get a CSV-file writer, for free!  :-)
;;
;;;;


(in-package :common-lisp-user)

(defpackage :csv-parser
  (:use :common-lisp)
  (:export  #:*field-separator*
	    #:*quote-character*
	    #:read-csv-line
	    #:do-csv-file
	    #:map-csv-file
	    #:write-csv-line))

(in-package :csv-parser)

(defparameter *field-separator* #\,
  "The character used to indicate the end of a field
   in a CSV file.")

(defparameter *quote-character* #\"
  "The character used to protect embedded field separators
   (usually commas) and whitespace within a field.

   To import the *quote-character* itself, you must have
   it printed twice in the input stream.")

(defvar *state* nil
  "Holds a function which knows how to handle chars based
   on what we've seen so far.")

(defvar *spaces-gobbled* nil
  "Keeps track of how many blanks have been skipped.  When emitting
   a field, we can trim extra right spaces, if appropriate, using this.")

(defvar *current-field* nil
  "Holds the field we are currently working on.")

(defvar *fields* nil
  "Holds the fields we have collected/parsed so far.")

(defvar *num-fields*
  "Holds the number of fields we have collected/parsed so far.")


;; Public
(defun read-csv-line (stream)
  "Read one line form a stream containing CSV data.
   Returns two values; a list of strings parsed, and
   the number of parsed values."
  (let ((*state* #'skip-white-space)
	(*spaces-gobbled* 0)
	(*fields* ())
	(*num-fields*  0)
	(*current-field* (make-empty-field)))
    (catch 'end-of-line
      (loop
       (funcall *state* (read-char stream nil :eof))))
    (values (nreverse *fields*)
	    *num-fields*)))

;; Public
(defun map-csv-file (file fn &key limit (skip-lines 0))
  "Call FN (up to LIMIT times, if specified) with
   a list containing the fields parsed from the CSV
   file FILE.

   SKIP-LINES, if provided, is the number of lines to skip
   before starting to call FN.

   *FIELD-SEPARATOR* and *QUOTE-CHARACTER* can be bound to
   modify what separates fields and delimits fields."
  (with-open-file (stream file :direction :input)
    (loop repeat skip-lines
	  do (read-csv-line stream))
    (if limit
	(loop as line = (read-csv-line stream)
	      while line
	      repeat limit
	      do (funcall fn line))
	(loop as line = (read-csv-line stream)
	      while line
	      do (funcall fn line)))))


;; Public
(defmacro do-csv-file (((fields num-fields) file &key limit (skip-lines 0))
                       &body body)
  "Repeatedly call BODY on CSV file FILE, binding
   FIELDS and NUM-FIELDS to a list containing the parsed fields,
   and the number of fields.
   Code runs inside a block with tagname NIL, so you
   may call (RETURN).

   *FIELD-SEPARATOR* and *QUOTE-CHARACTER* can be bound to
   modify what separates fields and delimits fields."
  (let ((stream (gensym "STREAM"))
	(count  (gensym "COUNT"))
	(glimit (gensym "LIMIT")))
    `(with-open-file (,stream ,file :direction :input)
      (loop repeat ,skip-lines
            do     (read-csv-line ,stream))
      (loop for ,count upfrom 0
            with ,glimit = ,limit
            do
            (multiple-value-bind (,fields ,num-fields) (read-csv-line ,stream)
	      (when (or (null ,fields)
			(and ,glimit
			     (>= ,count ,glimit)))
		(return))
	      ,@body)))))


;;;; Utilities

(defun change-state (state)
  (setf *state*
	(ecase state
	  (:skip         #'skip-white-space)
	  (:first-quote  #'got-first-quote)
	  (:second-quote #'got-second-quote)
	  (:regular      #'regular-field))))

(declaim (inline add-char))
(defun add-char (char)
  (declare  (type base-char char))
  (vector-push-extend char *current-field*))

(defun make-empty-field ()
  (make-array 0
	      :fill-pointer 0
	      :adjustable   t
	      :element-type 'base-char))

(defun remove-last-n-chars (n)
  (setf (fill-pointer *current-field*)
	(- (length *current-field*)
	   n)))

(declaim (inline quote-char-p end-of-line-char-p end-of-field-char-p
		 white-space-char-p))
(defun quote-char-p (char)
  (char= char *quote-character*))

(defun end-of-line-char-p (char)
  (char= char #\Newline))

(defun end-of-field-char-p (char)
  (char= char *field-separator*))

(defun white-space-char-p (char)
  (or
   (char= char #\Space)
   (char= char #\Tab)
   (char= char #\Return)))  ; For DOS style line termination

;;  States:
;;    * skip-white-space (initial state).
;;    * regular-field    handle things like ` foo bar '
;;    * got-first-quote  handle things like ` " foo X'
;;    * got-second-quote handle things like ` " foo  "X '
;;

(defun skip-white-space (char)
  (cond
    ((or (eq char :eof)
	 (end-of-line-char-p char))
     (when *fields*
       ;; If no fields are present, this was a completeley
       ;; blank line.  Otherwise, collect the last null field.
       (emit-field))
     (throw 'end-of-line nil))

    ((end-of-field-char-p char)
     ;; Careful to check for end-of-field _before_ whitespace,
     ;; as maybe TAB is the end-of-field marker.
     (emit-field))

    ((white-space-char-p char)
     ; skip
     nil)

    ((quote-char-p char)
     (change-state :first-quote))

    (t
     (change-state :regular)
     (add-char char))))

(defun regular-field (char)
  (cond
    ((or (eq char :eof)
	 (end-of-line-char-p char))
     (emit-field)
     (throw 'end-of-line nil))

    ((end-of-field-char-p char)
     (emit-field))

    ((white-space-char-p char)
     (add-char char)
     (incf *spaces-gobbled*))

    ((quote-char-p char)
     (error "Got a quote after regular characters; ~
             incorrectly formatted CSV file."))


    (t
     (setf *spaces-gobbled* 0)
     (add-char char))))

(defun got-first-quote (char)
  (cond
    ((eq char :eof)
     (error "Ran out of characters before finishing quoted field."))

    ((quote-char-p char)
     (change-state :second-quote))

    (t ; collect anything else
     (add-char char))))

(defun got-second-quote (char)
  (cond
    ((or (eq char :eof)
	 (end-of-line-char-p char))
     (emit-field)
     (throw 'end-of-line nil))

    ((quote-char-p char)
     ;; This is the weird embedded "" scenario
     (add-char char)
     (change-state :first-quote))

    ((end-of-field-char-p char)
     (emit-field))

    ((white-space-char-p char)
     (incf *spaces-gobbled*))

    (t
     (error "Got unexpected non-blank char after end of a quoted field"))))

(defun emit-field ()
  (cond
    ((eq *state* #'skip-white-space)
     (push nil *fields*))

    ((eq *state* #'got-second-quote)
     (push *current-field* *fields*))

    ((eq *state* #'regular-field)
     (remove-last-n-chars *spaces-gobbled*)
     (push *current-field* *fields*))

    (t
     (assert nil nil "Bug!")))

  (incf *num-fields*)
  (setf *spaces-gobbled* 0
	*state*          #'skip-white-space
	*current-field*  (make-empty-field)))



;;;; test harness
#+(or)
(trace skip-white-space got-first-quote got-second-quote regular-field
       emit-field add-char change-state remove-last-n-chars)

#+(or)(csv-tests)

;;  Call this; if nothing asserts, you win.
;;  (csv-tests)

(defun csv-tests ()
  (csv-test-blank)
  (csv-test-blanks)
  (csv-test-empty)
  (csv-test-simple)
  (csv-test-quoted)
  (csv-test-space-handling)
  (csv-test-other-delimiters)
  (csv-test-embedded-lines)
  (csv-test-embedded-commas))

(defun csv-test-blank ()
  (with-input-from-string (s "")
    (assert (eq nil (read-csv-line s)))
    (assert (eq nil (read-csv-line s)))))

(defun csv-test-blanks ()
  (with-input-from-string (s "  ")
    (assert (eq nil (read-csv-line s)))
    (assert (eq nil (read-csv-line s)))))

(defun csv-test-empty ()
  (with-input-from-string (s " ,,   , ")
    (assert (equal (list nil nil nil nil) (read-csv-line s)))
    (assert (eq nil (read-csv-line s)))))

(defun csv-test-simple ()
  (with-input-from-string (s "foo,bar,baz")
    (assert (equal (list "foo" "bar" "baz")
		   (read-csv-line s)))))

(defun csv-test-quoted ()
  (with-input-from-string (s "\"foo\",\"bar\",\"baz\"")
    (assert (equal (list "foo" "bar" "baz")
		   (read-csv-line s)))))

(defun csv-test-space-handling ()
  ;; leading/trailing blanks (but not intra word blanks)
  ;; are stripped unless the whole thing is quoted
  (with-input-from-string (s "\"  foo  bar  \",  foo bar  ")
    (assert (equal (list "  foo  bar  " "foo bar")
		   (read-csv-line s)))))

(defun csv-test-other-delimiters ()
  (let ((*field-separator* #\|)
	(*quote-character* #\'))
    (with-input-from-string (s "'foo'|'bar'|'baz'")
      (assert (equal (list "foo" "bar" "baz")
		     (read-csv-line s))))))

(defun csv-test-double-quotes ()
  (with-input-from-string (s "\"foo \"\" bar\",2,3,\"\", \" \"")
    (assert (equal (list "foo \" bar" "2" "3" "" " ")
		   (read-csv-line s)))))

(defun csv-test-embedded-lines ()
  (with-input-from-string (s "\"foo
 and bar\",\" 2 \",3")
    (assert (equal (list "foo
 and bar" " 2 " "3")
		   (read-csv-line s)))))

(defun csv-test-embedded-commas ()
  (with-input-from-string (s "\"foo , bar\",2,3")
    (assert (equal (list "foo , bar" "2" "3")
		   (read-csv-line s)))))

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;



;;;; Writing utilities

(defun write-csv-line (stream fields)
  "Write one CSV line to STREAM, containing fields.
   FIELDS is any (1d) sequence containing strings, symbols or numbers.

   Try to write it as esthetically pleasing as possible,
   i.e. don't output the *quote-character* unless necessary
   to protect the integrity of the data."

  (etypecase fields
    (cons   (write-csv-line-on-cons stream fields))
    (array  (write-csv-line-on-array stream fields))))

(defun write-csv-line-on-cons (stream fields)
  (loop for rest on fields
	while rest
	do
	(write-csv-field stream (first rest))
	(when (cdr rest)
	  (write-char *field-separator* stream)))
  (terpri stream))

(defun write-csv-line-on-array (stream fields)
  (loop for field across fields
	repeat (1- (length fields))
	do
	(write-csv-field stream field)
	(write-char *field-separator* stream)

	finally
	(write-csv-field stream (aref fields (1- (length fields))))
	(terpri stream)))

(defun write-csv-field (stream field)
  (etypecase field
    (null t)
    (number (princ field stream))
    (string (write-csv-string-safely stream field))
    (symbol (write-csv-string-safely stream (symbol-name field)))))

(defun special-char-p (char)
  (or (char= char *field-separator*)
      (char= char *quote-character*)
      (white-space-char-p char)))

(defun write-csv-string-safely (stream string)
  (if (find-if #'special-char-p string)
      (write-protected-copy stream string)
      (princ string stream)))

(defun write-protected-copy (stream field)
  (write-char *quote-character* stream)
  (loop for c across field
	do
	(write-char c stream)
	(when (char= c *quote-character*)
	  ;; Double it
	  (write-char c stream)))
  (write-char *quote-character* stream))


;;; END OF FILE