From: dave linenberg
Subject: newbie needs a little help parsing a comma delimited file
Date: 
Message-ID: <37F2A4C4.C115ADD8@home.com>
Hello all,

 I am trying to learn lisp ( I am in the middle of Touretzky's Gentle
Introduction book, and I've got Winston's book as well - I just started
trying to learn lisp about a week ago).  I have been programming C++ for
the last 10 years, and I work on a financial trading floor.  I would
like to start converting a bunch of C++
code to lisp ( I have franz's lisp  for linux, and am delighted with it)

I would like to do the following:
take an ascii file which is uniformly comma delimited ( except at the
end of each line)   i.e.

double,double,double,double,double, double
double,double,double,double,double, double
.
.
double,double,double,double,double, double

for an unknown amount of lines, and create a list
in the form (  (double,double,double,double,double, double)
                      (double,double,double,double,double, double)
                       .
                       .
                        (double,double,double,double,double, double))

I am playing around with the   with-open-file and the read functions,
but
I am not really sure whether to read the file in character by character,
peeking ahead
for a comma, or parsing it line by line.

Any source code or pointers would be incredibly helpful.  By converting
my C++ code to lisp,
I know I will eliminate A LOT of past and future work that I have been
slaving over in the C++ world ( like representing integers, doubles,
chars, char*   as generic pieces of data ) which lisp hands me for free.

In fact, for the past 2 years I have been working on genericizing
functions ( in the form of template function
objects) and objects.  Lisp does all this already.  It's pretty damned
amazing.

 ( Anyway, here is party of some c++ code - a static member function
which gets fed lines of a file, and then returns the individual pieces
in a vector of strings,    I know lisp is going to be easier , once I
get some experience!! )

//------------------------------------------------------------
// parse a string using delimiter returning individual strings in vector

// boolean indicates whether there is a delimier at end of line or not
  vector<string> CRecordset::parseLine( string line, string delimiter,
bool isDelimiterAtEOL )
{
  vector<string> substrings;
  string::size_type pos = 0;
  string::size_type posPrevious = pos;
  string::iterator itBegin =line.begin();
  while (( pos = line.find_first_of(delimiter , pos)) != string::npos) {

    string substring(itBegin+posPrevious,itBegin+pos);
    substrings.push_back(substring);
    pos++;
    posPrevious = pos;
  }

  // handle case of no delimiter at End of Line
  // yet we want that piece of data
  //
  // i.e.   5,10,189
  //
  if (isDelimiterAtEOL == false ) {
    string::iterator itEnd = line.end();
    string substring(itBegin+ posPrevious, itEnd );
    substrings.push_back(substring);
  }


  return substrings;
}

From: Samir Barjoud
Subject: Re: newbie needs a little help parsing a comma delimited file
Date: 
Message-ID: <wkln9pp5df.fsf@mindspring.com>
dave linenberg <········@home.com> writes:

> Hello all,
> 
>  I am trying to learn lisp ( I am in the middle of Touretzky's Gentle
> Introduction book, and I've got Winston's book as well - I just started
> trying to learn lisp about a week ago).  I have been programming C++ for
> the last 10 years, and I work on a financial trading floor.  I would
> like to start converting a bunch of C++
> code to lisp ( I have franz's lisp  for linux, and am delighted with it)
> 
> I would like to do the following:
> take an ascii file which is uniformly comma delimited ( except at the
> end of each line)   i.e.
> 
> double,double,double,double,double, double
> double,double,double,double,double, double
> .

If you want to do this parser in LISP as an exercise in learning LISP,
fine.  However, if you just want to read this data into LISP, why not
have the C++ code write out the data as a LISP list?  You already have
the code to read it into C++, after all.

-- 
Samir Barjoud
·····@mindspring.com
From: dave linenberg
Subject: Re: newbie needs a little help parsing a comma delimited file
Date: 
Message-ID: <37F2BED4.6D031ADB@home.com>
Samir Barjoud wrote:

>
>
> If you want to do this parser in LISP as an exercise in learning LISP,
> fine.  However, if you just want to read this data into LISP, why not
> have the C++ code write out the data as a LISP list?  You already have
> the code to read it into C++, after all.
>
> --
> Samir Barjoud
> ·····@mindspring.com

I want to do this purely as an exercise in learning LISP.  I would like those
with experience doing
this simple ( for others :)    ) exercise to point me towards enlightenment.

dave linenberg
From: Kent M Pitman
Subject: Re: newbie needs a little help parsing a comma delimited file
Date: 
Message-ID: <sfw7ll9gh80.fsf@world.std.com>
dave linenberg <········@home.com> writes:

> Samir Barjoud wrote:
> >
> > If you want to do this parser in LISP as an exercise in learning LISP,
> > fine.  However, if you just want to read this data into LISP, why not
> > have the C++ code write out the data as a LISP list?  You already have
> > the code to read it into C++, after all.
> >
> > --
> > Samir Barjoud
> > ·····@mindspring.com
> 
> I want to do this purely as an exercise in learning LISP.  I would like those
> with experience doing
> this simple ( for others :)    ) exercise to point me towards enlightenment.

Part of learning a language, of course, is learning how to do what it
wants you to do in the style it wants you to do it, so Samir's suggestion
*is* learning LISP, in a sense.  Printing out info in C++ style and
then asking LISP to read it in that way will cause you to write a program
in Lisp that is not very lispy, and so what it teaches you about the
language is questionable.

Here's an example of one way to do it that takes advantage of the fact
that you're doing something that involves relatively simple, predictable
syntax and where you might not mind redefining certain syntaxes of Lisp
that might be used differently in other cases.  But I really don't recommend
this as a general solution and am not sure it teaches good practices to
novices.

(defvar *my-readtable* (copy-readtable *readtable*))

(set-syntax-from-char #\, #\Space *my-readtable*)

(defun parse-comma-delimited-numbers-from-string (string)
  (with-input-from-string (string-stream string)
    (parse-comma-delimited-numbers-from-stream string-stream)))

(defun parse-comma-delimited-numbers-from-stream (stream)
  (let ((*readtable* *my-readtable*)
        (*read-eval* nil)) ;disable trojan horses
    (loop for x = (read stream nil nil)
          while x
          collect x)))

(parse-comma-delimited-numbers-from-string "1.0,3.0,5.3d7,6.2")
=> (1.0 3.0 5.3E7 6.2)

;; Note: I tested an earlier version of this but then edited it a lot
;;  to make it prettier so may have broken it somewhere along the way.
;;  The basic concept should work, though.
From: Rob Warnock
Subject: Re: newbie needs a little help parsing a comma delimited file
Date: 
Message-ID: <7t9n4q$d736v@fido.engr.sgi.com>
Kent M Pitman  <······@world.std.com> wrote:
+---------------
| (defvar *my-readtable* (copy-readtable *readtable*))
| (set-syntax-from-char #\, #\Space *my-readtable*)
| (defun parse-comma-delimited-numbers-from-string (string)
|   (with-input-from-string (string-stream string)
|     (parse-comma-delimited-numbers-from-stream string-stream)))
| (defun parse-comma-delimited-numbers-from-stream (stream)
|   (let ((*readtable* *my-readtable*)
|         (*read-eval* nil)) ;disable trojan horses
|     (loop for x = (read stream nil nil)
|           while x
|           collect x)))
| 
| (parse-comma-delimited-numbers-from-string "1.0,3.0,5.3d7,6.2")
| => (1.0 3.0 5.3E7 6.2)
| 
| ;; Note: I tested an earlier version of this but then edited it a lot
| ;;  to make it prettier so may have broken it somewhere along the way.
| ;;  The basic concept should work, though.
+---------------

Hmmm... The test case works fine in CLISP, but CMUCL 18b complains:

	* (parse-comma-delimited-numbers-from-string "1.0,3.0,5.3d7,6.2")
	Reader error at 4 on #<String-Input Stream>:
	Comma not inside a backquote.
	[...thence to debugger...]


-Rob

-----
Rob Warnock, 8L-846		····@sgi.com
Applied Networking		http://reality.sgi.com/rpw3/
Silicon Graphics, Inc.		Phone: 650-933-1673
1600 Amphitheatre Pkwy.		FAX: 650-933-0511
Mountain View, CA  94043	PP-ASEL-IA
From: Rainer Joswig
Subject: Re: newbie needs a little help parsing a comma delimited file
Date: 
Message-ID: <joswig-0410991025050001@194.163.195.67>
In article <············@fido.engr.sgi.com>, ····@rigden.engr.sgi.com (Rob Warnock) wrote:

> | (parse-comma-delimited-numbers-from-string "1.0,3.0,5.3d7,6.2")
> | => (1.0 3.0 5.3E7 6.2)
> | 
> | ;; Note: I tested an earlier version of this but then edited it a lot
> | ;;  to make it prettier so may have broken it somewhere along the way.
> | ;;  The basic concept should work, though.
> +---------------
> 
> Hmmm... The test case works fine in CLISP, but CMUCL 18b complains:
> 
>         * (parse-comma-delimited-numbers-from-string "1.0,3.0,5.3d7,6.2")
>         Reader error at 4 on #<String-Input Stream>:
>         Comma not inside a backquote.
>         [...thence to debugger...]

Maybe it is time for a bug report. ;-)
From: dave linenberg
Subject: Re: newbie needs a little help parsing a comma delimited file
Date: 
Message-ID: <37F96086.BFEAD306@home.com>
Hi all,

I originally had the question about parsing comma delimited files. ( I just started learning
lisp a week ago)   It is VERY common to have to parse such files ( exported from excel,  data
files holding equity/ forex/ interest-rate instrument data etc. ) on the trading floor where
I work.

In any case, I came up with a simple function,

(defun comma-parse( in-string )
 (concatenate 'string "(" (substitute #\space #\, in-string) ")" ))

which simply substitues spaces for the commas, if they exist, and adds parentheses to both
sides of the string. If you wrap that with the read-from-string function   i.e.

(read-from-string ( comma-parse "1,2,3,4,5"))

(1 2 3 4 5)

which is exactly what I wanted. This can easily be implemented in a loop or do construct
which reads the file line by line until the eof, appending or collecting the lines into the
desired form.

(   (1 2 3 4 5)
    (1 1 5 9 2)
.
.
    )

Although I have pretty much no experience with lisp, I am blown away by how beautiful and
powerful the language is.  Although I haven't even scratched the surface of the language, it
is immediately apparent that the amount of code required to do amazingly powerful things is
quite small.

It's amazing that doing the above in C++ would require quite a bit of code, which would
ultimately be a hack implemented by some programmer(s) implementing containers (lists
/vectors ) holding containers ofgeneric
objects ( string, double, int ), each with a char* constructor, to be read from a file.  I've
done this,  and it's taken a while ( ie at least a year), and lisp gives this, and a lot more
for free!!

dave linenberg


Rainer Joswig wrote:

> In article <············@fido.engr.sgi.com>, ····@rigden.engr.sgi.com (Rob Warnock) wrote:
>
> > | (parse-comma-delimited-numbers-from-string "1.0,3.0,5.3d7,6.2")
> > | => (1.0 3.0 5.3E7 6.2)
> > |
> > | ;; Note: I tested an earlier version of this but then edited it a lot
> > | ;;  to make it prettier so may have broken it somewhere along the way.
> > | ;;  The basic concept should work, though.
> > +---------------
> >
> > Hmmm... The test case works fine in CLISP, but CMUCL 18b complains:
> >
> >         * (parse-comma-delimited-numbers-from-string "1.0,3.0,5.3d7,6.2")
> >         Reader error at 4 on #<String-Input Stream>:
> >         Comma not inside a backquote.
> >         [...thence to debugger...]
>
> Maybe it is time for a bug report. ;-)