From: Dustin Withers
Subject: Tree Creation
Date: 
Message-ID: <1159544167.111023.241480@c28g2000cwb.googlegroups.com>
Hello All,

I'm working on a project to parse EDI documents and I'm learning Lisp
at the same time :) It's not anything that needs to be done quickly (we
are already using an outsourced application) but I would love to use
Lisp in a production environment.

Given this mapping:
(defparameter *850-mapping*
  '((("ISA")
     (("GS")
      (("ST"))
      (("BEG"))
      (("CSH"))
      (("DTM"))
      (("N1"))
      (("PO1")
       (("CTP"))
       (("PID"))
       (("PO4")))
      (("CTT")))
     (("SE"))
     (("GE")))
    (("IEA"))))

And this semi-parsed document:
(("ISA" "00" "          " "00" "          " "ZZ" "AMAZON         " "ZZ"
  "1111           " "0 11126" "1117" "U" "11100" "000000111" "0" "P"
">")
 ("GS" "PO" "AMAZON" "1111" "20060726" "1116" "65001" "X" "004010")
 ("ST" "850" "0001") ("BEG" "00" "NE" "P811711" "" "20060726") ("CSH"
"Y")
 ("DTM" "064" "20060726") ("DTM" "063" "20060802") ("N1" "ST" "" "92"
"TUL1")
 ("PO1" "1" "XX" "EA" "XX" "PE" "UP" "685387094514")
 ("PO1" "2" "XX" "EA" "XX" "PE" "UP" "685387022391") ("CTT" "2" "43")
 ("SE" "10" "0001") ("GE" "1" "65001") ("IEA" "1" "000000065") (""))

What might be a good direction for me to head in if I want to turn the
semi-parsed document into:
((("ISA" "00" "          " "00" "          " "ZZ" "AMAZON         "
"ZZ"
	 "1111           " "0 11126" "1117" "U" "11100" "000000111" "0" "P"
">")
  (("GS" "PO" "AMAZON" "1111" "20060726" "1116" "65001" "X" "004010")
   (("ST" "850" "0001"))
   (("BEG" "00" "NE" "P811711" "" "20060726"))
   (("CSH" "Y"))
   (("DTM" "064" "20060726"))
   (("DTM" "063" "20060802"))
   (("N1" "ST" "" "92" "TUL1"))
   (("PO1" "1" "XX" "EA" "XX" "PE" "UP" "685387094514"))
   (("PO1" "2" "XX" "EA" "XX" "PE" "UP" "685387022391"))
   (("CTT" "2" "43")))
  (("SE" "10" "0001"))
  (("GE" "1" "65001")))
 (("IEA" "1" "000000065")))

The mapping will have more nodes than I have here that is just a small
example of all the nodes of an 850.  You'll notice the child nodes of
the PO1(PID and SDQ) are not used in the example document.

Here is my current (non-working) solution:
(defun parse-to-tree (document mapping)
  (if document
      (parse-to-tree (cdr document) (subst (car document) (list (caar
document)) mapping :test #'equal))
      mapping))

The problem is that it only replaces the first one and then I loose any
recurring nodes.  One idea I had was to create a mapping on the fly
that makes as many nodes as there are in the document and then somehow
keep subst from substituing all nodes in the mapping.  Is this a good
idea?  Also after watching Rainers DSL video I thought about putting
the documents into objects but my CLOS skills are non-existent...

Thanks for any help,
-dustin

From: Pascal Bourguignon
Subject: Re: Tree Creation
Date: 
Message-ID: <87irj6s3xl.fsf@thalassa.informatimago.com>
"Dustin Withers" <·········@gmail.com> writes:

> Hello All,
>
> I'm working on a project to parse EDI documents and I'm learning Lisp
> at the same time :) It's not anything that needs to be done quickly (we
> are already using an outsourced application) but I would love to use
> Lisp in a production environment.
>
> Given this mapping:
> (defparameter *850-mapping*
>   '((("ISA")
>      (("GS")
>       (("ST"))
>       (("BEG"))
>       (("CSH"))
>       (("DTM"))
>       (("N1"))
>       (("PO1")
>        (("CTP"))
>        (("PID"))
>        (("PO4")))
>       (("CTT")))
>      (("SE"))
>      (("GE")))
>     (("IEA"))))

This defines a grammar (it might be useful to put it in a more
grammar-like form thought).


> What might be a good direction for me to head in if I want to turn the
> semi-parsed document into:

Therefore I would write a parser generator from the grammar specified.


-- 
__Pascal Bourguignon__                     http://www.informatimago.com/

"Logiciels libres : nourris au code source sans farine animale."
From: Dustin Withers
Subject: Re: Tree Creation
Date: 
Message-ID: <1159680103.121224.66790@m7g2000cwm.googlegroups.com>
Pascal Bourguignon wrote:
> "Dustin Withers" <·········@gmail.com> writes:
>
> > Hello All,
> >
> > I'm working on a project to parse EDI documents and I'm learning Lisp
> > at the same time :) It's not anything that needs to be done quickly (we
> > are already using an outsourced application) but I would love to use
> > Lisp in a production environment.
> >
> > Given this mapping:
> > (defparameter *850-mapping*
> >   '((("ISA")
> >      (("GS")
> >       (("ST"))
> >       (("BEG"))
> >       (("CSH"))
> >       (("DTM"))
> >       (("N1"))
> >       (("PO1")
> >        (("CTP"))
> >        (("PID"))
> >        (("PO4")))
> >       (("CTT")))
> >      (("SE"))
> >      (("GE")))
> >     (("IEA"))))
>
> This defines a grammar (it might be useful to put it in a more
> grammar-like form thought).
>

What would be a more grammar-like form? EDI documents (ASCX12) have two
delimters, a segment and an element delimiter.  The first "element" of
the segment is the name of the segment.  Each subsequent element, up to
the next segment, is sequentially referenced (BEG01, BEG02, etc...).
The document itself contains no relationship data.  I'm trying to
figure out what a more verbose mapping would be.  Rainer Joswigs DSL
example uses character indexes of the rows and his mapping is very
simple and straightforward. I don't know what other information I could
include in the mapping that would make it more of a grammar definition.
 Should I use CL-PPCRE to define the grammar?  Currently I'm using
SPLIT-SEQUENCE to break the document into a list representation.

>
> > What might be a good direction for me to head in if I want to turn the
> > semi-parsed document into:
>
> Therefore I would write a parser generator from the grammar specified.
>
>
> --
> __Pascal Bourguignon__                     http://www.informatimago.com/
>
> "Logiciels libres : nourris au code source sans farine animale."

Thanks for the help on this,
-dustin
From: Pascal Bourguignon
Subject: Re: Tree Creation
Date: 
Message-ID: <874puoqdly.fsf@thalassa.informatimago.com>
"Dustin Withers" <·········@gmail.com> writes:

> Pascal Bourguignon wrote:
>> "Dustin Withers" <·········@gmail.com> writes:
>>
>> > Hello All,
>> >
>> > I'm working on a project to parse EDI documents and I'm learning Lisp
>> > at the same time :) It's not anything that needs to be done quickly (we
>> > are already using an outsourced application) but I would love to use
>> > Lisp in a production environment.
>> >
>> > Given this mapping:
>> > (defparameter *850-mapping*
>> >   '((("ISA")
>> >      (("GS")
>> >       (("ST"))
>> >       (("BEG"))
>> >       (("CSH"))
>> >       (("DTM"))
>> >       (("N1"))
>> >       (("PO1")
>> >        (("CTP"))
>> >        (("PID"))
>> >        (("PO4")))
>> >       (("CTT")))
>> >      (("SE"))
>> >      (("GE")))
>> >     (("IEA"))))
>>
>> This defines a grammar (it might be useful to put it in a more
>> grammar-like form thought).
>
> What would be a more grammar-like form? EDI documents (ASCX12) have two
> delimters, a segment and an element delimiter.  The first "element" of
> the segment is the name of the segment.  Each subsequent element, up to
> the next segment, is sequentially referenced (BEG01, BEG02, etc...).
> The document itself contains no relationship data.  

There are several levels.  Obviously, the ASN.1 level is taken into account.

When you have a file format with records, some mandatory, some
optional, some repeatable, with some hierarchical structure like
denoted by the above *850-mapping*, you can describe this file format
with a grammar, and you can use a parser and therefore a parser
generator, to read the file, and a synthesize, and therefore a
synthesizer generator to write the file.


doc850 ::= isa iea .
isa    ::= 'ISA' isadata gs* [se] [ge] .
gs     ::= 'GS' gsdata [st] [beg] [csh] [dtm] [n1] po1* .
st     ::= 'ST' stdata .
beg    ::= 'BEG' begdata .
csh    ::= 'CSH' cshdata .
...
po1    ::= 'PO1' po1data [ctp] [pid] [po4] .
ctt    ::= 'CTT' cttdata .
...

isadata ::= field1 field2 field3 field4 name field6 field7 field8
            field9 field10 field11 field 12 field13 field14 field15 '>' .

;; ("ISA" "00" "          " "00" "          " "ZZ" "AMAZON         " "ZZ"
;;   "1111           " "0 11126" "1117" "U" "11100" "000000111" "0" "P"
;;   ">") 

...


-- 
__Pascal Bourguignon__                     http://www.informatimago.com/

In a World without Walls and Fences, 
who needs Windows and Gates?