Hello,
I am interested in reading penn treebank parses into a lisp program,
specifically from the output of Eugene Charniak's parser. The parses
are already in s-exp format, but problems arise due to the inclusion of
unquoted periods, among other things. I am still learning CL, so this
could be totally trivial. I have googled around and not found much.
Does anyone know of code that already does this?
If no, what would be the course to take?
Some ideas I had were to change the readtable and try to read it in as
lisp, or to read it into a string and then treat it a DSL.
Thank you.
Chris Millward
"Chris Millward" <··············@gmail.com> writes:
> Hello,
> I am interested in reading penn treebank parses into a lisp program,
> specifically from the output of Eugene Charniak's parser. The parses
> are already in s-exp format, but problems arise due to the inclusion of
> unquoted periods, among other things. I am still learning CL, so this
> could be totally trivial. I have googled around and not found much.
Have you considered writing a Perl script to escape the offending
characters in your treebank files? I have used this approach to load
ldc data into Lisp in the past.
> Have you considered writing a Perl script to escape the offending
> characters in your treebank files? I have used this approach to load
> ldc data into Lisp in the past.
Yes, but I wanted to do it all in Lisp as a learning experience. Thanks
for the suggestion though.
On 15 Jul 2005 12:15:48 -0700, <··············@gmail.com> wrote:
>
>> Have you considered writing a Perl script to escape the offending
>> characters in your treebank files? I have used this approach to load
>> ldc data into Lisp in the past.
>
> Yes, but I wanted to do it all in Lisp as a learning experience. Thanks
> for the suggestion though.
Sounds like a job for regular expressions, thus CL-PPCRE. Probably
fixed by modifying the reader macro too (but see someone else for
clues). Or perl, depends on what you want to learn.
--
LOOP :: a Domain Specific Language.