Robert Maas, see http://tinyurl.com/uh3t wrote:
> Do you know of any free software, written in Java or Common Lisp,
> which parses and/or generates MS-Word documents?
You underestimate the task. Check out open office, koffice, et al to
find libraries that attempt compatibility. Its a quirky, poorly
documented binary format. Doesn't matter what language you use.
- Daniel
P� Tue, 12 Feb 2008 07:13:46 +0100, skrev D Herring
<········@at.tentpost.dot.com>:
> Robert Maas, see http://tinyurl.com/uh3t wrote:
>> Do you know of any free software, written in Java or Common Lisp,
>> which parses and/or generates MS-Word documents?
>
> You underestimate the task. Check out open office, koffice, et al to
> find libraries that attempt compatibility. Its a quirky, poorly
> documented binary format. Doesn't matter what language you use.
>
> - Daniel
Truth with modifications. The newer Word2007 format is a XML format
(compressed) and is well documented.
And it can be read in with cxml. There is a conversion tool from Microsoft
that allows older Word versions to read/write to this format.
So I would:
1. Get the converter.
2. Load into Word.
3. Convert the document to Word2007 format.
4. Load it using sxml (unzip first).
Open XML File Format:
http://msdn2.microsoft.com/en-us/library/aa338205.aspx
Conversion tool:
http://www.microsoft.com/downloads/details.aspx?FamilyId=941b3470-3ae9-4aee-8f43-c6bb74cd1466&displaylang=en
CL zip lib:
http://common-lisp.net/project/zip/
CL cxml:
http://common-lisp.net/project/cxml/
--------------
John Thingstad