Discussion:
Common Lisp Library for MS Word files (.doc or .docx)?
Mark H. David
2011-02-28 23:46:54 UTC
Permalink
Does anyone know of any CL libraries for dealing with Microsoft Word files?
Tools for creating them, reading from them, parsing them, converting
them to plain text or other formats, things like that?
Thanks,
-Mark
Chris Perkins
2011-03-01 01:07:58 UTC
Permalink
Mark,

I don't know of any Common Lisp libraries, but the Apache Foundation has
a Java library for that. 'Apache POI'. http://poi.apache.org/ I used
it several years ago and it worked well, though I was not reading .doc
files.

Hope this helps,

Chris
Post by Mark H. David
Does anyone know of any CL libraries for dealing with Microsoft Word files?
Tools for creating them, reading from them, parsing them, converting
them to plain text or other formats, things like that?
Thanks,
-Mark
_______________________________________________
pro mailing list
http://common-lisp.net/cgi-bin/mailman/listinfo/pro
Daniel Herring
2011-03-01 01:20:15 UTC
Permalink
Post by Mark H. David
Does anyone know of any CL libraries for dealing with Microsoft Word files?
Tools for creating them, reading from them, parsing them, converting
them to plain text or other formats, things like that?
I suspect that RDNZL might provide the best results. You can use it to
hook into the beast itself.

Your other approach is to hook into the code for another office suite such
as Open/LibreOffice, AbiWord, or KWord.

In addition to Apache POI, there is also wvWare, but it doesn't support
the new XML formats...

Right when the libraries were becoming good at doc, MS went and changed
formats. Funny coincidence, that.

Later,
Daniel
Shaneal Manek
2011-03-01 01:53:59 UTC
Permalink
A few years back I used the standalone 'antiword' binary to convert
.doc files to plaintext. It seemed to work pretty well.

-Shaneal
Post by Mark H. David
Does anyone know of any CL libraries for dealing with Microsoft Word files?
Tools for creating them, reading from them, parsing them, converting
them to plain text or other formats, things like that?
I suspect that RDNZL might provide the best results.  You can use it to
hook into the beast itself.
Your other approach is to hook into the code for another office suite such
as Open/LibreOffice, AbiWord, or KWord.
In addition to Apache POI, there is also wvWare, but it doesn't support
the new XML formats...
Right when the libraries were becoming good at doc, MS went and changed
formats.  Funny coincidence, that.
Later,
Daniel
_______________________________________________
pro mailing list
http://common-lisp.net/cgi-bin/mailman/listinfo/pro
Peter Seibel
2011-03-01 02:40:05 UTC
Permalink
I have some code that I used when doing my books for parsing and
generating RTF. Worked pretty well but is nowhere near polished or
well packaged.

-Peter
Post by Mark H. David
Does anyone know of any CL libraries for dealing with Microsoft Word files?
Tools for creating them, reading from them, parsing them, converting
them to plain text or other formats, things like that?
Thanks,
-Mark
_______________________________________________
pro mailing list
http://common-lisp.net/cgi-bin/mailman/listinfo/pro
--
Peter Seibel
http://www.codequarterly.com/
Matthew Mondor
2011-03-01 04:24:26 UTC
Permalink
On Mon, 28 Feb 2011 18:46:54 -0500
Post by Mark H. David
Does anyone know of any CL libraries for dealing with Microsoft Word files?
Tools for creating them, reading from them, parsing them, converting
them to plain text or other formats, things like that?
Unfortunately also not CL, but other utilities to look into would be
antiword and OdfConverter (which also claims to support OOXML).
--
Matt
Alessio Stalla
2011-03-01 08:40:35 UTC
Permalink
Post by Mark H. David
Does anyone know of any CL libraries for dealing with Microsoft Word files?
Tools for creating them, reading from them, parsing them, converting
them to plain text or other formats, things like that?
Thanks,
-Mark
I know some time ago Knut Olav Bøhmer was fiddling with OpenOffice.org
and ABCL. I don't know if that ever resulted in a library suiting your
purpose.

hth,
Alessio

Loading...