Greetings,

Is anybody aware of any code that will allow me to read .rtf or .doc or .pdf
or .htm as plain text (so I can do a streamreader off them). Thanks,

-Dave

Re: Convert .rtf or .doc or .pdf or .htm to plain txt by David

David
Fri Jan 28 08:17:11 CST 2005


"Dave" <nospam@yahoo.com> wrote in message
news:uEOED%23TBFHA.2624@TK2MSFTNGP11.phx.gbl...
> Greetings,
>
> Is anybody aware of any code that will allow me to read .rtf or .doc or
> .pdf or .htm as plain text (so I can do a streamreader off them). Thanks,
>

Each format would require a different tool. Microsoft Word can do .rtf and,
of course, .doc.

But for PDF check out the pdftotext.exe from the XPDF library

http://www.foolabs.com/xpdf/download.html

from their web site:

"Xpdf is an open source viewer for Portable Document Format (PDF) files.
(These are also sometimes also called 'Acrobat' files, from the name of
Adobe's PDF software.) The Xpdf project also includes a PDF text extractor,
PDF-to-PostScript converter, and various other utilities.

Xpdf runs under the X Window System on UNIX, VMS, and OS/2. The non-X
components (pdftops, pdftotext, etc.) also run on Win32 systems and should
run on pretty much any system with a decent C++ compiler. "


It's a commandline tool so you would need to shell out to it, and then open
a streamreader against the output file.

David




Re: Convert .rtf or .doc or .pdf or .htm to plain txt by Beringer

Beringer
Fri Jan 28 08:59:43 CST 2005

As a related topic:
Does anybody know of code examples on how to convert RTF to HTML, XML etc?

Thanks in advance,
Eric

"David Browne" <davidbaxterbrowne no potted meat@hotmail.com> wrote in
message news:eJNMQQUBFHA.2180@TK2MSFTNGP12.phx.gbl...
>
> "Dave" <nospam@yahoo.com> wrote in message
> news:uEOED%23TBFHA.2624@TK2MSFTNGP11.phx.gbl...
>> Greetings,
>>
>> Is anybody aware of any code that will allow me to read .rtf or .doc or
>> .pdf or .htm as plain text (so I can do a streamreader off them).
>> Thanks,
>>
>
> Each format would require a different tool. Microsoft Word can do .rtf
> and, of course, .doc.
>
> But for PDF check out the pdftotext.exe from the XPDF library
>
> http://www.foolabs.com/xpdf/download.html
>
> from their web site:
>
> "Xpdf is an open source viewer for Portable Document Format (PDF) files.
> (These are also sometimes also called 'Acrobat' files, from the name of
> Adobe's PDF software.) The Xpdf project also includes a PDF text
> extractor, PDF-to-PostScript converter, and various other utilities.
>
> Xpdf runs under the X Window System on UNIX, VMS, and OS/2. The non-X
> components (pdftops, pdftotext, etc.) also run on Win32 systems and should
> run on pretty much any system with a decent C++ compiler. "
>
>
> It's a commandline tool so you would need to shell out to it, and then
> open a streamreader against the output file.
>
> David
>
>
>



Re: Convert .rtf or .doc or .pdf or .htm to plain txt by Dave

Dave
Fri Jan 28 12:38:32 CST 2005

David,

This tool from Foolabs does exactly what I was looking for. I am looking to
use it, though, in the .NET Compact Framework. Is there a way to do that?

-Dave

"David Browne" <davidbaxterbrowne no potted meat@hotmail.com> wrote in
message news:eJNMQQUBFHA.2180@TK2MSFTNGP12.phx.gbl...
>
> "Dave" <nospam@yahoo.com> wrote in message
> news:uEOED%23TBFHA.2624@TK2MSFTNGP11.phx.gbl...
>> Greetings,
>>
>> Is anybody aware of any code that will allow me to read .rtf or .doc or
>> .pdf or .htm as plain text (so I can do a streamreader off them).
>> Thanks,
>>
>
> Each format would require a different tool. Microsoft Word can do .rtf
> and, of course, .doc.
>
> But for PDF check out the pdftotext.exe from the XPDF library
>
> http://www.foolabs.com/xpdf/download.html
>
> from their web site:
>
> "Xpdf is an open source viewer for Portable Document Format (PDF) files.
> (These are also sometimes also called 'Acrobat' files, from the name of
> Adobe's PDF software.) The Xpdf project also includes a PDF text
> extractor, PDF-to-PostScript converter, and various other utilities.
>
> Xpdf runs under the X Window System on UNIX, VMS, and OS/2. The non-X
> components (pdftops, pdftotext, etc.) also run on Win32 systems and should
> run on pretty much any system with a decent C++ compiler. "
>
>
> It's a commandline tool so you would need to shell out to it, and then
> open a streamreader against the output file.
>
> David
>
>
>



Re: Convert .rtf or .doc or .pdf or .htm to plain txt by David

David
Sat Jan 29 15:25:58 CST 2005


"Dave" <nospam@yahoo.com> wrote in message
news:%23fV9SiWBFHA.3588@TK2MSFTNGP11.phx.gbl...
> David,
>
> This tool from Foolabs does exactly what I was looking for. I am looking
> to use it, though, in the .NET Compact Framework. Is there a way to do
> that?
>

It's not managed code: It's a platform binary compiled in C++. It might
run, or you might be able to compile it for your platform.

David