Hello all.

I have an exchange 2003 environment with over 300 user accounts.

When we get a new user, I create an HTM signature file, open it in
word, save it as RTF and then save as TXT. These files are put in a
folder on a network share named the same thing as the user's
username.

For example, the user's name is john smith. On the network share,
would be a folder called jsmith, and within that folder would be 3
files, signature.htm, signature.rtf and signature.txt.

During the login, the login script takes these files and Xcopies them
into the signatures folder in the user's profile. I then have a VBS
script (courtesy of Sue Mosher, microsoft MVP) that sets this
signature file as the current signature.

My problem is this... I do not want to have to OPEN the HTM file to
create the RTF and the TXT. I COULD create the rtf simply by doing an
xcopy "path\sig.htm path\sig.rtf" and that works fine. Trouble is when
creating the TXT, ALL the html code becomes visible.

Is it possible at all to do this via VB script?

Thanks in advance.

Re: Document conversion by mr_unreliable

mr_unreliable
Thu Jul 19 16:15:36 CDT 2007

matthewconlon@gmail.com wrote:
> My problem is this... I do not want to have to OPEN the HTM file to
> create the RTF and the TXT. I COULD create the rtf simply by doing an
> xcopy "path\sig.htm path\sig.rtf" and that works fine. Trouble is when
> creating the TXT, ALL the html code becomes visible.
>
> Is it possible at all to do this via VB script?
>

If you are talking about a general case of stripping out the
html tags, there are various utilities around to do this.

However, if the "signature" if just some text in a paragraph
tag (for example), then it is easy to script that.

You can instantiate internet explorer, load the page, and
then use the dhtml "innerText" method on the paragraph,
to get the (er, um) inner text.

If the html is really, really simple, you could just
write a simple script using the string functions to
extact your "signature". For example if the signature
was contained in a paragraph tag, you could find the
beginning tag using instr with "<p>", and the end
using instr with "</p>". Then use the mid function
to extract the text.

cheers, jw
____________________________________________________________

You got questions? WE GOT ANSWERS!!! ..(but,
no guarantee the answers will be applicable to the questions)


Re: Document conversion by ekkehard

ekkehard
Fri Jul 20 05:25:04 CDT 2007

mr_unreliable schrieb:
> matthewconlon@gmail.com wrote:
> > My problem is this... I do not want to have to OPEN the HTM file to
>> create the RTF and the TXT. I COULD create the rtf simply by doing an
>> xcopy "path\sig.htm path\sig.rtf" and that works fine. Trouble is when
>> creating the TXT, ALL the html code becomes visible.
>>
>> Is it possible at all to do this via VB script?
>>
>
> If you are talking about a general case of stripping out the
> html tags, there are various utilities around to do this.
>
> However, if the "signature" if just some text in a paragraph
> tag (for example), then it is easy to script that.
>
> You can instantiate internet explorer, load the page, and
> then use the dhtml "innerText" method on the paragraph,
> to get the (er, um) inner text.

Hi mr_unreliable,

in the context of your (very much appreciated) answer to my
question regarding the use of HtmlFile - wouldn't the task
of loading and manipulating a .html file via DOM a case where
the HtmlFile object could be used instead of the full IE?

Thanks

Ekkehard

>
> If the html is really, really simple, you could just
> write a simple script using the string functions to
> extact your "signature". For example if the signature
> was contained in a paragraph tag, you could find the
> beginning tag using instr with "<p>", and the end
> using instr with "</p>". Then use the mid function
> to extract the text.
>
> cheers, jw
> ____________________________________________________________
>
> You got questions? WE GOT ANSWERS!!! ..(but,
> no guarantee the answers will be applicable to the questions)
>

Re: Document conversion by mr_unreliable

mr_unreliable
Fri Jul 20 10:52:00 CDT 2007

ekkehard.horner wrote:
> in the context of your (very much appreciated) answer to my
> question regarding the use of HtmlFile - wouldn't the task
> of loading and manipulating a .html file via DOM a case where
> the HtmlFile object could be used instead of the full IE?
>
Servus Ekkehard,

I am not understanding your question very well.

I know about the "document object" associated with html pages,
_but_ afaik that doc object doesn't exist outside of a browser,
(like IE). In other words, the browser (IE) creates the doc
object for itself (internal use) upon loading the page, and
makes it accessible to you so you can get/set information
from the doc object.

Your question seems to suggest that there is an "HtmlFile
Object" available to script, from simply reading in an
html file. If there is such a capability available to
script, I have not yet heard of it.

Allow me to repeat myself. If your html file is not too
complicated, you can get the text you want out of it, by
simply using primitive scripting string functions, such as
instr, and mid. And this comes without needing any
"HtmlFile Object".

Mit freundlichen Grüßen, jw

Re: Document conversion by ekkehard

ekkehard
Fri Jul 20 14:10:03 CDT 2007

mr_unreliable schrieb:
> ekkehard.horner wrote:
>> in the context of your (very much appreciated) answer to my
>> question regarding the use of HtmlFile - wouldn't the task
>> of loading and manipulating a .html file via DOM a case where
>> the HtmlFile object could be used instead of the full IE?
>>
> Servus Ekkehard,
>=20
Hi mr_unreliable,

> I am not understanding your question very well.
>=20
Sorry about that; perhaps some code will make clear what I mean:

Dim sHTML : sHTML =3D Join( Array( _
"<html>" _
, " <head>" _
, " </head>" _
, " <body>" _
, " <p id =3D ""pChange"">" _
, " change me" _
, " </p>" _
, " </body>" _
, "</html>" _
), vbCrLf )
Dim oHTML : Set oHTML =3D CreateObject( "htmlfile" )
oHTML.Write sHTML
oHTML.Close

Dim oP : Set oP =3D oHTML.GetElementById( "pChange" )
WScript.Echo oP.innerText
oP.innerText =3D "gesagt, getan!"
WScript.Echo oP.innerText
WScript.Echo oHTML.documentElement.innerHTML

> I know about the "document object" associated with html pages,
> _but_ afaik that doc object doesn't exist outside of a browser,
> (like IE). In other words, the browser (IE) creates the doc
> object for itself (internal use) upon loading the page, and
> makes it accessible to you so you can get/set information
> from the doc object.
>=20
oHTML - created from HtmlFile, not involving IE - looks like a useable
HTML Document to me. Of course I hope to use it still, when in IE 7.5
all access to the document is strictly forbidden; of course I'm afraid
that as soon as I really rely on it, Microsoft will discontinue to
support it.

> Your question seems to suggest that there is an "HtmlFile
> Object" available to script, from simply reading in an
> html file. If there is such a capability available to
> script, I have not yet heard of it.
>=20
That increases my fears: There must be a reason for the lack
of 'visibility' of this component.

> Allow me to repeat myself. If your html file is not too
> complicated, you can get the text you want out of it, by
> simply using primitive scripting string functions, such as
> instr, and mid. And this comes without needing any
> "HtmlFile Object".
>=20
I agree that string/RegExp functions may be used to manipulate
HTML, but if possible at all I prefer to use methods fitted
to the hierarchical structure of the data.

> Mit freundlichen Gr=FC=DFen, jw
Regards
Ekkehard

o.k., I FINALLY "get it" (verstehen Sie) by mr_unreliable

mr_unreliable
Fri Jul 20 15:53:03 CDT 2007

Thanks for the code.

I finally "get it".

Essentially, the confusion stems from my "object finder"
not being able to find an "htmlfile" object.

However, what I did find is an "HTMLDocument" class in the
mshtml.tlb (typelib). The HTMLDocument class supports the
write and close methods you used in your script. And, I
suspect that the htmldocument class is what IE gives you
back when you ask for the document object of a web page.

As best I can tell, the mshtml typelib invokes the actual
methods and properties of the shdocvw.dll. As you probably
already know shdocvw.dll (and/or its more recent cousins)
is the "guts" (American slang for the driving engine) of
IE.

And so, while you claim to not be using IE, but rather
something called the "htmlfile" object, if fact I would
assert that "under-the-covers" (more U.S. slang) what
you are getting is maybe not the "full IE", but at least
the "guts" of it. In fact, the "InternetExplorer.Application"
object that scripers use to invoke IE is also contained
in shdocvw.dll.

Now back to your original question (gasp!). Maybe microsoft
is dumping you, maybe not. Take a look at the htmldocument
methods and properties as contained in the net.framework.

http://msdn2.microsoft.com/en-us/library/system.windows.forms.htmldocument_methods.aspx

It could be that microsoft may be closing out access to
the document object in IE 7.5 as you say, but all the "works"
are "in there", in the net.framework. And I would assert
that microsoft is not going to dump the net.framework any
time soon -- for another 10 years at least. So a clever
person ought to be able to figure out a way to use the
htmldocument class from the net.framework to get what you
want (even if you have to start writing your scripts in
vb express 2005).

mfg, jw