Mark
Mon Jul 18 02:52:08 CDT 2005
"Mark Schupp" <notvalid@email.net> wrote in message
news:%23jm$HQ9hFHA.3656@TK2MSFTNGP09.phx.gbl...
> Create the simplest page that you can which reproduces the problem and
> post the entire code here.
Strangely, when I tried to fo that, the problem disappeared. Worse, I
noticed that the same characters were being displayed correctly in another
frame of the same frameset!
I did some more research and found an explanation (though that it's
implemented like this is a mind-blower to me):
--------------
Literal strings in a script are still encoded by using @CodePage (if
present) or the AspCodePage metabase property value (if set), or the system
ANSI code page. If you set Response.CodePage or Session.CodePage explicitly,
do so before sending nonliteral strings to the client. If you use literal
and nonliteral strings in the same page, make sure the code page of
@CodePage matches the code page of Response.CodePage,
--------------
The above is an excerpt from this page:
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/iissdk/html/268f1db1-9a36-4591-956b-d7269aeadcb0.asp
We haven't been setting any codepage. I had thought we tried to set it as a
possible solution using @codepage and saving the source as utf-8, but I see
I didn't even come close, there are included files, and I did not imagine
I'd *need* to set it elsewhere to make it effective.
So to cause my pages to be output as utf-8 I need all of this:
<% @CodePage = 65001 Language=VBScript %>
<%
Response.CharSet = "utf-8"
Response.CodePage = 65001
%>
[...]
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; CHARSET=utf-8">
AND the source file must be saved as utf-8, and any included files should be
saved as utf-8 also? That's straight nuts!
Also the doc contradicts itself, in one spot it says there can be only one
codepage (which seems reasonable) but in another it says, "...or the literal
strings are encoded differently from the nonliteral strings and display
incorrectly." If there was only one codepage, all output would be encoded
accordingly.
Unreal! And this is supposed to be an improvement over IIS5? It did not
seem to suffer from the same potential for ambiguity.
What's more, ANSI is perfectly capable of displaying 8 bit characters. So
it seems to me that IIS6 goes to way too much trouble trying to ascertain
the codepage, when all we really wanted was ANSI by default. In the end,
without having any explicit codepage set, it makes wrong assumptions and
incorrectly encodes some characters.
Another irony, I tried saving the source as unicode, and it bitterly
complained... so all strings in VBS are unicode, the script engine that does
this is unable to read source files saved as unicode? So how could unicode
be native, does it read ANSI source files and convert to unicode?
Riiight.... Add one to the reasons I gotta call BS on the "Unicode Native to
VBS" myth.
So in the end I said to hell with it, and substituted some chars in webdings
that look close enough, and fit in 7 bits. It works.
Thanks for the reply,
Mark
> --
> --Mark Schupp
>
>
> "Mark J. McGinty" <mmcginty@spamfromyou.com> wrote in message
> news:nbbBe.30594$8o.26125@fed1read03...
>>
>> "David Wang [Msft]" <someone@online.microsoft.com> wrote in message
>> news:%2375ojE6hFHA.1480@TK2MSFTNGP10.phx.gbl...
>>> IIS6 itself does not do any such conversion, so I am not certain the
>>> issue
>>> has to do with 8bit characters.
>>>
>>> Page Frameworks like ASP or the web browser (based on
>>> Content-Encoding/Language hints from the response) can do such
>>> conversion,
>>> but those are parameters you need to control if you wish your page to be
>>> consistently interpreted.
>>
>> Thanks for your reply.
>>
>> It is definitely a server side issue, I looked at the response using
>> Ethereal, before it reaches the browser. I also tried sending characters
>> with values over 128 using the default font, as a reality check.
>> Characters that use the 8th bit are being transformed somewhere in
>> between the VBS/ASP script -- which does a Response.Write(Chr(239)) --
>> and the requesting client's TCP socket.
>>
>>
>>> Can you describe what codepage your ASP page is configured to be
>>> interpreted
>>> as, and whether you send any additional response headers that may affect
>>> how
>>> the browser interprets your response?
>>
>> It was implicit, we also tried explicitly setting it to ANSI and to
>> UTF-8. (Yes I saved the script source files as UTF-8 after adding the
>> @CodePage directive.) We also added the DTD that VS7 generates for new
>> HTML documents. The content type is ASP's default, HTML Document
>> according to IE. Ethereal confirms it, the type is text/html.
>>
>>
>>> Because in order for the little arrow outlines to display properly,
>>> these
>>> two things have to happen:
>>> 1. The response entity body must contain characters whose character code
>>> is
>>> 239-242
>>
>> Therein lies a big part of the problem, because those character values
>> are not being sent as written to the response context.
>>
>>
>>> 2. The browser must choose a code page (based on response headers) which
>>> selects a font which maps little arrow outlines to character codes
>>> 239-242
>>
>> We set the font via CSS for just the elements that display these
>> characters. We surely do not want the entire page to use wingdings (it
>> would be extremely difficult to read that way.) Wingdings is installed
>> by default on all Windows machines since Windows 95 iirc.
>>
>> Further, as I stated, the value of the characters in question has been
>> altered by the time the content makes it to the browser. The page *is*
>> displaying wingdings characters where we expect it to, they just are not
>> the correct wingdings characters... because their value has been altered
>> or transformed in some strange way.
>>
>>
>>> You must ensure those two things happen; neither IIS nor the browser can
>>> make it automagically happen.
>>
>> I surely didn't expect that from either of them, and as I stated, this
>> same code worked perfectly in Win2K/IIS5. It's not something we set out
>> on a lark to do, and wistfully hoped it would miraculously happen, we
>> have worling code already deployed on next-to-latest major release of
>> IIS. So I suspect it has something to do with IIS6 MIME type handling
>> "enhancements"
>>
>> In practice I've all but decided to just say the hell with it, and
>> replace the wingding characters with images. Yes it will add a few KB to
>> the size of the content, and a number of extra requests will be generated
>> by the page as it renders (even cached images generate an HTTP request
>> per instance of the image, delivering them inline as individual
>> characters incurs much less client request overhead) but nobody with
>> broadband will ever know the difference.
>>
>> Even so I'd love to get to the bottom of this, because I saw a loosely
>> related issue (involving XML) in another NG. If it alters any XML, it
>> won't be so easy to work around.
>>
>>
>> -Mark
>>
>>
>>> --
>>> //David
>>> IIS
>>>
http://blogs.msdn.com/David.Wang
>>> This posting is provided "AS IS" with no warranties, and confers no
>>> rights.
>>> //
>>> "Mark J. McGinty" <mmcginty@spamfromyou.com> wrote in message
>>> news:eueP9MzhFHA.1412@TK2MSFTNGP09.phx.gbl...
>>> Greets,
>>>
>>> Part of the content of one of our web pages uses wingdings and Chr(239)
>>> through Chr(242) (which are little arrow outlines, though that's not
>>> really
>>> important.)
>>>
>>> It worked just fine in Windows 2000 Server, but now under Server 2003 it
>>> seems that characters above 127 get converted somehow, and our code no
>>> longer produces the desired effect.
>>>
>>> Does anyone know how to make it send our content without modification,
>>> or
>>> how to encode it in a way that it makes it out to the browser with the
>>> intended character value (as opposed to some thoroughly useless
>>> conversion
>>> to a 7 bit value)?
>>>
>>> tia,
>>> Mark
>>>
>>>
>>>
>>>
>>
>>
>
>