I need to select data from an nchar field in a sqlserver database, persist
it to XML, then transform it using XSL, (all in COM component called from
ASP - returning HTML ), outputting HTML (Response.Write the string
returned from COM+).
The data may contain european characters such as

? é

and also chinese characters such as

???? (it looks like SQLserver stores these as
两个月前)

There seem to be a lot of stages along the way where the encoding can be set
and/or will default to a certain encoding (COM defaults to UTF-16?)

What do I need to set to be able to see (and input) both Eurpoean and
Chinese (Simplified) characters?

So far I'm using;

XML FROM
RECORDSET-------------------------------------------------------------

In the recordset the Chinese char is 个 but when persisted to XML the
ampersand at the front is shown as &

<data>&amp;#20010;</data>

So I'm replacing "&amp;#" with "&# " to output &#20010; in the XML

XML
HEADER----------------------------------------------------------------------
---------

I grab the documentElement.xml of the XML from several recordsets and build
up a 'master' XML string. The header that I use on this string is;

<?xml version='1.0' encoding='ISO-8859-1'?> (should I change this to <?xml
version='1.0' encoding='utf-8'?>)

XSL
TRANSFORMATION--------------------------------------------------------------

I then transform the XML using XSL. - no encoding specified that I can
think of..

ASP-------------------------------------------------------------------------
---------------------

The resulting HTML string is returned to the ASp from the COM component.

At the moment I have no codepage set in the ASP..what effect would setting
this have?

HTML------------------------------------------------------------------------
----------------------

The Meta tag in the HTML is set to ;

<META http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"/>

Re: unicode by Han

Han
Tue Jul 29 21:03:21 CDT 2003

Rather than unicode issue, it can be an issue of stream vs string. With
stream, the original encoding will be saved always. Without your code, my
guessing can be hasty. Anyway,

oXml.transformnodetoObject oXsl, response

In this way, the original encoding will be saved no matter what encoding you
used.

"russ holmes" <russ1@ihug.co.nz> wrote in message
news:OUVp#fiVDHA.1872@TK2MSFTNGP12.phx.gbl...
> I need to select data from an nchar field in a sqlserver database, persist
> it to XML, then transform it using XSL, (all in COM component called from
> ASP - returning HTML ), outputting HTML (Response.Write the string
> returned from COM+).
> The data may contain european characters such as
>
> ? ?
>
> and also chinese characters such as
>
> ???? (it looks like SQLserver stores these as
> &#20004;&#20010;&#26376;&#21069;)
>
> There seem to be a lot of stages along the way where the encoding can be
set
> and/or will default to a certain encoding (COM defaults to UTF-16?)
>
> What do I need to set to be able to see (and input) both Eurpoean and
> Chinese (Simplified) characters?
>
> So far I'm using;
>
> XML FROM
> RECORDSET-------------------------------------------------------------
>
> In the recordset the Chinese char is &#20010; but when persisted to XML
the
> ampersand at the front is shown as &amp;
>
> <data>&amp;#20010;</data>
>
> So I'm replacing "&amp;#" with "&# " to output &#20010; in the XML
>
> XML
>
HEADER----------------------------------------------------------------------
> ---------
>
> I grab the documentElement.xml of the XML from several recordsets and
build
> up a 'master' XML string. The header that I use on this string is;
>
> <?xml version='1.0' encoding='ISO-8859-1'?> (should I change this to
<?xml
> version='1.0' encoding='utf-8'?>)
>
> XSL
>
TRANSFORMATION--------------------------------------------------------------
>
> I then transform the XML using XSL. - no encoding specified that I can
> think of..
>
>
ASP-------------------------------------------------------------------------
> ---------------------
>
> The resulting HTML string is returned to the ASp from the COM component.
>
> At the moment I have no codepage set in the ASP..what effect would setting
> this have?
>
>
HTML------------------------------------------------------------------------
> ----------------------
>
> The Meta tag in the HTML is set to ;
>
> <META http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"/>
>
>



Re: unicode by russ

russ
Tue Jul 29 21:45:20 CDT 2003

it seems the problem is when I use loadXML to pass the XML string into an
XML object;

in strXML;

<z:row tindex="794773" tddesc="&#20004;&#20010;&#26376;&#21069;&#21018; é "
tdline="2" tdindex="55257"/>

but after I try;

objxml.loadXML (strXML)

I get

<z:row tindex="794773" tddesc="????? é " tdline="2" tdindex="55257"/>

The head of the string has an encoding specified;

strXML header = <?xml version='1.0' encoding='ISO-8859-1'?>

but after calling the loadXML the encoding has gone;

The header of objxml.xml = <?xml version="1.0"?>

This all happens before I transform using the XSL...


"Han" <hp4444@kornet.net> wrote in message
news:us7uD7jVDHA.2368@TK2MSFTNGP09.phx.gbl...
> Rather than unicode issue, it can be an issue of stream vs string. With
> stream, the original encoding will be saved always. Without your code, my
> guessing can be hasty. Anyway,
>
> oXml.transformnodetoObject oXsl, response
>
> In this way, the original encoding will be saved no matter what encoding
you
> used.
>
> "russ holmes" <russ1@ihug.co.nz> wrote in message
> news:OUVp#fiVDHA.1872@TK2MSFTNGP12.phx.gbl...
> > I need to select data from an nchar field in a sqlserver database,
persist
> > it to XML, then transform it using XSL, (all in COM component called
from
> > ASP - returning HTML ), outputting HTML (Response.Write the string
> > returned from COM+).
> > The data may contain european characters such as
> >
> > ? ?
> >
> > and also chinese characters such as
> >
> > ???? (it looks like SQLserver stores these as
> > &#20004;&#20010;&#26376;&#21069;)
> >
> > There seem to be a lot of stages along the way where the encoding can be
> set
> > and/or will default to a certain encoding (COM defaults to UTF-16?)
> >
> > What do I need to set to be able to see (and input) both Eurpoean and
> > Chinese (Simplified) characters?
> >
> > So far I'm using;
> >
> > XML FROM
> > RECORDSET-------------------------------------------------------------
> >
> > In the recordset the Chinese char is &#20010; but when persisted to XML
> the
> > ampersand at the front is shown as &amp;
> >
> > <data>&amp;#20010;</data>
> >
> > So I'm replacing "&amp;#" with "&# " to output &#20010; in the XML
> >
> > XML
> >
>
HEADER----------------------------------------------------------------------
> > ---------
> >
> > I grab the documentElement.xml of the XML from several recordsets and
> build
> > up a 'master' XML string. The header that I use on this string is;
> >
> > <?xml version='1.0' encoding='ISO-8859-1'?> (should I change this to
> <?xml
> > version='1.0' encoding='utf-8'?>)
> >
> > XSL
> >
>
TRANSFORMATION--------------------------------------------------------------
> >
> > I then transform the XML using XSL. - no encoding specified that I can
> > think of..
> >
> >
>
ASP-------------------------------------------------------------------------
> > ---------------------
> >
> > The resulting HTML string is returned to the ASp from the COM component.
> >
> > At the moment I have no codepage set in the ASP..what effect would
setting
> > this have?
> >
> >
>
HTML------------------------------------------------------------------------
> > ----------------------
> >
> > The Meta tag in the HTML is set to ;
> >
> > <META http-equiv="Content-Type" content="text/html;
charset=ISO-8859-1"/>
> >
> >
>
>



Re: unicode by Han

Han
Tue Jul 29 22:20:45 CDT 2003

"loadXml" is another kind of string way instead of stream way. The string
going to and from will be regarded as utf-16 always no matter what encoding
you set. Same for "xml" property.

Usually there is no need to "loadXml" the saved recordset if you control all
the scenario. And that's even inefficient. Good is,

rs.save xmldoc, 1
or,
rs.save response, 1

In this way or something another, the encoding will be always saved. I think
if encoding is once broken, it cannot be recovered.

"russ holmes" <russ1@ihug.co.nz> wrote in message
news:eq4lnSkVDHA.384@TK2MSFTNGP12.phx.gbl...
> it seems the problem is when I use loadXML to pass the XML string into an
> XML object;
>
> in strXML;
>
> <z:row tindex="794773" tddesc="&#20004;&#20010;&#26376;&#21069;&#21018; ?"
> tdline="2" tdindex="55257"/>
>
> but after I try;
>
> objxml.loadXML (strXML)
>
> I get
>
> <z:row tindex="794773" tddesc="????? ?" tdline="2" tdindex="55257"/>
>
> The head of the string has an encoding specified;
>
> strXML header = <?xml version='1.0' encoding='ISO-8859-1'?>
>
> but after calling the loadXML the encoding has gone;
>
> The header of objxml.xml = <?xml version="1.0"?>
>
> This all happens before I transform using the XSL...
>
>
> "Han" <hp4444@kornet.net> wrote in message
> news:us7uD7jVDHA.2368@TK2MSFTNGP09.phx.gbl...
> > Rather than unicode issue, it can be an issue of stream vs string. With
> > stream, the original encoding will be saved always. Without your code,
my
> > guessing can be hasty. Anyway,
> >
> > oXml.transformnodetoObject oXsl, response
> >
> > In this way, the original encoding will be saved no matter what encoding
> you
> > used.
> >
> > "russ holmes" <russ1@ihug.co.nz> wrote in message
> > news:OUVp#fiVDHA.1872@TK2MSFTNGP12.phx.gbl...
> > > I need to select data from an nchar field in a sqlserver database,
> persist
> > > it to XML, then transform it using XSL, (all in COM component called
> from
> > > ASP - returning HTML ), outputting HTML (Response.Write the string
> > > returned from COM+).
> > > The data may contain european characters such as
> > >
> > > ? ?
> > >
> > > and also chinese characters such as
> > >
> > > ???? (it looks like SQLserver stores these as
> > > &#20004;&#20010;&#26376;&#21069;)
> > >
> > > There seem to be a lot of stages along the way where the encoding can
be
> > set
> > > and/or will default to a certain encoding (COM defaults to UTF-16?)
> > >
> > > What do I need to set to be able to see (and input) both Eurpoean and
> > > Chinese (Simplified) characters?
> > >
> > > So far I'm using;
> > >
> > > XML FROM
> > > RECORDSET-------------------------------------------------------------
> > >
> > > In the recordset the Chinese char is &#20010; but when persisted to
XML
> > the
> > > ampersand at the front is shown as &amp;
> > >
> > > <data>&amp;#20010;</data>
> > >
> > > So I'm replacing "&amp;#" with "&# " to output &#20010; in the XML
> > >
> > > XML
> > >
> >
>
HEADER----------------------------------------------------------------------
> > > ---------
> > >
> > > I grab the documentElement.xml of the XML from several recordsets and
> > build
> > > up a 'master' XML string. The header that I use on this string is;
> > >
> > > <?xml version='1.0' encoding='ISO-8859-1'?> (should I change this to
> > <?xml
> > > version='1.0' encoding='utf-8'?>)
> > >
> > > XSL
> > >
> >
>
TRANSFORMATION--------------------------------------------------------------
> > >
> > > I then transform the XML using XSL. - no encoding specified that I
can
> > > think of..
> > >
> > >
> >
>
ASP-------------------------------------------------------------------------
> > > ---------------------
> > >
> > > The resulting HTML string is returned to the ASp from the COM
component.
> > >
> > > At the moment I have no codepage set in the ASP..what effect would
> setting
> > > this have?
> > >
> > >
> >
>
HTML------------------------------------------------------------------------
> > > ----------------------
> > >
> > > The Meta tag in the HTML is set to ;
> > >
> > > <META http-equiv="Content-Type" content="text/html;
> charset=ISO-8859-1"/>
> > >
> > >
> >
> >
>
>