Hi all

I am trying to send a stream of data from Windows CE to a Java program
on my desktop over bluetooth.

However, there seems to have some inconsistency with regard to character
encoding. (i.e. some squares are shown together with some correct
characters).

I suspect that re-encoding is required using
String(readBuffer, souceEncoding);

How can the souceEncoding be found? There are many types of unicode
that exist. Which of this is being used by my Pocket PC?

Ponce

Re: UNICODE in Windows CE by MSenne

MSenne
Thu Jan 12 16:06:53 CST 2006

There are 4 encodings that Pocket PCs (or at least the .NET CF) use.
ASCII (7 of 8 bits used)
UTF7 (7 of 8 bits used)
UTF8 (8 bits used)
Unicode (16 bits used)

In my experience UTF7 displays non-standard characters in controls the
best, UTF8 is the best for storing non-standard characters.
ASCII characters correspond to their Unicode equivalents exactly in
decimal value, it just takes an extra byte to store them, which shows
up like a NULL square when displayed.

If you know you are dealing with Unicode (16bits/char) then just loop
through the list one byte at a time and remove all the NULLs (int value
of the byte is 0).
If you want to have a robust solution that can autodetect then we'll
need info as to what language you are writing this in for starters.

Have fun!
-MSenne


Re: UNICODE in Windows CE by ctacke/>

ctacke/>
Thu Jan 12 20:27:10 CST 2006

Looping through removing nulls isn't really the way to go, using
Encoding.Unidoce to convert is a far better way. Unicode is 2 bytes for a
reason - it's possible that second byte means something.

-Chris

"MSenne" <yoat42@gmail.com> wrote in message
news:1137103613.294180.49030@z14g2000cwz.googlegroups.com...
> There are 4 encodings that Pocket PCs (or at least the .NET CF) use.
> ASCII (7 of 8 bits used)
> UTF7 (7 of 8 bits used)
> UTF8 (8 bits used)
> Unicode (16 bits used)
>
> In my experience UTF7 displays non-standard characters in controls the
> best, UTF8 is the best for storing non-standard characters.
> ASCII characters correspond to their Unicode equivalents exactly in
> decimal value, it just takes an extra byte to store them, which shows
> up like a NULL square when displayed.
>
> If you know you are dealing with Unicode (16bits/char) then just loop
> through the list one byte at a time and remove all the NULLs (int value
> of the byte is 0).
> If you want to have a robust solution that can autodetect then we'll
> need info as to what language you are writing this in for starters.
>
> Have fun!
> -MSenne
>



Re: UNICODE in Windows CE by MSenne

MSenne
Fri Jan 13 08:55:34 CST 2006

No, not the best way to go. But if he wasn't using the .NET CF that's
how I'd probably go for testing a simple "solution." If you want to
convert it from Unicode before sending it to Java then you're going to
lose 1 of the bytes from your data in any kind of conversion. Removing
the first byte is a basic Unicode->ASCII conversion. That's all you're
going to end up doing removing the first byte, but there are better
ways to do it if you are using managed code.

Looks like you are using .NET, so I guess I'll try to actually answer
your question this time:
How can the source encoding be found? Try the GetEncoding method from
the System.Text.Encoding class.
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/cpref/html/frlrfsystemtextencodingclassgetencodingtopic.asp

-MSenne


Re: UNICODE in Windows CE by ponce

ponce
Sat Jan 14 02:43:19 CST 2006

Hi MSenne

sorry for this late reply...

i am using Windows API to do the Windows CE programming. So far, it
works by removing the NULLs in the Java program. I believe that it will
work as well if i convert it to ASCII first before sending
<strong>specifically to my case</strong>. i wish that managed codes can
be used but it will take up too much time to create a C interface for
the SDK which i am using. Hence, i have to stay with API.

one more thing, is there any recommended sites for learning strings? it
gets quite confusing as a new windows programmer to work with a mixture
of TCHAR, char, LPSTR, bytes... and conversion between the formats
because of the different devices being used...

Thanks :)
Ponce



MSenne wrote:
> No, not the best way to go. But if he wasn't using the .NET CF that's
> how I'd probably go for testing a simple "solution." If you want to
> convert it from Unicode before sending it to Java then you're going to
> lose 1 of the bytes from your data in any kind of conversion. Removing
> the first byte is a basic Unicode->ASCII conversion. That's all you're
> going to end up doing removing the first byte, but there are better
> ways to do it if you are using managed code.
>
> Looks like you are using .NET, so I guess I'll try to actually answer
> your question this time:
> How can the source encoding be found? Try the GetEncoding method from
> the System.Text.Encoding class.
> http://msdn.microsoft.com/library/default.asp?url=/library/en-us/cpref/html/frlrfsystemtextencodingclassgetencodingtopic.asp
>
> -MSenne
>

Re: UNICODE in Windows CE by Hao

Hao
Mon Jan 16 03:10:36 CST 2006

Well, I was also confused by this stuff, and I found 2 useful articles
to understand Unicode

http://www.tenouk.com/ModuleG.html

http://www.codeproject.com/string/cppstringguide1.asp


Re: UNICODE in Windows CE by ponce

ponce
Tue Jan 17 09:02:28 CST 2006

That's cool! :)

Hao wrote:
> Well, I was also confused by this stuff, and I found 2 useful articles
> to understand Unicode
>
> http://www.tenouk.com/ModuleG.html
>
> http://www.codeproject.com/string/cppstringguide1.asp
>