How would one go about reading a Japanese EUC encoded file in Visual C++? In
MSDN, I looked up EUC and it said it was 51932, in the 932 family with JIS
and SJIS. I tried opening my file like this:

std::wifstream src;
src.imbue(std::locale("ja_JP.euc-jp"));
src.open(L"post.euc");

but it gave me a runtime error. So I inserted the code family instead, like
below:

std::wifstream src;
src.imbue(std::locale(".932"));
src.open(L"post.euc");

The program now runs, but the Japanese text is garbled. if I look at the
file in Internet Explorer, the browser defaults on SJIS and I get the same
garbled text. But if I manually change the display encoding to EUC, the
files displays correclty.

Is it not possible to use C++ do this? Will I need Win32 API file I/O
functions instead?

Re: EUC encoding with Visual C++? by Ulrich

Ulrich
Mon Oct 23 02:12:51 CDT 2006

Bryan wrote:
> How would one go about reading a Japanese EUC encoded file in Visual C++?
> In MSDN, I looked up EUC and it said it was 51932, in the 932 family with
> JIS
> and SJIS. I tried opening my file like this:
>
> std::wifstream src;
> src.imbue(std::locale("ja_JP.euc-jp"));
> src.open(L"post.euc");
[...]
> Is it not possible to use C++ do this? Will I need Win32 API file I/O
> functions instead?

Locales are extensible to encodings that are not supported by the
vendor-supplied locales, so the answers to those two questions are "no"
and "no". In this case, it is the codecvt facet which governs codeset
conversions for which you need to write an EUC implementation. I'm not 100%
sure there is no suitable supplied facet though, typically one is supposed
to find such info in the vendor's documentation.

Uli


Re: EUC encoding with Visual C++? by Bryan

Bryan
Mon Oct 23 23:18:02 CDT 2006

It seems like encodings are a fairly tricking business. And I was hoping to
be able to use regular C++ and have this little app work on Windows and
Solaris! Looks like I will have to hack away at the Solaris version too.

I did get the EUC encoded file to load properly. I found that the following
works:

std::wifstream src;
src.imbue(std::locale(".20932"));

It seems to work for EUC and JIS. So, now onto the next challenge, reading
a utf-8 file. :(