Hello,
In java, I need to convert a String to UTF-8 :
1) I read a string in an xml file . The header of the file declare to be UTF-8.
It's probabily not completely true because :
if I open this original xml with notepad or IE I get : "é 곿¶êæóñ."
(IE switched to unicode)
if i open it in another editor (proprietary) i get :"é 곿¶êæóñ."
Let's call this string "originalStr"
2) I convert this string to UTF-8 >>here stands my problem
Let's call the result String "convertedStr".
3) ...then i generate an xml file in UTF-8 and I wish to get this result :
"é ??????ó?".
I tried the following :
String convertedStr = new String(originalStr.getBytes("UTF8"));
but I get in IE :
"é 곿¶êæóñ."
I tried :
String convertedStr = new String(originalStr.getBytes(),"UTF-8");
but i get white spaces in ie :
" "
I tried :
String convertedStr = new String(originalStr.getBytes("UTF-8"),"UTF-8");
but i get
é 곿¶êæóñ.
Other notes :
- my JVM starts with ISO1 encoding parameters and I cannot change it.
- I'm 100% sure I can obtain the result 3) in a navigator because it
works with another system that reads the same source XML
- I cannot change the first xml, neither the way I read it, i just
have in entry the originalStr
- of course I want a converter that works with all ISO2 but also other
characters, not especially these one.
** Answer before monday the 17 is mandatory **
Thanks
for your help by advance
Regards,
Bruno B |
Clarification of Question by
brunob-ga
on
13 Jan 2005 09:52 PST
I will try to reorganize a little bit my question :
1) I must convert a String to UTF-8.
2) What I expect to see is "é ??????ó?" (in IE or other UTF-8 compliant reader)
The only thing i can get after trying many conversions is "é 곿¶êæóñ."
The source of this string stand in an xml file. If I open it, I get
é 곿¶êæóñ.
or
é 곿¶êæóñ.
I know the conversion is possible because the roundtrip IE(page
ISO2)>xml(claim UTF-8)>IE(pageISO2) works well in another system.
Thanks,
Regards,
Bruno
|
Request for Question Clarification by
googleexpert-ga
on
13 Jan 2005 21:22 PST
Hi brunob-ga,
I have tried converting your original xml file using different "source encodings"
1.) Converted original xml using many different "Source Encodings" to UTF-8
2.) Viewed converted file in web browser using different text encodings
3.) xml file looked readable in browser with (Traditional Chinese,
Thai, and Cyrillic)
Going back to your example:
[if I open this original xml with notepad or IE I get : originalStr
(IE switched to unicode)
if i open it in another editor (proprietary) i get : orginalStr2]
Now, in my converted (UTF-8) xml file, what I get is:
[if I open this original xml with notepad or IE I get: originalStr2
(IE switched to unicode)
if i open it in another editor (proprietary) i get : originalStr2]
However, this only seems to happen when viewing in Traditional Chinese.
Based on the encodings that I converted with,
and the encodings that I viewed the xml file with...
Is my result what you are looking for?
I hope to hear from you soon.
-googleexpert
|
Clarification of Question by
brunob-ga
on
14 Jan 2005 06:20 PST
Hello,
Thanks for the answer. Unfortunatily, what I'm looking for is a
generated xml file in UTF-8 that displays the glyphs "é ??????ó?" when
opened in IE.
Thanks
Regards,
|