Google Answers Logo
View Question
 
Q: java : convert a string to UTF-8 ( No Answer,   3 Comments )
Question  
Subject: java : convert a string to UTF-8
Category: Computers > Programming
Asked by: brunob-ga
List Price: $100.00
Posted: 13 Jan 2005 07:22 PST
Expires: 20 Jan 2005 06:47 PST
Question ID: 456589
Hello,

In java, I need to convert a String to UTF-8 :

1) I read a string in an xml file . The header of the file declare to be UTF-8.
It's probabily not completely true because :
if I open this original xml with notepad or IE I get : "é 곿¶êæóñ." 
(IE switched to unicode)
if i open it in another editor (proprietary) i get :"é 곿¶êæóñ."

Let's call this string "originalStr"


2) I convert this string to UTF-8 >>here stands my problem
Let's call the result String "convertedStr".

3) ...then i generate an xml file in UTF-8 and I wish to get this result :
"é ??????ó?". 

I tried the following :
String convertedStr = new String(originalStr.getBytes("UTF8"));
but I get in IE :
"é 곿¶êæóñ."

I tried : 
String convertedStr = new String(originalStr.getBytes(),"UTF-8");
but i get white spaces in ie : 
"    "
I tried :
String convertedStr = new String(originalStr.getBytes("UTF-8"),"UTF-8");
but i get
é 곿¶êæóñ.


Other notes : 
- my JVM starts with ISO1 encoding parameters and I cannot change it.
- I'm 100% sure I can obtain the result 3) in a navigator because it
works with another system that reads the same source XML
- I cannot change the first xml, neither the way I read it, i just
have in entry the originalStr
- of course I want a converter that works with all ISO2 but also other
characters, not especially these one.

** Answer before monday the 17 is mandatory **

Thanks
 for your help by advance

Regards,
Bruno B

Clarification of Question by brunob-ga on 13 Jan 2005 09:52 PST
I will try to reorganize a little bit my question :

1) I must convert a String to UTF-8.

2) What I expect to see is "é ??????ó?" (in IE or other UTF-8 compliant reader)
The only thing i can get after trying many conversions is "é 곿¶êæóñ."

The source of this string stand in an xml file. If I open it, I get 
é 곿¶êæóñ.
or
é 곿¶êæóñ.

I know the conversion is possible because the roundtrip IE(page
ISO2)>xml(claim UTF-8)>IE(pageISO2) works well in another system.

Thanks,
Regards,
Bruno

Request for Question Clarification by googleexpert-ga on 13 Jan 2005 21:22 PST
Hi brunob-ga,
I have tried converting your original xml file using different "source encodings"
1.)    Converted original xml using many different "Source Encodings" to UTF-8
2.)  Viewed converted file in web browser using different text encodings
3.)  xml file looked readable in browser with (Traditional Chinese,
Thai, and Cyrillic)

Going back to your example:
[if I open this original xml with notepad or IE I get : originalStr
(IE switched to unicode)
if i open it in another editor (proprietary) i get : orginalStr2]

Now, in my converted (UTF-8) xml file, what I get is:
[if I open this original xml with notepad or IE I get: originalStr2 
(IE switched to unicode)
if i open it in another editor (proprietary) i get : originalStr2]
However, this only seems to happen when viewing in Traditional Chinese.

Based on the encodings that I converted with, 
and the encodings that I viewed the xml file with...

Is my result what you are looking for?

I hope to hear from you soon.

-googleexpert

Clarification of Question by brunob-ga on 14 Jan 2005 06:20 PST
Hello,

Thanks for the answer.  Unfortunatily, what I'm looking for is a
generated xml file in UTF-8 that displays the glyphs "é ??????ó?" when
opened in IE.

Thanks
Regards,
Answer  
There is no answer at this time.

Comments  
Subject: Re: java : convert a string to UTF-8
From: vladimir-ga on 13 Jan 2005 08:33 PST
 
Try this:

String convertedStr = new String(originalStr.getBytes("ISO8859-1"), "ISO8859-2");

Vladimir
Subject: Re: java : convert a string to UTF-8
From: brunob-ga on 13 Jan 2005 09:37 PST
 
Hello, 

I tried 
String convertedStr = new String(originalStr.getBytes("ISO8859-1"), "ISO8859-2");
but I get  : é? ??????ó?.

Following your idea, I also tried :
String convertedStr = new String(originalStr.getBytes("ISO8859-2"), "UTF-8");
I get spaces and :
String convertedStr = new String(originalStr.getBytes("ISO8859-1"), "UTF-8");
But same result, spaces in IE. Althought , in the console I obtained
other unrelated characters.

Cheers,
Bruno
Subject: Re: java : convert a string to UTF-8
From: willsmithbcn-ga on 14 Jan 2005 11:45 PST
 
vladimir I'm able to see your target string in my IE6 Browser when
executing the following servlet (tested in Apache Tomcat, both
arguments and PCDATA)

The trick is 

a) You set the content type to be "text/xml; charset='UTF-8'" before
getting the printwriter from the response object.
b) You declare the XML to be UTF-8 in the xml content ?<xml ... >

PS: If this works for you please send me the 100 bugs by e-mail ;-)

Daniel Macho Ortiz.

import java.io.*;
import java.text.*;
import java.util.*;
import javax.servlet.*;
import javax.servlet.http.*;

/**
 * UTF-8 Test Servlet.
 *
 * @author Daniel Macho Ortiz
 */

public class UTF8Servlet extends HttpServlet {


    public void doGet(HttpServletRequest request,
                      HttpServletResponse response)
        throws IOException, ServletException
    {
	String s = new String("é Ä?Å?żÅ?Ä?Ä?óÅ?".getBytes(),"UTF-8");

        response.setContentType("text/xml; charset=UTF-8");
        PrintWriter out = response.getWriter();

        out.println("<?xml version='1.0' encoding='UTF-8'
?><xml><mytest attr='"+s+"'>"+s+"</mytest></xml>");

    }
}

Important Disclaimer: Answers and comments provided on Google Answers are general information, and are not intended to substitute for informed professional medical, psychiatric, psychological, tax, legal, investment, accounting, or other professional advice. Google does not endorse, and expressly disclaims liability for any product, manufacturer, distributor, service or service provider mentioned or any opinion expressed in answers or comments. Please read carefully the Google Answers Terms of Service.

If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you.
Search Google Answers for
Google Answers  


Google Home - Answers FAQ - Terms of Service - Privacy Policy