IDE support for UTF8?

Look at this code please:

String text ="سلام";
    PrintStream out = new PrintStream(System.out, true, "UTF8");
    out.println(text);

When I run this code in netbeans, It prints:
ط³ظ„ط§ظ…

But same code in Eclips work well!

Why?

But When I run

String text ="سلام";
    PrintStream out = new PrintStream(System.out, true, "windows-1256");
    out.println(text);

Works well in netbeans and fail in Eclips!!

Why JAVA IDEs work crazy like this?

May someone answer please?

Could be a lot of things.

My gut feeling is that the eclipse file is being stored using a character set that doesn’t support the characters you’re trying to print out.

Can you print out the code points instead of the characters themselves to verify that the correct characters are being printed by your output stream?

First of all, Thank you for your guidance.

Can you print out the code points instead of the characters themselves to verify that the correct characters are being printed by your output stream?

How to do that?

This code might work for you.


public class ListCodePoints {
	
	public static void main(String[] args) {
		
		String string = "ENTER TEXT HERE\\u0643";
		
		char[] charArray = string.toCharArray();
		
		for (int i = 0; i < charArray.length; i++) {
			
			int codePoint = Character.codePointAt(charArray, i);
			
			String hexString = Integer.toHexString(codePoint);
			
			System.out.println(hexString);
			
		}
	}
}

You’re looking to see if the values match between the two IDEs. Like I said before my guess is that the strings you’re saving are being changed between eclipse and netbeans.

This is output:

20
54
45
58
54
20
48
45
52
45
643

Yep of course.

You should replace the string i put in the code called “ENTER TEXT HERE\u0643” with the text that you’re trying to print out and see what the numbers are.

You should then be able to check both IDEs to see if they are outputting the same numbers. If they are, I’m not sure what to tell you, but if they aren’t, then look into how eclipse and netbeans encode their saved files.


package sockettest;
import java.io.*;
import java.net.*;

public class Main {

    public static void main(String[] args) {

        Socket ClientSocket = null;
        //DataOutputStream os = null;
        //DataInputStream is = null;
        BufferedWriter inchar=null;
        BufferedReader bufferchar=null;

        try {
            ClientSocket = new Socket("127.0.0.1", 9000);
            
            inchar=new BufferedWriter(new OutputStreamWriter(ClientSocket.getOutputStream(),"windows-1256"));
            bufferchar=new BufferedReader(new InputStreamReader(ClientSocket.getInputStream(),"windows-1256"));
        } catch (UnknownHostException e) {
            System.err.println("Don't know about host: hostname");
        } catch (IOException e) {
            System.err.println("Couldn't get I/O for the connection to: hostname");
        }

    if (ClientSocket != null &&  inchar != null && bufferchar != null) {
            try {
                  inchar.write("&#1605;&#1606; &#1575;&#1711;&#1585;&#1605;&#1740; &#1711;&#1585;&#1583;&#1605;\
"
                          +"&#1576;&#1575;&#1586; &#1576;&#1585;&#1582;&#1608;&#1575;&#1607;&#1605; &#1711;&#1588;&#1578;\
"   );
                                 
                  inchar.flush();
                  ClientSocket.shutdownOutput();
                  int c;
                  while ((c = bufferchar.read()) != -1) {
                          System.out.print((char)c);
                  }
                           
                  inchar.close();
                  bufferchar.close();
                  ClientSocket.close();
                  } catch (UnknownHostException e) {
                  System.err.println("Trying to connect to unknown host: " + e);
                  } catch (IOException e) {
                  System.err.println("IOException:  " + e);
                 }
               }
             }
         }

package socketserver;
import java.io.*;
import java.net.*;

public class Main {

    public static void main(String args[]) {

// declare a server socket and a client socket for the server
        ServerSocket echoServer = null;
        String line;
        BufferedReader bufferchar=null;
        PrintStream os;
        Socket ServersideSocket = null;

        try {
           echoServer = new ServerSocket(9000);
        }
        catch (IOException e) {
           System.out.println(e);
        }
// Create a socket object from the ServerSocket to listen and accept
// connections.
// Open input and output streams
    try {
           ServersideSocket = echoServer.accept();
           bufferchar=new BufferedReader(new InputStreamReader( ServersideSocket.getInputStream(),"windows-1256"));
           os = new PrintStream( ServersideSocket.getOutputStream(),true,"windows-1256");
// As long as we receive data, echo that data back to the client.
           while ((line = bufferchar.readLine()) != null) {
             
             os.println(line);
             os.flush();
           }
        }
    catch (IOException e) {
           System.out.println(e);
        }
    }
}

Output

من اگرم? گردم
بم گشت

It prints question mark, Why?

May someone answer please?

Still wait for answer…

Because that character isn’t in the windows-1256 character set.

I ran this code:

String defaultEncoding = java.nio.charset.Charset.defaultCharset().name();

This code returns the encoding must be used in order to show the text well.

It returns Windows-1256.

But when I use it in that code, text is now shown well

I did a search on the following for that specific character and I did not find that character:
http://smontagu.damowmow.com/genEncodingTest.cgi?family=windows&codepage=1256

So, are we/you confident that they character encoding method is working properly? Are you completely confident that the character that is missing is in the windows-1256 encoding?

Have you considered using utf-8 for all your encoding instead of using windows-1256?

inchar.write("&#1605;&#1606; &#1575;&#1711;&#1585;&#1605;&#1740; &#1711;&#1585;&#1583;&#1605;\
"
                          +"&#1576;&#1575;&#1586; &#1576;&#1585;&#1582;&#1608;&#1575;&#1607;&#1605; &#1711;&#1588;&#1578;\
"   );

This is the output:

من اگرم? گردم
باز برخواهم گشت

The miss characters are م , ی

ی U+0649

م U+0645

I also change windows-1256 to UTF8 But same result.

I just ran your code and changed BOTH files to use UTF8 as the character encoding and the code worked properly.

Please verify that both the client and the server code are both using UTF8 for their input/output streams.

The lesson here is that if you’re attempting internationalization don’t mess around. Use UTF8 or UTF16 as your character encoding. As far as I see it, there is no point in using any other character sets.

But please realize that there are differences between UTF8 and UTF16:

Please put the code here, I will run in Netbeans and tell tell result. Thank You very much.

Which IDE do you use?

I use eclipse.

When saving these two files make sure that your IDE saves the files with UTF-8 encoding.


import java.io.BufferedReader;
import java.io.BufferedWriter;
import java.io.IOException;
import java.io.InputStreamReader;
import java.io.OutputStreamWriter;
import java.net.Socket;
import java.net.UnknownHostException;

public class ArabianClient {
	
	public static void main(String[] args) {

		Socket ClientSocket = null;
		//DataOutputStream os = null;
		//DataInputStream is = null;
		BufferedWriter inchar=null;
		BufferedReader bufferchar=null;

		try {
			ClientSocket = new Socket("127.0.0.1", 9000);

			inchar=new BufferedWriter(new OutputStreamWriter(ClientSocket.getOutputStream(),"UTF8"));
			bufferchar=new BufferedReader(new InputStreamReader(ClientSocket.getInputStream(),"UTF8"));
		} catch (UnknownHostException e) {
			System.err.println("Don't know about host: hostname");
		} catch (IOException e) {
			System.err.println("Couldn't get I/O for the connection to: hostname");
		}

		if (ClientSocket != null &&  inchar != null && bufferchar != null) {
			try {
				inchar.write("&#1605;&#1606; &#1575;&#1711;&#1585;&#1605;&#1740; &#1711;&#1585;&#1583;&#1605;\
"
						+"&#1576;&#1575;&#1586; &#1576;&#1585;&#1582;&#1608;&#1575;&#1607;&#1605; &#1711;&#1588;&#1578;\
"   );

				inchar.flush();
				ClientSocket.shutdownOutput();
				int c;
				while ((c = bufferchar.read()) != -1) {
					System.out.print((char)c);
				}

				inchar.close();
				bufferchar.close();
				ClientSocket.close();
			} catch (UnknownHostException e) {
				System.err.println("Trying to connect to unknown host: " + e);
			} catch (IOException e) {
				System.err.println("IOException:  " + e);
			}
		}
	}
}


import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.io.PrintStream;
import java.net.ServerSocket;
import java.net.Socket;

public class ArabianServer {

	public static void main(String args[]) {

		// declare a server socket and a client socket for the server
		ServerSocket echoServer = null;
		String line;
		BufferedReader bufferchar=null;
		PrintStream os;
		Socket ServersideSocket = null;

		try {
			echoServer = new ServerSocket(9000);
		}
		catch (IOException e) {
			System.out.println(e);
		}
		// Create a socket object from the ServerSocket to listen and accept
		// connections.
		// Open input and output streams
		try {
			ServersideSocket = echoServer.accept();
			bufferchar=new BufferedReader(new InputStreamReader( ServersideSocket.getInputStream(),"UTF8"));
			os = new PrintStream( ServersideSocket.getOutputStream(),true,"UTF8");
			// As long as we receive data, echo that data back to the client.
			while ((line = bufferchar.readLine()) != null) {

				os.println(line);
				os.flush();
			}
		}
		catch (IOException e) {
			System.out.println(e);
		}
	}
}

I ran the code so UTF8 is used.

Output:

من اگرم? گردم
باز برخواهم گشت

The picture of IDE, IDE was Netbeans 6.5