SitePoint Sponsor

User Tag List

Results 1 to 21 of 21

Hybrid View

  1. #1
    SitePoint Evangelist
    Join Date
    Apr 2003
    Location
    lisboa
    Posts
    423
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    charset=ISO-8859-1 doesnt have the euro symbol

    i wrote this blog using java and mysql; when i enter the (euro symbol), all i get, when i retrieve the data from database, is a question mark
    does someone has any idea how to solve this?
    as i said before, i'm using charset=ISO-8859-1 (latin 1, in mysql)

    thanks in advance

  2. #2
    Gre aus'm Pott gold trophysilver trophybronze trophy
    Pullo's Avatar
    Join Date
    Jun 2007
    Location
    Germany
    Posts
    5,941
    Mentioned
    215 Post(s)
    Tagged
    12 Thread(s)
    Hi there,

    ISO/IEC 8859-1 is missing some characters for French and Finnish text, as well as the euro sign.
    Could you simply not specify another charset on your pages, such as utf-8 or ISO-8859-15?

  3. #3
    SitePoint Evangelist
    Join Date
    Apr 2003
    Location
    lisboa
    Posts
    423
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    i tried utf-8 but got some strange characters, so i'm gonna try the other
    brb

  4. #4
    SitePoint Evangelist
    Join Date
    Apr 2003
    Location
    lisboa
    Posts
    423
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    i tried
    <meta http-equiv = "Content-Type" content = "text/html; charset = iso-8859-15">
    but no luck; still the question mark instead...

  5. #5
    SitePoint Evangelist
    Join Date
    Apr 2003
    Location
    lisboa
    Posts
    423
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    i tyried again with utf-8 without success

  6. #6
    Gre aus'm Pott gold trophysilver trophybronze trophy
    Pullo's Avatar
    Join Date
    Jun 2007
    Location
    Germany
    Posts
    5,941
    Mentioned
    215 Post(s)
    Tagged
    12 Thread(s)
    Can you check the character set for the database table in which your content is stored.
    What is that?

  7. #7
    Gre aus'm Pott gold trophysilver trophybronze trophy
    Pullo's Avatar
    Join Date
    Jun 2007
    Location
    Germany
    Posts
    5,941
    Mentioned
    215 Post(s)
    Tagged
    12 Thread(s)
    Hi,

    So, to summarize:
    One or more fields in your database (which is a latin1_whatever) show the Euro symbol fine in PHPMyAdmin.
    However, when you try to output these fields in a webpage, the Euro sign shows up as the question-mark.
    You are using the following meta tag on the page: <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
    Is that correct?

    Could you provide the code you are using to read the data from the database and to output it on the page.

  8. #8
    SitePoint Evangelist
    Join Date
    Apr 2003
    Location
    lisboa
    Posts
    423
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    >> One or more fields in your database (which is a latin1_whatever) show the Euro symbol fine in PHPMyAdmin.
    no, when i open mysql query browser, i already have a question mark

    >>
    when you try to output these fields in a webpage, the Euro sign shows up as the question-mark.
    yes

    >>
    You are using the following meta tag on the page: <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
    Is that correct?
    no, i'm currently using
    <meta http-equiv = "Content-Type" content = "text/html; charset = iso-8859-1"> meta tag, but i tested with both charset = iso-8859-15 and also with UTF-8

    >>
    Could you provide the code you are using to read the data from the database and to output it on the page.

    i think it would be also relevant posting the code to insert in bd
    the servlet that inpust to db:
    Code Java:
    package blog;
     
    import java.io.*;
    import javax.servlet.*;
    import javax.servlet.http.*;
    import java.sql.*;
    import bd.*;
     
     
    public class Escrever extends HttpServlet {
        private JdbcAccess access;
        private int linhas;
     
     
        public void init() throws ServletException {
            access = new JdbcAccess("avulsas");
        }
     
     
        public void doGet(HttpServletRequest request, HttpServletResponse response) throws IOException, ServletException {
            String titulo = request.getParameter("titulo");
            String texto = request.getParameter("texto");
            long data = java.lang.System.currentTimeMillis() / 1000;
     
     
            String sql = "INSERT INTO posts (data, titulo, texto) VALUES (" + data + ", '" + titulo + "',  '" + texto + "')";
     
     
            try {
                linhas = access.executaUpdate(sql);
            }
            catch (SQLException msg) {}
     
     
            response.setContentType("text/html");
     
     
            if (linhas ==1) {
                PrintWriter out = response.getWriter();
                out.println("<html>");
                out.println("<head>");
                out.println("<title>Escrever</title>");
                out.println("<meta HTTP-EQUIV=\"REFRESH\" content=\"0; url=http://rsacramento.no-ip.org/Blog\"");
                out.println("</head>");
                out.println("<body>");
                out.println("</body>");
                out.println("</html>");
            }
        }
     
     
        public void doPost(HttpServletRequest request, HttpServletResponse response) throws IOException, ServletException {
            doGet(request, response);
        }
    }

    the servlet i use to read from db is next:
    Code Java:
    package blog;
     
    import java.io.*;
    import javax.servlet.*;
    import javax.servlet.http.*;
    import java.sql.*;
    import bd.*;
    import util.*;
     
     
    public class Avulsas extends HttpServlet {
        private JdbcAccess access;
     
     
        public void init() throws ServletException {
            access = new JdbcAccess("avulsas");
        }
     
     
        public void doGet(HttpServletRequest request, HttpServletResponse response) throws IOException, ServletException {
            String sql = "SELECT * FROM posts order by data desc LIMIT 5";
            String apagina = "";
     
     
            try {
                ResultSet rs = access.executaQuery(sql);
                access.fecha(access.getConnection());
     
     
                apagina = AvulsasUtil.formata(rs);
                request.setAttribute("apagina", apagina);
                getServletContext().getRequestDispatcher("/jsp/avulsas.jsp").forward(request, response);
            }
            catch (SQLException msg) {
    //          String erro = "De momento no  possvel comunicar com a base de dados.<br /> Tente mais tarde.";
                Object erro = msg.toString();
                request.setAttribute("erro", erro);
                getServletContext().getRequestDispatcher("/jsp/erro.jsp").forward(request, response);
            }
        }
     
     
        public void doPost(HttpServletRequest request, HttpServletResponse response) throws IOException, ServletException {
            doGet(request, response);
        }
    }
    and also:
    Code Java:
    package util;
     
    import java.sql.*;
    import java.text.*;
     
     
    public class AvulsasUtil {
        public static synchronized String formata(ResultSet rs) throws SQLException {
            StringBuilder pagina = new StringBuilder();
            String dataTemporaria = "";
     
     
            while (rs.next()) {
                long mili = rs.getLong(2);
                mili = mili * 1000;
                java.util.Date data = new java.util.Date(mili);
                String padraoExtenso = "EEEEEE, d 'de' MMMMMM 'de' yyyy";
                String padraoHora = "HH:mm";
                SimpleDateFormat  sdfExtenso = new SimpleDateFormat(padraoExtenso);
                SimpleDateFormat  sdfHora = new SimpleDateFormat(padraoHora);
                String dataPorExtenso = sdfExtenso.format(data);
                String dataHoraria = sdfHora.format(data);
                String titulo = rs.getString(3);
                String texto = rs.getString(4);
                if (!dataTemporaria.equals(dataPorExtenso)) {
                    dataTemporaria = dataPorExtenso;
                    pagina.append("<h1>" + dataTemporaria + "</h1>\n");
                    pagina.append("<h2>" + titulo + "</h2>\n");
                    pagina.append("<p><span class = \"horas\">" +
                            dataHoraria +
                            " </span>" +
                            texto +
                            "</p>\n");
                }
                else {
                    pagina.append("<h2>" + titulo + "</h2>\n");
                    pagina.append("<p><span class = \"horas\">" +
                            dataHoraria +
                            " </span>" +
                            texto +
                            "</p>\n");
                }
            }
     
     
            return pagina.toString();
        }
    }

    hope it helps

  9. #9
    Gre aus'm Pott gold trophysilver trophybronze trophy
    Pullo's Avatar
    Join Date
    Jun 2007
    Location
    Germany
    Posts
    5,941
    Mentioned
    215 Post(s)
    Tagged
    12 Thread(s)
    Hi,

    If you can't see the Euro symbol in PHPMyAdmin, that is not so hopeful.

    I know it sounds obvious, but did you try setting the correct encoding in your browser?
    Which browser are you using?
    Which encoding do you have?
    Does this problem occur in all browsers?

    Regarding your code, I'm afraid my Java isn't wonderful.
    Had it been PHP and had the Euro sign been displaying properly in PHPMyAdmin, I would have had a bunch of suggestions.

    As it is, if changing your browser's charset encoding doesn't help, it might be the case that the Euro symbol isn't being stored correctly in the first place.

  10. #10
    SitePoint Evangelist
    Join Date
    Apr 2003
    Location
    lisboa
    Posts
    423
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    i'm testing with latest opera, with very recent chrome and with i.e.8, and its equal all over
    i read this article:
    http://www.oracle.com/technetwork/ar...et-142283.html,
    but was no help for me too
    >> it might be the case that the Euro symbol isn't being stored correctly in the first place
    yeap

    thanks anyway

  11. #11
    SitePoint Evangelist
    Join Date
    Apr 2003
    Location
    lisboa
    Posts
    423
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    i guess must have something to do with server's charset... (using tomcat)

  12. #12
    SitePoint Zealot Michel Merlin's Avatar
    Join Date
    Mar 2005
    Location
    Versailles (France)
    Posts
    169
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Encode in local charsets (ISO-8859-1, Shift-JIS, etc) and use FINANCIAL Euro symbol

    Encode in local charsets (ISO-8859-1, Shift-JIS, etc) and use FINANCIAL Euro symbol

    Recommendation:

    1. First you need the same charset everywhere in your information handling chain, seamlessly from forms and email to web pages to DB, including all according back-and-forth interfaces.
    2. For this you need to select a charset that will actually work in real world. If dealing with public in (North or South) America or Western Europe, the only currently (while waiting for UTF-8 to become ready) efficient (hence affordable and reliable) combination is ISO-8859-1 + EUR (the ISO 4217 3-letter FINANCIAL symbol). In Japan, JIS or Shift-JIS + EUR. And accordingly in the rest of the world.

    Explanation:

    1. In real life in France I often receive from big companies French email messages that they have entirely stripped from the due French accents (no matter the mailer they use), making them ugly and difficult to read, yet readable; this is apparently because, being usually English-speaking, they still encode in UTF-8, ignoring (since in English UTF-8 brings no difference or drawback or benefit over ASCII) that UTF-8 is the cause of their problems with NONASCII chars; oppositely the email messages I receive in Western language (FR, EN, DE) from most other companies or individuals are encoded in ISO-8859-1, and rid of charset problems. This (temporary I hope) situation is IMO because compatibility problems between UTF-8 and traditional fixed-length charsets have been underestimated by the official bodies in charge of enforcing UTF-8; as a result, UTF-8 problems in real world with NONASCII characters:
      1. are inexistent in English, where all characters are ASCII, so encoding in UTF-8 is actually encoding in ASCII;
      2. are few in Western European languages, where few chars are NONASCII, so encoding in UTF-8 does make documents inelegant, but not unreadable;
      3. are total in Japan, where most chars are NONASCII, so UTF-8 not only augments the document size but, in real world, causes most characters replaced with Mojibakes, making UTF-8 vastly rejected by regular people (Note: I still need, and would appreciate, more recent, direct, helpful, precise and reliable checks and facts in English about charsets in Japan, from able persons, if possible Japanese or living in Japan; same about China mainland, Hong-Kong and Taiwan).

    2. If you send some text (through email or a form) to someone in the public, you have no control over what they will do with that text (editing, replying, forwarding), and particularly what programs or charset(s) will be used down the workflow. Many of your correspondents will knowingly or not use their local charset, so if you have encoded in another one (namely UTF-8 if you are NOT writing in English), they will encounter a lot of big problems with no solution apparent to them, whence their going back traditional charsets or removing accents.
    3. In real world, UTF-8's goal (efficiently representing all the 0.1-1-million Unicode characters in the world) has only been successfully achieved in complete closed pure-UTF-8 environments built with careful intelligent thinking and sufficient resources, as Wikipedia; others generally tend to go back to "traditional" local fixed-length charsets (ISO-8859-1 in Western European Languages, JIS for email and Shift-JIS for web pages in Japanese, etc).
    4. ISO-8859-15's main goal (and effect) is to introduce the Euro typographical symbol "", but it does so by substituting it to the general currency typographical symbol "", so in real world if you send an ISO-8859-15-encoded "", somewhere down the workflow it will inevitably get replaced with an ISO-8859-1-encoded "", building a damageable confusion, thus making ISO-8859-15 unsafe thus actually unusable. Oppositely, the Euro financial symbol "EUR" is recognized, understood, read, written, conveyed, transcribed, immediately sans ambiguity or error by any person or machine or program world-wide, from financial traders to shoe shiners, from Bhutan to Manhattan. So, after (inter alia) my various posts and emails, many sites like amazon (.fr, .de, etc) or wikipedia (all) have now switched, in their use or recommendations, from "" to "EUR".

    Details: For Long URLs, Accentuated Chars, encode as Quoted-Printable, Western European (ISO), use EUR for Euro symbol of Sun 19 Nov 2006.

    Versailles, Thu 13 Dec 2012 22:00:00 +0100

  13. #13
    SitePoint Wizard bronze trophy Jeff Mott's Avatar
    Join Date
    Jul 2009
    Posts
    1,278
    Mentioned
    18 Post(s)
    Tagged
    0 Thread(s)
    Hi, Michel. Thanks for your very detailed replies! I hope you don't mind a follow-up question. Perhaps it's due to me being in an English-speaking bubble, but my understanding was that UTF-8 is universally understood by now. Many large websites use it, I presume successfully. In Western Europe or eastern countries, is there still software being used that doesn't support UTF-8?
    "First make it work. Then make it better."

  14. #14
    SitePoint Zealot Michel Merlin's Avatar
    Join Date
    Mar 2005
    Location
    Versailles (France)
    Posts
    169
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Big sites tend to UTF-8 for EN pages, local charsets for forms

    Big sites tend to UTF-8 for EN pages, local charsets for forms
    Quote Originally Posted by Jeff Mott View Post
    (Thu 13 Dec 2012 22:15 GMT)
    ...my understanding was that UTF-8 is universally understood by now. Many large websites use it, I presume successfully.
    In the 2 sites you link ( http://www.google.fr and http://www.yahoo.co.jp ) and in the very page we are posting on right now (charset=ISO-8859-1 doesnt have the euro symbol), let's check the charset they state in their HTTP Headers (using HTTP Web-Sniffer 1.0.44) and in their HTML source (I recall that, whatever we can think about it, the HTTP Header has priority over the HTML source):

    1. http://www.google.fr (and http://www.google.co.jp BTW) actually uses ISO-8859-1 (states "Content-Type: text/html; charset=ISO-8859-1" in its HTTP Header, and nothing in its HTML source)
    2. http://www.yahoo.co.jp (as well as www.yahoo.fr, that redirects to http://fr.yahoo.com, or as http://www.yahoo.com ) actually uses UTF-8 (states "Content-Type: text/html; charset=utf-8" in its HTTP Header, and <meta http-equiv="content-type" content="text/html; charset=utf-8"> in its HTML source
    3. this SitePoint page actually uses ISO-8859-1 (states "Content-Type: text/html; charset=ISO-8859-1" in its HTTP Header, and <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"> in its HTML source), and nevertheless displays correctly the Euro "" and Currency "" typographical symbols (and some).

    Notice however that between my checks of 2008 and 2011, some sites have converted from UTF-8 to local charsets, some the other way; a typical case is SONY, where global and US sites have switched from ISO-8859-1 to UTF-8 (that will do them no hurt at all since for English UTF-8 is actually ASCII), while local sites have remained in, or converted to, local charsets, especially in their form pages: see SONY and Sony USA (from ISO-8859-1 to UTF-8), Sony Global (ISO-8859-1), Sony JP (Shift-JIS), Sony FR (UTF-8) > Contact (UTF-8) > Form (Windows-1252).
    Quote Originally Posted by Jeff Mott View Post
    In Western Europe or eastern countries, is there still software being used that doesn't support UTF-8?
    Sure everything exists in Nature, yet the remaining ones that don't support it at all must be rare. However many sites support UTF-8 but incompletely or wrongly. A notable case is Microsoft, who despite its vast resources never corrected Outlook Express' big UTF-8 flaw in editing HTML source, and took years before correctly taming UTF-8 everywhere else.

    Versailles, Fri 14 Dec 2012 12:15:10 +0100

  15. #15
    SitePoint Evangelist
    Join Date
    Apr 2003
    Location
    lisboa
    Posts
    423
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Michel Merlin View Post

    1. this SitePoint page actually uses ISO-8859-1 (states "Content-Type: text/html; charset=ISO-8859-1" in its HTTP Header, and <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"> in its HTML source), and nevertheless displays correctly the Euro "" and Currency "" typographical symbols (and some).
    that's intriguing: how do they do it that i cant have it?

  16. #16
    SitePoint Zealot Michel Merlin's Avatar
    Join Date
    Mar 2005
    Location
    Versailles (France)
    Posts
    169
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by rfl View Post
    (15:37 GMT)
    i cant have it?
    I guess what you mean is you don't have the Euro and Currency signs properly displayed. It can't be a FONT problem since the fonts used (Arial, Verdana) are very common. So, have you checked your browser is set to detect the charset in the web page, as told in my "Details" link in last line of 21:00?

    Versailles, Fri 14 Dec 2012 18:07:25 +0100

  17. #17
    SitePoint Evangelist
    Join Date
    Apr 2003
    Location
    lisboa
    Posts
    423
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    yes, in opera, chrome and ie8

  18. #18
    SitePoint Wizard bronze trophy Jeff Mott's Avatar
    Join Date
    Jul 2009
    Posts
    1,278
    Mentioned
    18 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by rfl View Post
    that's intriguing: how do they do it that i cant have it?
    I suspect the euro and currency symbols are special cases. The browser probably error corrects for the euro symbol, because even though it isn't in the iso-8859-1 set, it is in the windows-1252 set. And the currency symbol is actually in iso-8859-1, so that one is legal.

    If yours is coming through as a question mark, then I suspect it's not the browser but your server that is sending it that way. You'll need to follow Michel's advice to have "the same charset everywhere in your information handling chain." If either your application or your database is Latin1, then either one of those could be replacing illegal characters with the question mark. You'll need to pick a charset that can support all characters (in the English-speaking world, UTF-8 is by far the most popular choice), and make sure everything is using that charset.
    "First make it work. Then make it better."

  19. #19
    SitePoint Evangelist
    Join Date
    Apr 2003
    Location
    lisboa
    Posts
    423
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    i'm still working on how to alter tomcat's charset
    a bit off topic: i notice that in my app, if i have a " character or a - character, there it goes again - i get a question mark; but if i edit it, i mean, rewrite it, i get it right!


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •