SitePoint Sponsor

User Tag List

Results 1 to 17 of 17
  1. #1
    Barefoot on the Moon! silver trophy Force Flow's Avatar
    Join Date
    Jul 2003
    Location
    Northeastern USA
    Posts
    4,606
    Mentioned
    56 Post(s)
    Tagged
    1 Thread(s)

    ¬ characters appearing in database values

    I have a strange issue where I type text into a textarea, it gets UPDATE to a MySQL database. mb_detect_encoding() on the $_POST variable in PHP returns "ASCII".

    Then, when the text is read back from the database, added o the textarea field, I just update it to the database again, but this time all the spaces have a ¬ characters in front of them. mb_detect_encoding() on the $_POST variable gives "UTF-8". Example data in the database:

    Code:
    Due¬ to¬ inclement¬ weather,¬ we¬ will¬ be¬ closed¬ on¬ 1/4.
    The HTML headers have the utf8 meta tag:

    HTML Code:
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
    OS platform (windows vs *nix), PHP, MySQL versions don't seem to matter--the issue seems to happen everywhere.

    The database is using the default collation of latin1_swedish_ci. Changing the collation to utf8_general_ci doesn't make a difference.

    I've been banging my head against the wall for hours on this issue--I've never encountered it before. The reason this matters is that I need to count the number of characters and limit the string to x number of characters. When I try to save the string to the database for a second time, there are extra characters counted in the string, but when I do a var_dump() of the string, there is nothing extra there. I don't get it.

    Any ideas?
    Visit The Blog | Follow On Twitter
    301tool 1.1.5 - URL redirector & shortener (PHP/MySQL)
    Can be hosted on and utilize your own domain

  2. #2
    Programming Team silver trophybronze trophy
    Mittineague's Avatar
    Join Date
    Jul 2005
    Location
    West Springfield, Massachusetts
    Posts
    17,140
    Mentioned
    190 Post(s)
    Tagged
    2 Thread(s)
    It's definately a character encoding issue. I've seen this before when some files are saved as UTF-8
    (the BOM "space" gets saved as an ¬)

    Are you using the optional arguments?
    PHP Code:
    mb_detect_encoding($str'UTF-8'true); 

  3. #3
    Barefoot on the Moon! silver trophy Force Flow's Avatar
    Join Date
    Jul 2003
    Location
    Northeastern USA
    Posts
    4,606
    Mentioned
    56 Post(s)
    Tagged
    1 Thread(s)
    Quote Originally Posted by Mittineague View Post
    Are you using the optional arguments?
    PHP Code:
    mb_detect_encoding($str'UTF-8'true); 
    No.
    PHP Code:
    $result mb_detect_encoding($str
    The result ends up being either "ASCII" or "UTF-8". I'm not doing a boolean check. I was trying to figure out what encoding was actually being used.

    So, since you're saying the BOM space may be the issue, do you know how I might address it?
    Visit The Blog | Follow On Twitter
    301tool 1.1.5 - URL redirector & shortener (PHP/MySQL)
    Can be hosted on and utilize your own domain

  4. #4
    Programming Team silver trophybronze trophy
    Mittineague's Avatar
    Join Date
    Jul 2005
    Location
    West Springfield, Massachusetts
    Posts
    17,140
    Mentioned
    190 Post(s)
    Tagged
    2 Thread(s)
    I haven't had problems since I changed everything to UTF-8 i.e.
    My text editior
    Web pages
    Database

    ASCII is OK if you know you'll only ever need that set of characters I suppose but UTF-8 has those characters covered and others as well.

    I think if you add in the arguments you can "force" the strings to seen as UTF-8 instead of ASCII

    You might end up foo-bar-ing your database if you change the charset and collation but I think it would be worth doing for the long term benefits.
    Save a back-up most definately and if possible experiment on localhost before messing with a live online database.

  5. #5
    Barefoot on the Moon! silver trophy Force Flow's Avatar
    Join Date
    Jul 2003
    Location
    Northeastern USA
    Posts
    4,606
    Mentioned
    56 Post(s)
    Tagged
    1 Thread(s)
    Quote Originally Posted by Mittineague View Post
    I haven't had problems since I changed everything to UTF-8 i.e.
    My text editior
    Web pages
    Database

    ASCII is OK if you know you'll only ever need that set of characters I suppose but UTF-8 has those characters covered and others as well.

    I think if you add in the arguments you can "force" the strings to seen as UTF-8 instead of ASCII

    You might end up foo-bar-ing your database if you change the charset and collation but I think it would be worth doing for the long term benefits.
    Save a back-up most definately and if possible experiment on localhost before messing with a live online database.
    Yes, but *how*? I've found all sorts of articles giving bits and pieces and differing information, so I have no idea if one change I make actually works or not.

    [edit]:
    I have Eclipse set to UTF-8 by default.

    I have the utf8 HTML meta tag on all pages.

    I have UTF-8 set for PDO:
    PHP Code:
    $dbconn = new PDO('mysql:host='.$host.';dbname='.$database_name$user$password, array(PDO::MYSQL_ATTR_INIT_COMMAND => "SET NAMES utf8")); 
    I've run these SQL statements for my database and tables:
    Code:
    SET NAMES utf8;
    ALTER DATABASE databasename CHARACTER SET utf8 COLLATE utf8_general_ci;
    ALTER TABLE tablename CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;
    However, that annoying ¬ character still keeps showing up in VARCHAR and TEXT fields in the database--but only when I update a field for a second time based on what was loaded from the database. This issue only seems to surface for HTML textarea fields, and not text input fields.

    What else is left?
    Last edited by Force Flow; Mar 21, 2014 at 17:19.
    Visit The Blog | Follow On Twitter
    301tool 1.1.5 - URL redirector & shortener (PHP/MySQL)
    Can be hosted on and utilize your own domain

  6. #6
    Programming Team silver trophybronze trophy
    Mittineague's Avatar
    Join Date
    Jul 2005
    Location
    West Springfield, Massachusetts
    Posts
    17,140
    Mentioned
    190 Post(s)
    Tagged
    2 Thread(s)
    It still does it for new INSERTS or only with old content?

    Hopefully only old then, removing the "¬" could be done.

    Otherwise I'm suspecting it's an Eclipse setting that's in play.

  7. #7
    Barefoot on the Moon! silver trophy Force Flow's Avatar
    Join Date
    Jul 2003
    Location
    Northeastern USA
    Posts
    4,606
    Mentioned
    56 Post(s)
    Tagged
    1 Thread(s)
    Quote Originally Posted by Mittineague View Post
    It still does it for new INSERTS or only with old content?

    Hopefully only old then, removing the "¬" could be done.

    Otherwise I'm suspecting it's an Eclipse setting that's in play.
    The condition under which this happens:
    1) update/insert text from a previously blank textarea or when all the textarea text is deleted and re-entered. (PHP sees the supplied text prior to UPDATE/INSERT as ASCII).
    2) Get the text from the database and display it in the textarea (PHP sees the retrieved text as UTF8).
    3) Leave the text in the textarea unchanged or alter it in some way, update it in the database, and then the extra characters appear. (PHP sees the supplied text prior to UPDATE as UTF8).
    Visit The Blog | Follow On Twitter
    301tool 1.1.5 - URL redirector & shortener (PHP/MySQL)
    Can be hosted on and utilize your own domain

  8. #8
    From space with love silver trophy
    SpacePhoenix's Avatar
    Join Date
    May 2007
    Location
    Poole, UK
    Posts
    5,014
    Mentioned
    103 Post(s)
    Tagged
    0 Thread(s)
    What version of PHP is in use?
    Community Team Advisor
    Forum Guidelines: Posting FAQ Signatures FAQ Self Promotion FAQ
    Help the Mods: What's Fluff? Report Fluff/Spam to a Moderator

  9. #9
    Barefoot on the Moon! silver trophy Force Flow's Avatar
    Join Date
    Jul 2003
    Location
    Northeastern USA
    Posts
    4,606
    Mentioned
    56 Post(s)
    Tagged
    1 Thread(s)
    Quote Originally Posted by SpacePhoenix View Post
    What version of PHP is in use?
    I've tried it on 5.3.27 and 5.4.7.
    Visit The Blog | Follow On Twitter
    301tool 1.1.5 - URL redirector & shortener (PHP/MySQL)
    Can be hosted on and utilize your own domain

  10. #10
    From space with love silver trophy
    SpacePhoenix's Avatar
    Join Date
    May 2007
    Location
    Poole, UK
    Posts
    5,014
    Mentioned
    103 Post(s)
    Tagged
    0 Thread(s)
    In the site's header, what is the value for the meta:

    Code:
    <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
    Community Team Advisor
    Forum Guidelines: Posting FAQ Signatures FAQ Self Promotion FAQ
    Help the Mods: What's Fluff? Report Fluff/Spam to a Moderator

  11. #11
    SitePoint Mentor bronze trophy
    John_Betong's Avatar
    Join Date
    Aug 2005
    Location
    City of Angels
    Posts
    1,819
    Mentioned
    73 Post(s)
    Tagged
    6 Thread(s)
    I read somewhere that it was essential to have <meta http-equiv="Content-Type" content="text/html; charset=utf-8">immediately after declaring <head> to ensure that the contents uses the declared charset.

    I was wondering if immediately after your UPDATE statement, a PHP script is called to extract the text from the database before the html header statement is called.

    Have you tried setting the PHP header(...).
    Code:
    <?php header('content-type: text/html; charset: utf-8'); ?>
    
    

  12. #12
    From space with love silver trophy
    SpacePhoenix's Avatar
    Join Date
    May 2007
    Location
    Poole, UK
    Posts
    5,014
    Mentioned
    103 Post(s)
    Tagged
    0 Thread(s)
    Can you please post the output of a SHOW CREATE TABLE for the table(s) concerned?

    What versions of MySQL are both servers using?
    Community Team Advisor
    Forum Guidelines: Posting FAQ Signatures FAQ Self Promotion FAQ
    Help the Mods: What's Fluff? Report Fluff/Spam to a Moderator

  13. #13
    Barefoot on the Moon! silver trophy Force Flow's Avatar
    Join Date
    Jul 2003
    Location
    Northeastern USA
    Posts
    4,606
    Mentioned
    56 Post(s)
    Tagged
    1 Thread(s)
    Quote Originally Posted by John_Betong View Post
    Have you tried setting the PHP header(...).
    No. I would've thought the meta tag would be enough. I've never needed to set a PHP header for content-type before.


    Quote Originally Posted by SpacePhoenix View Post
    Can you please post the output of a SHOW CREATE TABLE for the table(s) concerned?

    What versions of MySQL are both servers using?

    I've tried with MySQL 5.5.27 and 5.5.32.

    Here's the creation script, but with just the table in question:
    Code:
    SET @OLD_UNIQUE_CHECKS=@@UNIQUE_CHECKS, UNIQUE_CHECKS=0;
    SET @OLD_FOREIGN_KEY_CHECKS=@@FOREIGN_KEY_CHECKS, FOREIGN_KEY_CHECKS=0;
    SET @OLD_SQL_MODE=@@SQL_MODE, SQL_MODE='TRADITIONAL,ALLOW_INVALID_DATES';
    
    -- -----------------------------------------------------
    -- Table `MessageTemplates`
    -- -----------------------------------------------------
    CREATE TABLE IF NOT EXISTS `MessageTemplates` (
      `MID` INT UNSIGNED NOT NULL AUTO_INCREMENT,
      `name` VARCHAR(255) NOT NULL,
      `subject` VARCHAR(255) NULL,
      `msg` VARCHAR(255) NULL COMMENT 'my comment',
      `msgLong` TEXT NULL COMMENT 'my comment',
      `lastupdated` DATETIME NULL,
      PRIMARY KEY (`MID`),
      UNIQUE INDEX `name_UNIQUE` (`name` ASC))
    ENGINE = InnoDB;
    
    SET SQL_MODE=@OLD_SQL_MODE;
    SET FOREIGN_KEY_CHECKS=@OLD_FOREIGN_KEY_CHECKS;
    SET UNIQUE_CHECKS=@OLD_UNIQUE_CHECKS;
    Here is an export of the table with data. One row is for an INSERT. The other is what happens after an UPDATE (when using the text supplied from the database. If I delete and type something, the UPDATE doesn't introduce the extra character).

    Note that I have a PHP function that trims the string to 160 characters. Since each ¬ character adds to the string length, there is a difference between the two strings.

    Code:
    SET SQL_MODE="NO_AUTO_VALUE_ON_ZERO";
    SET time_zone = "+00:00";
    
    
    /*!40101 SET @OLD_CHARACTER_SET_CLIENT=@@CHARACTER_SET_CLIENT */;
    /*!40101 SET @OLD_CHARACTER_SET_RESULTS=@@CHARACTER_SET_RESULTS */;
    /*!40101 SET @OLD_COLLATION_CONNECTION=@@COLLATION_CONNECTION */;
    /*!40101 SET NAMES utf8 */;
    
    
    --
    -- Table structure for table `messagetemplates`
    --
    
    CREATE TABLE IF NOT EXISTS `messagetemplates` (
      `MID` int(11) unsigned NOT NULL AUTO_INCREMENT,
      `name` varchar(255) NOT NULL,
      `subject` varchar(255) DEFAULT NULL,
      `msg` varchar(255) DEFAULT NULL COMMENT 'my comment',
      `msgLong` longtext COMMENT 'my comment',
      `lastupdated` datetime DEFAULT NULL,
      PRIMARY KEY (`MID`),
      UNIQUE KEY `name` (`name`)
    ) ENGINE=InnoDB  DEFAULT CHARSET=utf8 AUTO_INCREMENT=14 ;
    
    --
    -- Dumping data for table `messagetemplates`
    --
    
    INSERT INTO `messagetemplates` (`MID`, `name`, `subject`, `msg`, `msgLong`, `lastupdated`) VALUES
    (12,  'INSERT Test', '', 'Due to inclement weather, the primary event  location will be closed on 1/4/14. We apologize for the inconvenience.  Please visit example.org for updates.', '', '2014-03-24 17:43:40'),
    (13,    'UPDATE Test', '',  'Due¬ to¬ inclement¬ weather,¬ the¬ primary¬ event¬ location¬ will¬ be¬ closed¬ on¬ 1/4/14.¬ We¬ apologize¬ for¬ the¬ inconvenience.¬ Please¬ visit¬ example.org¬',    '', '2014-03-24 17:43:59');
    
    /*!40101 SET CHARACTER_SET_CLIENT=@OLD_CHARACTER_SET_CLIENT */;
    /*!40101 SET CHARACTER_SET_RESULTS=@OLD_CHARACTER_SET_RESULTS */;
    /*!40101 SET COLLATION_CONNECTION=@OLD_COLLATION_CONNECTION */;
    Last edited by Force Flow; Mar 24, 2014 at 12:40.
    Visit The Blog | Follow On Twitter
    301tool 1.1.5 - URL redirector & shortener (PHP/MySQL)
    Can be hosted on and utilize your own domain

  14. #14
    Barefoot on the Moon! silver trophy Force Flow's Avatar
    Join Date
    Jul 2003
    Location
    Northeastern USA
    Posts
    4,606
    Mentioned
    56 Post(s)
    Tagged
    1 Thread(s)
    To maybe see why the spaces were being replaced with wonky characters, I put the string into an array of characters:

    Code:
    array(171) {
      [0]=>
      string(1) "D"
      [1]=>
      string(1) "u"
      [2]=>
      string(1) "e"
      [3]=>
      string(1) "�"
      [4]=>
      string(1) "�"
      [5]=>
      string(1) "t"
      [6]=>
      string(1) "o"
    }
    And used ord() to get the ASCII code for each character

    Code:
    array(171) {
      [0]=>
      int(68)
      [1]=>
      int(117)
      [2]=>
      int(101)
      [3]=>
      int(194)
      [4]=>
      int(160)
      [5]=>
      int(116)
      [6]=>
      int(111)
    }
    I'm not sure if that provides any useful clues or not.
    Visit The Blog | Follow On Twitter
    301tool 1.1.5 - URL redirector & shortener (PHP/MySQL)
    Can be hosted on and utilize your own domain

  15. #15
    Barefoot on the Moon! silver trophy Force Flow's Avatar
    Join Date
    Jul 2003
    Location
    Northeastern USA
    Posts
    4,606
    Mentioned
    56 Post(s)
    Tagged
    1 Thread(s)
    Agh! Ok, I found the problem.

    The library I was using to generate HTML input/textarea fields was automatically replacing spaces with the nbsp special character. Somehow going back and forth between the database, the spaces were getting messed up a bit.

    I noticed this when I was looking at the HTML source and noticed a difference between the textarea values depending on when on was UPDATING the text.

    Once I removed that feature, everything started working as I would normally expect.

    Good grief.

    Thank you to everyone who took the time to look at this and help with troubleshooting
    Visit The Blog | Follow On Twitter
    301tool 1.1.5 - URL redirector & shortener (PHP/MySQL)
    Can be hosted on and utilize your own domain

  16. #16
    From space with love silver trophy
    SpacePhoenix's Avatar
    Join Date
    May 2007
    Location
    Poole, UK
    Posts
    5,014
    Mentioned
    103 Post(s)
    Tagged
    0 Thread(s)
    What processing or functions is the data being run through between its entry into the $_POST array and its insertion to the database?
    Community Team Advisor
    Forum Guidelines: Posting FAQ Signatures FAQ Self Promotion FAQ
    Help the Mods: What's Fluff? Report Fluff/Spam to a Moderator

  17. #17
    Barefoot on the Moon! silver trophy Force Flow's Avatar
    Join Date
    Jul 2003
    Location
    Northeastern USA
    Posts
    4,606
    Mentioned
    56 Post(s)
    Tagged
    1 Thread(s)
    Quote Originally Posted by SpacePhoenix View Post
    What processing or functions is the data being run through between its entry into the $_POST array and its insertion to the database?
    That's basically what the problem was, although I didn't think of it until later since the code processing the $_POST array was the only part I was looking at, and not the code that created the input fields later.
    Last edited by Force Flow; Mar 26, 2014 at 07:10. Reason: typo
    Visit The Blog | Follow On Twitter
    301tool 1.1.5 - URL redirector & shortener (PHP/MySQL)
    Can be hosted on and utilize your own domain


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •