SitePoint Sponsor

User Tag List

Results 1 to 7 of 7
  1. #1
    SitePoint Member
    Join Date
    Mar 2009
    Posts
    7
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    £ (pound sign) not working with preg_match etc...

    I am loading a webpage stored locally (i.e. not from a live site) using file_get_contents.

    I then try and split the source by '£' i.e. split('£', $html_source); However it doesn't find the £. I tried searching with strstr and preg_match however none of these found the £.

    Note that yes they are in £ form and not £ etc. Is it to do with the character type of the file?

  2. #2
    Twitter: @AnthonySterling silver trophy AnthonySterling's Avatar
    Join Date
    Apr 2008
    Location
    North-East, UK.
    Posts
    6,111
    Mentioned
    3 Post(s)
    Tagged
    0 Thread(s)
    A quick test shows all work as expected this end...

    PHP Code:
    <?php
    $sString 
    'Foo£Bar';

    print_r(preg_split('~£~'$sString));
    /*
    Array
    (
        [0] => Foo
        [1] => Bar
    )
    */

    echo (strstr($sString'£') !== false) ? 'Found' 'Not found' ;
    /*
    Found
    */

    print_r(split('£'$sString));
    /*
    Array
    (
        [0] => Foo
        [1] => Bar
    )
    */
    ?>
    You could check the encoding of the document, or even take a peek at the request headers to see what they say.

    Good luck.
    @AnthonySterling: I'm a PHP developer, a consultant for oopnorth.com and the organiser of @phpne, a PHP User Group covering the North-East of England.

  3. #3
    SitePoint Member
    Join Date
    Mar 2009
    Posts
    7
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Thanks for your reply.

    A simple example like the one you gave does work, however the loading of a webpage and then splitting on the pound symbol doesn't seem to work for me. I am probably doing something very daft ;-)

    I tried converting the HTML source using htmlentities() and THEN splitting using &pound; and this worked.

    I have now run the following code to determine the actual ASCII values of the pound symbols in the html source:


    PHP Code:
    $Length strlen($HTMLSource);
       
    for (
    $Index 1$Index $Length$Index++)
    {
       
    $Char $HTMLSource[$Index];
       echo 
    "$Char = " ord($Char) . "<br>\n";

    The pound symbols are ASCII 163 as I expected.

    So why isn't it splitting? I still don't know.

  4. #4
    SitePoint Member
    Join Date
    Mar 2009
    Posts
    7
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    This works though:

    PHP Code:
    print_r(split(chr(163), $HTMLSource)); 
    But this code gives 194 as the result:
    PHP Code:
    echo (ord('£')); 
    So it does look like some character mapping problem. Does anyone know what is going on? i.e. is it my editor (BBEdit) that is causing the problem?

    The odd thing is that I can do a view source of the html in question, then copy the pound symbol and insert it straight into the code doing the split, and it STILL doesn't work!!!! Very irritating ;-)

  5. #5
    SitePoint Member
    Join Date
    Mar 2009
    Posts
    7
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    It's my editor!

    I have now re-opened the file with character mapping "ISO Latin 1", it seemed to be UTF-8 before. I will try and change to this character mapping in my defaults.

    Talk about a wild good chase... I feel like POUNDing my fists on the floor

    haha!

  6. #6
    Twitter: @AnthonySterling silver trophy AnthonySterling's Avatar
    Join Date
    Apr 2008
    Location
    North-East, UK.
    Posts
    6,111
    Mentioned
    3 Post(s)
    Tagged
    0 Thread(s)
    Excellent investigatory skills, a nice example for many others to follow...

    Well done, happy you got it sorted.
    @AnthonySterling: I'm a PHP developer, a consultant for oopnorth.com and the organiser of @phpne, a PHP User Group covering the North-East of England.

  7. #7
    SitePoint Member
    Join Date
    Mar 2009
    Posts
    7
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Talking

    Thanks Anthony

    At least I learnt something along the way


Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •