SitePoint Sponsor

User Tag List

Results 1 to 10 of 10

Thread: regex question

  1. #1
    SitePoint Wizard
    Join Date
    Oct 2005
    Location
    London
    Posts
    1,678
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    regex question

    I have this regex:
    Code PHP:
    $replace = ereg_replace("\&+^amp\;$", "&");

    What i want to say is find any & that isnt followed by a amp; and replace it with & I think as i have it now it kinda finds any & that are followed by a amp; and replace it with &.....although i bet theres errors in there, im just learning this stuff and ive written that off the top of my head so its bound to be wrong

    any ideas?

  2. #2
    . shoooo... silver trophy logic_earth's Avatar
    Join Date
    Oct 2005
    Location
    CA
    Posts
    9,013
    Mentioned
    8 Post(s)
    Tagged
    0 Thread(s)
    Logic without the fatal effects.
    All code snippets are licensed under WTFPL.


  3. #3
    SitePoint Wizard
    Join Date
    Oct 2005
    Location
    London
    Posts
    1,678
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Hi,

    thanks for the quick response....does that cater for & that already have an amp; on the end....i guess it does as that expression searches for a & and nothing else?

  4. #4
    . shoooo... silver trophy logic_earth's Avatar
    Join Date
    Oct 2005
    Location
    CA
    Posts
    9,013
    Mentioned
    8 Post(s)
    Tagged
    0 Thread(s)
    Yes using the code below will preserve existing entities.
    PHP Code:
    $line 'something & nothing   &   find it';

    $line htmlentities($lineENT_NOQUOTES'UTF-8'false);
    #something & nothing    find it 
    Logic without the fatal effects.
    All code snippets are licensed under WTFPL.


  5. #5
    SitePoint Wizard stereofrog's Avatar
    Join Date
    Apr 2004
    Location
    germany
    Posts
    4,324
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Doesn't work for me.

    @elduderino: "not followed by" is "?!" in regexp language, so "& not followed by amp;" will be

    Code:
    /&(?!amp;)/
    "& not followed by something that looks like html entity"

    Code:
    /&(?!\w+;)/
    You'll need preg_ functions to use these expressions, not ereg_

  6. #6
    SitePoint Wizard
    Join Date
    Oct 2005
    Location
    London
    Posts
    1,678
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Hi,

    Yeah the htmlentities replaces everything so ijust get a page of escaped chars.

    @stereofrog....kinda following you....

    we currently have this:

    $html = ereg_replace("&([^a-z]+[^;])", "&\\1", $html);

    which is trying to say an ampersand not follwed by some chars and not followed by a ;...but it doesnt work.....can you do a not followed by after you've already done a not followed by?

  7. #7
    . shoooo... silver trophy logic_earth's Avatar
    Join Date
    Oct 2005
    Location
    CA
    Posts
    9,013
    Mentioned
    8 Post(s)
    Tagged
    0 Thread(s)
    It worked on my test are you sure you provided the last false expression?

    htmlentities ( string $string [, int $quote_style [, string $charset [, bool $double_encode]]] )

    $double_encode should be false.

    EDIT: Ok I should have read further down the page.
    5.2.3 The double_quote parameter was added.
    Worked for me cause i'm on the latest version of PHP.
    Logic without the fatal effects.
    All code snippets are licensed under WTFPL.


  8. #8
    SitePoint Wizard stereofrog's Avatar
    Join Date
    Apr 2004
    Location
    germany
    Posts
    4,324
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    kinda lost you here "not followed by is" "?!" not ^, which is something completely different

  9. #9
    SitePoint Wizard
    Join Date
    Oct 2005
    Location
    London
    Posts
    1,678
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Hi logicearth......i might be wrong but i cant see how htmlentities odesnt turn all < and > in to &lt and &gt............we're doing the regex on a whole page...

    this seems to work:
    PHP Code:
    preg_replace("/&(?![a-z]+ ;)/""&amp;"$html); 
    but now all our pound signs show up as & # 36; !!!!

  10. #10
    SitePoint Wizard
    Join Date
    Oct 2005
    Location
    London
    Posts
    1,678
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    got it :

    PHP Code:
     $html preg_replace("/&(?!#?[a-z0-9]+;)/""&amp;"$html); 


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •