SitePoint Sponsor

User Tag List

Results 1 to 5 of 5
  1. #1
    SitePoint Guru
    Join Date
    Jan 2007
    Posts
    967
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)

    Regular Expressions question

    How would I search a string for links and add google tracking to the end (ie: &utm_medium=email).

    So "<a href='page.php' >" would be converted to "<a href='page.php?utm_medium=email' >"

    Thanks

  2. #2
    play of mind Ernie1's Avatar
    Join Date
    Sep 2005
    Posts
    1,252
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Use this:
    PHP Code:
    <?php
    $string 
    '<a href="page.php">';

    $pattern "/(.*\.php)/";

    $replacement '$1?utm_medium=email';

    echo 
    preg_replace($pattern$replacement$string);
    my mobile portal
    ghiris.ro

  3. #3
    SitePoint Guru
    Join Date
    Jan 2007
    Posts
    967
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    Thanks a million!
    E

  4. #4
    SitePoint Guru
    Join Date
    Jan 2007
    Posts
    967
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    I tried out your example:
    I just used page.php as an example. The link might have any or no extension.
    I've been playing with what you wrote and this is what I came up with:

    Code PHP:
    $string = 'Lorem ipsum <a href="page.php"> dolor sit amet, consectetur adipiscing elit. Phasellus at neque sit amet metus laoreet venenatis quis quis quam. Fusce fringilla orci nisi, vitae pellentesque justo. <a href="page.php?go=help"> Integer sed nulla lacus. Sed congue dui enim. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nunc gravida volutpat felis, sit amet fringilla lectus vestibulum at.';
     
    $pattern = "/(.*href\=\"[^\r\n]*)\"/";
    $replacement = '$1?utm_medium=email"';
     
    echo htmlentities(preg_replace($pattern, $replacement, $string));

    The problem with this is that it only does the last link. Also, If the link already has a query string it should change it to "<a href="/page?go=123&utm_medium=email" > rather than adding a new query string to the end. It should also work with single or double quotes.

    Any ideas?

    Thanks again E

  5. #5
    SitePoint Enthusiast nrg_alpha's Avatar
    Join Date
    Dec 2008
    Posts
    81
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Here would be my approach:

    PHP Code:
    $string 'Lorem ipsum <a href="page.php"> dolor sit amet</a>, consectetur adipiscing elit. <a href="page.php?go=help"> Integer sed nulla lacus</a>. Sed congue dui enim.';
    echo 
    $string preg_replace('#<a\b[^>]*href\s?=\s?[\'"]\K[^\'"]+#i''$0?utm_medium=email'$string); 
    output (via right-click view source):
    Code:
    Lorem ipsum <a href="page.php?utm_medium=email"> dolor sit amet</a>, consectetur adipiscing elit. <a href="page.php?go=help?utm_medium=email"> Integer sed nulla lacus</a>. Sed congue dui enim.
    Note that in your sample string, you neglected to close your anchor tags (missing </a>), so I arbitrarily added some. I also shortened the thing, as we don't need a mile long sample to demonstrate something.

    Off Topic:


    @Ernie...
    your pattern was "/(.*\.php)/". As a general rule, stuff like .* (or .+) is frowned upon and not favorable, for the simple reason being that the * and + quantifiers are greedy will match as much as it can, then backtrack if need be.. so in my sample string above, using your pattern, the .* would capture all the way to the end of the string, then backtrack till it reaches the first .php it finds (which would be the second url's '.php') and replace ALL of that (clearly not what we would want). If anything, you could make the quantifier lazy, like so: .*? this will protect against such mishaps.


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •