SitePoint Sponsor

User Tag List

Results 1 to 9 of 9
  1. #1
    Non-Member
    Join Date
    Jan 2004
    Location
    Seattle
    Posts
    4,328
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    str_replace 1st occurrence only (and much more!)

    I'll probably eventually turn this into several threads, but I thought I'd start by posting it all here.

    I have a database table (gw_geog) that contains the names of the world's nations, states, etc. Another database contains articles about all these places. I assume I can use a str_replace function to link words in these articles that also occur in gw_geog.

    For example, consider the following sentence:

    Code:
    Brazil and Venezuela have both become economic powerhouses.
    If I display the article with a variable, like $Article, then I could use the following script to link the words Brazil and Venezuela:

    PHP Code:
    str_replace('Brazil''<a href="/World/Brazil/">Brazil</a>'$Article);

    str_replace('Venezuela''<a href="/World/Venezuela/">Venezuela</a>'$Article
    (Yes, I know how to combine both words in a single array; I just can't remember how to write it at the moment.)

    What I'd like to know is how to link the FIRST occurrence of each word only. If my article mentions Brazil three dozen times, I don't want three dozen links.

    I have a couple other related questions, which may deserve separate threads, but I'll ask them here, in case someone knows of a total solution.

    First, consider the word Arctic. If I use a str_replace script to link it, then how can I prevent it from being linked when used with the word "Ocean," like this?:

    Code:
    The <a href="/World/Arctic/">Arctic</a> Ocean covers much of the <a href="/World/Arctic/">Arctic</a>.
    This is what it should look like:

    Code:
    The <a href="/World/Arctic_Ocean/">Arctic Ocean</a> covers much of the <a href="/World/Arctic/">Arctic</a>.
    Similarly, how do I prevent this kind of problem?:

    Code:
    The <a href="/World/Europe/">Europe</a>an Union...
    Obviously, I want to link the word Europe but not EuropeAN.

    Unless someone has a better suggestion, I think I might have a partial fix. First, I'd fill a database table or array with some problem words, like Arctic and Arctic Ocean. Then I'd use the following str_replace function:

    PHP Code:
    str_replace('Arctic Ocean''<span class="Link">Arctic Ocean</a>) 
    Except I'd write the script so that all of these problem compound words are surrounded by span tags.

    Then I'd write a script that links the word Arctic only if it isn't preceded by a span tag.

    That should leave Arctic Ocean untouched. The next step would be to write a script linking the compound words, like Arctic Ocean, perhaps removing the span tags in the process.

    Does this sound like something that might work? If so, how can I write script that links only words that aren't preceded by span tags?

    That still doesn't solve the problem of EuropeAN, but perhaps I could apply a similar solution to it.

    Thanks!

  2. #2
    Non-Member
    Join Date
    Jan 2003
    Posts
    5,748
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    > If my article mentions Brazil three dozen times, I don't want three dozen links.

    Nope... The function will replace all of the occurencies of the word you are looking for, ie The word Brazil for example, so just use the function the once.

  3. #3
    Non-Member
    Join Date
    Jan 2003
    Posts
    5,748
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    As to your other questions, you are looking for regular expressions, as the function you are using at the moment isn't flexible enough to cover all avenues in your string search.

  4. #4
    Non-Member
    Join Date
    Jan 2004
    Location
    Seattle
    Posts
    4,328
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Dr Livingston View Post
    > If my article mentions Brazil three dozen times, I don't want three dozen links.

    Nope... The function will replace all of the occurencies of the word you are looking for, ie The word Brazil for example, so just use the function the once.
    Yes, I understand that str_replace will replace very instance of Brazil. I'm asking if there's some sort of trick to make it replace only the first instance. Or is that what you mean when you say, "just use the function once." If so, I'm confused; how do you use str_replace just once?

    Thanks.

  5. #5
    SitePoint Zealot ejg's Avatar
    Join Date
    Jun 2007
    Posts
    141
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Find the position of the first occurrence using strpos() then replace that instance with substr_replace().

  6. #6
    SitePoint Zealot the DtTvB's Avatar
    Join Date
    Jul 2006
    Location
    Thailand
    Posts
    162
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Try this code:

    PHP Code:
    <?php

    //================//
    // Linkify words. //
    //================//

    //
    // Set up some variables.
    //
    $Dcount  0;
    $Dlist   = array();
    $Dtarget '';
    $Dtext   '';

    //
    // Get the next replacement.
    //
    function linkifyWord__nexttoken() {
        global 
    $Dcount$Dlist$Dtarget$Dtext;
        while (
    1) {
            
    $Dcount ++;
            if (
    strpos($Dtext'__' $Dcount '__') === false)
                return;
        }
    }

    //
    // This function will return the link,
    // called from the linkifyWord function.
    //
    function linkifyWord__callback($m) {
        global 
    $Dcount$Dlist$Dtarget$Dtext;
        
    linkifyWord__nexttoken ();
        
    $Dlist[$Dcount] = '<a href="' $Dtarget '">' $m[2] . '</a>';
        return 
    $m[1] . '__' $Dcount '__' $m[3];
    }

    //
    // This function linkify the word.
    //
    function linkifyWord(&$text$word$link) {

        
    //
        // Set-up variables.
        //
        
    global $Dcount$Dlist$Dtarget$Dtext;
        
    $Dtarget $link;
        
    $Dtext   $text;

        
    //
        // Pad the text.
        //
        
    $text ' ' $text ' ';

        
    //
        // Replace
        //
        
    $text preg_replace_callback('~(\W)(' preg_quote($word) . ')(\W)~i''linkifyWord__callback'$text1);

        
    //
        // Unpad.
        //
        
    $text substr($text1);
        
    $text substr($text0, -1);

    }

    //
    // Finish
    //
    function linkifyFinish(&$text) {
        global 
    $Dlist;
        foreach (
    $Dlist as $k => $v) {
            
    $text str_replace('__' $k '__'$v$text);
        }
        
    $Dlist = array();
    }

    //
    // Start (Callback)
    //
    function linkifyStart__callback($m) {
        global 
    $Dcount$Dlist$Dtarget$Dtext;
        
    linkifyWord__nexttoken ();
        
    $Dlist[$Dcount] = $m[1];
        return 
    '__' $Dcount '__';
    }

    //
    // Protect the HTML tags!
    //
    function linkifyStart(&$text) {
        global 
    $Dcount$Dlist$Dtarget$Dtext;
        
    $Dcount  0;
        
    $Dlist   = array();
        
    $Dtarget '';
        
    $Dtext   '';
        
    $text preg_replace_callback('~(<\w+\s+[^>]+>)~''linkifyStart__callback'$text);
    }


    //===============//
    // Example code. //
    //===============//

    $text '
        <ul>
            <li>Brazil and Venezuela have both become economic powerhouses.</li>
            <li>Brazil Brazil Brazil Brazil Brazil Brazil Brazil Brazil Brazil Brazil Brazil</li>
            <li>The Arctic Ocean covers much of the Arctic.</li>
            <li>The European Union...</li>
            <li>The European Union...</li>
        </ul>'
    ;

    linkifyStart ($text);

    linkifyWord ($text'Arctic Ocean''/World/Arctic_Ocean/');
    linkifyWord ($text'Venezuela''/World/Venezuela/');
    linkifyWord ($text'European''/World/Europe/');
    linkifyWord ($text'Brazil''/World/Brazil/');
    linkifyWord ($text'Arctic''/World/Arctic/');
    linkifyWord ($text'Europe''/World/Europe/');

    linkifyFinish ($text);

    echo 
    $text;

    ?>
    Note: Replace longer words first, then followed by short one.

  7. #7
    Non-Member
    Join Date
    Jan 2004
    Location
    Seattle
    Posts
    4,328
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Wow, thanks. I'll have to play with that for a while to adapt it to my database, but it looks impressive. I guess I should have searched Google for "Linkify," eh?

  8. #8
    Non-Member
    Join Date
    Jan 2004
    Location
    Seattle
    Posts
    4,328
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Nevermind; I made a dumb mistake.
    Last edited by geosite; Jun 16, 2007 at 21:36.

  9. #9
    Non-Member
    Join Date
    Jan 2004
    Location
    Seattle
    Posts
    4,328
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I edited the last two posts after I solved the problems. Your script's amazingly easy to use.
    Last edited by geosite; Jul 11, 2007 at 13:14.


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •