SitePoint Sponsor

User Tag List

Results 1 to 15 of 15
  1. #1
    Wanna-be Apple nut silver trophy M. Johansson's Avatar
    Join Date
    Sep 2000
    Location
    Halmstad, Sweden
    Posts
    7,400
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    Hello everyone - I'm thinking about doing an evil script.

    This is what I want to do:
    I have a list of High-bid keywords at search123. I want to take out the keywords on this list and put them in my mySQL database. Keyword list is here It looks a bit wierd, that's why I need your help to filter it.

    Next, if any of these words are found in the forums and/or articles - they will be replaced with a link to the keyword.

    Mind you - I wont let my forums look like this. Instead, I'm just going to make the links non-underlined and with a color slightly different from the message text color. However, when mouseovered, they will turn link-blue and become underlined.

    Can you please guide me in how to design these two scripts?
    Mattias Johansson
    Short, Swedish, Web Developer

    Buttons and Dog Tags with your custom design:
    FatStatement.com

  2. #2
    Grumpy Mole Man Skunk's Avatar
    Join Date
    Jan 2001
    Location
    Lawrence, Kansas
    Posts
    2,066
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    That's a LOT of keywords. I think if you stick them all in a mySQL database it could be a major performance grind - it might be worth storing them in a flat text file instead (I dunno if this will be faster or not though).

    The way I would do it is this: I'd put all of the keywords / key phrases in a big text file with one phrase to each line. Then I'd read the contents of the text file into a PHP array using the file function. Next I would loop through the contents of the array doing a search and replace operation on the text to be processed - for each term I would search for that term using one of PHP's regular expression functions and if it exists I would replace it with the link to the keyword.

    Shouldn't be too hard to do

  3. #3
    Grumpy Mole Man Skunk's Avatar
    Join Date
    Jan 2001
    Location
    Lawrence, Kansas
    Posts
    2,066
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Incidentally grabbing all of those keywords and converting them into a text file like the one I described should be extremely easy. Simply save the word file to disk and then write a PHP script that loads everything into an array as I descrubed. Now loop through the array running a regular expression that deletes everything after the tab (\t) on each line - that will clean the file up and leave you with just a list of keywords.

  4. #4
    Wanna-be Apple nut silver trophy M. Johansson's Avatar
    Join Date
    Sep 2000
    Location
    Halmstad, Sweden
    Posts
    7,400
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    Ah, it is regular expressions that I have to learn.

    /me looks at PHPBuilder.com's tutorial

    /me stares

    /me gets a cup of coffee

    Thank you, Skunk - I really do love you.
    Mattias Johansson
    Short, Swedish, Web Developer

    Buttons and Dog Tags with your custom design:
    FatStatement.com

  5. #5
    Grumpy Mole Man Skunk's Avatar
    Join Date
    Jan 2001
    Location
    Lawrence, Kansas
    Posts
    2,066
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Ooh sorry - forgot about regexps being a right sod (I spent gawd knows how many hours reading a chapter on them in Programming Perl before I figured them out).

    This code is untested but it should work:

    $str = eregi_replace("\t.+", "", $str);

    That should take a line of text and cut off everything after (and including) the first tab.

  6. #6
    Wanna-be Apple nut silver trophy M. Johansson's Avatar
    Join Date
    Sep 2000
    Location
    Halmstad, Sweden
    Posts
    7,400
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    Yes, I've just discovered that

    regular expressions == satan
    Mattias Johansson
    Short, Swedish, Web Developer

    Buttons and Dog Tags with your custom design:
    FatStatement.com

  7. #7
    Database Jedi MattR's Avatar
    Join Date
    Jan 2001
    Location
    buried in the database shell (Washington, DC)
    Posts
    1,107
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    regeps are the devil, yes

  8. #8
    SitePoint Addict
    Join Date
    Dec 2000
    Location
    BOSTON MA
    Posts
    335
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Originally posted by truelight
    regular expressions == satan
    what did you expect when doing an evil script?
    . . . chris

  9. #9
    Wanna-be Apple nut silver trophy M. Johansson's Avatar
    Join Date
    Sep 2000
    Location
    Halmstad, Sweden
    Posts
    7,400
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    I have been looking into regexp a little more, and I must say that it's complicated to the verge of insanity.

    How can ^ represent the beginning of string, and why does $ represent the end?

    Why does . represent "any character"?

    It would be easier to understand it if somebody actually explained why the darned things were laid out as they were.

    By the way, how do you make a regular expression readable? A block of PHP code is generally very easy to read, but this regular expression, which checks if a number of dollars is properly written...

    ^([0-9]+|[0-9]{1,3}(,[0-9]{3})*)(\.[0-9]{1,2})?$

    ...just makes me scream. Euuuurrrggh
    Mattias Johansson
    Short, Swedish, Web Developer

    Buttons and Dog Tags with your custom design:
    FatStatement.com

  10. #10
    SitePoint Wizard silver trophy Karl's Avatar
    Join Date
    Jul 1999
    Location
    Derbyshire, UK
    Posts
    4,411
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Seen as I am a bit of a saddist and like to play devils advocate I would say that MySQL is easy up to the job of storing all those keywords, I loaded them into Excel and counted 17,062 keywords - MySQL is upto storing that many rows.

    Now once you have got them into MySQL - which is very easy as you can use Excel to split the data and then export the data as a CSV file and then import into MySQL.

    The best way to do this would be to modify your forum database to add a field that marks wether the post has had links added for keywords, then once a day check for posts that have not had links added, then run a scripts to add links to them.

    That is the only way I can see of doing it, if you try and do it on the fly then you will wait forever for posts and articles to load.

    PHP Code:
    while( $row mysql_fetch_assoc($res) ){

        
    $keyword $row[Keywords];
        
    $post eregi_replace"($keyword)""<a href='http://www.placetolinkto.com/search.pl?keyword=\\\\1'>\\\\1</a>" $post );


    Something like the above code would do the replacement for you.
    Karl Austin :: Profile :: KDA Web Services Ltd.
    Business Web Hosting :: Managed Dedicated Hosting
    Call 0800 542 9764 today and ask how we can help your business grow.

  11. #11
    Wanna-be Apple nut silver trophy M. Johansson's Avatar
    Join Date
    Sep 2000
    Location
    Halmstad, Sweden
    Posts
    7,400
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    I was actually thinking about adding the keywords at the time of posting, rather that doing a cron job.

    Your script is precicely what I was looking for, though. Thanks - you saved me a lot of work finding that one out.
    Mattias Johansson
    Short, Swedish, Web Developer

    Buttons and Dog Tags with your custom design:
    FatStatement.com

  12. #12
    SitePoint Wizard silver trophy Karl's Avatar
    Join Date
    Jul 1999
    Location
    Derbyshire, UK
    Posts
    4,411
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Adding it at the time of posting is also going to take a fair while as well - I thought about that too, the other option is to run:

    exec('php /path/to/phpfile.php postID &');

    which would execute a php file that would do the link adding to the post specified by postID, it should in theory run the script in the background which would mean it should return to PHP instanly which wouldn't delay the adding of the post.
    Karl Austin :: Profile :: KDA Web Services Ltd.
    Business Web Hosting :: Managed Dedicated Hosting
    Call 0800 542 9764 today and ask how we can help your business grow.

  13. #13
    Talk to the /dev/null Theiggsta's Avatar
    Join Date
    Mar 2001
    Location
    Tampa, FL
    Posts
    376
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    str_replace is faster than eregi and isint Satanicly slow like eregi or preg, but thats just my opinion.

    *grabs a cup of coffee*

  14. #14
    SitePoint Wizard silver trophy Karl's Avatar
    Join Date
    Jul 1999
    Location
    Derbyshire, UK
    Posts
    4,411
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I'm aware of str_replace being fast, but I was being lazzy so it was easier to use back referencing to place the keyword in the link.
    Karl Austin :: Profile :: KDA Web Services Ltd.
    Business Web Hosting :: Managed Dedicated Hosting
    Call 0800 542 9764 today and ask how we can help your business grow.

  15. #15
    Talk to the /dev/null Theiggsta's Avatar
    Join Date
    Mar 2001
    Location
    Tampa, FL
    Posts
    376
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    whatever suits you the best...but I use str and I have my reasons...


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •