SitePoint Sponsor

User Tag List

Results 1 to 10 of 10
  1. #1
    SitePoint Guru team1504's Avatar
    Join Date
    May 2010
    Location
    Okemos, Michigan, USA
    Posts
    732
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)

    Help with PHP search-engine. Says there are no results?

    Hello,
    The following .zip file that is linked is a search engine written in PHP and its assets.

    It worked fine until I cleared the files in the search_cache and added new URLs to be searched in config.php

    Now, it says that there are no results for the word ' hello ' when searched even though all of the URLs to be search in config.php have the word hello in them.

    Could anyone please help me figure out why, all of a sudden, I am getting no results ?

    Here are the files.

    or, here is just the code that powers the searchEngine ( although, I don't think it is anything wrong with this as i didn't change it and it worked fine. ).

  2. #2
    SitePoint Wizard silver trophybronze trophy Cups's Avatar
    Join Date
    Oct 2006
    Location
    France, deep rural.
    Posts
    6,869
    Mentioned
    17 Post(s)
    Tagged
    1 Thread(s)
    It worked fine until I cleared the files in the search_cache and added new URLs to be searched in config.php
    So what does your new config file now contain? Post the code. Wrap it in tags like this [ php ] [ /php ] (but without the spaces shown)

  3. #3
    SitePoint Guru team1504's Avatar
    Join Date
    May 2010
    Location
    Okemos, Michigan, USA
    Posts
    732
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    Thank you @Cups for replying, it contains the following:

    PHP Code:
    <?php


    /*********************************************/
    /**                                         **/
    /**        USER CONFIGURATION START         **/
    /**                                         **/
    /*********************************************/

    $GLOBALS['_SEARCH_HTML_WEBSITES'] = array(
        
    "http://www.the-irf.com/hello/hello1.html",
        
    "http://www.the-irf.com/hello/hello2.html",
        
    "http://www.the-irf.com/hello/hello3.html",
        
    "http://www.the-irf.com/hello/hello4.html",
        
    "http://www.the-irf.com/hello/hello5.html",
        
    "http://www.the-irf.com/hello/hello6.html",
        
    "http://www.the-irf.com/hello/hello7.html",
        
    "http://www.the-irf.com/hello/hello8.html",
    );
    $GLOBALS['_SEARCH_HTML_DEPTH'] = 3;
    $GLOBALS['_SEARCH_CACHE_LENGTH'] = 1;
    $GLOBALS['_SEARCH_ALL_IGNORE'] = array(
        
    "#~$#",
        
    "#/\\.#",
        
    "#/\\.ht#",
        
    "#private#i",
        
    "#phpsearch_files#i",
        
    "#search\\.php#i",
    );




    /*********************************************/
    /**                                         **/
    /**        USER CONFIGURATION END           **/
    /**                                         **/
    /*********************************************/








    // advanced stuff below:

    // any files found matching these links will be searched as well. 
    // if you have PDF support the pdf contents will be searched as well :) 
    $GLOBALS['_SEARCH_FILES_INCLUDE'] = array(
        
    "#\.jpg$#i",
        
    "#\.jpeg$#i",
        
    "#\.gif$#i",
        
    "#\.png$#i",
        
    "#\.exe$#i",
        
    "#\.pdf$#i",
        
    "#\.zip$#i",
        
    "#\.doc$#i",
        
    "#\.docx$#i",
        
    "#\.avi$#i",
        
    "#\.mov$#i",
        
    "#\.mpg$#i",
        
    "#\.mpeg$#i",
    );
    // any content type matching these regex's will be indexed and searched
    $GLOBALS['_SEARCH_HTML_INCLUDE'] = array(
        
    "#text/html#i",
    );

    // any content type matching these regex's will be downloaded and treated as a pdf, converted to text, and indexed
    // (if supported by server software)
    $GLOBALS['_SEARCH_PDF_INCLUDE'] = array(
        
    "#application/pdf#i",
    );

    // nfi why i've used globals instead of define ... meh. same dif.
    // min number of characters in search string.
    $GLOBALS['_SEARCH_MIN_CHARS']             = 4
    $GLOBALS['_SEARCH_SUMMARY_LENGTH']         = 110;
    $GLOBALS['_SEARCH_PER_PAGE']             = 10;
    $GLOBALS['_SEARCH_SHOW_BOX']             = true;
    $GLOBALS['_SEARCH_SHOW_STYLESHEET']     = true// use the inbuilt stylesheet or not? ie: phpsearch.css
    $GLOBALS['_SEARCH_COMBINE']                = true// set this to false if results take long time
    $GLOBALS['_SEARCH_FILES_FOLDER']         = "phpsearch_files/"// end it in a slash. path from search.php has to be writable 
    $GLOBALS['_SEARCH_CACHE_FOLDER']         = $GLOBALS['_SEARCH_FILES_FOLDER']."search_cache/"// end it in a slash. path from search.php has to be writable 
    $GLOBALS['_SEARCH_FILE_HEADER']         = $GLOBALS['_SEARCH_FILES_FOLDER'].'header.php'// file that contains anything before search results
    $GLOBALS['_SEARCH_FILE_FOOTER']         = $GLOBALS['_SEARCH_FILES_FOLDER'].'footer.php'// file that contains anything after search results
    define("_SEARCH_DEBUG",false);
    define("_SEARCH_DEMO",false);
    // advanced: comma separated list of callback functions to pull search results from. 
    // Add your own if you would like to (eg:) pull results from a database as well as the default. 
    // See search_sample_callback() for example usage.
    $GLOBALS['_SEARCH_CALLBACKS']             = "search_sample_callback"
    function 
    search_sample_callback($keyword,$search_depth){
        if(!
    _SEARCH_DEMO)return false;
        
    // only search on the first iteration, not after splitting keyword up further.
        
    if($search_depth>0)return false;
        return array(
            
    "something" => array(
                
    "importance" => 0.7// 1 will push result to the top.
                
    "title" => "Example PHP callback result"// main title to display, keyword wil be highlighted automatically.
                
    "summary" => "For advanced PHP users, you can inject results in with a callback function."// summary text to display under search result, keyword highlighted in this.
                
    "url" => "http://google.com/search?q=".urlencode($keyword), // Url to display
            
    ),
        );
    }

  4. #4
    SitePoint Wizard silver trophybronze trophy Cups's Avatar
    Join Date
    Oct 2006
    Location
    France, deep rural.
    Posts
    6,869
    Mentioned
    17 Post(s)
    Tagged
    1 Thread(s)
    All looks ok, thanks for using the [ PHP ] tags

    I did spot this though:

    PHP Code:
    define("_SEARCH_DEBUG",false); 
    Try turning that to true.

  5. #5
    SitePoint Guru team1504's Avatar
    Join Date
    May 2010
    Location
    Okemos, Michigan, USA
    Posts
    732
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    No problem.

    Okay, well now I get the following, which looks like errors:


    - try2: http://www.the-irf.com/hello/hello1.html
    Crawling URL at depth 0 'http://www.the-irf.com/hello/hello1.html'
    for url (1) 'http://www.the-irf.com/hello/hello1.html' links: 0
    - try2: http://www.the-irf.com/hello/hello2.html
    Crawling URL at depth 0 'http://www.the-irf.com/hello/hello2.html'
    for url (1) 'http://www.the-irf.com/hello/hello2.html' links: 0
    - try2: http://www.the-irf.com/hello/hello3.html
    Crawling URL at depth 0 'http://www.the-irf.com/hello/hello3.html'
    for url (1) 'http://www.the-irf.com/hello/hello3.html' links: 0
    - try2: http://www.the-irf.com/hello/hello4.html
    Crawling URL at depth 0 'http://www.the-irf.com/hello/hello4.html'
    for url (1) 'http://www.the-irf.com/hello/hello4.html' links: 0
    - try2: http://www.the-irf.com/hello/hello5.html
    Crawling URL at depth 0 'http://www.the-irf.com/hello/hello5.html'
    for url (1) 'http://www.the-irf.com/hello/hello5.html' links: 0
    - try2: http://www.the-irf.com/hello/hello6.html
    Crawling URL at depth 0 'http://www.the-irf.com/hello/hello6.html'
    for url (1) 'http://www.the-irf.com/hello/hello6.html' links: 0
    - try2: http://www.the-irf.com/hello/hello7.html
    Crawling URL at depth 0 'http://www.the-irf.com/hello/hello7.html'
    for url (1) 'http://www.the-irf.com/hello/hello7.html' links: 0
    - try2: http://www.the-irf.com/hello/hello8.html
    Crawling URL at depth 0 'http://www.the-irf.com/hello/hello8.html'
    for url (1) 'http://www.the-irf.com/hello/hello8.html' links: 0
    Array ( [hello] => 1 )

  6. #6
    SitePoint Wizard silver trophybronze trophy Cups's Avatar
    Join Date
    Oct 2006
    Location
    France, deep rural.
    Posts
    6,869
    Mentioned
    17 Post(s)
    Tagged
    1 Thread(s)
    You sure you are connected to the internet and that those urls exist and there are some test links on them that match the patterns?

    I'm sorry, but I can only tell you about things which look obvious to me, having never used that sw.

  7. #7
    SitePoint Guru team1504's Avatar
    Join Date
    May 2010
    Location
    Okemos, Michigan, USA
    Posts
    732
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    Yes, I believe I am connected to the interent while this happened.
    This is all happening on a test server through MAMP PRO— I haven't tried it on an actual server yet.

    The URLs to exist, but there are not links on any of those pages. They are just samples pages with text. No anchors at all. Is that what is tripping this up?

  8. #8
    SitePoint Guru team1504's Avatar
    Join Date
    May 2010
    Location
    Okemos, Michigan, USA
    Posts
    732
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    Okay, someone suggested that possible the script can only search files on the server it is on. So I downloaded all the sample pages and put them on the same test-server.

    Now, I'm not getting those messages. But, it still says no results and I am getting the last-part of the message:
    Array ( [hello] => 1 )

    hello being the keyword I searched for. It looks like that remaining part that is being printed when debugging is turned on is supposed to be part of something, but it is printed instead. Do you think I am correct ?

  9. #9
    SitePoint Wizard silver trophybronze trophy Cups's Avatar
    Join Date
    Oct 2006
    Location
    France, deep rural.
    Posts
    6,869
    Mentioned
    17 Post(s)
    Tagged
    1 Thread(s)
    TBH the flag is called `debug` but it just looks as though it has gone into verbose mode, possibly because there are no errors.

  10. #10
    SitePoint Guru team1504's Avatar
    Join Date
    May 2010
    Location
    Okemos, Michigan, USA
    Posts
    732
    Mentioned
    1 Post(s)
    Tagged
    0 Thread(s)
    Yeah, that is what I was thinking too. Okay, thank you very much for your help thus far!


Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •