SitePoint Sponsor

User Tag List

Results 1 to 8 of 8
  1. #1
    SitePoint Zealot darkwarrior's Avatar
    Join Date
    Dec 2010
    Posts
    171
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    How to sanitize a title (strip illegal characters entirely)?

    I was using the below function to clean post titles to create a GUID link and unique name, but while I thought it was working iit doesn't seem to actually be doing anything useful, for instance this:

    "l'sm,-$~~*&dfs$%*£!"!/"

    Is meant to be the sanitized title from "l'sm, $~~*&DFS$%*£!"!/".

    I don't quite understand the preg_replace function or how they derive the patterns, how would I go about eliminating these characters as I'm guessing that using a name like that in a URL will not end well.


    PHP Code:
        //SANITIZE A TITLE FOR USE
    function cleanTitle($title$optional=''$type='') {
        
    $title strip_tags(trim($title));
        
    $title preg_replace('/&.+?;/'''$title);
        
        if ( 
    '' === $title || false === $title )
            
    $title $optional;
            
        if (
    $type === 'cleanname') {
            
    $title strtolower($title);
            
    $title preg_replace('/\s+/''-'$title);
        }
        return 
    $title;


  2. #2
    SitePoint Wizard silver trophybronze trophy Cups's Avatar
    Join Date
    Oct 2006
    Location
    France, deep rural.
    Posts
    6,869
    Mentioned
    17 Post(s)
    Tagged
    1 Thread(s)
    Not quite sure what you are after, but if you know what you want to allow in (a white-list) then you can remove everything which is not on that list:
    PHP Code:
    // rm everything but numbers and letters and chars . , /
    // upper and lowercase
    $input '0123 Big Street bc < &lt; ?.,/#';
    $output preg_replace('#[^0-9a-z .,/]#i'''$input); 

  3. #3
    SitePoint Zealot darkwarrior's Avatar
    Join Date
    Dec 2010
    Posts
    171
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Largely I want to remove anything that would not be acceptable in a URL as this particular function should be creating a unique text name for each post that could be used in a link such as http://www.sitepoit.com/post.php?tit...o-i-am-a-title

    In which case I believe your function would be fine if it rem0ves all but numbers and letter characters. Though looking at the URL here it leaves in () also. Is there a good learning resource for preg_replace besides the PHP Manual? It doesn't really explain how to actually write the type of filter you want. I look at your code and I can understand what it is doing up to 'z' and obvious the '' as the replacement character but do not follow what the '.,/]#i' is achieving.

    Also thankyou for the prompt help.

  4. #4
    SitePoint Wizard silver trophybronze trophy Cups's Avatar
    Join Date
    Oct 2006
    Location
    France, deep rural.
    Posts
    6,869
    Mentioned
    17 Post(s)
    Tagged
    1 Thread(s)
    .,/ - permit those chars
    ] - end of character class definition
    # - end of the delimiter "#[load of rules]#", I could use anything you usually see "/[load of rules]/"
    i - the switch meaning ignore case, s = include new lines etc etc

    Going back to the nub of your problem, is it the case that you want to take a title such as

    "Roberts my mothers' brother (in law)"

    and turn it into a url-friendly string:

    "roberts-my-mothers-brother-in-law"

    Or are you trying to strip bad chars from an entire URL (http://www. etc etc)

  5. #5
    SitePoint Zealot darkwarrior's Avatar
    Join Date
    Dec 2010
    Posts
    171
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    The first one, create a URL friendly string.

    Thanks for the explanation about preg_replace.

  6. #6
    SitePoint Wizard silver trophybronze trophy Cups's Avatar
    Join Date
    Oct 2006
    Location
    France, deep rural.
    Posts
    6,869
    Mentioned
    17 Post(s)
    Tagged
    1 Thread(s)
    In that case you could be described as creating what is often termed a "slug" .

    Search this forum for the words "slug" or "slugify" to find quite a bit of discussion on this matter, not only how to create them but how to store use them especially in tandem with Apache's mod_rewrite.

    Come back if you cannot find the discussions or have any questions.

  7. #7
    SitePoint Zealot darkwarrior's Avatar
    Join Date
    Dec 2010
    Posts
    171
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Ah ok Thank you for your help Cups I will go investigate Slugs.

  8. #8
    Non-Member bronze trophy
    Join Date
    Nov 2009
    Location
    Keene, NH
    Posts
    3,760
    Mentioned
    23 Post(s)
    Tagged
    0 Thread(s)


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •