SitePoint Sponsor

User Tag List

Results 1 to 24 of 24

Hybrid View

  1. #1
    *********! *********!!! jackli's Avatar
    Join Date
    Sep 2005
    Posts
    436
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    PHP URL Shortening

    What's the best solution for shortening long url's that looks like form.php?state=ca&country=usa&area=bay_area ...
    into
    form.php/usa/ca/bay_area
    or
    form/usa/ca/bay_area
    ?

    I've attempted to devise a method that doesn't require any packages. But, it involves dynamically creating directories with a script that parses all the get parameters into folders. I recently learned that there's a max number of directories and files allowed in a *nix directory, so this method is not extensible enough...

  2. #2
    SitePoint Wizard silver trophy
    Join Date
    Mar 2006
    Posts
    6,132
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    tons of directorys is the only solution i know of if you dont want to use a url rewriting solution like mod_rewrite. well, either that or make your 404 error page handle it, which i dont think is a good idea.

  3. #3
    *********! *********!!! jackli's Avatar
    Join Date
    Sep 2005
    Posts
    436
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    do you know how mod_rewrite works? does it also create directories? so, it wouldn't be suitable for a large site, either, given the max number of inodes/directories limit in *nix.

  4. #4
    Worship the Krome kromey's Avatar
    Join Date
    Sep 2006
    Location
    Fairbanks, AK
    Posts
    1,621
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    No, it does not create directories. It internally remaps URLs to different resources, often used to take long strings of directories (e.g. /form/usa/ca/bay_area) and remapping to a different resource with GET parameters (e.g. form.php?state=ca&country=usa&area=bay_area). Google "mod_rewrite tutorials" and I'm sure you'll have it up and running in no time.
    PHP questions? RTFM
    MySQL questions? RTFM

  5. #5
    *********! *********!!! jackli's Avatar
    Join Date
    Sep 2005
    Posts
    436
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    good news! it turns out this mod_rewrite thing is installed with my cpanel!

    i got a basic implementation to work, but ideally, i'd like to make these "virtual directories" dynamic.

    this would mean using php to dynamically write the .htacccess file.

    are there any security issues, here? precautions i should take? would it be safe for all users to access the same .htaccess file (in same parent directory), or should each user have her own physical directory and corresponding .htaccess file?

  6. #6
    SitePoint Wizard silver trophy
    Join Date
    Mar 2006
    Posts
    6,132
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    its not the best idea ot make your .htaccess files writable by php, since that will probably make them writable by all users on the server.

    instead, consider either using some type of pattern/prefix to the urls you want rewritten, or use some type of catch all with exclusions.

    exclusions would be something like
    "rewrite all urls except a few select patterns, and/or some specifically named urls/dirs"
    or
    "rewrite all urls that do not actually map to a real, existing file or directory, or dont have a file extension etc..."

    or some combination of the above components.

    mod_rewrite is regex aware so it is extremely powerful.


    directives set in .htaccess file cascade to directorys underneath it.
    a pretty powerful and simple system can acheived by rewriting all urls by default, and then writing a few exclusions to match certain directorys you dont want rewriting to occur in. include a few other exclusions for stuff in your webroot like your robots.txt, favicon.ico etc...

    once the request has been rewritten to your php script, then you can use php to decide what to do from there. for example, query the db to see if this "page" really exists.

  7. #7
    *********! *********!!! jackli's Avatar
    Join Date
    Sep 2005
    Posts
    436
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    ok... is there a way to remap domain.com/folder/subfolder/
    with a slash at the end
    to script.php?i=folder&j=subfolder

    ... or is this how you tell whether a site is using real directories or mod_rewrite remapping?

  8. #8
    SitePoint Wizard silver trophy
    Join Date
    Mar 2006
    Posts
    6,132
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Code:
    RewriteEngine on
    RewriteRule ^([^/]+)/([^/]+)/$ script.php?i=$1&j=$2 [L,QSA]
    couldnt really understand the question at the end of your post.

  9. #9
    *********! *********!!! jackli's Avatar
    Join Date
    Sep 2005
    Posts
    436
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    sorry. suppose you see on your browser on a site you don't have serverside-access to. is there a way to tell whether a directory you see is being rerouted by mod_rewrite or if it's an actual physical directory?

    also, is there a way for this to process spaces? using ([A-z0-9\s]) doesn't work...
    example of why? how else would you process a query like script.php?name=Abe Lincoln?

  10. #10
    ✯✯✯ silver trophybronze trophy php_daemon's Avatar
    Join Date
    Mar 2006
    Posts
    5,284
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by jackli View Post
    also, is there a way for this to process spaces? using ([A-z0-9\s]) doesn't work...
    example of why? how else would you process a query like script.php?name=Abe Lincoln?
    Space is not a legal url character, it's being encoded with %20. Use + in your url instead.
    Saul

  11. #11
    SitePoint Wizard silver trophy
    Join Date
    Mar 2006
    Posts
    6,132
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    someone could make an educated guess that it is being rewritten, but they couldnt be sure.

    \s is common to pcre regex, i beleive mod_rewrite uses posix regex.
    try an actual space character, i think that should work.

  12. #12
    *********! *********!!! jackli's Avatar
    Join Date
    Sep 2005
    Posts
    436
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    is there a way to encode the %20 into the .htaccess?

    [A-z%20]{1,75} doesn't work

  13. #13
    ✯✯✯ silver trophybronze trophy php_daemon's Avatar
    Join Date
    Mar 2006
    Posts
    5,284
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    Not really sure, but try [A-z+]{1,75}
    Saul

  14. #14
    *********! *********!!! jackli's Avatar
    Join Date
    Sep 2005
    Posts
    436
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    no, php_daemon, that wouldn't really help unless i have my url encoded with +'s

    my url is encoded with %20 (the name is also stored in the mysql db as "Abe Lincoln" -- with a space between first and last names)

    is there a way to verify that the name is alphabetical and contains one space encoded as %20?

  15. #15
    ✯✯✯ silver trophybronze trophy php_daemon's Avatar
    Join Date
    Mar 2006
    Posts
    5,284
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    Well, no, not sure why the one with %20 doesn't work. Can you try encoding your urls with +?
    Saul

  16. #16
    *********! *********!!! jackli's Avatar
    Join Date
    Sep 2005
    Posts
    436
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    well, i'd initially used physical directories like /users/Abe%20Lincoln

    these URL's have already gone into use, and I don't want to invalidate them unless there is a security issue with using this %20 instead of +

    the mod_rewrite just redirects these and new virtual directories to script, which i can easily modify to accept +, but again, i would like the old directory infrastructure ---- with the %20's ---- to work.

    question reposed for late-posters to thread: is there a way to make mod_rewrite allow for url-encoded spaces and alphabetical characters, such as "Abe Lincoln"

  17. #17
    SitePoint Wizard silver trophy
    Join Date
    Mar 2006
    Posts
    6,132
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Code:
    [A-z]+[\ ][A-z]+

  18. #18
    *********! *********!!! jackli's Avatar
    Join Date
    Sep 2005
    Posts
    436
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    ah ha! it works like this. ^([A-z/ ])$ (with a space after the slash)

    and it also works to truncate excess spaces for url's that look like script.php?name=Abe%20%20%20Lincoln ... with indefinite spaces between Abe and Lincoln

    should have taken clam's initial suggestion.

    now, one last question before i'm totally completely satisfied with this :

    is there a security issue why i might want to use + instead of %20 in the encoded url?

  19. #19
    *********! *********!!! jackli's Avatar
    Join Date
    Sep 2005
    Posts
    436
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    oops. i posted the message above before i got to read clam's latest reply... thanks clam!

    also, it looks like without explicitly checking for where the space in the URL is, this remapping can work for all sorts of typo-ed url... because it truncates multiple spaces to one, and trims, too!

    like /users/%20%20Abe%20%20Lincoln%20%20 with spaces all over

  20. #20
    ✯✯✯ silver trophybronze trophy php_daemon's Avatar
    Join Date
    Mar 2006
    Posts
    5,284
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    No security issues.

    Though I gotta question, when is the request url decoded?
    Saul

  21. #21
    *********! *********!!! jackli's Avatar
    Join Date
    Sep 2005
    Posts
    436
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    it's never decoded. the mysql table's name column expects names with spaces between first and last. (it is varchar, and existing data is in such form). the name is accepted with the space in mysql fetch.

    ... tangent security issue, here?

  22. #22
    ✯✯✯ silver trophybronze trophy php_daemon's Avatar
    Join Date
    Mar 2006
    Posts
    5,284
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    No, I'm just wondering why does the space in mod_rewrite works, while %20 doesn't. That's against my conviction that the urls are being decoded by php rather than apache.
    Saul

  23. #23
    SitePoint Wizard silver trophy
    Join Date
    Mar 2006
    Posts
    6,132
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    the webserver urldecodes

  24. #24
    ✯✯✯ silver trophybronze trophy php_daemon's Avatar
    Join Date
    Mar 2006
    Posts
    5,284
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by clamcrusher View Post
    the webserver urldecodes
    That's what I wanted to hear.
    Saul


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •