SitePoint Sponsor

User Tag List

Results 1 to 5 of 5

Hybrid View

  1. #1
    SitePoint Zealot
    Join Date
    Feb 2002
    Posts
    127
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Still can't get to grips with reg expressions

    Hi All

    I am trying to validate a URL. I want to be able to validate the following URLS:

    http://xxx.xxx/
    http://xxx.xxx.xxx/
    http://xxx.xxx.xxx.xx/

    I have been trying to write a reg exp that reads like this:

    http://xxx. AND (xxx OR xxx.xxx OR xxx.xxx.xx) AND /

    Here is what if have been trying:

    var urlReg = "^http:\\/\\/([a-zA-Z0-9])+\\.{1}([a-zA-Z0-9-]+)|([a-zA-Z0-9]+\\.{1}[a-zA-Z0-9]+)|([a-zA-Z0-9]+\\.{1}[a-zA-Z0-9]+\\.{1}[a-zA-Z0-9]{2})?\\/$";

    Any suggestions on why this wont work?

    Regards, Ben

  2. #2
    Perl/Mason Guru Flawless_koder's Avatar
    Join Date
    Feb 2002
    Location
    Gatwick, UK
    Posts
    1,206
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    You can validate all you want....

    At the end of the day - that's no assurance the url is valid ( if you understand me ) ie: www.fishingforgoatsinthepacific.com

    ( I REALLY hope that url doesn't work )

    Better still - why not actually validate it server side ?

    You could use a single script entry to call a file - use LWP::Simple or something - check the response - and return a binary response variable.

    That'd work better - wouldn't it ?

    Flawless
    ---=| If you're going to buy a pet - get a Shetland Giraffe |=---

  3. #3
    SitePoint Enthusiast
    Join Date
    Mar 2001
    Location
    northern Maine
    Posts
    52
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    This is something I was working on a while ago, but doesn't quite work yet, but it may be something to start with:

    ((?:\w[\w-]*\.)*\w(?:[\w-]{0,66}(?=\.(?:com)|(?:net)|(?:org)$))|(?:[\w-]{2,66}(?=\.info$))|(?:[\w-]{2,21}(?=\.biz$))|(?:[\w-]{0,21}(?!\.(?:com)|(?:net)|(?:org)|(?:biz)|(?:info)$)(?=\.[a-z]{2,6}(?:\.[a-z]{2})?$)))

    The goal of that to is match valid domain names, and the individual components of it work fine, but when jumbled together with |, they don't, something I intend to fix...
    Jason - Contact Me
    Supermoderator @ CodingForums

  4. #4
    SitePoint Addict been's Avatar
    Join Date
    May 2002
    Location
    Gent, Belgium
    Posts
    284
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Code:
    #([a-z]+?://){1}([a-z0-9\-\.,\?!%\*_\#:;~\\&$@\/=\+]+)#si
    This is a fairly good pcre to match an url (got it from phpBB), maybe you could change that to
    Code:
    #http://([a-z0-9\-\.,\?!%\*_\#:;~\\&$@\/=\+]+)#si
    But, like Flawless_koder said, that doesn't guarantee that the domain/page the url points to really exists.
    Per
    Everything
    works on a PowerPoint slide

  5. #5
    SitePoint Zealot matiefert's Avatar
    Join Date
    Nov 2001
    Location
    Bay area, California
    Posts
    188
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    If you are willing (and allowed by your ISP) to use Rebol, the only expression you need is:

    url? variableName

    And that's it - no regex messes at all.

    More info: http://www.rebol.com/

    enjoy! :-)

    Marj


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •