SitePoint Sponsor

User Tag List

Results 1 to 8 of 8
  1. #1
    SitePoint Zealot boognish's Avatar
    Join Date
    Sep 2005
    Location
    Leeds
    Posts
    102
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Regex - match anything between href=" and " that contains

    I'm trying to do a regex find replace on a load of links that contain foreign characters such as and replacing them with their encoded version %C3%A4.

    For example. I would want to replace
    <a href="whle.html">Whle</a>
    to
    <a href="w%C3%A4hle.html">Whle</a>

    Can anyone tell me how I do this?

  2. #2
    Gre aus'm Pott gold trophysilver trophybronze trophy
    Pullo's Avatar
    Join Date
    Jun 2007
    Location
    Germany
    Posts
    5,890
    Mentioned
    211 Post(s)
    Tagged
    12 Thread(s)
    Hi,

    What are you using to do this?
    Dreamweaver? A scripting language like Ruby?

    Something like this maybe:
    Code Ruby:
    l = '<a href="whle.html">Whle</a>'
    h = l.match(/href=".*?"/).to_s.gsub(//, "ae")
    l = l.sub(/href=".*?"/, h)

    Bit ugly, but it does the job.

  3. #3
    SitePoint Zealot boognish's Avatar
    Join Date
    Sep 2005
    Location
    Leeds
    Posts
    102
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Thanks for your reply! I'm using a program called powergrep which can search regex statements and replace with what I want. So in my original example in the replace field I put '%C3%A4', but what regex can I use in the search field to match anything between href=" and " that contains ?

  4. #4
    Gre aus'm Pott gold trophysilver trophybronze trophy
    Pullo's Avatar
    Join Date
    Jun 2007
    Location
    Germany
    Posts
    5,890
    Mentioned
    211 Post(s)
    Tagged
    12 Thread(s)
    Hi,
    I was just about to download Powergrep to try it out, then I saw it cost 120€. Oops
    So, let me understand: You have a folder full of html files and want to use Powergrep to search through all of these files, line for line, and replace any occurrences of foreign characters within a href attribute with their encoded version. I.e. href="whle.html" would become href="w%C3%A4hle.html".
    Is that correct?

  5. #5
    SitePoint Zealot boognish's Avatar
    Join Date
    Sep 2005
    Location
    Leeds
    Posts
    102
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Yes that's exactly right, any ideas?

    Powergrep is a great program, I use it a lot, definitely worth the money!

  6. #6
    Gre aus'm Pott gold trophysilver trophybronze trophy
    Pullo's Avatar
    Join Date
    Jun 2007
    Location
    Germany
    Posts
    5,890
    Mentioned
    211 Post(s)
    Tagged
    12 Thread(s)
    Hi,

    I downloaded Powergrep (test version) and I've got your answer (I hope).

    In Powergrep:
    • With your directory selected, go to the Action tab
    • Select search and replace
    • Set Search type to Regular expression
    • In Search enter: href="(.*?)"
    • In replace enter: href="$1"
    • Set a tick by "Extra processing. Perform a search and replace on the replacement text or collect text"
    • In extra processing search type ""
    • In extra processing replace type "%C3%A4"
    • Rinse and repeat
    This works for me, put please use the preview function before altering anything.

    I hope this helps you.

  7. #7
    SitePoint Zealot boognish's Avatar
    Join Date
    Sep 2005
    Location
    Leeds
    Posts
    102
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Danke sch%C3%B6%0An!!

    Worked perfectly. Thank you very much for your help, very kind

  8. #8
    Gre aus'm Pott gold trophysilver trophybronze trophy
    Pullo's Avatar
    Join Date
    Jun 2007
    Location
    Germany
    Posts
    5,890
    Mentioned
    211 Post(s)
    Tagged
    12 Thread(s)
    Quote Originally Posted by boognish View Post
    Danke sch%C3%B6%0An!!
    Sweet! That made me laugh out loud


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •