SitePoint Sponsor

User Tag List

Results 1 to 8 of 8
  1. #1
    SitePoint Zealot boognish's Avatar
    Join Date
    Sep 2005
    Location
    Leeds
    Posts
    102
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Regex - match anything between href=" and " that contains ä

    I'm trying to do a regex find replace on a load of links that contain foreign characters such as ä and replacing them with their encoded version %C3%A4.

    For example. I would want to replace
    <a href="wähle.html">Wähle</a>
    to
    <a href="w%C3%A4hle.html">Wähle</a>

    Can anyone tell me how I do this?

  2. #2
    Grüße aus'm Pott gold trophysilver trophybronze trophy
    Pullo's Avatar
    Join Date
    Jun 2007
    Location
    Germany
    Posts
    5,352
    Mentioned
    179 Post(s)
    Tagged
    9 Thread(s)
    Hi,

    What are you using to do this?
    Dreamweaver? A scripting language like Ruby?

    Something like this maybe:
    Code Ruby:
    l = '<a href="wähle.html">Wähle</a>'
    h = l.match(/href=".*?"/).to_s.gsub(/ä/, "ae")
    l = l.sub(/href=".*?"/, h)

    Bit ugly, but it does the job.

  3. #3
    SitePoint Zealot boognish's Avatar
    Join Date
    Sep 2005
    Location
    Leeds
    Posts
    102
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Thanks for your reply! I'm using a program called powergrep which can search regex statements and replace with what I want. So in my original example in the replace field I put '%C3%A4', but what regex can I use in the search field to match anything between href=" and " that contains ä?

  4. #4
    Grüße aus'm Pott gold trophysilver trophybronze trophy
    Pullo's Avatar
    Join Date
    Jun 2007
    Location
    Germany
    Posts
    5,352
    Mentioned
    179 Post(s)
    Tagged
    9 Thread(s)
    Hi,
    I was just about to download Powergrep to try it out, then I saw it cost 120€. Oops
    So, let me understand: You have a folder full of html files and want to use Powergrep to search through all of these files, line for line, and replace any occurrences of foreign characters within a href attribute with their encoded version. I.e. href="wähle.html" would become href="w%C3%A4hle.html".
    Is that correct?

  5. #5
    SitePoint Zealot boognish's Avatar
    Join Date
    Sep 2005
    Location
    Leeds
    Posts
    102
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Yes that's exactly right, any ideas?

    Powergrep is a great program, I use it a lot, definitely worth the money!

  6. #6
    Grüße aus'm Pott gold trophysilver trophybronze trophy
    Pullo's Avatar
    Join Date
    Jun 2007
    Location
    Germany
    Posts
    5,352
    Mentioned
    179 Post(s)
    Tagged
    9 Thread(s)
    Hi,

    I downloaded Powergrep (test version) and I've got your answer (I hope).

    In Powergrep:
    • With your directory selected, go to the Action tab
    • Select search and replace
    • Set Search type to Regular expression
    • In Search enter: href="(.*?)"
    • In replace enter: href="$1"
    • Set a tick by "Extra processing. Perform a search and replace on the replacement text or collect text"
    • In extra processing search type "ä"
    • In extra processing replace type "%C3%A4"
    • Rinse and repeat
    This works for me, put please use the preview function before altering anything.

    I hope this helps you.

  7. #7
    SitePoint Zealot boognish's Avatar
    Join Date
    Sep 2005
    Location
    Leeds
    Posts
    102
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Danke sch%C3%B6%0An!!

    Worked perfectly. Thank you very much for your help, very kind

  8. #8
    Grüße aus'm Pott gold trophysilver trophybronze trophy
    Pullo's Avatar
    Join Date
    Jun 2007
    Location
    Germany
    Posts
    5,352
    Mentioned
    179 Post(s)
    Tagged
    9 Thread(s)
    Quote Originally Posted by boognish View Post
    Danke sch%C3%B6%0An!!
    Sweet! That made me laugh out loud


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •