SitePoint Sponsor

User Tag List

Results 1 to 7 of 7
  1. #1
    SitePoint Enthusiast Deo's Avatar
    Join Date
    Oct 2003
    Location
    Washington, USA
    Posts
    56
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    stript out international characters...

    Hi,

    Ive run across a problem that I havent really had to deal with before. Im basically taking dynamic content and filtering out different parts of the content before it is displayed.

    The problem is, there is occasionally international characters in the content that i cant get to be "filtered out".

    Anyone have any idea on how to accomplish this?

    ~Deo

  2. #2
    SitePoint Zealot hvoice's Avatar
    Join Date
    Sep 2003
    Location
    New York
    Posts
    186
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I'm kind of curious how to do this as well - any one has any suggestions?

    Boris
    - PayPerClickUniverse.com - Click Here for exclusive
    deals on Overture, FindWhat, Enhance, Search123 and more.

  3. #3
    PHP manual bot bronze trophy Gaheris's Avatar
    Join Date
    Oct 2003
    Location
    Germany
    Posts
    2,195
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    This might not exactly be what you are looking for, but anyway, I found this one in the strtr comments in the PHP Manual at php.net.
    PHP Code:
    function removeaccents($string) {
    return 
    strtr($string,"",  
    "SOZsozYYuAAAAAAACEEEEIIIIDNOOOOOOUUUUYsaaaaaaaceeeeiiiionoooooouuuuyy");


  4. #4
    SitePoint Addict shad0w's Avatar
    Join Date
    Aug 2003
    Location
    PA
    Posts
    239
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I'm a bit confused about what exactly your trying to do, but what Gaheris posted should work for converting all accented or foreign characters to normal characters.

  5. #5
    SitePoint Enthusiast Deo's Avatar
    Join Date
    Oct 2003
    Location
    Washington, USA
    Posts
    56
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Well, its kinda like this:

    The content being pulled has tabs and spaces that are of different languages. So php is unable to remove the character codes because they arent legal for the charset. Im basically trying to find these characters and remove them.

    My first thought was to do a preg_replace to filter out everything that wasnt a-z,A-Z,0-9, -, and _, but it didnt seem to work.

    If anyone has any other ideas, im all ears.

    ~Deo

  6. #6
    PHP manual bot bronze trophy Gaheris's Avatar
    Join Date
    Oct 2003
    Location
    Germany
    Posts
    2,195
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Could you show us some example data?

  7. #7
    SitePoint Addict shad0w's Avatar
    Join Date
    Aug 2003
    Location
    PA
    Posts
    239
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I don't think there are tabs and spaces in different languages. There's just a tab and a space, whether it's in chinese or english.


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •