SitePoint Sponsor

User Tag List

Results 1 to 4 of 4
  1. #1
    monitormensch oerdec's Avatar
    Join Date
    Sep 2004
    Location
    Hamburg
    Posts
    706
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    problems with ereg_replace()

    Hi,

    I want to clean up MS Word HTML. I found some code which works for me. But I donīt know how to extend it. I tried to add <table>, <tr> and <td> but failed.

    Hereīs the original code:

    Code:
    	// strip tags, still leaving attributes, second variable is allowable tags
    	$content = strip_tags($content, 
    
    '<p><b><i><u><a><h1><h2><h3><h4><h4><h5><h6>');
    		
    	// removes the attributes for allowed tags, use separate replace for heading tags 
    
    since a
    	// heading tag is two characters
    	$content = ereg_replace("<([p|b|i|u])[^>]*>", "<\\1>", $content);
    	$content = ereg_replace("<([h1|h2|h3|h4|h5|h6][1-8])[^>]*>", "<\\1>", $content);

    Could you tell me how to extend it?

    oerdec//

  2. #2
    SitePoint Enthusiast
    Join Date
    Jul 2004
    Location
    Cyberia
    Posts
    94
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Add line:
    PHP Code:
    $content ereg_replace("<([table|tr|td|th|tbody])[^>]*>""<\\1>"$content); 
    Also, I think last line is wrong, it should be either:
    PHP Code:
    $content ereg_replace("<([h1|h2|h3|h4|h5|h6])[^>]*>""<\\1>",  content); 
    or:
    PHP Code:
    $content ereg_replace("<([h][1-6])[^>]*>""<\\1>"$content); 

  3. #3
    monitormensch oerdec's Avatar
    Join Date
    Sep 2004
    Location
    Hamburg
    Posts
    706
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    hmm, it doesnīt work.

    this is my actual code. I tried different options but didnīt made it.

    PHP Code:
    // strip tags, still leaving attributes, second variable is allowable tags
    $content strip_tags($content'<p><b><i><u><a><h1><h2><h3><h4><h4><h5><h6><table><tr><td><th><tbody>');
      
    // removes the attributes for allowed tags, use separate replace for heading tags since a
    // heading tag is two characters
    $content ereg_replace("<([p|b|i|u])[^>]*>""<\\1>"$content);
    $content ereg_replace("<([h1|h2|h3|h4|h5|h6])[^>]*>""<\\1>",  content); 
    $content ereg_replace("<([table|tr|td|th|tbody])[^>]*>""<\\1>"$content); 
    any idea?

    oerdec

  4. #4
    monitormensch oerdec's Avatar
    Join Date
    Sep 2004
    Location
    Hamburg
    Posts
    706
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    ... <tables> were removed completely.


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •