SitePoint Sponsor

User Tag List

Results 1 to 3 of 3
  1. #1
    SitePoint Guru
    Join Date
    Mar 2002
    Posts
    608
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Simple Regex Question (Translate html to ubb)

    Hey,

    I have successfully been able to translate ubb except for the fonts.

    The problem is a font tag can come at any point, and I must translate it as such:

    Code:
    <font face="verdana" size="2"> test </font> must become (dynamically)
    
    [font=verdana] [size=2] test [/font=verdana] [/size=2]
    Now, if I knew there would be just ONE reference to font, I might be ok. But the problem is, I want to dynamically take any font reference, including the NUMBER of size, and dynamically create the
    [/font=fontname] [/size=thisfontsize]

    I don't know how to do that dynamically, especially if there are 2-5 different fonts or font sizes being used.

    I tried searching preg_match and other regex functions...I'll still do so, but I had to ask just to see if this was even possible to do dynamically. Every other tag I can replace with ease.

    Thank you.

  2. #2
    SitePoint Evangelist ldivinag's Avatar
    Join Date
    Jan 2005
    Location
    N37 33* W122 3*
    Posts
    414
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    try this:

    1. you need to find a "<font" starter. and it needs a closing ">"...

    2. everything between it is an attribute with a value, right? and their order is random.

    once you can grab that string, it's as simple as EXPLODE'ing it using the space as a separator.

    for example:

    PHP Code:
    $font_string '<font face="verdana" size="2">';
    $font_string_array explode (" "$font_string);

    //  now just loop $font_string_array and look up the ATTRIBUTE
    //  reserve words like FACE, SIZE, COLOR, etc.

    foreach ($font_string_array as $attrib)
    {
    //  also, you need to deal with the $attrib[0] and $attrib[n] since
    //  those are the beginning and the end of the string...
    //  unless of course you have already stripped out the "<font" and the terminating ">" out of the original string...
      
    $attrib_parts explode ("="$attrib);
      
    //  if everything went right:
      //  $attrib_parts[0] = "face"
      //  $attrib_parts[1] = "verdana" according to your sample...
      
    switch ($attrib_parts[0])
        case 
    "face"
           
    $ubb_string $ubb_string "[font=" $attrib_parts[1] . "] ";
           
    $closer_string $closer_string "[/font=" $attrib_parts[1] . "] ";
           break;
       case 
    "size":
           
    $ubb_string $ubb_string "[size=" $attrib_parts[1] . "] ";
           
    $closer_string $closer_string "[/size=" $attrib_parts[1] . "] ";
           break;
       
    //  insert the rest of the attributes here... 
    now, i didnt do any error checking if the actual values for each attrib is valid. that's up to you..

    also, you have to deal with the actual text that is between the html... but that's the easy part...
    leo d.

  3. #3
    SitePoint Wizard stereofrog's Avatar
    Join Date
    Apr 2004
    Location
    germany
    Posts
    4,324
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    you cannot parse html with regular expressions. Use html sax parser


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •