SitePoint Sponsor

User Tag List

Results 1 to 9 of 9
  1. #1
    SitePoint Guru whisher's Avatar
    Join Date
    May 2006
    Location
    Kakiland
    Posts
    732
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Html purifier insert before or after Db

    Hi,
    Is it better to use purify before
    or after the db ?


    Bye.

  2. #2
    SitePoint Enthusiast
    Join Date
    Feb 2010
    Posts
    40
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Why would you store something if it could possibly do your database harm? Purify before you do anything with it.

    Input data should nearly always be validated and purified before anything is done with it.

  3. #3
    SitePoint Guru whisher's Avatar
    Join Date
    May 2006
    Location
    Kakiland
    Posts
    732
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by aaronfalloon View Post
    Why would you store something if it could possibly do your database harm? Purify before you do anything with it.

    Input data should nearly always be validated and purified before anything is done with it.
    I disagree (with purified before not for the validation) I don't think to harm the db
    if I also insert for instance
    < script >doHarm </script>
    the real trouble if to display it in the view
    without escape it (htmlenties)
    imho

  4. #4
    Theoretical Physics Student bronze trophy Jake Arkinstall's Avatar
    Join Date
    May 2006
    Location
    Lancaster University, UK
    Posts
    7,062
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    Escape only for the circumstance you are in.

    In otherwords - escape the input for SQL injection before entering into the database - so use MySQL_Real_Escape_String or PDO prepared statments etc.

    When it comes to outputting HTML, escape for HTML.

    If you escape for HTML when putting it into the database, you'll need to unescape it for non-html output, e.g. when editing it inside a textarea, or serving it as a txt file etc.

    Keep prepared for circumstances where you won't be definitely outputting content as HTML.
    Jake Arkinstall
    "Sometimes you don't need to reinvent the wheel;
    Sometimes its enough to make that wheel more rounded"-Molona

  5. #5
    SitePoint Guru whisher's Avatar
    Join Date
    May 2006
    Location
    Kakiland
    Posts
    732
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Jake Arkinstall View Post
    Escape only for the circumstance you are in.

    In otherwords - escape the input for SQL injection before entering into the database - so use MySQL_Real_Escape_String or PDO prepared statments etc.

    When it comes to outputting HTML, escape for HTML.

    If you escape for HTML when putting it into the database, you'll need to unescape it for non-html output, e.g. when editing it inside a textarea, or serving it as a txt file etc.

    Keep prepared for circumstances where you won't be definitely outputting content as HTML.

    I agree

    Bye and thanks for the help.

    RIP Dan Schulz

  6. #6
    PHP Guru lampcms.com's Avatar
    Join Date
    Jan 2009
    Posts
    921
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I would go with "before" because purifier is a fairly slow script. It's slow because it does lots of things like inspects each unicode character first and then inspects the html
    On their website, they recommend running the purifier before the insert, otherwise if you run in after the sql select on every page load, your pages will load much slower.
    My project: Open source Q&A
    (similar to StackOverflow)
    powered by php+MongoDB
    Source on github, collaborators welcome!

  7. #7
    SitePoint Guru whisher's Avatar
    Join Date
    May 2006
    Location
    Kakiland
    Posts
    732
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by lampcms.com View Post
    I would go with "before" because purifier is a fairly slow script. It's slow because it does lots of things like inspects each unicode character first and then inspects the html
    On their website, they recommend running the purifier before the insert, otherwise if you run in after the sql select on every page load, your pages will load much slower.
    Thanks for the point but I saw a cache system
    can it enhance the performance ?
    I'm quite new at Html purifier

    Bye

  8. #8
    SitePoint Wizard
    Join Date
    Mar 2008
    Posts
    1,149
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I thought HTML Purifier didn't have a bundled cache system.

    And yes, you should either do it before (and perhaps store it somewhere else) or do it after, but cache the results.

  9. #9
    SitePoint Wizard bronze trophy
    Join Date
    Jul 2008
    Posts
    5,757
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    If having both a processed, and unprocessed version of the data would be highly useful to you in different situations, you can always store both versions in the db. You just need to take care to maintain both versions.

    For example, I'm not sure if it does, but this very forum might store both a version of my post with the raw bbcode(so I can edit my post), as well as a version of my post with bbcode transformed to html.

    A seperate caching layer instead of this may or may not be more suitable. Depends


Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •