SitePoint Sponsor

User Tag List

Results 1 to 5 of 5
  1. #1
    SitePoint Wizard silver trophybronze trophy Cups's Avatar
    Join Date
    Oct 2006
    Location
    France, deep rural.
    Posts
    6,869
    Mentioned
    17 Post(s)
    Tagged
    1 Thread(s)

    delete rows where count > x ?

    I have 2 tables similar to this;
    Code:
    rss_news
    =======
    link (PK)
    feed_ref (FK)
    title - varchar
    pubDate - timestamp
    
    rss_feeds
    =======
    feed PK
    maximum_items INT
    
    Examples:
    
    rss_news
    -------------------------------------------------------------------------
    bbc.co.uk/news/item1234 | bbc | "Man finds jumper" | 2009-11-01 12:12:00
    cnn.com/news.12997rr78  | cnn | "Water found in lake" | 2008-10-30 14:45:09    
    
    rss_feeds
    ---------
    bbc  | 10 
    cnn  |  5
    I works like an aggregator, I periodically update that table with info from the original rss feed, adding news rows and updating those whose pubDate has changed - and this works fine.

    I want to do a cleanup operation though, so there are never more that say, 5 feeds in there for a given feed name at a time, that is a setting I would get from each feeds maximum_items setting.

    Can anyone give me a steer on how to construct the sql statement which would :

    go through the rss_feeds table and for each feed / maximum_items, go to the rss_news table and order the items by date and delete any old feeds which are in there ( above the target number, e.g. 5).

    Can that be done in one statement or do I need to go through them one at a time?

  2. #2
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,220
    Mentioned
    58 Post(s)
    Tagged
    3 Thread(s)
    caution: untested
    Code:
    DELETE rss_news
      FROM rss_news 
    INNER
      JOIN rss_news AS other_news
        ON other_news.feed_ref = rss_news.feed_ref
       AND other_news.pubDate  > rss_news.pubDate 
    INNER
      JOIN rss_feeds 
        ON rss_feeds.feed = rss_news.feed_ref
    GROUP
        BY rss_news.link
    HAVING COUNT(*) > rss_feeds.maximum_items
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  3. #3
    SitePoint Wizard silver trophybronze trophy Cups's Avatar
    Join Date
    Oct 2006
    Location
    France, deep rural.
    Posts
    6,869
    Mentioned
    17 Post(s)
    Tagged
    1 Thread(s)
    I am getting error 1064 on that query.

    "You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near "

    'GROUP
    BY rss_news.link
    HAVING COUNT(rss_news.link) > rss_feeds.max_no' at line 10

    If I understand correctly so far, 'other_table' is a temporary copy of 'rss_news', and is sorted in date order.

    But that is as far as I have understood up to now.

    Do you want me to post a bit of test data?

  4. #4
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,220
    Mentioned
    58 Post(s)
    Tagged
    3 Thread(s)
    maybe you can do it in two steps?

    change the DELETE into a SELECT (with only rss_news.link in the SELECT clause)

    this will return the feeds you want to delete
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  5. #5
    SitePoint Wizard silver trophybronze trophy Cups's Avatar
    Join Date
    Oct 2006
    Location
    France, deep rural.
    Posts
    6,869
    Mentioned
    17 Post(s)
    Tagged
    1 Thread(s)
    Yeah, I forgot to say I had tried that before, not having seen "DELETE rss_news ... " before, and not wanting to fry my test data.

    The error moves down to:

    Unknown column 'rss_feeds.maximum_items' in 'having clause' err 1054

    Not to worry, I will do it the old fashioned way, it only happens 3 times a day to about 10 feeds.


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •