SitePoint Sponsor

User Tag List

Results 1 to 7 of 7

Hybrid View

  1. #1
    Pandora Tipem's Avatar
    Join Date
    Feb 2006
    Posts
    450
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Counting regexp matches with MySQL?

    Is there any way to COUNT regexp matches in MySQL and store it in a column in the results set?

    e.g.,

    SELECT name, COUNT(REGEXP '[ae]') as matches FROM table
    If so, could you provide a couple of examples? I need to do something like this for sorting purposes.
    Tipem no fancy sig, yup
    Need Neopets Help? Try Neopets
    fun adoptable eggs

  2. #2
    SitePoint Addict wibble wobble's Avatar
    Join Date
    Dec 2008
    Posts
    242
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Cant tell what you are trying to match with regex, but this should work:
    SELECT SUM(name REGEXP '[ae]') AS matches FROM table

    (I use 'SUM' because 'name REGEXP exp' returns 1 or 0.)
    Find freelance jobs from all the major sites in one place:
    on twitter / on the web / twitter rss feed

  3. #3
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,273
    Mentioned
    60 Post(s)
    Tagged
    3 Thread(s)
    actually, the fact that REGEXP returns 0 or 1 (instead of TRUE or FALSE) is in my opinion not sufficient reason to sum these values directly

    it's better to write it as a CASE expression -- it's more obvious, and easier to swap with some other relevancy expression should you ever want something other than a REGEXP test
    Code:
    SELECT SUM(
            CASE WHEN name REGEXP '[ae]'
                 THEN 1 
                 ELSE 0 END
              ) AS matches 
      FROM ...
    tipem, you might also be interested in this, if you're looking to sort by relevance...
    Simple Keyword Relevance

    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  4. #4
    SitePoint Addict wibble wobble's Avatar
    Join Date
    Dec 2008
    Posts
    242
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by r937 View Post
    actually, the fact that REGEXP returns 0 or 1 (instead of TRUE or FALSE) is in my opinion not sufficient reason to sum these values directly

    it's better to write it as a CASE expression -- it's more obvious, and easier to swap with some other relevancy expression should you ever want something other than a REGEXP test
    Code:
    SELECT SUM(
            CASE WHEN name REGEXP '[ae]'
                 THEN 1 
                 ELSE 0 END
              ) AS matches 
      FROM ...
    tipem, you might also be interested in this, if you're looking to sort by relevance...
    Simple Keyword Relevance

    Tipem: This is why r937 deserves your vote in the SP Awards for SQL guru

    One thing I'm wondering: you want to use CASE for clarity and maintainability. Why not just give the query a comment saying "Sums because REGEXP returns 1 or 0". Over a largish data set, surely the case statements would slow it down quite a bit?
    Find freelance jobs from all the major sites in one place:
    on twitter / on the web / twitter rss feed

  5. #5
    Pandora Tipem's Avatar
    Join Date
    Feb 2006
    Posts
    450
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    You guys truly rock my socks! Awesomesauce!

    I have run into a problem though... I am trying to find all of the following instances and then count them (yes, with the square brackets included): [a], [e], [p], [l], [d], [w], and [f]. The regexp that I am using is \[[aefdwpl]\] to create this query (obviously this query is just to test on one row):

    Code mysql:
    SELECT w_attack, SUM(
            CASE WHEN w_attack REGEXP '\[[aefdwpl]\]'
                 THEN 1 
                 ELSE 0 END
              ) AS attack_icons 
      FROM weapons WHERE w_id = 1 GROUP BY w_id
    But unfortuately it is returning only 1 for attack_icons (when in fact, there are 23 matches). Does the MySQL REGEXP function match everything? Or just find one match and then stop? Let me know... how should I go about fixing this problem?

    Thanks in advanced! Again, you guys rock! I should nominate r937 for SQL guru...
    Tipem no fancy sig, yup
    Need Neopets Help? Try Neopets
    fun adoptable eggs

  6. #6
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,273
    Mentioned
    60 Post(s)
    Tagged
    3 Thread(s)
    Quote Originally Posted by Tipem View Post
    I am trying to find all of the following instances and then count them (yes, with the square brackets included): [a], [e], [p], [l], [d], [w], and [f].
    try this:
    Code:
    , SUM(
       CASE WHEN w_attack IN ( '[a]', '[e]', '[p]', '[l]', '[d]', '[w]', '[f]' )
            THEN 1 
            ELSE 0 END ) AS attack_icons
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  7. #7
    SitePoint Addict wibble wobble's Avatar
    Join Date
    Dec 2008
    Posts
    242
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    MySQL doesnt support regex replacements (which is what you would need: replace other chars with nothing, then count the resulting length). Do this in your programming/scripting language and store the result in a new field ('attack_icons_count' maybe).

    Edit: If you need help with it, let me know and I'll do a blog post on it.
    Find freelance jobs from all the major sites in one place:
    on twitter / on the web / twitter rss feed


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •