Adding a "scoring" option to a bookmarking system

Forbes · January 31, 2011, 12:45pm

Hi guys!

I’ve developed a bookmarking application, and it’s become apparent that I need to deal with duplicate bookmarks in a sensible way.

The long-term aim is to have a way of “scoring” bookmarks, so that when someone adds a bookmark that already exists, the score for the original is incremented.

The reasons for doing this are, I no doubt, obvious.

So far, the bookmarks table looks like this:

CREATE TABLE `bookmarks` (
  `id` int(11) NOT NULL auto_increment,
  `user_id` mediumint(11) NOT NULL,
  `url` text,
  `title` text,
  `snippet` text,
  `datetime` datetime default NULL,
  `status` enum('public','private') NOT NULL default 'public',
  PRIMARY KEY  (`id`),
  KEY `datetime` (`datetime`),
  FULLTEXT KEY `title` (`title`,`snippet`)
);

I have a few ideas in my mind how I might build a scoring system, but I’d much prefer to run this by you guys first, before committing to something that’s either not practical or scales poorly over time.

Any help would be much appreciated!

Forbes · February 1, 2011, 8:24am

As I said previously.

r937 · January 31, 2011, 10:38pm

first thing you have to do is remove everything from that table that relates to which user(s) bookmarked a particular bookmark, and put that in a separate table

Forbes · January 31, 2011, 10:50pm

Hi, and thanks for the reply!

That’s one of the options I’m thinking of going with — placing the URL itself into a separate table, along with the score and the ID of the originating bookmark.

I was also thinking of chaining the bookmark URLs together, so that all other bookmarks would then fall into a parent-child relationship, along the lines of:

id, parent_id, child_id, url, score

That way, when an originating parent bookmark is viewed, all of the child bookmarks can be retrieved.

Is that a viable method, or are there better ways of doing this?

r937 · February 1, 2011, 12:21am

chaining? parent-child?

on what basis are two urls related?

r937 · February 1, 2011, 10:04am

okay, but what does “originating” mean?

i still don’t see how two urls can be related

i’m quite familiar with the parent-child structure, but i just don’t see it applying here

Forbes · February 1, 2011, 10:55am

Someone will have added the first originating bookmark to which all duplicates are a child of.

r937 · February 1, 2011, 11:12am

no, that’s not a good idea

duplicate the url each time? no

Forbes · February 1, 2011, 11:19am

That’s what I’m trying to figure out. But then there’s the issue of determining exact matching URLs, which becomes a parameter cleansing issue.

Rather than me guessing — which is exactly what I’m doing right now — I was hoping someone would actually explain the most appropriate way of doing this.

guelphdad · February 1, 2011, 12:49pm

Add a UNIQUE key on URL. Create a column called counter or something like that.

See INSERT ON DUPLICATE UPDATE syntax in the manual.

That way a duplicate is not inserted but your Counter column is updated instead.

Forbes · February 1, 2011, 8:48pm

Hi!

I’m using INSERT ON DUPLICATE UPDATE elsewhere in the application, so I’m at least familiar with that. In fact, I’m doing almost exactly as you’re recommending for the analytics side of things (user visits to bookmarks).

Right, I can see how this would work now.

I’m assuming I’m right to keep the parent-child method, so I can keep track of all those who added the same bookmark afterwards, yes?

Thanks for your time! Much appreciated.

guelphdad · February 1, 2011, 9:07pm

No, you don’t need a parent-child method at all, the link isn’t a child of the same link, it is a duplicate of the link. You would use parent-child if you were drilling down:

Electronics -> Televisions - Sanyo -> Model123

If you want to know who added the bookmarks then you allow all inserts into the table with duplicate URLs and those who entered them. You would use a GROUP BY clause to group the URLs together and get a count of how many you had.

You would be using three tables. The one you have above (bookmarks) with the user_id column removed entirely, a user table defining information about your users and a users_bookmarks table with two columns userid, url_id and a new row for each person who enters a bookmark. You may have 50 people with different userid(s) with the same url_id in the table.

Forbes · February 3, 2011, 4:07pm

guelphdad:

No, you don’t need a parent-child method at all, the link isn’t a child of the same link, it is a duplicate of the link. You would use parent-child if you were drilling down:

Electronics -> Televisions - Sanyo -> Model123

If you want to know who added the bookmarks then you allow all inserts into the table with duplicate URLs and those who entered them. You would use a GROUP BY clause to group the URLs together and get a count of how many you had.

You would be using three tables. The one you have above (bookmarks) with the user_id column removed entirely, a user table defining information about your users and a users_bookmarks table with two columns userid, url_id and a new row for each person who enters a bookmark. You may have 50 people with different userid(s) with the same url_id in the table.

This is actually the one option I thought there might be a way of avoiding.

As I said, I’d thought through a few ideas, and this was the only one I could think of that would work fully. But I thought it was too complex and there was, perhaps, a better way.

OK, if that’s the way, so be it.

And thanks for your time, here. Always appreciated.

Topic		Replies	Views
Group similar results, listing users Databases	56	2622	March 2, 2011
Optiminzing a query Databases	8	926	February 14, 2011
Multiple on duplicate update Databases	1	369	January 16, 2010
Add to value if exists or insert if doesn't Databases	4	667	July 19, 2010
MySql How to edit all duplicate mysql values Databases	10	1424	May 14, 2015

Adding a "scoring" option to a bookmarking system

Related topics