The goal is to query my db_table and return the most popular rows based on the hits count for the last 7 days, 14 days and so on, something I can then easily change if needed.
I need a way to check the hits count over a certain date.
Small example
I have a table called: venues_table with the following columns:
id (int)
name (varchar)
hits (mediumint)
SELECT name
FROM venues_table
ORDER BY hits
Typical row:
id: 2
name: Venue name
hits: 34
How and where do I set the timestamp?
Do I need another table?
What is the best approach for something like this?
Ok, so we create a hits table and record every hit for each venue, then, when we want to find out the popular venues, we simply query the table based on the date and count the hits for each name(index).
Is that what you mean, sound correct?
Example
table_hits
id (int)
hits (mediumint)
name (varchar/index)
date (timestamp or date)
Could I use date instead of timestamp, does it matter?
Quick draft
SELECT name
FROM hits_table
WHERE date(?)
AND name = ?
ORDER BY hits
LIMIT 10
Also wondering, if the above is correct…
Say eventually we have over hundred venues, each venue is receiving 100+ hits a day, spread that over a couple of years… very big table thousands of rows.
Will this cause any problems?
Slow queries? Processing power?
notice what happens if you run a query like this –
SELECT COUNT(*) as hits
FROM venue_hits
WHERE venue_id = 937
AND dtstamp >= '2016-10-01'
AND dtstamp < '2016-11-01'
notice this gets all of October’s hits for that given venue, and specifically includes timestamps right up until midnight of the last day of the month
but more important, because of the way the primary key is defined, it acts as a covering index and the actual table rows are not accessed at all!!
The problem I have is the venues table is already built which already has venue_id(varchar) as the PK.
notice this gets all of October’s hits for that given venue, and specifically includes timestamps right up until midnight of the last day of the month
This makes sense, though the idea is to show the popular 10 venues in a list. I want to show the names of the venues with the most hits for that date, not count hits for only one venue. Unless I have misunderstood?
October
name 1 (20 hits)
name 2 (14 hits)
name 3 (12 hits)
...
I did think we would need to use a JOIN.
but more important, because of the way the primary key is defined, it acts as a covering index and the actual table rows are not accessed at all!!
Just read a little about covering index, first time I have heard of this, sounds good, this will help with the speedy queries
[quote=“computerbarry, post:5, topic:242540, full:true”]
This makes sense, though the idea is to show the popular 10 venues in a list. I want to show the names of the venues with the most hits for that date, not count hits for only one venue. [/quote]
this “top ten” report by necessity requires counting all hits (for the selected time period) for all venues
you need another index …
ALTER TABLE venue_hits ADD INDEX ( dtstamp , venue_id )
As mentioned @r937, the venues table already has venue_id as the PK which is varchar, I’m generally using the name of the venue as the PK in the venues table.
Slightly confused if I create venue_id as (int) in the hits table, they will not match up when I run the query?
One is int and the other is varchar.
What I have done is create:
id (int) | primary key | index
venue_id (varchar 60) | index
dtstamp (datetime) | primary key | index
Is this correct?
And then run a query something like:
SELECT COUNT(*) as hits
FROM venue_hits
WHERE venue_id = "name of venue"
AND dtstamp >= '2016-10-01'
AND dtstamp < '2016-11-01'
I can then query the hits table based on the matching venue_id, which is the name of the venue.
Will this work the same r937 and benefit from the covering index?
And why do we have both id and dtstamp as PKs, I thought only one PK was allowed per table?
And thanks for the link @John_Betong will keep this for reference no need to alter things just yet, this is a new table. No data as yet.
[quote=“computerbarry, post:8, topic:242540, full:true”]
As mentioned @r937, the venues table already has venue_id as the PK which is varchar, I’m generally using the name of the venue as the PK in the venues table.[/quote]
not a good idea… this is where you would use an AUTO_INCREMENT integer PK
[quote=“computerbarry, post:8, topic:242540, full:true”]
What I have done is create:
id (int) | primary key | index
venue_id (varchar 60) | index
dtstamp (datetime) | primary key | index
Is this correct?[/quote]
nope… this is where you would ~not~ use a surrogate PK
not a good idea… this is where you would use an AUTO_INCREMENT integer PK
Seemed like a good idea at the time, the venue name will never change and will never be duplicated in the venues table. It was mainly an SEO purpose for url inclusion.
I have a live instance and the venues table has been running for some time with lots of live data running off venue_id which is tied up with multiple queries already.
you want this –
CREATE TABLE venue_hits
( venue_id INTEGER NOT NULL
, dtstamp DATETIME NOT NULL
, PRIMARY KEY ( venue_id , dtstamp )
);
So there is no way to query the hits table if I don’t create a AUTO_INCREMENT integer PK on the venues table?
One thing is certain, I can’t change the name of venue_id or the type inside the venues table.
Could we change the venue_id inside hits table to id? And then add a id onto venues table? (which would mean moving the PK from venue_id onto id)
And just wondering, how do I group the hits if every id is different?
Meaning, how do I associate the venue to the hit if I have no name?
perhaps you weren’t aware that the PK could be composite, i.e. consisting of more than one column
[quote=“computerbarry, post:10, topic:242540, full:true”]
So there is no way to query the hits table if I don’t create a AUTO_INCREMENT integer PK on the venues table?[/quote]
what??!! of course you can query the table!!!
look, you mention a number of things that can’t change… and i’m getting lost
why don’t you post the tables that ABSOLUTELY CANNOT CHANGE and we’ll go from there
I also have a hits column which I use currently, hence me creating the new hits table to get the extra queries for the different dates.
In short, I need to create the hits table so I can record all the hits for each venue from the venues_tbl which I’ll then query and show a separate list based on most popular hits for that time period, weather it be 7days october etc.
I think what we’ve done so far is good, just the little issue linking the hits to the venue_id above.
are there any tables which have a FK to the venue_id in the venues_tbl table?
FK in my head, I just remember.
Its not a proper FK in DB terms - I have venue_id inside my reviews table which I use and query for comments about the venue.
I know it’s probably not the best db design, a little old. Seems to work ok for small site easy to remember.
as a trivial aside, you do not need to append _tbl into the name of a table
This is a little old, at the time everybody was using it, I thought it was cool, advanced
also, SMALLINT(16) will hold ~exactly~ the same range of numbers as SMALLINT(2) or SMALLINT(937)
I had an issue some time back if I used 2 the third 3 char would not show, thanks for the sharing, am I little rusted ha
I’ve been writing lots of prepared statements which I might use, in the process of switch from mysql to mysqli.
after this is done, browse the data inside your new table to assure yourself it is exactly the same as your existing table, except it has an auto_increment id
when you’re happy, run this –
[code]DROP TABLE venues_tbl;
RENAME TABLE venues_new TO venues_tbl;
[/code]
you can now proceed to create the venue_hits table as previously described
And back to the hits table, just to clarify, want to make sure I get this right:
CREATE TABLE venue_hits
( venue_id INTEGER NOT NULL
, dtstamp DATETIME NOT NULL
, PRIMARY KEY INDEX ( venue_id , dtstamp )
);
Where do I add the venue_id name, else how do I know which hits relate to what venue from the venues table?
I’ve also added the INDEX above as you mentioned in the alter table snippet, is this correct?