For my thesis research Im doing some weblog analysis using a custom database table as log.
While I’m getting most of the info I need by just querying all the cells and putting it in an array, the database contains a text coumn in the format varchar(255) that contains internal paths of the hits on my website.
I want to see if these internal paths are a certain value such as ‘cart’ or ‘cart/checkout/review’ and put that in the datafile that I use for analysis.
The problem is that I’m grouping the hits in the mysql database on unique clients or sessions, and in the grouping process not all the path values of all the hits in a session are stored, so this doesn’t work:
// Set headings for csv file
$rows = null;
$rows[] = array('Client ID','Group ID','path','reference url','Page Views','Impression','Impression Count','Segment','User ID','Total Time Recorded','timestamp','Date','Cart','Cart Review');
// Query all access rows
$query="SELECT cid,gid,url,ref,COUNT(*),MAX(impression),SUM(impression),segment,uid,SUM(timer),timestamp,date FROM statspluslog WHERE staff =0 AND cap =1 AND hostname != 'xxx' GROUP BY cid";
$result = mysql_query($query);
while ($r = mysql_fetch_assoc($result)) {
if ($r['url'] == 'cart') {
$r['cart'] = 1;
} else {
$r['cart'] = 0;
}
if ($r['url'] == 'cart/checkout/review') {
$r['cartreview'] = 1;
} else {
$r['cartreview'] = 0;
}
//print_r($r);
$rows[] = $r;
}
//Write csv to file
$fp = fopen('statspluslog-client.csv', 'w');
foreach ($rows as $row) {
fputcsv($fp, $row);
}
fclose($fp);
print "finished CLIENT LOG TO CSV!\
\\r";
mysql_close($conn);
Instead I reckon I should have some sort of aggregate function to see if ‘cart’ exists in the url column for any of the rows that are collapsed by the GROUP BY clause, something like this:
SELECT cid,gid,url,ref,COUNT(*),MAX(impression),SUBSTRING(“cart”,url).
Unfortunately it seems such a function does not exist, so how can I access the url column for all the columns that are collapsed by the GROUP BY clause? Or is there another way to do this?
My “Simply SQL” book by rudy discusses something called pushing down the GROUP BY clauses into a subquery but I don’t really understand this part, nor see if it can apply to my situation.
thx for any help.