Problems trying to streamline this query

I’m trying to combine some queries on a web page and I’m having unexpected results.

We’re using a home-built CMS that manages our website, and it holds things like the page title and the contact person for that page. Our website is broken down into topic areas, so we have a summary page that shows all the topics listed in a table, along with the default contact person for that topic. It also lists total pages per topic, and pages with no contact or not title.

The summary page is built using 4 queries. One to list out the topics with their default contact, and three more to do the total pages, the pages with no contact, and the pages with no title.

Here are the queries from the old, working version of the page:

Main query to get topics and contacts:


SELECT    default_contacts.topic_id
		, default_contacts.topic_name
		, default_contacts.contact_id
		, maindir.first_name & ' ' & maindir.last_name AS fullname
		, pagecontacts.page_topic
		, pagecontacts.page_id
		, pagecontacts.page_title
		, pagecontacts.contact_id
FROM ( ( default_contacts
LEFT JOIN maindir
		ON maindir.main_id = default_contacts.contact_id )
LEFT JOIN pagecontacts
		ON default_contacts.topic_name = pagecontacts.page_topic )
ORDER BY topic_name

Then these three to get the totals

Total pages in topic:


SELECT COUNT(page_id) as a
FROM pagecontacts
WHERE page_topic = '#gettopics.page_topic#'

Pages with no contact:


SELECT COUNT(page_id) as b
FROM pagecontacts
WHERE page_topic = '#gettopics.page_topic#'
AND contact_id = 0

Pages with no title:


SELECT COUNT(page_id) as c
FROM pagecontacts
WHERE page_topic = '#gettopics.page_topic#'
AND ( page_title IS NULL
OR page_title = '' )

Sine we’re stuck with MS Access, I’ve been trying to figure out a way to streamline pages. So one of the things I thought of was to combine the above four queries into one. Well, sort of one. I still have the same number of select statements, so it might not be ideal. This is what I came up with:


SELECT    default_contacts.topic_id
		, default_contacts.topic_name
		, default_contacts.contact_id
		, maindir.first_name & ' ' & maindir.last_name AS fullname
		, pagecontacts.page_topic
		, pagecontacts.page_id
		, pagecontacts.page_title
		, pagecontacts.contact_id
		, ( SELECT COUNT(page_id)
				FROM pagecontacts
				WHERE page_topic = topic_name ) as a		
		, ( SELECT COUNT(page_id)
				FROM pagecontacts 
				WHERE page_topic = topic_name
				AND contact_id = 0 ) as b
		, ( SELECT COUNT(page_id)
				FROM pagecontacts
				WHERE page_topic = topic_name
				AND ( page_title IS NULL
				OR page_title = '' ) ) as c
FROM ( ( default_contacts
LEFT JOIN maindir
		ON maindir.main_id = default_contacts.contact_id )		
LEFT JOIN pagecontacts
		ON default_contacts.topic_name = pagecontacts.page_topic )
ORDER BY topic_name

When I run this on my local machine, it takes waaaaaay longer than the four separate queries, and when I copy it to our test server, the query times out. I would expect that at worse, it would perform identical to the separate ones and not worse.

What am I doing wrong?

it looks like you confuse db engine the same table names… try that

SELECT    dc.topic_id
		, dc.topic_name
		, dc.contact_id
		, md.first_name & ' ' & md.last_name AS fullname
		, pc.page_topic
		, pc.page_id
		, pc.page_title
		, pc.contact_id
		, ( SELECT COUNT(page_id)
				FROM pagecontacts pp1
				WHERE pp1.page_topic = dc.topic_name ) AS a		
		, ( SELECT COUNT(page_id)
				FROM pagecontacts pp2
				WHERE pp2.page_topic = dc.topic_name
				AND pp2.contact_id = 0 ) AS b
		, ( SELECT COUNT(page_id)
				FROM pagecontacts pp3
				WHERE pp3.page_topic = dc.topic_name
				AND ( pp3.page_title IS NULL
				OR pp3.page_title = '' ) ) AS c
FROM ( ( default_contacts as dc
LEFT JOIN maindir as md
		ON md.main_id = dc.contact_id )		
LEFT JOIN pagecontacts as pc
		ON dc.topic_name = pc.page_topic )
ORDER BY dc.topic_name

i’m curious about the LEFT JOINs you have

is it possible for default_contacts.contact_id to have a value which does not exist in the maindir table?

is it possible for default_contacts.topic_name to have a value which does not exist in the pagercontacts table?

i think a few words on the purpose of each of your tables would help me a lot

i can’t approach the COUNT queries until i understand the main query…

No, but a topic can have a default contact who’s not been assigned yet. Without the left join there, the topic wouldn’t show up on the list so we couldn’t click it to assign a contact.

Yes. The way the CMS is set up is that you add a topic first. That topic appears in the defaults_contact table, but the entries in pagecontacts may not be there yet because you have to browse the page to add them. We have some topics with thousands (and even tens of thousands) of pages, so adding them manually wasn’t an option. The CMS therefore adds the pages as they’re browsed (on the dev server) which includes running a link check across them. So one pass of the link checker not only checks links, it also adds all the pages for that topic to the CMS.

So we have the following tables:

maindir: contains our complete staff listing (names, email addies, office location, phone number, etc)

pagecontacts: list of pages (by path), topic name for each page, and ID numbers for the primary, secondary, etc contacts. The contact ID is the same as the ID in the maindir table. Also contains the title for the page (so we can use it in the H1 and title tag with the latter being buried in an include and hard to get to otherwise.

default_contacts: The ID of the default contact for each topic. Also serves as the list of topics. This ID is the “fallback” contact in the event we neglect to assign an ID to a page in the pagecontacts table.

It might seem a little quirky, but this database started off just being a way to populate our online office directory, and the maindir table was the only thing in there. We soon realized it could do a lot more, so it’s sort of evolved into what it is now. It’s basically an evolution over 3 or 4 years, and despite how quirky it seems, it’s ended up making the website really easy to update, and the pages load super fast despite the Access DB. Once we get a real database under us, things should really work well.

Hope that helps explain things!

Right now I’ve put the original 4 queries back, but I’ve changed the three that do the counts to a QofQ of the main query, and that seems to have sped things up a little, but I’m really interested in the best way to do it.

sorry for the delay in replying

i looked at this several times, and i am sorry, i must admit defeat

despite your explanations, i ~still~ cannot understand the relationships in your tables

however, the fact that you’re using query-of-query on the results of the main query in order to get your counts, well, that tells me that the counts shouldn’t have been incorporated into the main query in the first place

the reason is, since you are returning all the detailed rows anyway, the best place to count them is in the application

i cannot vouch for the performance of query-of-query in general but i would be genuinely surprised if it’s not substantially better to do counts in the app than in the database

No apology needed rudy! I’m the one sorry for not explaining it sufficiently!

I think one of my problems is that every since I read your book, I seem to go out of my way to do everything in the SQL that I can possibly make it do! So I end up with those extremely complex queries and very simple CF code.

I’ll take another look at this app tomorrow and see if there’s a way to make the CF code to the counting, although since the counts are sill based on the database results (I need to count how many pages exist for a certain condition in the database), I’m not sure how practical it’ll be.

there is – query-of-query

Well that’s what I was doing before! But now I’ve changed it to increment a counter for each case. It seems faster, but EVERYTHING is faster at home. I’ll bring the page to work tomorrow and see if it’s faster through the coffee stirrer straw that’s our connection to the dev server.

OOOH! It actually is faster!

I uploaded it to the server at work, and new version of the page loads significantly faster. I’ll have to remember to keep the math out of the SQL and put it in the app from now on. :tup:

you can do all kinds of math in SQL, as long as it doesn’t add to the data retrieval load

e.g. GROUP BY with a HAVING restriction is best done in SQL