Tricky SQL query - Need some expert advise

Both my buddy and I are computer programmers. He called me on the phone tonight with a SQL question that has us both stumped. He’s been working on it 5+ hours and can’t figure it out.

He said he’s using postgres 8.1 for the database (which I’ve never used), so it needs to be SQL that can run on any database. He said if it was Oracle he had a few ideas he could try.

He’s got a table with an auto-increment number field, a date/time stamp, and X, Y, Z fields. Every few minutes, his program inserts a new row with the current date/time.

He needs a query that will give him the first row entered for each day for the past 365 days (or whatever value he wants to use).

My suggestion was to add a new field to the table with a “Y/N” flag and have it be “Y” for the first entry of the day and “N” for the others, but he doesn’t want to perform that check every few minutes when it inserts a row. The more I think about it, the more I like my idea…

Any idea how to get an SQL statement to do what he wants?

I’m not familiar with the specifics of Postgres, but this would work in T-SQL.


SELECT *
FROM table AS A
	JOIN ( 
		SELECT TOP [X]
			DATEPART( YEAR , dateField ) , DATEPART( MONTH , dateField ) , DATEPART( DAY , dateField ) , MIN( ID ) AS 'Key'
		FROM table
		GROUP BY DATEPART( YEAR , dateField ) , DATEPART( MONTH , dateField ) , DATEPART( DAY , dateField ) 
	) AS Data ON Data.Key = A.ID

I had him email me the info…

Postgres Database (v8.1)

Primary Goal: Give me the first row inserted every day for the past X days (any group of columns).
Secondary Goal: Give me a single row for every day for the past X days

Table Makeup:
Name Type
---------------------- | --------------------
table_id Integer (actually, a “serial” type in Postgres), unique to each row
table_timestamp Timestamp without time zone (ex. ‘YYYY-MM-DD HH-mm-ss’), not unique
table_column_a Integer, not unique

Here is a query that will give me the unique dates, but no ID.

SELECT EXTRACT(‘day’ FROM table_timestamp) AS d, EXTRACT(‘month’ FROM table_timestamp) AS m, EXTRACT(‘year’ FROM table_timestamp) AS y, EXTRACT(‘hour’ FROM table_timestamp)as hr, EXTRACT(‘minute’ FROM table_timestamp) as min, table_id, table_column_a FROM my_schema.my_table t GROUP BY y, m, d ORDER BY y, m, d

More from him…

Data is inserted as it is generated, which is decided upon current conditions, operator instructions, etc… No way to know how many rows will be inserted each day.

I will have to do this across all tables in the database (around 1,000 tables), so I want to keep any subqueries to a minimum (as well as triggers).