Hierarchical PostgreSQL Query

I’ve long been a MySQL addict, but I was told that PostgreSQL is a better choice for working with hierarchical queries - a very important feature for an animals website I’m working on. So I downloaded and installed PostgreSQL 9.5, which is running alongside MySQL right now.

Can anyone tell me how to write a hierarchical query, given the data in my table?

This is the schema from my table (as copied from pgAdmin III):

CREATE TABLE public.gz_life_mammals (
 id integer NOT NULL,
 taxon text NOT NULL,
 parent text NOT NULL,
 parent_id smallint NOT NULL,
 slug text,
 name_common text,
 plural text,
 extinct smallint NOT NULL,
 rank smallint NOT NULL,
 key smallint NOT NULL,
 CONSTRAINT "Primary Key" PRIMARY KEY (id)
);

I’m just figuring out how to write PostgreSQL queries, but this is my first successful attempt…

$dbh = pg_connect("host=localhost dbname=geozoo user=postgres");

if (!$dbh) {
die("Error in connection: " . pg_last_error());
}
$sql = "SELECT * FROM gz_life_mammals";

$result = pg_query($dbh, $sql);

while ($row = pg_fetch_array($result)) {
 echo "ID: " . $row[0] . " | ";
 echo "Taxon: " . $row[1] . " | ";
 echo "ParentID: " . $row[3] . "<br>";
}

// free memory
pg_free_result($result);

// close connection
pg_close($dbh);

The most important table fields for this exercise are the first four (id, taxon, parent and parent_id. The data looks like this:

ID | TAXON         | PARENT    | PARENT_ID
1  | Mammalia      | Chordata  | 1
2  | Carnivora     | Mammalia  | 2
3  | Felidae       | Carnivora | 3
4  | Panthera    | Felidae   | 4
5  | Panthera-leo   | Panthera | 5
6  | Panthera-tigris | Panthera | 5

Where the last two rows represent the lion (Panthera leo) and tiger (Panthera tigris). My ultimate goal is to be able to display both names and numbers (counts) of everything in the hierarchy. So if I navigate to MySite/panthera-leo, it displays “Class: Mammalia, Order: Carnivora.” And if I navigate to MySite/mammalia, it might display something like this:

Orders: 19
Families: 58
Genera: 688
Species: 8,034

The resources I’ve checked out so far have kind of thrown me for a loop, so maybe I should start by asking a simple question: Can this be done with one query, or will I need separate queries for extracting names and counting rows?

In fact, I’ve already figured out a way to use multiple “stairstep” queries to display grandparents and children with MySQL, so I’d really like to focus on counts. Can anyone show me how to write a PostgreSQL query that counts rows on various taxonomic levels?

I tried one script which I asked about @ Problem with PostgreSQL Query with Function (T_FUNCTION error) but it was too complex for me to figure out. I’d like to try and find either a simpler script or a different approach. Thanks.

In order to query a tree or hierarchy of data, you need either to build out the hierarchy model yourself or the database supports some sort of tree data structure. I found a pretty need SO question and a great answer on the subject.

http://stackoverflow.com/questions/4048151/what-are-the-options-for-storing-hierarchical-data-in-a-relational-database

That doesn’t answer your question directly, but it might get your thinking on the right track.

Scott

Basic google search revealed this.

Don’t believe everything you hear. You can get my just fine using MySQL and one of the after mentioned pattens by s_molinari. Not to mention if you will be using a shared host it is going to be difficult to find a provider with postgress. I think the disadvantages outweigh the advantages of using PostgreSQL over MySQL.

Wow, thanks for all the tips; very revealing. It is very revealing that The Catalogue of Life uses MySQL with their database, which has over one million rows. I haven’t waded through all their PHP code, but judging from their database setup, it looks like they don’t even use recursive queries.

I’m tempted to emulate them and just slap two or three extra columns (e.g. ‘Family’, ‘Order’) in my tables, so I can just count all the species that have the value ‘Carnivora’ in the field Order, for example.

This topic was automatically closed 91 days after the last reply. New replies are no longer allowed.