Help selecting database system

I need some advise about selecting a database system.

Currently, I have a client that has an existing system for keeping track of, among other things, county land records. The bulk of the data in this database is static data that is reloaded once a day early in the morning. It is used throughout the day for users to search for property records, subdivision covenants, etc. Right now, there is a one-to-one relationship between the tables in the database and the files in the county’s AS/400. The data is in need of normalizing, but that will require a considerable time investment because the existing search systems would have to be rewritten to account for the new database structure.

The system is currently using PostgreSQL.

My question is, what would be the best database system to use for the system. The client is looking for suggestions about how to speed up the site.

As an example I have tables:

CREATE TABLE ilgf (
    "type" character(2),
    book integer,
    page integer,
    subpage character(2),
    o_or_e character varying(13),
    seq integer,
    name character varying(32),
    "month" integer,
    "day" integer,
    "year" integer,
    index_type integer
);
CREATE TABLE ilf (
    "type" character(2),
    book integer,
    page integer,
    subpage character(1),
    "month" integer,
    "day" integer,
    "year" integer,
    "time" integer,
    nature character varying(20),
    description character varying(30),
    receipt_number character varying(9),
    amount numeric(30,2),
    "operator" character(3),
    last_file_main bigint,
    grantor_count integer,
    grantee_count integer,
    subdivision_lots integer,
    metes_bounds integer,
    daily_flag character(1),
    docket_number integer,
    parcel_number character varying(13),
    parcel_description character varying(30),
    acreage numeric(30,3),
    ending_page integer,
    ending_subpage character(1),
    ucc_number integer,
    instrument_number bigint,
    doc_type character(2),
    data_1 character varying(30),
    data_2 character varying(30)
);

Table ilgf has ~1.7 million records, table ilf has ~600,000 records. The tables are related to each other on the ‘book’ and ‘page’ fields.

One of the queries that is run for the subdivision covenant search is:

SELECT DISTINCT ilgf.name
   FROM ilgf, ilf
  WHERE ilgf.book = ilf.book AND ilgf.page = ilf.page AND (ilf.doc_type = 14 OR ilf.doc_type = 15 OR ilf.doc_type = 28 OR ilf.doc_type = 29 OR ilf.doc_type = 93 OR ilf.doc_type = 98)
  ORDER BY ilgf.name;

This query is used to retrieve unique property names to use as the source for an auto complete textbox. This query takes ~30 seconds to execute in the current system.

Also, to give some scope to the size of the database, when the data is reloaded everynight, each of the ~15 involved tables are truncated and their data reloaded from the AS/400 files to pick up changes and new entries. This process takes ~6 hours to complete.

So, what I need/would like is any input, or tips about either a database system that would be better suited to serve this static data, or some performance tuning types for the current database system.

I know that this is a vague question, and what constitutes the “best” DBMS is very subjective. I’m really just looking for opinions and suggestions that I can use as a jumping off point for more research. It may be that PostgreSQL is the best option, but the server needs to be performance tuned. I need some discussion about how to approach this problem to get me headed in the right direction.

Thanks for the help, and thanks for reading my long winded post :slight_smile:

do those tables have any indexes defined on them?

but surely not in ~both~ tables, since those are the join columns!!

i.e. if there are 5 rows with the same book/page in one table, and 6 rows with the same book/page in the other table, then your join will return 30 rows!!

to get back to your original question…

My question is, what would be the best database system to use for the system. The client is looking for suggestions about how to speed up the site.
speeding up the site will ~not~ happen if you swap out one database management system for another, without addressing the underlying indexing issues

Yes, there are indexes.

For ilgf:

CREATE INDEX ilgf_index_book ON ilgf USING btree (book);
CREATE INDEX ilgf_index_date ON ilgf USING btree (datify("year", "month", "day"));
CREATE INDEX ilgf_index_name ON ilgf USING btree (name);
CREATE INDEX ilgf_index_page ON ilgf USING btree (page);
CREATE INDEX ilgf_index_seq ON ilgf USING btree (seq);
CREATE INDEX ilgf_index_subpage ON ilgf USING btree (subpage);
CREATE INDEX ilgf_index_type ON ilgf USING btree ("type");
CREATE INDEX "ilgf_index_type_book_page_subpage_oORe" ON ilgf USING btree ("type", book, page, subpage, o_or_e);

For ilf:

CREATE INDEX ilf_index_book ON ilf USING btree (book);
CREATE INDEX ilf_index_day ON ilf USING btree ("day");
CREATE INDEX ilf_index_instrument_number ON ilf USING btree (instrument_number);
CREATE INDEX ilf_index_month ON ilf USING btree ("month");
CREATE INDEX ilf_index_page ON ilf USING btree (page);
CREATE INDEX ilf_index_subpage ON ilf USING btree (subpage);
CREATE INDEX ilf_index_type ON ilf USING btree ("type");
CREATE INDEX ilf_index_type_book_page_subpage ON ilf USING btree ("type", book, page, subpage);
CREATE INDEX ilf_index_year ON ilf USING btree ("year");

The reason neither of the tables have Primary or Foreign keys is because, and this may be an acceptable practice, the Primary/Foreign keys would have to encompass, in the ilgf table anyway, every field in the table. There can be multiple rows with the same book/page combination, etc. It takes every field to make up a unique row. Likewise, there can be multiple rows in the ilgf table that point to the same row in the ilf table.

For example ilgf rows:

<row>
 <column name="type">T </column>
 <column name="book">2226</column>
 <column name="page">934</column>
 <column name="subpage" null="null"></column>
 <column name="o_or_e">O</column>
 <column name="seq">31</column>
 <column name="name">BATES KEITH A ET AL</column>
 <column name="month">8</column>
 <column name="day">13</column>
 <column name="year">2007</column>
 <column name="index_type">2</column>
</row>
<row>
 <column name="type">T </column>
 <column name="book">2226</column>
 <column name="page">934</column>
 <column name="subpage" null="null"></column>
 <column name="o_or_e">O</column>
 <column name="seq">32</column>
 <column name="name">BATES FREDA ET AL</column>
 <column name="month">8</column>
 <column name="day">13</column>
 <column name="year">2007</column>
 <column name="index_type">2</column>
</row>

Both point to ilf row

<row>
 <column name="type">T </column>
 <column name="book">2226</column>
 <column name="page">934</column>
 <column name="subpage" null="null"></column>
 <column name="month">8</column>
 <column name="day">13</column>
 <column name="year">2007</column>
 <column name="time">1540</column>
 <column name="nature">AMENDED PROTECTIVE C</column>
 <column name="description" null="null"></column>
 <column name="receipt_number" null="null"></column>
 <column name="amount">0.00</column>
 <column name="operator">DAD</column>
 <column name="last_file_main">20070813165745</column>
 <column name="grantor_count">95</column>
 <column name="grantee_count">2</column>
 <column name="subdivision_lots">62</column>
 <column name="metes_bounds">0</column>
 <column name="daily_flag" null="null"></column>
 <column name="docket_number">0</column>
 <column name="parcel_number" null="null"></column>
 <column name="parcel_description" null="null"></column>
 <column name="acreage">0.000</column>
 <column name="ending_page">974</column>
 <column name="ending_subpage" null="null"></column>
 <column name="ucc_number">0</column>
 <column name="instrument_number">542141</column>
 <column name="doc_type">15</column>
 <column name="data_1">D</column>
 <column name="data_2" null="null"></column>
</row>

seems like there should also be PRIMARY and FOREIGN KEYs

the index on book (in both tables) should be on (book,page), and i wonder if the index on page is needed

there should probably be an index on doc_type