SitePoint Sponsor

User Tag List

Results 1 to 5 of 5
  1. #1
    _ silver trophy ses5909's Avatar
    Join Date
    Jul 2003
    Location
    NoVa
    Posts
    5,466
    Mentioned
    1 Post(s)
    Tagged
    1 Thread(s)

    Working on a Database Design - Input Wanted!

    I'm working on redesigning a database for an application which allows users to get estimates for different insurance plans by asking plan-specific questions. The user is able to save and retrieve past estimates.

    In the current system, the database stores the user's input in a seperate table for each plan. So if there are 25 plans, there are 25 tables. Some of the columns will be the same for all of the plans, but alot are different. You can see an example of the input tables below.


    Code:
    input_plan10				input_plan11
    -------------------			-------------------
    estimate_id				estimate_id
    column_1				column_1
    column_2				column_4
    column_3				column_7
    column_4				column_8
    .....					.....
    column_40				column_62
    Once the calculations are performed, the results of the estimate are shown to the user. This output is saved in a series of output tables so when the user retrieves the estimate, all of the calculations do not need to be performed again. The model is the same as that of the input tables; if there are 25 plans, there are 25 tables. Some of the columns will be the same for all of the plans but there are differences. The output tables are structured like the input tables, but with different columns.

    The current database is informix and the new db is SQL Server 2008. We get about 40k estimates run per year and we maintain past year's data. The average number of columns in each plan table for both inputs and outputs is 40.


    I know there has to be a better way to do this and wanted to get some opinions here. In our new design, we will have a table that holds all of the unique questions for all plans and then a question cross reference table that has the question id, plan id, and some other information needed by the app.


    We thought of possibly having one (very large) table to store the inputs for all of the plans which would be along the lines of what I believe is an EAV approach:
    Code:
    InputTable
    -------------
    EstimateID
    PlanID
    FieldKey
    Value
    So the data would look like:
    Code:
    EstimateID 	PlanID		FieldKey			Value
    -----------------------------------------------------------------------------
    1		10		Old Column Name			Column Value
    2		11		Old Column Name			Column Value
    This approach is not normalized however. Keep in mind that with at least 40k estimates run per year, and an average of 40 questions per estimate, we're looking at 1,600,000 records to search through for one year so this approach doesn't seem great at all.


    Another approach would be instead of having the FieldKey column, have a foreign key to the question cross reference table and the user's answer as the value. We don't need to store the Plan Id here because the Plan ID is stored in the question cross reference table.

    Code:
    InputTable
    ------------
    EstimateID
    QuestionXrefID
    Value
    While this helps with the normalization, this table can still grow large and I believe will be slow to query.

    We thought of possibly having an archive table so if the estimate date is more than 6 months old, move it to an archive table because most people retrieve existing estimates witin a few weeks of creating them. So this could help reduce the size of the primary input table, thus speeding up queries.

    I don't feel like any of these approaches is ideal so if you have any recommendations, let me have them! I appreciate any and all opinions on this. Let me know if I didn't explain something clear enough.
    Sara

  2. #2
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,251
    Mentioned
    59 Post(s)
    Tagged
    3 Thread(s)
    speed of queries depends entirely on proper indexing, not on the number of rows in the table, so removing older rows to archive tables will not improve performance but will increase complexity if older data is requested

    EAV is to be avoided at all costs

    i'm not sure i understand the rationale for having a separate table for each plan -- why can't you use just one table?
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  3. #3
    _ silver trophy ses5909's Avatar
    Join Date
    Jul 2003
    Location
    NoVa
    Posts
    5,466
    Mentioned
    1 Post(s)
    Tagged
    1 Thread(s)
    I definitely don't want to use multiple tables; that is the current design. It is a coldfusion/informix app that we are redeveloping.

    So the better approach would be to just have the one table with the question cross-ref id and not worry about the archive table?

    The Primary key would be a combination on the EstimateID and QuestionXrefID. I didn't know if further indexing would help in this scenario.

    Code:
    InputTable
    ------------
    EstimateID
    QuestionXrefID
    Value
    Sara

  4. #4
    SQL Consultant gold trophysilver trophybronze trophy
    r937's Avatar
    Join Date
    Jul 2002
    Location
    Toronto, Canada
    Posts
    39,251
    Mentioned
    59 Post(s)
    Tagged
    3 Thread(s)
    this is a typical many-to-many cross-reference table, and by making the PK a composite key consisting of (EstimateID,QuestionXrefID) you are optimizing all queries which have a filter on EstimateID, i.e. find results for a specific user

    for the other direction (find user results for a specific question) you will need an additional index on QuestionXrefID
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL
    "giving out my real stuffs"

  5. #5
    _ silver trophy ses5909's Avatar
    Join Date
    Jul 2003
    Location
    NoVa
    Posts
    5,466
    Mentioned
    1 Post(s)
    Tagged
    1 Thread(s)
    Awesome. Thanks Rudy.
    Sara


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •