Rate/Critique/improve my database indexing idea

Disclaimer: This is mostly a just-for-fun theorycrafting train of thought.

So traditionally data can be saved using a relational schema design. Theoretically couldn’t data be saved further by using an underlying checksum on groups of data?

I was thinking predictive analytics could be used to see what groups of data occur naturally together, and when they do happen, have an underlying table store that data with a checksum as a reference for when that would happen again? So when data would be inserted, the checksum is taken, if it exists then it’s stored as a reference, otherwise it’s stored entirely(But when querying the data, you only visibly see the data, not the checksums).

So you have columns A-F, the analytics finds that columns B,C and E occur quite often with the same set of values, so the data is stored in the hidden underlying table with the checksum and then the main table just stores the checksum with the column detail.

Have I reinvented the wheel?
Does it already exist?
Is it silly because checksums can theoretically clash? :slight_smile:

http://wiki.c2.com/?PrematureOptimization

https://shreevatsa.wordpress.com/2008/05/16/premature-optimization-is-the-root-of-all-evil/

I shall bear this in mind for general coding :slight_smile:

This topic was automatically closed 91 days after the last reply. New replies are no longer allowed.