2 queries vs 1 + processing

So it’s the old question all over again.

Which is more efficient; running 2 queries to select my data, or running 1 and using PHP to process the results.

Layout 1:
two tables, identical structure; i’m using generalized names.
fk1 int(5) (FOREIGN KEY)
fk2 int(5) (FOREIGN KEY)
data varchar(50)


$res = $sql->query("SELECT data FROM table1 WHERE fk1 = $somevalue AND fk2 = $someothervalue");
$data1 = $res->fetch_all();
$res = $sql->query("SELECT data FROM table2 WHERE fk1 = $somevalue AND fk2 = $someothervalue");
$data2 = $res->fetch_all();

Layout 2:
one table;
fk1 int(5) (FOREIGN KEY)
fk2 int(5) (FOREIGN KEY)
data varchar(50)
type tinyint(1)


$res = $sql->query("SELECT data,type FROM table2 WHERE fk1 = $somevalue AND fk2 = $someothervalue");
while($row = $res->fetch_array()) {
    $data[$row['type']][] = $row['data'];
}

(Figured it’d be easier to just lump it in a single array)

I think the fastest way to test would be actually running both scripts say, a thousand times, and return the difference between time() before and after for both, then compare.

Edit: this of course only counts if by ‘efficient’ you mean fast.

Isn’t it always the same old answer though? Well, it depends … :wink:

how many records will potentially be in each table
whether you have correctly used indexes on these tables
whether you are mostly reading or writing to these tables
whether you have also need to access ‘type’ and ‘data’ independently of each other in other places in your application
if a database is the correct data-holder for every table (vs say, a cached PHP array)

There’s actually a third FK that i forgot about, btw.

In order:
1: How many different video and audio file formats are there in the world?
2: Should be.
3: 99.99999% Reading.
4&5: It’s a matrix. fk1 x fk2 x fk3 x type = array(data); (or fk1 x fk2 x fk3 = 2 arrays(data))

Let me guess then: 3 tables:
file_type (audio, video, both)
OS (win, mac, *nix)
file_data (e.g. avi, mjpeg)

Extension (mp4, ogg, etc etc etc etc etc etc...) x Download/Streaming (Bit value 0/1) x (wouldnt be a table, obviously) Output Format (Flash,WindowsMedia, etc)

DataOut1: Metafile format possibilities
DataOut2: Set of possible services to output from. (Apache, WM, LightHttp, etc)

DataOut1: So I pick Flash and your matrix shows me which extensions can be used?

Is that it?

So your question is about the appropriate use of a relational lookup table (or tables)?

So where does Apache, LightHttp etc come in to play?

It’s a CDN system;

For a given, specific, file X;
X has an extension, and is either Download or Streaming. (This information is in a database table already, and is being used by the CDN’s inbuilt software).

Based on the Extension, a set of Output Formats are available.

Based on the extension, Download/Streaming type, and the chosen Output Format, a set of metafile outputs are possible.
Based on the extension, Download/Streaming type, and the chosen Output Format, a set of Delivery Services are possible (This set will later be array-intersected with the array of delivery services available on a given CDN)

As you rarely write to it, what if you baked the whole thing in one table?

ext | download | format | metafile_option

mp4, 1, Flash, optionA
mp4, 0, Flash, optionA
mp4, 1, Flash, optionB
mp4, 0, Flash, optionC

I dont believe I would use PHP for a CDN system myself, as the language is not optimal for such a use. If you want the content delivery to be efficient your much better of writing it in C++.

For your question, it depends how advanced you will make the system.

For example, a file can both be available as a stream or download depending on the context. I.e. for example if I want I can stream the file directly to view it, or download it to view it later.

If this should be possible then you would need to separate the download/stream part so it also is tied into the specific articles, allowing only one of the options or both depending on the article in question.

For the tables, you should be able to join those in one query. The main table would be the one with the extensions, then two more tables, one with the download option and one with the metafile options. Then you just populate the download/metafile tables with the information required per extension.

You can take a look on this page if you want to read up on how to join multiple tables together in SQL. MySQL :: MySQL 5.0 Reference Manual :: 12.2.8.1 JOIN Syntax

Well I have to hold my hands up here as I have no experience of CDN systems - I’m not quite even sure what the “metafile options” are in this case either.

What I was imagining eventually was an sqlite file based table which can be transferred and moved about between servers as a plain file, then to some degree ‘unbaking’ the table to free up only those options which applied to that particular server (or node on the CDN) and MAYBE keeping that section in memory as an sqlite heap table.

I am not sure how feasible that would be, or even if it triggers any ideas - but as I now understand the original question a bit better, I think Kokos’ idea might stand as the best suggestion TBH.

1 query will be faster, and I agree with Cups’ solution (to a degree).

Something else to consider is not to use file-extension to determine mime-type. Parse the file and read the mime-type. Extensions can be deceiving.

I would do something like this (loose scaffolding of an architecture, but gives you an idea of the data you’re dealing with):

List of Mime Types:

mime_type

mp4
flv
ogg

List of Delivery Services Available:

service | method | protocol | required_software

FlashStream | stream | RTMP | Flash 10
HTTPDownload | download | HTTP | Browser
QuicktimeStream | stream | RTSP | Quicktime Player
HTML5Stream | stream | HTTP | HTML-5 Enabled Browser

List of Services Available for Each Mime Type:

mime_type | service

flv | FlashStream
mp4 | FlashStream
mp4 | QuicktimeStream
mp4 | HTTPDownload
ogg | HTML5Stream