From Static Content to Database-Driven Content

So I have been building an “Add a Comment (to an article)” module, but decided to put that on hold temporarily to tackle a new problem.

Up until now, all articles on my website are separate files, e.g.

  • “benefits_of_using_quickbooks.php”
  • “incorporating_your_startup.php”
  • “contractors_a_flexible_alternative.php”

I would like to put these articles in a database, but am unsure of where to begin?!

Some questions:
1.) Can I store my already marked-up articles in HTML into a database field and just echo it?

2.) Do I just create a “place-holder” page and then append some query string, e.g.

w w w.MyWebsite.com/articles/index.php?id=1234

3.) I would think you’d want to have an “Article ID” but I also want “pretty” URLs, so how does that relate to Question #2?

That should get me started…

Thanks,

Debbie

  1. Yes you can, however for performance reasons (it will take a few microseconds to load the HTML) it would be quicker to store a reference to a cached file with the html and including that instead.

2/3) Nailed it :slight_smile: However you can also use htaccess mod_rewrite to make the urls’ prettier.

A breif code example would be:


// index.php?id=10101

// Query database for the ID, put results into $result

// Query would look something like: SELECT * FROM `ww_pages` WHERE `page_id` = ID;

if(isset($result)){ // lets assume if nothing is in the database, nothing is returned.
include($results['html_cached']);
}else{
// 404 error
}


# .htaccess

RewriteEngine on
RewriteRule ^page/([^/\\.]+)/?$ index.php?id=$1 [L]

I am not understanding what you mean about the caching part.

My articles are pretty formatted/marked up, yet they are simple HTML.

2/3) Nailed it :slight_smile: However you can also use htaccess mod_rewrite to make the urls’ prettier.

A brief code example would be:


// index.php?id=10101

// Query database for the ID, put results into $result
if(isset($result)){ // lets assume if nothing is in the database, nothing is returned.
include($results['html_cached']);
}else{
// 404 error
}

What is $result??

What is $results[‘html_cached’]??


# .htaccess

RewriteEngine on
RewriteRule ^page/([^/\\.]+)/?$ index.php?id=$1 [L]

Can you explain what this says in English and how it works?

Debbie

One strategy used is to store the article as html in your database, so then it can be pulled back for editing, in say a WYSIWYG editor.

When an ‘edit and save’ cycle is performed, you rewrite a straight html file probably with a common header and footer surrounding the article body itself.

This would be termed the cached copy. It is pre-prepared and ready to be served - without necessarily going to the database to get content.

eg

header.html


<html><body><menu></menu>

footer.html


</body></html>

So when an article, lets say articleID=23 is saved, you [fphp]file_get_contents/fphp the header, get the article body itself - perhaps along with storing it in your database, and [fphp]file_get_contents/fphp the footer and store them all using some kind of file write function as:

index.php (pseudo code)


grab incoming $article_text
grab incoming $article_title
store $article_text in the database

$header = file_get_contents('header.html')
$footer = file_get_contents('footer.html')

write to $article_title . '.html' the string:
$header . £article_text . $footer

Clearly index.php will be more complicated than that, but essentially this is what it has to do.

my-article-today.html


<html><body><menu></menu>
<h1>My Article Today</h1><p>blah etc</p>
</body></html>

So you can go on just picking up the html file from the public side of your website, which will be slightly quicker than hitting your database. Sometimes this strategy is termed ‘baking’ in that it presents a pre-baked loaf - all ready to go.

It may not be the best solution all the time, for example what if you need to change the menu? You will have to re-bake all of your files…

The idea of mod_rewrite is best explained here:

mod_rewrite: A Beginner’s Guide to URL Rewriting Article » SitePoint

or here:

Apache’s mod_rewrite

So you store the main article as HTML in your database, but then export that out, combine it with Header and Footer HTML, and then “bake” a finished article file called <<ArticleTitle>>.html??

Is that correct?

Debbie

You already have $article_text and $article_title as variables.

You perform two operations on them, store in database, and then when successful, go on to bake in a html file with surrounding code. No need to get it from the database again, you already have the text in index.php

This is a simplified answer to your question “what is meant by creating a ‘cache’?”.

Think of it like this if you like:


$header = "<html><body><menu></menu>"; 
$footer = "</body></html>";
$article_title = "<h1>My Story Today</h1>";
$article_text = "<p>blah etc</p>";
$file_name = 'my-story-today.html'; // ***
// store text and title in your database:
// of course you have to escape the values, but essentially:
"insert into articles (title, text) values ('$article_title', '$article_text');"
// if that operation returns positive ...

file_put_contents($file_name, $header . '<h1>' .$article_title .'</h1>' 
                        . $article_text . $footer)

*** this is the trick turn My Story Today into “my-story-today” (sometimes termed “slugging” - which you can search this forum for) which can be used to a) create the predictable and what must be a unique html file name and b) become the SEO friendly link - so to the second part of your question, how you turn:

/articles/my-story-today

into :

/articles/index.php?articleID=23

OR EVEN

/articles/index.php?article=my-story-today (why do you need an ID at all? use the slug as the unique primary key)

There are lots of other issues which you will have to address, what happens if the article is edited and the title changes? How will I enforce uniqueness in the file name? (or slug)

Accept that this may not be the best solution for you, but it is a strategy you could pursue. At least you have a couple of terms to search the wider web for now.

BTW there is absolutely nothing wrong with just having a public index.php file which does this:
index.php?articleID=23


<?php

$id = (int)$_GET['articleID'];

// get the title and the text from the database
"select title, text from articles where articleID = $id"

echo $header;
echo '<h1>' . $row['title']. '</h1>';
echo $row['text'];
echo $footer;

Thats what most seem to do, and its probably the best way to start - then add caching if you are inclined/feel the need. This way if something changes in your menu in the header file, your website is updated immediately.

Cups,

How much does accessing an article directly from my database (versus article caching) slow things down?

Debbie

I’m confused about the order of things…

I thought you queried the article from the database by using a URL like:

w w w.MyWebsite.com/index.php?articleID=123

and then you used Mod_Rewrite to create a virtual URL that reads…

w w w.MyWebsite.com/all-about-creating-dynamic-content.php

Debbie

Not enough to worry about unless you are serving millions of pages, or you are doing multiple sql requests per page (grabbing and building complex menu trees for example) or you have some evidence that this is causing undue stress on your server.

This type of caching (to file system) is only one of many caching techniques.

[google]caching strategies in PHP[/google]

I’m confused about the order of things…

I thought you queried the article from the database by using a URL like:

w w w.MyWebsite.com/index.php?articleID=123

and then you used Mod_Rewrite to create a virtual URL that reads…

w w w.MyWebsite.com/all-about-creating-dynamic-content.php

Try this explanation:

w w w.MyWebsite.com/all-about-creating-dynamic-content

is what you use in your links, when a user clicks on a link, and sees that in their address bar …

Behind the scenes, Apache then re-routes the request to

w w w.MyWebsite.com/index.php?articleID=123

which then does the business of building the page, but essentially,

w w w.MyWebsite.com/all-about-creating-dynamic-content

remains in the web browser address bar for the user (and search engines)

So while I appreciate your suggestion, it sounds like I need to K.I.S.S. and just access content from my database and save the caching for when I give Amazon.com a run for the money?! :smiley:

Try this explanation:

w w w.MyWebsite.com/all-about-creating-dynamic-content

is what you use in your links, when a user clicks on a link, and sees that in their address bar …

Behind the scenes, Apache then re-routes the request to

w w w.MyWebsite.com/index.php?articleID=123

which then does the business of building the page, but essentially,

w w w.MyWebsite.com/all-about-creating-dynamic-content

remains in the web browser address bar for the user (and search engines)

So, dumb question,…

In the past I created a physical file called “all-about-creating-dynamic-content.php”. And when I had to reference that file for a link or whatever, I just went into NetBeans (or my file system), copied the file name, and pasted it into a link, include, whatever.

Because the file name was self-documenting, it was very easy to do.

Now if I no longer have physical files, and if every article is ultimately referenced by id=1, id=2, …, id=62194802, then it’s going to be a real pain remembering that “all-about-creating-dynamic-content.php” is id=429876210.

Follow me?

I think this may be a large part of my confusion?! :-/

Lastly, to your earlier point…

Behind the scenes, Apache then re-routes the request to

w w w.MyWebsite.com/index.php?articleID=123

which then does the business of building the page, but essentially,

w w w.MyWebsite.com/all-about-creating-dynamic-content

remains in the web browser address bar for the user (and search engines)

So how hard is it to create a mod_rewrite that parses things up like that? :confused:

Ideally, I would like to keep the same “virtual structure” as the former “physical structure” that I had. Meaning, my Home Page would be:

[INDENT]w w w.MyWebsite.com
[/INDENT]
And any articles would have a URL of…

[INDENT]w w w.MyWebsite.com/articles/some-article-name-goes-here.php
[/INDENT]
…even though they technically would be…

[INDENT]w w w.MyWebsite.com/articles/page.php?id=1234.php
[/INDENT]

Thanks,

Debbie

I accept you want to KISS, but it is interesting that your existing solution is in fact a form of cacheing.

Now you know.

mod_rewrite, I am sorry, I cannot offer a better explanation than those I linked to, because those are the references I use myself.

You have urls like this:

w w w.MyWebsite.com/all-about-creating-dynamic-content.php

I am advocating urls like this:

w w w.MyWebsite.com/all-about-creating-dynamic-content (no php)

I also advocate a database whose setup would be something on the lines of


articles
======
slug varchar(80)
title varchar(80)
text text
category varchar(50)
last_update datetimestamp

And the entry for

w w w.MyWebsite.com/all-about-creating-dynamic-content

Would be:


articles
======
"all-about-creating-dynamic-content"
"All about creating Dynamic Content"
"blah de blah, de blarrrr"
"Technology"
2011-07-06 12:15:00

I would not advocate the use of false keys like id = 123 because:

a) you cannot understand what it happening in your db when looking at all these numeric keys
b) you dont need them
c) its easier to then use Apaches mod_rewrite as some kind of page-controller without all the messing around of un-mapping the ids to titles, as you have quite rightly pointed out

BUT there are drawbacks and I have mentioned them higher up on this thread. If you have further questions about them, then fire away.

Why do you drop the .php??

I also advocate a database whose setup would be something on the lines of


articles
======
slug varchar(80)
title varchar(80)
text text
category varchar(50)
last_update datetimestamp

And the entry for

w w w.MyWebsite.com/all-about-creating-dynamic-content

Would be:


articles
======
"all-about-creating-dynamic-content"
"All about creating Dynamic Content"
"blah de blah, de blarrrr"
"Technology"
2011-07-06 12:15:00

Well, I have that, but having a database background, I still through in an “id” as the PK, and will just put a unique index on “urlTitle” (aka “slug”).

I would not advocate the use of false keys like id = 123 because:

a) you cannot understand what it happening in your db when looking at all these numeric keys
b) you dont need them
c) its easier to then use Apaches mod_rewrite as some kind of page-controller without all the messing around of un-mapping the ids to titles, as you have quite rightly pointed out

Okay.

BUT there are drawbacks and I have mentioned them higher up on this thread. If you have further questions about them, then fire away.

I’m sorry, what drawbacks are you referring to?


Here is what I have so far…

I created a page called “article.php” that is my article template, and it is located in my WEB_ROOT.

I also still have a page called “article_index.php” which looks similar to the home page on Ars Technica.

So that means I have…

www. MyWebsite.com/article_index.php

And when I click on one of the links I get…

www. MyWebsite.com/article.php?id=1

I guess I no longer need a directory called “articles” to store my physical article files, right?

Does that mean that I just dump “article_index.php” in the WEB_ROOT? (It used to be in “/articles”

Also, I’m sorry, but how do I get away from “id=1”?

Is it as simple as changing my hyperlink to:


	<a href="<?php echo WEB_ROOT ?>article.php?id='mail-meters-are-a-good-investment'">Mail Meters are a Good Investment</a>

Debbie

Drop the .php as it is pointless.

Keep Ids if you want, the cost is miniscule.

Lose the single quotes. You just said that article.php is at your webroot, so just go to the home directory of the website with ‘/’


<a href="/article.php?id=mail-meters-are-a-good-investment">Mail Meters are a Good Investment</a>

A problem you will hit if you let others contribute to your site or you want your Titles to be constructed with proper English is:

Mail Meters, really a Good Investment?

But some chars are illegal or get turned into the likes of %20 as urls, so you have to be able to reliably slugify Mail Meters, really a Good Investment? to mail-meters-really-a-good-investment

What about articles with the same title? That is a likely scenario so you would also need a published date or something. To make thing less complicated I would always include the ID and place the title in the URL merely as filler data.

/article.php?id=89765&title=doesn’t matter

Then use a rewrite like:

/article/{title}/{id}

This all really a mute point considering everything else that has to be done before this even really becomes relevant.

How so?! How can you display a .php page without a .php file extension??

Keep Ids if you want, the cost is miniscule.

I’ll need it for my Comments module…

Lose the single quotes. You just said that article.php is at your webroot, so just go to the home directory of the website with ‘/’


<a href="/article.php?id=mail-meters-are-a-good-investment">Mail Meters are a Good Investment</a>

Guess I could. I just use my constant to make it easier and safer to switch from Dev to Production.

A problem you will hit if you let others contribute to your site or you want your Titles to be constructed with proper English is:

Mail Meters, really a Good Investment?

But some chars are illegal or get turned into the likes of %20 as urls, so you have to be able to reliably slugify Mail Meters, really a Good Investment? to mail-meters-really-a-good-investment

No other contributors. And I’m still a pretty manual kind of gal.

I will be composing the articles myself and manually typing in what I think is the best “urlTitle” >< “Title”.

Debbie

Cups,

You didn’t answer my key questions…

Debbie

I agree with Oddz. (I think a lot of online newspapers concatenate the “Article Title” with the “Article ID” to guarantee uniqueness.

To make thing less complicated I would always include the ID and place the title in the URL merely as filler data.

/article.php?id=89765&title=doesn’t matter

Then use a rewrite like:

/article/{title}/{id}

This all really a mute point considering everything else that has to be done before this even really becomes relevant.

“Moot” point, Oddz. :wink:

Yes, you just said what I was saying before I got to what you were saying?! :slight_smile:

Debbie

Many newspapers use the date as a unique identifier.

/2011/07/my-article

Anyhow, seeing as moot points are being raised:

a) if you are the only one submitting articles, you can easily ensure uniqueness yourself
b) I already articulated that in your case you should use IDs in my reply at #7 above, see how you get on with it, it will work fine.

My advocacy of using slugs is borne from designing multi-user systems where I find ridiculous the constant unmapping of id=1234 to ‘My Title’, and my file cacheing works incredibly efficiently as a result.

I have a site where there are literally 1000s of cached pages and I go directly from SEO friendly URLS to cached page content.

My POV served only to illustrate a possible relationship between caching, mod_rewrite and dynamic content - and at least I aired my points.

The best book I read on all the different types of cacheing was Schlossnagle’s Advanced PHP Programming - but it appears pretty old now.