Looking for feedback on a PDO tutorial I wrote

I am not sure it this is a right place to ask, but I found this community most friendly and thoughtful, so I want to try. If it’s not appropriate, feel free to delete this thread.

I spent several years on Stack Overflow answering questions on PHP. Being interested personally in database wrappers, I kept an eye on PDO related questions in particular.

With a lot of data to review, one is able to collect some statistics, and even trace the common patterns of (mis)use. So did I. From such an experience, and due to lack of really good tutorials on PDO, I set out for creating one of my own. Finally, I was able to come up with a tutorial, biased towards the very patterns of misuse I was able to spot - aiming to correct them. Please keep it in mind this fact if some statements would look too imperative to you - it’s only to counter massive use of bad practices.

At the same time, I was able to discover some neat PDO features that are almost totally unknown to developers, which I find a great waste. So I tried to advertise them as well.

It is called (The only proper) PDO tutorial and if you can take time I’ll be grateful for any feedback or questions which I’ll try to answer in as much detail as possible.

Some topics that could be of the most interest are:

  • Error handling. Exceptions
  • Return types.
  • Getting data out of statement in dozens different formats. fetchAll()
  • Getting row count with PDO
  • Calling stored procedures in PDO
  • Running multiple queries with PDO
  • Mysqlnd and buffered queries. Huge datasets.
8 Likes

Just taking a quick look. it looks like it is very thorough, so without reading the detail(yet) I can say good job! :smile:

Scott

1 Like

A very thorough article. I plan on going back and reading it completely from start to finish because I think it is a very valuable resource, having everything about PDO in one spot, in a well-organized, readable fashion.

I’m not qualified to make any comments about the correctness of the information - I’m trusting that you made sure of that yourself, but I do have one suggestion. There are a number of spelling (typos?) and grammatical errors that you might want to fix.

2 Likes

The arrogance is a bit dangerous. On the one hand it is good to be confident. On the other hand it does make people like me perhaps check a bit closer to see where errors are made.

For example:

This is just plain wrong. You can create as many pdo instances as you want as long as you are using the same dsn. The will only be one actual sql database connection. Easy to show, just run:

for($i = 0; $i < 1000; $i++) {
  new PDO($dsn, $user, $pass, $opt);
}

Works fine. No real impact on the server.

Of course I agree that in general one should avoid creating multiple pdo instances. But given the rather arrogant tone of your article, I would be careful to avoid actual factual errors.

3 Likes

First of all, thank you for a practical feedback on a particular statement - the exact kind of comment I were looking for.

Now to the argument itself. A have a couple objections

  1. Database connection considered to be a relatively heavy operation, that takes much more time than a query. That is why there is a persistent connection - exactly a thing to reduce this overhead.
  2. Your code makes the old connection closed while the new one is created. But it is often can be seen in the newbies’ codes that each class is using its own connection. And in this case your server will be killed for sure. Change your code to make it keep created connections and see. Would you accept a “Too many connections” error as a proof?
  3. Like I said above, the rather demanding tone is caused by the overwhelming number of people doing it wrong way. So I am trying to counter.

But advise is taken. I have to soften the statements for sure. I am already doing so. You just didn’t see this article a month ago :slight_smile:
Besides, English is not my native language and people told me already that my phrasing souns blunt even when it is not intended to.
Hope that a proofreader whom I am going to hire will make it well.

4 Likes

If I were you, I’d title the article, “PDO: the right way”. :smile:

Scott

3 Likes

Actually, I had a look at the article a few days back and I must say it is a very comprehensive description of PDO, I liked especially the part about all the different constant parameters for fetching data, which are very scarcely described in the official documentation. A job well done!

Now the critique :slight_smile: I agree with others that a softer tone would be more appropriate. I can’t speak for others but to me when I read phrases like “the only proper way to…” doesn’t add to the credibility of the author but rather stimulates me to find other “proper ways” and therefore disagree with him. In the same way:

I am not fond of calling anything “the right way” because you never know when it may be beneficial to go out of the established rules. People who are really wise and knowledgeable don’t advertise their methods as the (only) right ones but rather present the proper ways in a convincing style by providing good reasoning and examples. In other works - the reader should come to the conclusion him/herself that your way is truly the right way - not because you call it the right way but because your arguments make sense! This is only my subjective point of view, of course…

Now a few quotes to comment on:

[…] While prepared statement is the only proper way to run queries, if any variable is going to be used in the query.

This is where “the only proper way” is a statement taken too far - because technically this sentence is wrong. Prepared statements is not the only proper way because it is perfectly fine to use escaping functions and “jumble” data with SQL (as some like to call it) and the security is maintained - after all the database drivers provide native escaping functions for this purpose. The thing is that often using prepared statements is more elegant, less error prone, easier for humans not to screw or forget about something, result in better coding style, etc, etc. I’d rather see good reasoning for using prepared statements than just reading “the only proper way”.

Another example I found a bit inaccurate:

Passing data into execute() should be considered default and most convenient method.
When this method is used, all values will be bound as strings (save for NULL values, that will be sent to the query as is, i.e. as SQL NULL), but don’t worry: it’s all right in every aspect, save for only one issue with LIMIT clause in emulation mode described below.

And the SO answer it links to has currently the rank of -2! This doesn’t look credible for a newbie. Moreover, sending all data as strings is not always without potential negative effects so your statement “it’s all right in every aspect” is not true. For example, there are some cases where the query execution time will meaningfully increase when a number is sent as a string or in extreme cases the query may perform differently (for example when dealing with bigints). Also, other databases may not be as forgiving of the correct type as mysql, therefore the safest way would be binding params with the actual type. This can be important for people writing applications for wider audiences supporting many database engines.

And lastly - not a critique but rather a recommendation for improvement in the future - it appears that you wrote this article based on mysql only. It would be beneficial if you tested your findings with other databases, too, because sometimes PDO may behave differently in subtle and unexpected ways and therefore some of your recommendations may not apply generally to everyone using PDO. Of course, this would mean a lot of work on your part so don’t feel pressured! :sunglasses:

But overall, I’d say it’s a pretty good article filling in many holes in php docs!

4 Likes

So any fields in the database that are not strings will fail if the latest version of mySQL is used with the default strict setting or and older version is used and strict is turned on.

To work with the latest mySQL that would require that strict be disabled first. Not good as the setting was changed to the default to strict for a good reason.

2 Likes

As far as I know mysql’s strict mode is not that strict about types of values used in a query. Strict mode will reject improper values (like out of range or too long for a field) but will still allow to use numeric values in quotes and vice-versa. That would be too extreme a change. Even postgresql, which is always in strict mode, allows quoted numbers. However, there may be some less common (and unexpected) cases where this makes a difference. I’ve heard that some other databases (maybe SQL Server or Oracle?) outright reject numbers sent as string.

2 Likes

I had it reject ‘0’ for numeric fields and ‘0000-00-00’ for date fields - that’s how I found out that strict mode was the default when I upgraded to mysql 5.7

You should add ORM too, so we can do work better and faster.
It is painful task to match all field,its types and values.
I am expecting pluggable ORM with this tutorial

1 Like

It is a play on the web site PHP: the right way. I feel, if you do have a page like that, it also must be kept up-to-date too. And that also means, if it becomes outdated for any reason, that should also be made clear. So, at one point, PDO: the right way, could also say, “WARNING: Don’t use PDO anymore! Extension X is what you will want to use.” :smile:

Scott

2 Likes

With PDO having been around since late July 2004 it probably is closer to its end of life than it is to its birth. We probably don’t have long to wait for a new “improved” interface to be introduced.:wink:

1 Like

May I ask you to verify it once more? For me it works with '0' and only reject an empty string, ''.

mysql> select @@sql_mode;
+--------------------------------------------+
| @@sql_mode                                 |
+--------------------------------------------+
| STRICT_TRANS_TABLES,NO_ENGINE_SUBSTITUTION |
+--------------------------------------------+
1 row in set (0.00 sec)

mysql> show create table users;
+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table                                                                                                                                                                                                      |
+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| users | CREATE TABLE `users` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `name` varchar(255) DEFAULT NULL,
  `pass` varchar(255) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=8 DEFAULT CHARSET=utf8 |
+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

mysql> insert into users values ('0','','');
Query OK, 1 row affected (0.06 sec)

mysql> insert into users values ('','','');
ERROR 1366 (22007): Incorrect integer value: '' for column 'id' at row 1
1 Like

thank you for the in-depth review. I definitely should reconsider the phrasing and reasoning.

[quote=“Lemon_Juice, post:7, topic:215875”]
For example, there are some cases where the query execution time will meaningfully increase when a number is sent as a string[/quote]
During several weeks I tried hard to find any evidence for this, but no success yet.
Yes, I am pretty aware that DBAs every time say so, but I need a practical example that can be offered to the readers.

I tried with bigints and it worked for me all right. The only issue I encountered with if when you actually put a bigint right in the query without quotes, which is treated as a float in this case, with all the usual outcome. While bigint put in quotes worked fine.

Well, this one looks like a killer one. Indeed the article is hugely biased towards mysql and it shouldn’t be - as PDO is an abstraction. And this very statement in question indeed is based on my experience with mysql only. So it seems I have to mark out mysql-specific statements as exceptional cases. Or may be I should just re-title it as “…for PHP/MySQL developers”.

1 Like

Might have been an empty string (can’t remember as I changed it to 0 and then it worked.

@WebMachine

I am not a native speaker and all my English is self-taught. So I have problems recognizing errors.
I’ve ordered a proofreading service, and corrected all the mistakes they found. May I ask you to take a glance now, and tell whether you find the grammar acceptable?

I’ve corrected only first five chapters though. As soon as I make sure they do the job well, I’ll order to proofread the rest.

2 Likes

It makes a big difference with the spelling errors gone, and the grammar improved. I actually went through the first ten chapters for you and recorded some things that I found distracting as I was reading to understand what you were saying.

I am not a grammar expert, so I skipped over things that maybe were not correct, but did not make it hard to read.

I don’t want to detract from the important stuff - which is the information you are recording here, but if someone has to re-read a sentence to get its meaning, it obscures the real value of the tutorial.
And eventually this will be very important if you want to publish the tutorial somewhere.

BTW, I am learning a lot from reading it. :smiley:

PDO-tutorial.doc (25.5 KB)

4 Likes

@WebMachine thanks a lot, I am really surprised by such a detailed report. I’ve fixed all the issues you mention. The sentence on empty catch I rephrased this way, adding a link:

if the error can be bypassed, you can use try…catch for this. However, do not make it a habit. Empty catch in every aspect works as error suppression operator, and so equally evil it is.

Hope that now it reads clear.

1 Like

Yes, it’s difficult to find real life examples, a few years ago when I was very curious about it I managed to find a query which took 1.5 seconds to perform with a quoted number vs 1 second with unquoted number. Unfortunately, I don’t have that query any more and I don’t have several hours spare to look for one now :). But for sure it was a more complicated query, most probably with a join and a subquery, and returning a large result set.

And still we don’t know how other databases will behave in such cases, I’ve read that SQL Server and Oracle will type-cast quoted numbers to appropriate numeric types but that the execution plan can be different and the type-casting can be turned off in configuration, in which case such queries will fail.

I have found this:

// result: 3.602879701896397e18
SELECT 18014398509481984*'200';

// result: 3602879701896396800
SELECT 18014398509481984*200;

It is also hard to find examples where quotes around numbers make a difference but they exists in some less common cases. I also find myself often sending all values as strings because it is more convenient, however it is important to be aware that potential problems can occur - and I would probably send numbers as numbers whenever I am dealing with complicated queries, those returning large results or dealing with precise calculations.

Interestingly, postgresql seems to ignore integer hints and sends all numbers as strings even when using PDO::PARAM_INT so postgresql is less problematic than mysql in this case and the above two selects return same results. If the PDO driver is written this way then we can assume that postgresql doesn’t have any problems with extracting numeric values from strings.

However, there’s also the boolean type, which exists in postgresql but in mysql it is emulated as 0 or 1. And sending booleans in a single execute() method is not that straightforward, suppose we are binding a value to a boolean datatype in postgresql or tinyint in mysql

$stmt->execute([true]);
$stmt->execute([1]);

These should work for both databases. However, this:

$stmt->execute([false]);

will fail in postgresql, because the value will be sent as an empty string, which is not allowed for booleans. However, this will work:

$stmt->execute(['off']);

only in postgresql, because off is a special keyword that maps to false. A universal way would be like this:

$stmt->bindValue(1, (bool) $myBoolean, PDO::PARAM_BOOL);

This only shows that PDO does not abstract binding values to work equally across database types, and it’s understandable. However, I think that binding values with specifying the correct type and casting the variable to the desired type (like in the above statement) results in most universal and predictable behaviour. Whether it’s worth the trouble binding each variable like this remains to be judged by the programmer.

3 Likes