Tell me about... PDO?

Discussion in 'PHP' started by deathshadow, Dec 23, 2008.

  1. #1
    I've been writing database software for close to two decades now, and programming in general since 1979 - so needless to say coding is nothing new, though I've only really been dealing with php/mySQL for about five years now.

    In that time I've dabbled with PDO (php database objects), but had very little reason to resort to objects in what is an inherently an interpreted language. I'm also leery of objects in php because much like Java and C++, objects feel shoehorned into the language instead of having a sensible structure - unlike say smalltalk and Modula which is where I learned objects - objects make a HELL of a lot more sense in language where pre-declaration of variables and typing is the norm.

    I'm diving headlong into PDO right now for a host of reasons - first the multi-support for mySQL/msSQL,postgre, sqlite, etc, etc... from a single codebase is too attractive to turn away from - much less I THINK (I may be wrong as I've not tested this yet) I can pass my PDO object as a reference between functions, letting me keep my SQL connection as a LOCAL variable making it a lot harder for code appendage hacks to gain access to the DB.

    I guess my question is that for those of you (if any) using PDO are there any pitfalls or problems I should be aware of in it's use. I'm talking performance issues, overhead differences, differences in how you handle the resultant data, etc. I'm trying to document and understand as much as I can before I go high-hog on actually coding the fifth generation of my CMS system with it. (Which is the first version I'm planning on a public release of)

    In particular I've heard mention of performance issues with 'prepared' queries, but have been unable to find any hard data or benchmark any real difference between it and old school 'constructed'. If anything the ability to 'safely' submit an array of values to the execute command and have it auto-sanitized seems to execute faster... much less that if you have multiple iterations of changes or queries to make it is WAY faster than constructed.

    For example, if we had a two dimensional array, first dimension as record number, second dimension of values 'joinDate' and 'lastPost', to query each 'row' of our array off a table 'old school' would go:

    foreach ($searchValues as $row) {
    	$query="SELECT * FROM users 
    		WHERE joinDate < ".sanitize($row['joinDate'])."
    		AND lastPost > ".sanitize($row['lastPost'])."
    	";
    	$results=mysql_query($query);
    	/* do something with the results here */
    }
    Code (markup):
    Because on each iteration we have to manually sanitize our values, and construct a string, that's going to be pretty slow becuase it's all PHP code and not handled by native code functions.

    using PDO that would look something like this ($DB is the PDO object)

    $query='SELECT * FROM users
    	WHERE joinDate < :joinDate AND lastPost > :lastPost
    ';
    $queryHandler=$db->prepare($query);
    foreach ($searchValues as $row) {
    	$queryHandler->execute($row);
    	$results=$queryHandler->fetchAll();
    	/* do something with the results here */
    }
    
    Code (markup):
    The extra variables give it a bit of extra overhead, but inside the loop would be faster since we are passing our variables to the native code library instead of the parser.

    So I see the advantages - I'm just wondering where the penalties are or if there's any 'quirks' I really need to be aware of. One of the biggest things I'm wondering about are all the extra 'driver option' flags that the documentation on php.net is vague/nonexistant on exactly what they do. My google-fu is failing me badly on that subject.
     
    deathshadow, Dec 23, 2008 IP
  2. wmtips

    wmtips Well-Known Member

    Messages:
    601
    Likes Received:
    70
    Best Answers:
    1
    Trophy Points:
    150
    #2
    I don't use PDO yet (I use PEAR_DB instead of "pure" mysql functions though) and cannot help on PDO internals. But IMHO PDO and parameterized queries is a good choice considering several factors:

    1. Security. No SQL injection is possible.
    2. Cross-platform. Allows to change RDBMS without code rewriting. Maybe you'll want to move to PostgreSQL in the future ;)
    3. Better for some RDBMSes. For example, Oracle tries to cache "internally parsed (planned)" queries for faster execution, so it is better for it to have one parameterized query than a million of different but similar queries.

    Explore this thread for some more PDO advantages.
     
    wmtips, Dec 23, 2008 IP
  3. risoknop

    risoknop Peon

    Messages:
    914
    Likes Received:
    24
    Best Answers:
    0
    Trophy Points:
    0
    #3
    On the contrary, prepared statements are faster than ordinary queries (and if you sue transactions as well, performance will get even better). But even if they weren't, the security and flexibility benefits are just too great to ignore.

    I have been using PDO for only about half a year or so (after recommendation from PHP developers at Sitepoint) and I must say there hasn't been a single problem with it. Before that I've been using MySQLi object which worked well too but was dependant on MySQL and I needed to be able to work with different databases as well without rewriting large parts my applications.
     
    risoknop, Dec 23, 2008 IP
  4. deathshadow

    deathshadow Acclaimed Member

    Messages:
    9,732
    Likes Received:
    1,999
    Best Answers:
    253
    Trophy Points:
    515
    #4
    I'd love to, but I'm permabanned for life over there for using the word wigger. (no, that's not a joke) They even blocked me at the IP address level. (Yes, I know I could use a proxy - I just can't be bothered)
     
    deathshadow, Dec 23, 2008 IP
  5. risoknop

    risoknop Peon

    Messages:
    914
    Likes Received:
    24
    Best Answers:
    0
    Trophy Points:
    0
    #5
    Maybe I am just slow today but this seems wrong:

    
    $query='SELECT * FROM users
    	WHERE joinDate < :joinDate AND lastPost > :lastPost
    ';
    $queryHandler=$db->prepare($query);
    foreach ($searchValues as $row) {
    	$queryHandler->execute($row);
    	$results=$queryHandler->fetchAll();
    	/* do something with the results here */
    }
    
    PHP:
    You need to bind parameters to prepared statement with bindParam(). And by the way, use bindValue() instead, it's more readable and easier to work with, imho.
     
    risoknop, Dec 23, 2008 IP
  6. deathshadow

    deathshadow Acclaimed Member

    Messages:
    9,732
    Likes Received:
    1,999
    Best Answers:
    253
    Trophy Points:
    515
    #6
    Since execute can 'insert' an array of values, if our $row had this structure -

    ':joinDate' => value
    ':lastPost' => value

    That should work, as per example number 2 at:
    http://us3.php.net/manual/en/pdostatement.execute.php

    I'm just using a variable array to pass it instead of inline declaring a new array... Which would be a LOT faster than multiple function calls passing more variable copies, which BindParam and/or BindValue would be. Three function calls inside a loop or one? Gee, lemme think. Remember the rules of interpreted language optimizations, move complex declarations outside the loop, and make as few function calls inside the loop as possible.

    ... and BindParam would be faster than BindValue since you are only passing a reference to be evaluated at execute, not creating a new variable instance... and if you format your array to key => Value using carriage returns and tabs you can be just as legible as having separate parameters, possibly more so, than calling the 'bind' functions.

    -- edit --

    Just occurred to me an example may illustrate that better. Assuming that $row is the following (psuedodates, we'll assume they actually have mySQL compatable dates in them)

    $row=array (
    	':joinDate' => '30-10-2007',
    	':lastPost' => '23-12-2008'
    );
    Code (markup):
    and $query and $queryHandler were set up thus:
    $query='SELECT * FROM users
    	WHERE joinDate < :joinDate AND lastPost > :lastPost
    ';
    $queryHandler=$db->prepare($query);
    
    Code (markup):
    The following would all be functionally identical.

    using BindParam:
    $queryHandler->bindParam(':joinDate',$row[':joinDate']);
    $queryHandler->bindParam(':lastPost',$row[':lastPost']);
    $queryHandler->execute();
    Code (markup):
    using manual array insertion:
    $queryHandler->execute(array(
    	':joinDate' => $row[':joinDate'],
    	':lastPost' => $row[':lastPost']
    ));
    Code (markup):
    or just passing our array that already has the proper values:
    $queryHandler->execute($row);
    Code (markup):
    Should all do the same thing, though at different execution speeds, the last of those being the fastest, the first with BindParam being slowest.

    though the bindParam page ALSO throws in more constants with no links to where they are actually DEFINED. Just as I can't find the values you pass as 'Driver options' on the __construct method, I can't find where PDO::pARAM_* constants are actually explained. They have a quick list at:
    http://us2.php.net/manual/en/pdo.constants.php

    But those descriptions are pretty damned vague... suprising given how WELL php.net usually explains things. (I'd kill for a site that good for C#, C++ or Java)
     
    deathshadow, Dec 23, 2008 IP
  7. deathshadow

    deathshadow Acclaimed Member

    Messages:
    9,732
    Likes Received:
    1,999
    Best Answers:
    253
    Trophy Points:
    515
    #7
    Oh, you could also do that this way, though I wouldn't just because I never trust automatic indexing of parameters:

    $query='SELECT * FROM users
    	WHERE joinDate < ? AND lastPost > ?
    ';
    $queryHandler=$db->prepare($query);
    $queryHandler->execute(array(
    	$row[':joinDate'],
    	$row[':lastPost']
    ));
    Code (markup):
     
    deathshadow, Dec 23, 2008 IP
  8. risoknop

    risoknop Peon

    Messages:
    914
    Likes Received:
    24
    Best Answers:
    0
    Trophy Points:
    0
    #8
    Well but if you pass it as an array to execute() method, you cannot tell PDO which values are integer, string etc... so that would add some overhead I guess, or not?

    But what you are saying makes some sense, even though the performance differencies will be probably negligible.
     
    risoknop, Dec 23, 2008 IP
  9. deathshadow

    deathshadow Acclaimed Member

    Messages:
    9,732
    Likes Received:
    1,999
    Best Answers:
    253
    Trophy Points:
    515
    #9
    True, but given that they have no param constants for other data types like date?

    That's what I'm really wondering about, the constants set the data type - shouldn't the SQL end of things be smart enough to be handling that based on the FIELD you are plugging values into or comparing against? Normal SQL queries (my, postgre, ms or tiny) don't have typecasting in a normal query, why does PDO make such a big deal out of it? Is that part of the auto-sanitize to prevent code injections or something (no, that still doesn't quite make sense) or just something one of the more.. unusual databases makes use of (DB2? ODBC? FireBird? Oracle?)
     
    deathshadow, Dec 23, 2008 IP