Hang around the CodeGrrl forums for a while, and a few things become apparent. The same sloppy coding mistakes keep cropping up over and over again, causing the same problems over and over again. The mistakes are well-intentioned, in that people are generally trying to do the right thing, they just often don’t have enough knowledge of the whys and hows of what they’re trying to do.
With that in mind, I’ve attempted to assemble some of the most common mistakes I see young scripters making, explain why they’re mistakes, and offer some alternatives.
What It Is(n’t)
I suppose I should start at the start; what, exactly, is PHP? I once had a girl tell me it was “much harder than HTML and CSS” and yes, this is true. But why is it true?
HTML is what’s known as a markup language. I suppose the best analogy for it is that it’s an adjective language; it’s used to describe how something (in this case, text) looks. PHP, meanwhile, is a verb-language; used to described what something does.
Technically, PHP gets thrown into the basket of ‘scripting languages’ (as opposed to so-called ‘real’ programming languages like C), mostly because it’s loosely typed. Loosely typed roughly means that you don’t have to worry about managing system memory when writing PHP scripts (they’re scripts, not ‘programs’). PHP scripts do not have memory leaks, for example, and you can happily declare a variable and stick a string in it, then over-write it with a number, then turn it into an array. PHP is also an interpreted language. That means that you do not — here’s some more technical stuff — compile it into a binary in order to run it. In a compiled language, like C/++, you must first write the code in a text file, then run it through a compiler. The compiler is a program that, very simply, checks the code for errors and — if that’s all okay — turns it into a binary file; in Windows, these are .exe files. Programs, in other words. You then run your binary; if it still doesn’t do what you want, you go back to your code, add some more bits and repeat the process. The compiler does a bunch of other stuff, too (such a linking), but that’s the basics of it.
In comparison, PHP is an interpreted language. This means that instead of running the code through a compiler in order to get a binary, we instead run the code through an interpreter, which gives us the output; no binary is created, and we must have the interpreter handy in order to see the output of our script. For PHP, the interpreter is hidden away inside our webserver; this is what it means when hosts say they are ‘running PHP’. Without the interpreter talking to the webserver, our PHP would just show up as flat text files. But on a server configured to run PHP, the webserver has been configured to recognise our file extension — usually .php, but also sometimes .phtml or .php3/.php4 — and send that file to the interpreter. The interpreter then ‘does’ the code, and sends the output — usually a HTML file — back to the webserver for display.
This is also why you hear PHP being talked about as a server-side language; because the code is processed by your server before being handed over to the client (i.e. the person browsing your website). In contrast, HTML is a client-side language; it’s up to the individual client (i.e. your browser) to ‘interpret’ the HTML rather than the webserver. That’s why your webpage invariably looks different to everyone who views it.
Common Mistake #1: MySQL
Oh, right, MySQL. MySQL — and, in fact, any database — is yet another program that runs alongside our webserver and our PHP interpreter. In this case, it’s the PHP interpreter that ‘hands control’ over to MySQL when it encounters code that instructs it to do so; such as everyone’s favourite mysql_query(). The ‘hows’ of this handover, however, are something I often see causing problems.
So how does PHP ‘talk’ to MySQL? The exact low-level specifics of it aren’t really relevant, but from an abstract perspective, PHP does this by creating a link, which is analogous to PHP picking up the phone and making a call to MySQL. Now, MySQL is a bit like a helpdesk; it can accept multiple phone calls (from PHP or other sources) at any one time. It usually has a maximum of about 50 or so. Every time you visit a webpage that uses a database, you are making a new phonecall. If someone else happens to be browsing the same website as you at the same time, that’s another phonecall. If someone is browsing another site on the same server? Yup, another phonecall. The PHP interpreter can make many phonecalls to MySQL, but generally each individual PHP script can only make one phonecall at a time. This is generally where people run into problems.
The problems generally arise when people are trying to integrate more than one script at a time; for example running Enth and WordPress on the same page. If you think of these kinds of webpages as a conversation, here is what usually happens:
PHP: [reading down its script] Oh crap, WordPress needs some data out of the database. [dials] Hello, MySQL?
MySQL: Hello mysql@localhost, how may I help you?
PHP: Yeah, hi. I need some data from yourwordpressdatabase.
MySQL: Sure! Let me just patch you through…
WordPress: Hi there, how can I help you?
PHP: Can you get me the data for the last four day’s worth of blog posts?
WordPress: Of course, sending now…
PHP: Thanks! [reads some more] Crap, now Enth needs something. [hangs up phone, redials] Hello MySQL?
MySQL: Hello mysql@localhost, how may I help you?
PHP: Yeah, hi. I need some data from yourenthdatabase.
MySQL: Sure! Let me just patch you through…
Enth: Hi there, how can I help you?
PHP: Can you get me the data for the last three joined fanlistings?
Enth: Of course, sending now…
PHP: Thanks! [reads some more] Damnit, now WordPress wants a list of categories; can you get that for me?
Enth: I’m sorry, I cannot find the data you require.
Wow, that was goofy.
Anyway, here we have a classic communications problem. If you think of the mysql_connect() command as the instruction to ‘dial’ up the MySQL server, and the mysql_select_db() command as PHP asking the MySQL switchboard to patch it through to the correct department, hopefully you can see where the problem with ‘mixing scripts’ comes from. That is, PHP scripts are not very smart, and generally only like being on one phonecall to MySQL at once. So what does a PHP script do when it encounters a second mysql_connect() — usually caused by including one script’s head file underneath another’s? It hangs up the first ‘call’. So when the part of the script rolls around that requires it to get data out of the first database again, it gets very confused.
How to fix this?
Well, the simplest answer is to simply put all tables you’re going to use for any one page into the same database. That is, stick your Enth stuff and your WordPress stuff into the same database. Then it doesn’t really matter if you prematurely instruct PHP to ‘hang up’ its connection to MySQL, since the reconnect should be to the same database. Most scripts nowadays are written with a ‘prefix’ option for exactly this situation; that is, all Enth tables are prefixed with enth_ while all WordPress ones are prefixed by wp_. That stops table name clashes.
The other solution is to force PHP to make two (or more!) phonecalls at once. PHP can do this, but its initial instinct is not to. The way to force it, is to use mysql_connect() (it won’t work with mysql_pconnect()) and sticking an extra 1 at the end.
Check the following script:
$link1 = mysql_connect( 'localhost', 'user', 'password' );
$link2 = mysql_connect( 'localhost', 'user', 'password', 1 );
mysql_select_db( 'db1', $link1 );
mysql_select_db( 'db2', $link2 );
$sql1 = mysql_query( "SHOW TABLES", $link1 );
$sql2 = mysql_query( "SHOW TABLES", $link2 );
while( $t1 = mysql_fetch_row( $sql1 ) )
print $t1[0] ."
n”;
print “
n”;
while( $t2 = mysql_fetch_row( $sql2 ) )
print $t2[0] .”
n”;
mysql_close( $link1 );
mysql_close( $link2 );
?>
This script essentially forces PHP to open two connections to MySQL simultaneously; represented here by $link1 and $link2. The fourth argument to mysql_connect() tells PHP to open two links to MySQL; generally if PHP sees two calls to mysql_connect() using the same connection data it will keep only one connection and make the second a reference to the first. You can then use these two links to open two databases. By default, PHP always looks in the last opened MySQL database on any one connection. Since we have two connections here, not one, we can work on two databases. Change the variables around and try the script. Now take out out the ’1′ and try it again.
See?
Common Mistake #2: Redeclaring Variables
I think this one is 90% caused by bad foundations laid down by CodeGrrl’s Build-a-Blog, though that might be finger pointing, and I’m sure I’ve seen it in a few other scripts too. So what is it?
How many times have you seen (or done) the following:
$sql = mysql_query( “SELECT * FROM sometable” )
or die( mysql_error() );
while( $r = mysql_fetch_array( $sql ) ){
$field1 = $r['field1'];
$field2 = $r['field2'];
$field3 = $r['field3'];
$field4 = $r['field4'];
print “$field1, $field2, $field3, $field4
“;
}
Come on, fess up; I know you have.
So, what’s wrong with doing this? Nothing. Technically, though it does sort of have the effect of announcing to the serious coders of the world that you’re, erm, a bit of a noob. Sorry, but… it does. Really.
Why? Because it’s wasting memory. Memory management is not really something we worry about much in PHP scripts unless we make dumb mistakes like calling MySQL queries that never end. This is a result of PHP being loosely typed (remember we talked about that above?). However just because we don’t have to worry about something doesn’t mean we shouldn’t, and in my opinion it’s extremely sloppy coding to redeclare a bunch of perfectly good variables. Arrays and objects aren’t scary; we can work with them just as easily as we can any other thing. Sometimes easier; try running print_r() on a reutrn array/object some time and then think of just how useful that can be.
PHP is a very gentle language (if you don’t believe me, go write web applications in CGI Perl or — horrors — C++ some day), and has an extremely robust database interface. mysql_fetch_array() and it’s counterpart (my personal choice; no real reason, I’ve just always used it) mysql_fetch_object() have been programmed with special loving care to return the most human-readable output possible. So please learn to use it; it’s not hard.
Not redeclaring variables also extends into other areas. For example, calling functions with return values on variables. I commonly see this sort of thing:
$var2 = stripslashes( $var1 );
stripslashes() here could be any function call that returns a formatted version of the variable passed to it, where you do not want to keep the contents of the unformatted variable. In this sort of situation, you don’t have to create a second variable to hold the output. Instead you can do the following:
$var1 = stripslashes( $var1 );
Or even:
print “This is a stripped slash string: “. stripslashes( $var1 );
Tips and Tricks: Associative Arrays
Since I made such an impassioned plea for people not to redeclare associative arrays into ‘flat’ variables, I might as well give some tricks on how to use them.
First, what is an associative array? For that matter, what the hell is an array?
Most variables in PHP are ‘flat’; that is, they only hold one value:
$var1 = “Shiina Ringo”;
$var2 = 42;
$var3 = ‘!’;
An array, on the other hand, is one ‘wrapper’ variable that holds multiple values. Generally we use them to collate ‘like’ values; which is why they’re good for MySQL output. If we’re getting weblog posts out of a database, all the information about that post ‘relates’. The post’s title, the post’s date, the post’s text and so on. I guess you can think of arrays as a bit like folders in your operating system. Maybe you keep all your music files in a folder called My Music. You don’t put your pictures in My Music — they probably go in My Pictures — because they’re not music. You also (hopefully!) don’t stick all your files in the root of C:. So is it with arrays; we use them to lump together stuff that, well, goes together.
There are two types of arrays; associative and indexed. Indexed arrays are ‘classic’ arrays; each element (that’s the different values) in the array is referenced by a number, starting with 0 and ending at n-1 (the number of elements total minus 1). They are written like $arrname[0], $arrname[1] and so on.
The second type of array is called an associative array in PHP (in other languages they’re called other things, such as dictionaries); these are arrays that are referenced by a word, called a key, rather than a number. They aren’t scary, and we can use the elements of associative arrays just like any other variable.
somefunction( $array['key1'], $array['key2'] );
$array['key3'] = stripslashes( $array['key3'] )’
print “This is an array: $array[key1], $array[key2]“;
And so on. Notice, however, one thing. When we use an associative array inside a string (anything delimited by — or inside — a ” or ‘) we drop the quotes from around they key word. Technically, we don’t ever have to put the keys of associative arrays inside quotes. By default, if PHP sees something it doesn’t understand, it treats it as a string. However, it’s bad form to do this. Why? Because if the key of your associative array is inside quotes, PHP knows that it’s to be treated as a key and not something else. Have a look at the following:
function getKeyNum(){
return 'banana';
}
$arr = array(
'apple' => “Mmm, apples!”,
‘banana’ => “Yuck!”
);
$k = getKeyNum();
print $arr[$k];
?>
Yup, you can use other variables to dynamically select which array key you want to access. (I think you can do it with functions, too, but don’t quote me.)
The general rule? Avoid confusing PHP, and know when to use quotes and when not to:
// outside of a string, use quotes
$arr['key'];
// inside a string, drop the quotes
$var = “$arr[key]“;
// referencing dynamically, drop the quotes
$arr[$var];
Final Words…
There are more things I could cover — so many more — but they will have to wait until our next instalment I think.
My final word, however, is this; don’t be intimidated. I’m sure some of you have gotten down to this point and are freaking out; “It’s all so much to remember!” The temptation to do things poorly or the ‘easy way’ is very strong, but please try and avoid it. Start from good foundations and it’s much, much easier to write good code. And good code in strong code, and strong code is much, much harder to hack than weak or sloppy code. It’s worth it in the long run, I swear.
Until next time; happy coding.
1559 days ago
138 comments