14th December, 2006
Part 2: Getting Started With PHP
Thursday, 11:41 am in Build-a-Blog v2.0
Oh, you’re back. Good.
Welcome to Part 2 of BaB2. So, where did we get up to? Hopefully you’ll all remember that last time we set up a database schema and introduced some basic design concepts. Today we’re going to extend that a little further, and hopefully by the end we’re going to have a nice page you can use to log into and create a new blog post. In other words, we’re going to get our hands a bit dirty in PHP. Neato.
But not just yet; we’ve got some more designing to do, and don’t all groan at once. Every single computer science student ever in the entire history of computer science is groaning along with you. I used to do it as well, when all my tutors would make me write out project plans and design documents and blah blah blah. I was always of the opinion that code just got written; you took a vague idea threw yourself in, and watched it grow. To some extent, I still program like that, but past experiences have meant that even my most haphazard attempts generally have some kind of logic behind them; it’s the logic I’ve intuativley built up over years of abortive programming projects. So, let’s do some right now. We’ll get to the coding soon, I promise.
2.1 Designer Code
Firstly we need to think about the physical structure of our application, in this case a weblog script. What pages will we need? I can think of, at the most basic, two; one to display posts and one to make them. But is that really all? We’re also going to need to connect to the database. Now, technically we could just do this by making individual calls to mysql_connect() at the top of each page, but this is a bit of a waste of code (and programming is all about minimizing the amount of code you actually have to write). It would be much easier if we had one file containing our database connection information, and we then just included that. The other thing to consider is the fact that there will also likely be certain configuration variables which would be prudent for us to throw into a seperate, includable, file; things like pathing and cookie information. Because it’s highly likely that we are going to be moving this script around (ie. between our ‘dev’ environment and our ‘production’ environment), certain values will change between servers. If we hard code these in, we’re going to have to go through and edit these in each page in which they appear in in order to make our pages work. This is no good; all coders are lazy at heart and we’d much prefer not to have to do that sort of thing.
The final thing we need to consider is that, invariably, we’re going to start developing a small function library of useful routines for our blog. For example, notice how we stored the post date as a DATETIME rather than in epoch time? This means that every time we want to do format manipulation on it we’re going to have to convert it back to epoch. We could do this manually each time, but it would be much simpler to write ourselves a function we could call instead. As a general rule of thumb, any set of two or more lines of code that you’re going to use more than once (or even might potentially use more than once) should go into a function, and it’s logical to stick those functions into a seperate file.
Our little project is starting to grow; we’ve moved from two potential files to at least four, if not five.
Where are all these files going to go? Any page that will be directly accessed from the browser (our view post and new post pages, for example) is probably going to go in our root web directory. Include files, however, which are not designed to be accessed directly, should be hidden away in a folder. For the extremely security-paranoid, this folder should probably go above the webroot, but this isn’t always practical or possible (ie. if we’re hosting our blog in a subfolder of a domain). I generally stick my include below the main site folder, but in this tutorial feel free to do either.
While we’re thinking of folders, it’s probably also worthwhile to think of what we’re going to do with our admin pages. Currently, we’ve only got one (the new post page), but as the site grows it’s highly likely we’ll start accruing more. Basic neatness alone says we should probably collate all these together somewhere, and this also gives the added benifit that we can protect this folder by .htpasswd.
Finally, we’re going to need to think of names for our folders. I’m going to call mine /ba.blog and /ba.admin, and with that in mind, we can start to see the following directory structure emerging:
/index.php
/ba.blog/values.inc.php // database and values include
/ba.blog/functions.inc.php // function library
/ba.admin/newpost.php // the new post page
You’ll notice that I’ve given the includes the name *.inc.php. This is just habit as opposed to something ‘mandatory’. It’s also a bit of a security gotcha; note that any file with any extension will be parsed as PHP code when included in an include() statement. This is one of the (many) reasons why you should never, never include() based around raw user-input (ie. query strings), and should never, never HTTP include() files hosted on a server you don’t have control over. There’s also an issue around giving your PHP includes non-standard extensions. I’ve seen a couple of scripts around the place that use *.inc extensions. Don’t do that either; always give your files a PHP extension. Why? Because it tells the webserver to actually process that file as opposed to treating it like raw text. Say you’re storing important configuration, such as your database username and password, information in a *.inc file. Say someone out there figures this out, and directly navigates to your include file. What do they see? A raw print out of that page, including the plaintext of all your login info. That’s bad, and just goes to show that just because you can do something, doesn’t necessarily mean that you should.
2.2 Getting a Head(er)
Okay, we’ve done some basic planning and it’s time to start hacking some code (that’s the verb all the fly programmers use, yo). We’re going to write our values.inc.php file first, so fire up your text editor of choice and save a new (empty) file into your /ba.blog folder with that name. Now we’ve got an empty file, we’re going to do the very first thing we’re going to be doing on every single PHP page we write from now on:
<?php
/***********************************************************************
FILE: values.inc.php
BEGUN: __TODAY__
AUTHOR: __YOURNAME__
PURPOSE: Set some basic values and connect to the database.
NOTES: From the Build-a-Blog v2.0 tutorial.
<http://void-star.net/archive.php/bab>
***********************************************************************/
?>
First the extreme basics. Though it’s not always necessary in includes (see above security warning), it’s best-practice to always put all of our PHP code in between PHP’s tags. In this case, they are <?php ?> These tags act as delimiters to the PHP engine to tell it when it should be treating our script like PHP, and when it should be treating it like text. The reason we can do this is because of the way PHP integrates in with HTML; unlike most other programming languages, a PHP script file does not have to be 100% PHP. PHP has two sets of tags, the ones here are what’s known as long tags. We can also use <? ?>, or short tags. Note that short tags can technically be disabled in PHP’s configuration file, but I’ve yet to encounter a server that actually does.
And what about the stuff in between the asterisks? This is what’s known as a comment block. Comments in coding are lines of text which are not parsed by the PHP interpreter and are really only there to make the code more readable to programmers (ie. you). There are two types of comments in PHP; line comments and block comments. Block comments, as above, are delimited by /* comment goes here */. They can span multiple lines, and everything in between the slash-stars is the comment. Line comments are delimited by // comment and can only go on one line. It’s good practice to always outline what your code is doing in comments, and there’s no time to start like the present.
Because comments are completley ignored by the interpreter, we can have fun as put in as many of them as we want. Here I’ve given the outline of some basic header information that tells you important stuff about your file, such as its name, when it was started (now), who wrote it (you), what it does, and other notes as applicable, in this case a short plug for myself (huzzah). You should fill in your name and today’s date. Obviously not every file has to have this information, or have it in this format, but it’s generally considered good practice. Additionally, every coder tends to have their own ‘style’ of header info; this one is mine, and it’s pretty wordy (like me). Look at five different scripts by five different people and I guarantee you you’ll see five different-looking headers.
Okay, we’ve got our skeleton; time to back-fill some code.
The first thing we want to do is connect to the database. There are two ways we can do this; mysel_connect() and mysql_pconnect(). What’s the difference? mysql_pconnect() creates what’s known as a persistant connection, so that every time your script is run it attempts to use the same channel to talk to the database. You don’t mysql_close() a persistant connection like you do with mysql_connect(), and it can help to reduce database load on high-traffic scripts. There’s also a (somewhat advanced) issue around embedding multiple scripts using multiple different databases on the same page which goes a bit beyond this tutorial, but which I’ve previously written about here.
For the purposes of this tutorial, I’m going to use mysql_pconnect().
We also need to decide what we’re going to do with our database connection values. One of the gotchas you always hear about in badly-formed PHP scripts is database values being stored in insecure variables which are then compromised (ie. by printing them out to the browser). The thing to realise here is that we don’t really need to store our database values in variables at all; the reason most distributed scritps do this is either because they’re using multiple calls to mysql_connect(), or for other distribution reasons where changing the value would effect multiple files. This is not going to be the case for us; we’re only ever going to have one call to open the database link, and when we move our script around it’s just as easy to change the values there as it is anywhere else.
With that in mind, the call to open the database looks like this:
$link = mysql_pconnect( 'localhost', 'username', 'password' )
or die( "<b>Could not connect to database!</b><br />n". mysql_error() );
Obviously you’ll need to change username and password to reflect your database values (just remember to keep them inside the single-quotes). The localhost bit you can probably leave. This tells the function where your MySQL database is actually located; since the vast majority of web servers have colocated databases, it’s generally safe to just leave it as localhost.
Notice we’ve also put in an or die() clause. This comes up if, for some reason, the connection fails. Note that it’s not always ‘safe’ or desirable to notify visitors to your site if you’ve got errors, but in this case we’re going to leave this in. The call to die() essentially stops all further execution of the script; since this is the very first thing in our file, if the connection to the database fails nothing further will execute, and the message sent to die() will be displayed in the browser. In this case, it’s going to print out “Could not connect to database!” followed by the output of mysql_error(), which will likely say exactly the same thing. Take note of how we’ve ‘joined’ the output from the mysql_error() function to the rest of the string sent to die() using the concatanation operator, the period. We’ll be seeing this more later, so don’t worry if you’re not quite sure just yet what it’s doing.
Okay, so we’ve set up our link to MySQL; next we need to tell it which database we want to use.
mysql_select_db( 'tutorial', $link )
or die( "<b>Could not select database.</b><br />nn". mysql_error() );
Again remembering to change tutorial to the name of your database as created in Part 1.
Okay, now we’ve got an awesome file that we can use to connect to our database, but we’re not done here yet! Instead we’re going to start defining some ‘generic’ variables to do with our blog; stuff that we’re going to need in multiple places but which is likely to change if we move servers. With that in mind, stick the following in your file, above the database connection:
define( 'BAB_PATH', '/path/to/my/blog/dir' );
define( 'BAB_ADMIN_PATH', BAB_PATH .'/ba.admin' );
define( 'BAB_URI', 'http://url.to/my/blog/dir' );
define( 'BAB_ADMIN_URI', BAB_URI .'ba.admin' );
Here is where you’ll need to know your path and URI information (which you should already have assuming you read Part 1 properley), plus the directory you are using for your admin folder. Change the values as appropriate.
What are we doing here? What the heck is a define()? Why aren’t we using variables? Good question. The main answer is because I’m using it as an example to show you what a define() is; you could just as easily use a variable (kinda). Values set by define(), however, unlike variables can never be changed during the execution of a script which in some respects makes them safer. They aren’t, however, affected by scoping (more on that later), which in some respects makes them less safe. While BAB_URI is more of a convenience thing, there’s something to be said for not allowing people to see your path info which makes BAB_PATH semi-sensitive.
Anyway, we’ll likely be coming back to it later, but for now we’re done with values.inc.php. Our completed file should look something like this:
<?php
/***********************************************************************
FILE: values.inc.php
BEGUN: __TODAY__
AUTHOR: __YOURNAME__
PURPOSE: Set some basic values and connect to the database.
NOTES: From the Build-a-Blog v2.0 tutorial.
<http://void-star.net/archive.php/bab>
***********************************************************************/
// define some environment values
define( 'BAB_PATH', '/path/to/my/blog/dir' );
define( 'BAB_ADMIN_PATH', BAB_PATH .'/ba.admin' );
define( 'BAB_URI', 'http://url.to/my/blog/dir' );
define( 'BAB_ADMIN_URI', BAB_URI .'/ba.admin' );
// connect to the database
$link = mysql_pconnect( 'localhost', 'username', 'password' )
or die( "<b>Could not connect to database!</b><br />n". mysql_error() );
mysql_select_db( 'tutorial', $link )
or die( "<b>Could not select database.</b><br />nn". mysql_error() );
?>
If you want, visit this page in your browser (remembering to access it from your webserver, not as a file); you should get a blank page with no output.
2.2 Form Processing 101
Okay, we’re finally read to start writing our post-to-weblog page. Open up a new blank text file and save this one in your admin folder as newpost.php. But don’t rush in and start opening PHP tags just yet! It’s HTML time.
<html>
<head>
<title>New Post</title>
<body>
<h1>New Blog Post</h1>
<form id="newPostForm" action="<?=BAB_ADMIN_URI?>/newpost.php" method="post">
<input type="hidden" name="f" value="doNewPost" />
<fieldset>
<label for="ptitle">
Title:
<input type="text" id="ptitle" name="d[title]" />
</label>
<label for="ptime">
Time:
<input type="text" id="ptime" name="d[time]" />
</label>
<label for="ptext">
Text:
<textarea id="ptext" name="d[text]"></textarea>
</label>
<label for="psubmit">
Submit:
<input type="submit" id="psubmit" name="d[submit]" value="Make Post" />
</label>
</fieldset>
</form>
</body>
</html>
It sure ain’t pretty, but it’s a form and it works. Okay, I admit; it’s hideous. Take some time out and give it some nice CSS, I’ll still be here when you get back…
… okay, you’ve now got a more leet form. I saved my stylesheet as /ba.skin/default/admin.css. I can imagine that sometime down the track I am going to want to impliment some kind of site skinning system, so I’m going to plan out my stylesheets now in a way that would make this task easier in the end. This isn’t by any means essential, but it hopefully demonstrates that right from the start you need to keep an idea in your head of where you want to end up.
Back to our ‘script’, observant people will notice that there is one teensy little piece of PHP that has slipped in; <?=BAB_ADMIN_URI?>. Remember that BAB_ADMIN_URI is one of the variables we defined earlier, and it’s hiding in something that looks like PHP short tags. But what is that equals sign doing? <?= ?> is something that I rarely see outside my own scripts (so I’m sure there’s some horrible flaw with it or something), but functionally it’s shorthand for <? print() ?>. If you view-source on the file, you’ll notice it’s currently printing the word ‘BAB_ADMIN_PATH’ to the browser. This isn’t what we want, but it’s currently doing so because there’s one very important thing we’ve forgotten; we haven’t included our values file! Let’s do that now; add the following at the top of your new post file:
<?php
/***********************************************************************
FILE: ba.admin/newpost.php
BEGUN: __TODAY__
AUTHOR: __YOURNAME__
PURPOSE: Make new posts.
NOTES: From the Build-a-Blog v2.0 tutorial.
<http://void-star.net/archive.php/bab>
***********************************************************************/
// include our common files
require_once( '../ba.blog/values.inc.php' );
?>
require_once(), which you might not have seen before, works a bit like include() (and a lot like require()), in that it will include the file passed to it. Unlike include(), however, it will kill the script if the file cannot be found (include() will simply error). Since our script won’t work at all if the values file can’t be found, we probably want this as a nice smack in the face notifying us that our file cannot be found. The _once part tells the script that, no matter how many times we try and require the values.inc.php file it should only ever be added in once.
Refresh and view-source the page again in your browser; it should now be filling in the form’s action with the full URI to our file. That means we’re working.
Let’s add one more line; above the require, add the following:
define( 'IS_ADMIN', true );
We’ll come to this later when we go through putting a login to our admin sections (more planning ahead).
Okay, we’ve got a form and a connection to the database, but it doesn’t ‘work’. Well, it does; try putting the following line in the PHP block underneath the require:
print_r( $_POST );
Now add some data into the form and press the submit button. Huzzah! There’s our data. print_r() is a great function; it’s a bit like print(), only it expands out arrays and objects so we can see the full contents. It’s excellent for debugging purposes. Anyway, now that we’ve proven that our data is being submitted, we don’t really need this anymore so delete the line.
Now we have to decide exact what we’re going to do with our data. How are we going to get it into the database? Well, first we need to know a little about how PHP picks up data submitted to it in forms. It’s fairly straightforward; all form fields, as defined by their name value, go into a special array based on the form’s method. Our form uses post (as should yours; this prevents people from using simple query-string hacks to input data since this gets parsed as get), which means that our data is going into the $_POST array. An array is a special type of variable in PHP that can hold multiple different chucks as data, referenced by indexes (the word or number in the square brackets). I love arrays, so much, in fact, that I have actually passed our form data itself as an array; notice the name attributes of most of our fields hold values like d[word]. This tells PHP that they should be passed as an array. Complicated? Probably at this stage, but the more you use arrays the easier they become.
If you noticed from our print_r() earlier, after submitting our form our $_POST array looks something like this:
Array
(
[f] => doNewPost
[d] => Array
(
[title] => dsdsds
[time] => dsdsds
[text] => dsdsdsdsddddddd
[submit] => Make Post
)
)
What is this telling us? Well, firstly it’s telling us that $_POST contains two keys; named f and d here. $_POST['f'] contains a simple string, doNewPost, however $_POST['d'] is in itself an array which currently has three keys of its own – title, time, text and submit – that hold the data we have entered into the form. Hopefully it’s reasonably obviously that d in this instance is shorthand for ‘data’, since this is the array holding exactly that. But what’s f for? Well, it’s short for function, because it’s going to tell us which one of those we want to use to parse our data. It’s going to do this, incidentally, because we’re going to make it, not because it’s some magic compulsory PHP thing. Remember what I was saying before about coder ‘style’? This is another example of that; we don’t have to process the form data this way, but we’re going to.
So let’s do just that. Back in the PHP section of your page, just underneath the require statement, put the following:
if( $_POST['f'] == 'doNewPost' )
$r = doNewPost( $_POST['d'] );
See how we were using f as a marker letting us know when our form was submitted? When we see that it’s equal to our expected value (as opposed to not defined, as is the case before we submit the form) it tells us to call a function, doNewPost(), and passes the d array as a value. But wait a moment, what’s this doNewPost() function? You’ll notice that if you try and submit the form now it will give you an error tell you PHP cannot find the specified function; that’s because we’re going to write it ourselves.
Next time.
Part 1 Files: values.inc.php, newpost.php, admin.css.
- « Previous
- Next »