24th November, 2006

Part 1: The Database

Friday, 8:24 am in Build-a-Blog v2.0

“What’s this?” I hear you ask in fascinated Jack Skellington style.  ”Build A Blog v2.0?  You mean… she’s actually re-writing it?”

That’s right, kids, welcome to the first instalment of the re-write of CodeGrrl’s seminal Build A Blog tutorial.  It’s a weighty undertaking; with its mighty promise, the BAB tutorial has been leading young, would-be coders to the murky waters of PHP for over two years now, and more power to it.  However, in that time it has been a source of tremendous frustration to ‘established’ coders as the coding foundations it lays down are shakey at best and outright dangerous at worst.  Nevertheless, the idea is still a great one and with that in mind I’ve tasked myself with re-writing the behemoth.  Wish me luck.

But I’m sure you’re not here to hear about that; you’re here to learn how to write your own blog, from scratch, and hopefully give yourselves some solid foundations in PHP while you’re at it.  So hold on tight, kids, and here we go…


1.1 What You’ll Need…

First off, you’re going to need a basic background in PHP and maybe a little MySQL.  It doesn’t have to be extensive, but if you’ve never worked with the language (or any language) before this probably isn’t the turorial for you.  While I’m going to try to explain my code as I go along, it’s highly likely that I will miss out on important details like “what is the difference between a variable and a value” and “what is an if-statement”.  There are lots of good PHP 101 tutorials on the internet (and also a lot more bad ones); start there first, then come back.  It almost goes without saying, but I should also point out that if you want to actually learn to code you will have to read further than just this tutorial.  This tutorial will (hopefully) teach you how to write a very basic weblogging script; it’s not going to teach you how to be a 1337 coder.  Only you can teach you that one.

With the attitude stuff out of the way, what else are we going to need?

  • A Webserver:  For this tutorial I’m going to be assuming that you’re running Apache on a *NIX-like environment.  If you’re not, no biggie but be aware that there may be some information (specifically pathing locations) that might need changing; I’ll try and point these out as we go along.  If, for some odd reason, you’re running PHP on IIS… er, can’t help you.  Should work.  I guess…  Also, because the first few stages of this script are going to be about as secure as leaving your money under your doormat, it’s probably a good idea to start developing your blog on your own computer.  This isn’t as hard as it sounds.  If you’re running Windows, you can grab one of the many bundled Apache/PHP/MySQL packages such as XAMPP.  Just run the installer and you’re good to go.  For those of you running OS X or *NIX variants, you’ve probably already got the necessary stuff installed on your machines.  If not, bundled packages are also avaliable for most distributions; check your friendly local package manager.  If all else fails, you can use a public, online server but it’s advisable to keep your working directory password locked via .htaccess or, at worse, security-by-obscurity.  Don’t for the love of grock make your domain’s index page your development environment.  Seriously.  Just don’t.
  • PHP & MySQL:  See above.  As with everything, it’s best to use the latest stable releases of these two products.  More on both of these as we come to it.
  • Your Database Login Info:  Depending on where you’re doing your development, you are either going to have one or two sets of login information for MySQL.  The first – which you will have if you’re running MySQL on your home machine, or if you own your own domain – will be what’s known as the root account.  In fresh MySQL installs this is often root with a blank password.  Not particularly secure, but okay for a standalone machine.  The root account should have full permission over all aspects of the database; never ever use the root account in PHP scripts in a live environment.  For this, you want a second account with limited privileges; it should only have access to the blog database, and only have the SELECT, INSERT and DELETE privileges.  Most CPanels have a GUI tool that will let you set this sort of thing up, as does phpMyAdmin (see below).
  • phpMyAdmin:  An invaluable tool that no-one working with PHP and MySQL should do without.  phpMyAdmin provides a web-based GUI for working with your database; most domains will come with one installed at the CPanel, though you can also install it yourself separately.  Remember if you’re putting it on a public server to pay careful attention to the security (since it has godlike access over your databases).  Technically, you can do without phpMyAdmin but it’s a bit like trying to reheat dinner without a microwave.  I wouldn’t want you to go without dinner.
  • Your Server Path Info:  This includes both the path and the URI of the webserver which will be hosting your blog.  The path is the location of your files according to the filesystem.  On Windows they look like C:\Some Dir\Some Other Dir\htdocs\myblog, while on *NIX variants they are frequently things like /users/home/myname/www/myblog.  URIs, on the other hand, are web addresses; http://mydomain.com/myblog or http://localhost/myblog.  If you plan on developing your site locally and moving it online later (as you all should be), you will need to know the information from both.
  • A Text Editor:  To write your files.  Technically, you can use Notepad, but your life will be much, much easier by investing in something like UltraEdit.  Something with syntax highlighting and auto-indenting.  On my mac, I use Smultron.  *NIX people, if you’re feeling adventurous you can try something like vi or emacs, but GUI text editors are a dime a dozen.
  • Good Music and Some Snacks:  Whack on the iTunes and grab yourself a latte/juice/Coke/whatever.  The most important step.

Got all that?  Honest?  Really, really honest or did you just skim down the headings waiting for the spoonfeeding?  Because if you did, you might as well stop reading right now.  Seriously; coding is a tricky endeavour and if you’ve never done it before the learning curve is monstrous.  I’m not writing all this stuff for fun, I’m writing it to try and help you not only learn to code, not only to produce your own super-leet blog script, but also to try and help you avoid the traps and pitfalls I had when I was first starting out.  So you’re gonna have to do some reading.  And if you don’t get it, re-reading.

Okay, now we’re ready.


1.2 A Cunning Schema

The first step for our blog, and in fact most any project*, is to design what’s know as a schema.  A schema is, well, the overall ‘appearance’ of your database; how you’re going to store the data, what you’re going to store and how it relates to other things you’ve stored previously.

But first, a little about MySQL.

MySQL is what’s known as a relational database.  It’s one of the most widely-used and popular relational databases, mostly because it’s free and pretty much every single personal domain on the internet uses one.  It’s pretty good, as far as databases go; it’s missing some of the more advanced features of, say Oracle, but it’s nothing you’re going to miss at this stage.  Besides, PHP loves MySQL and has all kinds of handy in-built functions to interface with it.  Believe me, when you’re tried to do something horrible such as, say, making PHP interface to Access or db2 over ODBC you’ll be eternally grateful for the mysql_ functions.

Like all relational databases, MySQL is made up of a collection of databases.  Each separate database contains zero or more tables.  Tables can talk to other tables within the same database**, and are described as being made up of fields.  Data is stored in table rows.  There are a couple of important concepts in database design, most of which fall under the heading of what’s known as normalization.  You might hear programmers talking about ‘First Normal Form’, for example, and though there are lots of them and they get quite technical, here is a brief summary:

  • Each row in a table should have some kind of unique identifier.  This is most commonly (but not always) an arbitrarily-assigned automatically incrementing number and is known as a primary key.
  • Each individual field should contain only one ‘bit’ of data (this is probably the most commonly broken Normal Form; I do it all the time).
  • Each table should only contain data about one type of thing.

Which probably all sounds a bit vague and abstract right about now, but hopefully should get clearer.

Finally, MySQL databases – again, like all relational databases – use what are known as SQL queries.  SQL stands for ‘structured query language’ and it is the language that MySQL understands in order to make it do things.  Between relational databases, all SQL is similar but not always 100% the same.  It’s also fairly human-readable, so if you’re unsure about a query read it out loud and see what it sounds like it should be doing.

Anyway, we want to make a blog, right?  So, the first step is to think of what different ‘chunks’ of information go into a blog post.  A very, very basic list probably looks something like this:

  • The date the post was made.
  • The title of the post.
  • The text of the post.

And that’s probably it, really.  There are, of course, more things you could add; mood, music, category, tags… the list goes on.  Since this is your blog that you are writing for you to use, feel free to chuck in additional features as we go along if you feel confident (being wary of featuritis).  If not, you can always go back in and edit them later.  Note that at this stage there are a couple of things which definitely don’t go in this table; the first is comments, and the second is extensive author/user information.  Both of these things should go into a separate table, in keeping with dot-point three above.  You can also make good arguments for ‘currently’ information (mood, music), tags and category definitions to also go in separate tables, but that’s a skinning cats argument***.

Anyway, with our basic schema in mind we move onto step two; deciding what kind of data we want to store.  MySQL has a vast multitude of types that can be assigned to various fields.  The most basic of these is TEXT, which is essentially just vast strings of characters.  It’s perfectly suited to the body of our posts.  Closely related is VARCHAR; functionally, the VARCHAR type is pretty much like text except it can be assigned a length.  So, for example VARCHAR(255) can store up to 255 characters, while VARCHAR(1) can only store single letters.  It’s perfect for any text field where we want to control the ultimate length of input, such as our post title field.  We should pick a length that is restricted but still lets us put in some punchy headlines; I like the number 50.  Finally, we need to decide what we’re going to do about the date.  We basically have two options here; TIMESTAMP and DATETIME.  TIMESTAMP stores dates as what’s known as epoch time, or the number of seconds that have passed since the first Unix machine was turned on (midnight UTC, 1 January 1970).  They are not particularly readable, but most PHP date functions use epoch time as their base.  DATETIME is much more understandable – it stores dates in the format YYYY-MM-DD HH:MM:SS – but requires a bit of extra conversion to manipulate in PHP.  Personally, I hate TIMESTAMP with a passion (long story, has to do with timezones), so we’re going to use DATETIME.

“But wait!” I hear the more perceptive of you crying.  “What about our primary key?”

Good point.  In this instance, our primary key is going to be the tried-and-true auto-incrementing integer, mostly because there’s not a lot of other unique data we could use.  I suppose you could use the subject, but what happens if you make a post without a subject?  Besides, after seven years of blogging I can tell you this; no matter how creative, you will re-use subject lines eventually.

Okay, so we’ve got our basic design.  Now how do we get it into MySQL?


1.3 MySQL 101

From here on in we’re going to start executing some SQL queries.  Because I like skinning cats, I’m going to write out the basic queries and leave it up to you to decide how you want to execute them; command line, phpMyAdmin, whatever.  If you’re very precocious, you can even write a basic SQL input script (or copy someone else’s).  Up to you.

Anyway, the first step here is to actually create the database you want to store your blog in.  Think of a name for it.  I’m going to call mine tutorial but don’t use that for yours; this is step one of security-by-obscurity.  Having an obscure schema or naming convention is one of the most basic things you can do to try and prevent hack attempts.  It’s by no means where you should stop, but it does add one more little hiccup along the road.

CREATE DATABASE `tutorial`;

Simple.  Execute that code in your query manager of choice and you should now have a shiny new empty database.  Note that if you’re running CPanel on a public domain it’s highly likely you won’t be able to do this; instead you will need to go via your CPanel MySQL page.  While you’re there, you should also set up (if you haven’t already) the user under which your script will eventually run.  In phpMyAdmin, you can do this via the ‘Privileges’ link on the main pMA page.  I’m going to call my user bab and give it a password of ‘password’.  You should make your own login info up.  I’m also going to give it extremely limited privileges, and only in the blog database.  I’ve had a couple of scripts hacked in the past and had the hackers execute DROP commands costing me all my data; not fun, and easy to prevent.

CREATE USER bab@localhost IDENTIFIED BY 'password';
GRANT SELECT, INSERT, UPDATE, DELETE ON `tutorial` TO bab@localhost;

Now that we’ve seen two SQL queries, there should be a couple of things you notice.  The first one is the mixing of cases.  SQL commands aren’t case-sensitive (but table and field names are; database names depend on if you’re running Windows or *NIX), but the convention is to write them in uppercase.  It’s also convention, but not necessary, to quote database and table names using backticks (the key next to the 1 on a QUERTY keyboard), and other attributes using apostrophes.  Finally, all SQL queries should be terminated by a semicolon.

Now we’ve set up our user and our database.  Note that some of you might not be able to execute this step quite so cleanly, likely because you’re on shared hosting and only get one database.  That’s okay; it’s not optimal, but it’s okay.

Our second step is to make sure we’re actually working on the tutorial database.  If you’re using pMA, this is a matter of simply selecting the appropriate name from the drop-down menu, but the raw SQL is:

USE `tutorial`;

Next we want to actually create the table that will store our blog posts.  This is another instance where you will need to decide on a couple of names for things.  Naming conventions are actually very important in programming, and everyone’s got a favourite method.  All throughout this tutorial I’m going to be using my personal preferences for variable, function and database names but feel free to adapt what I write to what looks good and makes sense to you.  Want BiCapitalization?  Use it!  Need more_underscores?  Sure thing.  camelCaps?  Them too.  The only real rule is to keep it consistent and logical (at least to you, if not to anyone else).

CREATE TABLE `blogentries` (
  `postID`        INT( 7 ) UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
  `postDateTime`  DATETIME          NOT NULL,
  `postSubject`   VARCHAR( 50 )     NOT NULL,
  `postText`      TEXT              NOT NULL
) COMMENT = 'Log posts for BaB.';

Note that if you’ve only got one database available to you, it might be worth prefixing the name of your table with bab_ or something similar.

The other thing to remember about SQL is that it collapses whitespace (newlines, spaces, tabs and so forth).  So feel free to indent as much as you want.

Anyway, here is our basic table creation command.  We’ve created four fields with fairly descriptive names and assigned them sensible types.  We’ve defined postID as the primary key, and set it to auto-increment.  This means that now when we add some data to the table, we can omit a value for postID and it will be automagically backfilled with the next appropriate unused number.  Notice we’ve also made postID as an unsigned integer.  Setting an integer to unsigned tells the database (actually the underlying OS architecture) that we only want postive numbers in the field (which can also include zero).  Essentially it just means that we can potentially put more biggerer numbers since MySQL doesn’t have to ‘reserve’ a bit to delimit negatives.  It’s kinda technical.

As for the NOT NULL stuff…  er, well.  I dunno really.  I rarely ever use NULLs in MySQL, so it’s just kind of there as a default.

Finally, we’ve added a short comment letting us know what the table is there for.

And that’s it; you’ve made your database and learnt some good fundamentals on design, normalization and naming while you’ve been at it.  And to think, we haven’t even touched PHP yet.  That’s next module, trust me.  So, stay tuned, we’ll be right back…

____________________________________________
* Okay, technically it’s the second step; the first step is figuring out what the damn project is to start with.
** Technically, I do believe it is possible to make tables from different databases talk to one another.  However it’s not common in personal web applications and, quite frankly, I’ve never done it.  Ever.
*** As in, “there’s more than one way to”.

Comments

  1. User Avatar

    you should make me do this some time. I’ll be your guinea rat

  2. User Avatar

    You use PHP at work, right?  Do it at there and call it ‘professional development’. tongue.png

  3. User Avatar

    Not a bad idea, boss is away for a fortnight, so it may be quiet…or crazy.

  4. User Avatar

    Oooh… much more detailed than the original!  
    Actually, this is a much better start than the other one has.  Some time ago, I finished the original BAB tutorials and made my own blog.  But I never used it because I thought the code was (and IS) a mess.  And it wasn’t very secure either.  So instead of fixing it, I said to myself,  “I’ll abandon this and write something from the beginning when I have more experience”.  That’s what I have in mind.

    I learn something new every time I read articles and tutorials like this one.  So thanks.  smile.png

  5. User Avatar

    Heh, I taught myself PHP by writing sk.log, so I’m a vast and proud supporter of the “write your own blog/CMS” trend; the only way you learn a language is by thinking up a project, sitting down and actually doing it. Following tutorials will only get you so far, in the end you’ve gotta just bite the bullet and beat out your own code, even if it’s for something really basic.

    My main problem, though, is that I have an attention span of approximately three weeks. Three weeks of tutorial and then it was all like, “Meh… I could continue but… ooh, shiny!” sweatdrop.png

  6. User Avatar

    It’s in the part of writing a user interface and control panel where I get lazy with scripts…  I have been working on a script for people who want to have their personal links manager (like bookmarks) for a while now.  The entire planning is done (I always do that first), but I started getting lazy when I reached the most important part.  And once I stop, it’s hard to get back to it.  But I really want to finish it, so I’ll try again later.

    Yes, writing your own blog or CMS with the purpose of learning from it is a wonderful idea!  Whenever I want a script and I can’t find one that is to my liking, or that suits my needs, I write the idea down on paper or the computer, hoping to write my own version someday.  It’s because I am always unhappy with what I find. I always want to add or change something.  That is why I try to stick with small scripts.  Sometimes I feel it’s better that way.  If I feel like I want to change something, it’s easier with smaller scripts, I think…

  7. User Avatar

    Haha, yeah. sk.log’s been floating around for, what, seven years now and it still doesn’t have admin features for half of it’s shit. That’s the major reason I never really released it publicly. tongue.png

  8. User Avatar

    sk.log sounds like it’d be an awesome script if you finished it and made a public release, to be honest.  I was looking at the screenshots and reading about its features last night.
    Is sk.log’s code decent and understandable enough for someone who wants to learn from it and play around?  I ask because I know it’s old.

Add Comment
auto insert line breaks
use log.code
use smilies
Verification
  • v-s.net v0.6 and all content (unless noted) © Dee.
  • sk.log v0.6 spat this out in 2.102 seconds.
  • 78 / 181,759
artistic-twobyfour