Character Type Functions
19 Dec 2004An oft-overlooked PHP extension is ctype - a collection of functions that can help you determine whether a string belongs to a particular character class, such as alphanumeric. This extension is built-in as of PHP 4.3.0, so you may not have to do anything special before you can start using it.
The ctype functions are particularly useful for handling $_GET and $_POST data - elements in these superglobal arrays are always strings, and because they are sent by the client, you must treat them with suspicion.
Security-conscious PHP developers frequently use regular expressions to filter external data. While this is still the best approach in many cases, there are a few common character classes that are easier to filter with ctype functions:
- ctype_alnum - alphanumeric
- ctype_alpha - alphabetic
- ctype_cntrl - control
- ctype_digit - numeric
- ctype_graph - printable (except whitespace)
- ctype_lower - lowercase
- ctype_print - printable
- ctype_punct - punctuation (defined as any printable character that is not whitespace or alphanumeric)
- ctype_space - whitespace
- ctype_upper - uppercase
- ctype_xdigit - hexadecimal
A nice side-effect of using ctype functions is that they take locale into account. For example, I consider alphabetic characters to be [A-Za-z], but this isn't true everywhere. In fact, many common European names have characters that are not accounted for in my simplistic pattern.
Here is an example using ctype_alnum() that tests whether $_POST['username'] is alphanumeric:
<?php
$clean = array();
if (ctype_alnum($_POST['username']))
{
$clean['username'] = $_POST['username'];
}
else
{
/* Error */
}
?>
There are plenty of cases where a regular expression is still best, but I think the ctype functions are worth a look.