Friday, January 25, 2008

Five Things Web Developers Should Stop Doing

This may not come as a surprise, but I spend a lot of time on the Internet. Whether it’s browsing around for my own enjoyment or diligently working on a web-based application, I end up seeing both the end result and the inner workings of a lot of other peoples’ development work. And while a large majority of design elements are ultimately a matter of preference, there are certain web development techniques and implementation choices that I find myself shaking my head at, and I’d like to address a few of them here. The following is a list of five things that, in my opinion, web developers should simply stop doing.

1 – Including application code and HTML in the same file

Although many web scripting languages are tailored for alternating between application code and HTML by use of special tags, the failings of this architecture become apparent fairly quickly when developing robust web applications. Not only does this inline scripting method create messy, oftentimes confusing code, but it can discourage effective use of functions and introduce difficulty when delegating the roles of designer and programmer to different people who may not share one another’s skill sets. The answer here is to use a templating system to separate the application code from the HTML presentation. Templating functionality is widely available for any web development language, and is an integral part of pretty much any development framework (e.g. Ruby on Rails, CakePHP, FUSE).

2 – Embedding video with a technology other than Flash Full Motion Video

Until Flash FMV became widely available, a common method for video playback on websites involved encoding multiple versions of the same video, then asking the user which player he or she preferred to use (e.g. RealPlayer, Windows Media, or Quicktime). This was always a necessary evil, as developers needed to ensure that the site content was available to all visitors. However, presenting potentially confusing video preference questions to the user can often lead to abandonment, not to mention that encoding, uploading, and linking multiple versions of the same video can be a time consuming process.

Thanks to the introduction of full motion video capabilities in Adobe Flash, which has shipped alongside the most popular browsers for several years, developers now have some level of certainty that at least one video player will be available to the majority of users. Additionally, Flash FMV prevents the need to spawn an external application for playback, which is another scenario that can lead to abandonment if the site visitor is unsure of how to answer their browser’s security questions.

Although sites such as YouTube have unfortunately given many people the impression that Flash FMV is only capable of low quality videos with poorly synced audio, this is simply not the case. Adobe has even launched a “Flash HD gallery” (available at http://www.adobe.com/products/hdvideo/hdgallery/ ) that showcases Flash’s HD playback abilities. However, most embedded videos (news clips, etc) are short, small clips that download quickly, so even a medium or low quality encode will suffice. If your specific needs dictate that you must leverage the more advanced features of players like Quicktime or Windows Media, then you will have to use what best suits your end goal, but otherwise, stick with Flash.

3 – Implementing Flash pieces that introduce custom UI elements

Flash is a phenomenal addition to any developer’s toolkit, and well designed Flash pieces can significantly enhance both the aesthetics and functionality of a website or web-based application. However, one thing that many Flash developers fail to steer clear of is overusing Flash where it’s not necessary, to the point of introducing custom user interface elements that can end up hampering usability. As an example, consider something that’s unfortunately fairly common – a Flash-based block of text with a scrollbar that is also implemented within the Flash piece itself. Not only is this an unnecessary use of Flash, since the same effect can be accomplished with fairly simple CSS, but you may be alienating visitors who simply aren’t tech savvy enough to adjust their understanding of the browser’s UI elements on the fly. It may not be apparent to some users that they’re even looking at a scrollbar, especially if the bar is stylized or implemented in such a way that it doesn’t behave like the standard scrollbar. Your visitor is used to the way their browser functions and how they use its features to browse the web, so your best bet is not to alter basic UI elements.

4 – Using long query strings where they’re not necessary

Most web applications rely on the URI’s query string to bring in relevant data that is acted upon by the application code, but poor design choices often cause the query string to grow to unreasonable, unnecessary lengths. Long query strings can severely hamper the ease with which users can link to particular pages on your site, so you could very well be losing visitors because they didn’t quite get the full query string when a friend copy & pasted it over to them.

The first thing to do to clean up your query strings is simply to use small identifiers for both variables and values. Try to use numeric IDs instead of long text strings to identify a specific resource, and keep your variable names short. You should also avoid passing data that could easily be extracted from data you already have. For instance, don’t pass both an item name and item id through the query string – you can just pass the item id and pull the name from the database.

If you want to go a bit further in cleaning up or even eliminating query strings, look into URI rewriting. URI rewriting is a fairly simple process by which the friendly URI the user sees (e.g. /Blog/2008/01) is transparently translated into something more useful on the server side (e.g. blog_list.php?year=2008&month=01). Nearly all Model-View-Controller frameworks (Ruby on Rails, FUSE, CakePHP, etc.) have advanced techniques for rewriting the URI on the fly.


5 – Sizing images by means of the width and height attributes of the img tag

This one should be a no-brainer, but I still see it fairly regularly. While it’s quick and easy to force an image down to certain size by using the tag, you’re doing yourself a disservice in at least two ways by utilizing this technique. The first problem with this method of image sizing is that web browsers aren’t particularly good at shrinking or enlarging images. The browser doesn’t do any kind of resampling, so you often get a pixelated version of your original image, even when shrinking it. The second issue with sizing images by way of the browser is that you may be wasting a lot of bandwidth. An image that’s 1000 pixels by 800 pixels has a much larger filesize than one that’s 200 by 160 pixels, so if you’re forcing it to appear at the smaller size anyway, you’d be transferring a lot of extra data for no reason by resizing it on the client side. To resize images to the size you need, use any of the widely available free tools or websites that allow you to do so.


So there you have it – just a few things I’ve seen during the course of my Web travels that I personally think should be done away with. Especially as development trends continue to shift toward more user-friendly, AJAX-enabled “Web 2.0” applications, it’s important to remember to leave behind techniques that, although familiar, have either been deprecated or were never great ideas in the first place.

PHP Security in a shared hosting environment

Since its inception in 1994 as a set of basic development components, PHP has grown into one of the web’s most powerful development engines, having since been installed on literally millions of servers worldwide. And although PHP offers both the versatility and the built-in functionality to run in a reasonably secure fashion, most of those servers are configured in such a way that PHP scripts are at high risk for compromise.

Most PHP-enabled webservers are configured in such a way that the mod_php Apache module is loaded along with Apache itself, thereby allowing HTTP requests to be passed through the PHP engine, which preprocesses the data before it is sent to the client. While this configuration provides a simple, efficient way to get PHP up and running, it raises security issues when working in the most common webserver environment: shared virtual hosting.

Generally, it is unnecessary and wasteful to dedicate an entire server to hosting just one website. Since most sites demand only a small fraction of a server’s available resources, it is more common to have one server be home to a large number of virtual hosts. A virtual host is simply a configuration entry that points requests for a specific URL (for instance, www.google.com) to a particular directory (for instance, /home/www/mydomain.com).

The shared hosting model, though economical, immediately presents a security concern, since the HTTP server (for instance, Apache or Microsoft IIS) needs to have a considerable amount of control over the files and directories that are to be served to the client. If your application offers the ability to upload files posted through web forms, the problem is further compounded since the HTTP server now needs write permission on the destination directory. In the common virtual hosting configuration discussed above, if the HTTP server has write permission to that directory, then any user running a PHP script on that same server can also write to the directory. Obviously, this presents a major security concern. However, there are steps that can be implemented, as a server administrator or as a user, that will eliminate or mitigate the security issues, or at least isolate individual users so that a script exploit on one host cannot easily affect other hosts on the same server. In this article, I will discuss a few methods for more securely configuring PHP, and will offer some security-conscious techniques to use when writing applications. For the sake of convenience, I have grouped the article into three categories: Configuration directives and environment settings that can only be changed by a server administrator, a basic overview of PHP wrapping for the application developer, and general practices for securing PHP code. Even if you are not administrating your own server, I recommend reading through the first section in order to gain an understanding of the problem so that you can know what to expect from your web host. This article makes the assumption that your environment has PHP running on a Linux/Unix variant, with Apache acting as the HTTP server.


From the administrator’s perspective: Configuring your shared PHP environment

As the administrator of a virtual hosting server, you have full control over the HTTP server and the PHP engine, which is the ideal condition for tuning and securing your environment. First, let’s talk about separating the PHP interpreter from the Apache server, so that we negate the file permissions problem discussed above.

The PHP interpreter can be invoked in three different ways: as an Apache module (discussed above), as a CGI binary, and as a CLI. Since the CLI (Command Line Interface) isn’t relevant for serving web pages, I won’t be addressing it in this article. As mentioned above, the most common (and often default) method for invoking PHP is as an Apache module. However, let’s look at an alternative way of invoking PHP – namely, as a CGI binary.

When invoked as a CGI binary, Apache loads the PHP interpreter only when needed, passing necessary input (environment variables, POST data, etc) to the PHP executable, then collecting the output and sending it to the client. In this scenario, the PHP process is separated from the Apache thread, which makes it possible to run the processes as different users, thereby eliminating the permissions problem discussed above. However, by default, PHP is run as the Apache user, so we haven’t yet solved the problem simply by running PHP as a CGI binary. Our next step is to “wrap” PHP so that it is invoked as a user that we specify, not as the Apache user.

Note: Installing PHP as a CGI Binary can introduce other security concerns that are worth being aware of. While PHP is generally secure out of the box, It is advisable to take a look at http://us.php.net/manual/en/security.cgi-bin.php.

While Apache does have its own mechanism, suEXEC, for wrapping CGI programs, I will not be discussing it in this article. Instead, we’re going to look at another open source package: suPHP. Written by Sebastian Marsching, suPHP is a fairly simple Apache module that nicely wraps the PHP binary “in order to change the UID of the process executing the PHP interpreter” (suphp.org).

In my experience, installing suPHP has always been a fairly pleasant (as far as these things go) endeavor. You will need to refer to the suPHP instructions at http://www.suphp.org for installation information for your particular UNIX distribution, but as a lightweight application that makes use of Apache’s dynamic module API, installation of suPHP should be trivial.

There are many articles on using PHP’s ini directives to help lockdown your

server, and although that is not the focus of this article, I would now like to briefly touch on a few directives you should be aware of.

First, be sure to enable PHP’s safe_mode. Although some older applications have trouble running with safe_mode enabled, most have been updated to account for this directive, and its benefits are simply too numerous to ignore, especially if you are not able to use a PHP wrapper such as suPHP. Safe mode will be removed in PHP 6 in favor of alternative methods of implementing file and directory security, but for now, you should leave it enabled.

Next, if you cannot implement suPHP or another PHP wrapper, it’s a good idea to set open_basedir in all of your virtual hosts. Set “php_admin_value open_basedir /path/to/vhost/root” as a directive in your virtual host configuration to ensure that PHP is restricted from reading any files outside of the virtual host’s document root.

Finally, have a look at the disable_functions directive. While you do want to

make sure that your security procedures don’t prevent your users’ applications from running as they should, it’s often the case that few, if any, users will need any of the more potentially hazardous functions such as passthru(), exec(), and shell_exec(). If it is the case that none of your users need these functions, it’s a good idea to disable them.

(Note: In the event that only one user or application needs these functions, suPHP allows you to specify individual php.ini files for specific virtual hosts, which offers a middle ground between allowing these functions globally and restricting applications that need to use them for legitimate purposes.)

From the developer’s perspective: Why do I want a PHP wrapper?

Why do you want suPHP, or a PHP wrapper at all? Let’s look at a very common example – uploading images.

It’s a common condition that an application needs to accept image uploads via the web, and many developers operating in a shared hosting environment have run into the problem where, on the first try, PHP displays a “permission denied” error when trying to move the uploaded file into its destination directory. Our User vs. HTTP server permission problem is back again, where the HTTP server - user “www” or “apache” -does not have the proper permissions to write to a directory owned by the developer’s account. The common solution to this problem is to set the permissions on the destination directory to 777, giving all users system-wide read, write, and execute access. While this does work and your uploads can now flow freely, you’ve just ensured that any other user on the system – there are probably hundreds – could very easily issue an “rm –rf /path/to/your/uploads”, which would quickly and effectively delete everything in the directory. While I personally like to think that there exists camaraderie between users on the same server, this probably isn’t true, and you also have to consider that someone else’s account may have been compromised (probably by a lack of input checking on an upload form – more on that below).

With suPHP (or another PHP wrapper) enabled, you are free to leave your upload destination directories with the same permission as your other web-accessible files – namely that only your user account has write access, and the Apache user can read files and traverse directories. In fact, if your user account is in the same group as the Apache user, you can set these directories to have permission of 750, which is much more restrictive than a wide-open 777.

The primary downside to using a PHP wrapper is that there is in fact a performance hit, since the PHP interpreter has to be invoked for every request, rather than being started as part of the Apache server. However, in my experience, the performance decrease is generally unnoticeable. If your application is extremely performance-critical, you will want to run benchmarks before deciding to use a wrapped PHP environment, or consider graduating to your own dedicated server where you can ensure that only you and your developers have any kind of access to your application files. In the dedicated server scenario, the security concerns of using PHP as an Apache module are largely mitigated. ( Note, however, that if one site on your dedicated server is exploited in a mod_php setup, other sites or files will most likely be vulnerable as well, whereas a server running suPHP with PHP’s safe_mode enabled and open_basedir directive configured will generally be able to jail the attack to one virtual host’s document root)

General practices for PHP application security

At this point, I’d like to briefly go over just a few coding techniques you can use to increase the security of your application. Please understand that this is by no means a comprehensive list, and simply adhering to the suggestions below does not ensure that your application is secure. However, you should make it a point to be security conscious when writing code, rather than trusting your environment to eliminate or mitigate any potential attacks on your application.

1. Sanity-check your data

- This is probably the simplest and most effective way of preventing exploits in your application. Sanity checking just means that if you’re expecting the user to enter a number, make sure you actually received a number. If you’re expecting a string with alphanumeric characters only, verify that that’s what you got. Also, never trust this kind of validation to javascript only, as javascript can easily be disabled on the client side. Finally, never pass user data directly to an SQL query without validating it first. While PHP’s magic_quotes mechanism is great for helping to prevent SQL injection attacks (an attack where a user can enter data in such a way as to run their own arbitrary queries), again you should not rely exclusively on the environment for your application security.

2. Check the type and extension of uploaded files.

- Allowing file uploads is inherently risky, but very often it’s a necessary part of an application. PHP allows you to gain a lot of information about uploaded files before they’re ever written to their final destination, so make use of the information contained in the $_FILES array to ensure that you’re getting the type of file you’re expecting. A basic way of validating the file type is simply to ensure that the extension of the file indicates that it is (or is purported to be) the type of file you’re expecting. A common exploit for upload scripts is for an attacker to upload a malicious PHP script to your site, then browse to the uploaded script to gain control of your files. Even a basic check to make sure that, for instance, only files with a .jpg extension are allowed to be uploaded would prevent this type of exploit. However, I also recommend verifying the MIME type of the file, which is contained in the $_FILES array under the key ‘type’, and will look like: “image/jpeg” or “application/pdf”. Be as restrictive as possible – rather than validating against a list of extensions that are NOT allowed (php, exe, etc), check to make sure that the extension and/or MIME type matches a small group of file types that ARE allowed.

3. Use a .php extension for ALL files with PHP code contained in them.

Often I come across files in PHP projects that have a .inc extension, because they are meant to be included, not browsed to directly from the web. This is a common condition, but there’s a potential security issue here if those files contain any sensitive data (e.g. database passwords). Because .inc files are not parsed by the PHP interpreter, they can be passed directly to the client side if they’re available via a web request, which would allow anyone to read the php code directly. Hopefully the directory these files reside in is denied read access by webserver rules (see below), but even so, there’s no sense in risking an accident where the user ends up being able to browse directly to the file. Use .inc.php.

4. Make use of your webserver’s access control rules (e.g. .htaccess)

- Even if all the php files in your application have a .php extension as discussed in #3, you should still make use of your HTTP server’s access control to prohibit any files in sensitive directories from being served via the web. For instance, if you keep your database passwords in “include/db.inc.php”, this file, and all files in the include/ directory should be prevented from being served via the web. Even though the .php extension will ensure that client-side users can’t read the code if the PHP interpreter is functioning, there is the potential condition that the HTTP server has loaded without the PHP interpreter. Botched upgrades or configuration errors can sometimes cause this condition, and in the event that someone browses to a PHP file without the PHP interpreter ever having been loaded, they will again see the code just as you do when editing the files. In Apache, disabling directory access is usually as simple as creating a file called .htaccess (note the leading period) in the directory, then adding the line: “Deny from All” (no quotes) to that file and saving it.