PHP in the web development world: are we doing it all wrong?
Some thoughts and ponderings on how "the frameworks out there" might not do it just as right as they should.
The UNIX principle applied?
Something that is currently overlooked in a large part of the PHP community is that not all design patterns are necessarily implemented in PHP code. Your webserver can be considered the front controller of your application too. Routes can be defined using your filesystem and webserver configuration. Not everything has to be Object Oriented to be well organised and well structured. To gain perspective, here are some ideas to think outside the box, and not have OO patterns drive you beyond reason.
Some perfectly good solutions to, for example, caching and distributed processing, are commonly overlooked because of the simple presumption that everything should be done inside PHP or should at least be controlled by PHP. In my opinion, this might be a symptom of the Not Invented Here syndrome, and is thus by definition something to be questioned. It makes modular reimplementation in a different platform or language virtually impossible, which troubles long term development of web applications. You can not and must not feel confident that anything you want can or should be done in PHP. The language is irrelevant, the platform is irrelevant, the choice for PHP should be one based on those aspects that make PHP a better choice than another platform or language. If the outcome is PHP, use it. If it is not, it should be of least possible impact on the design choices you make for the rest of your application.
Why is the question important?
We need to be more and more prepared for distributed and parallel processing, which means we need to slice applications down to small and simple bits to be processed on any CPU or filesystem. That means that, while employing nice OO implementations of design patterns inside frameworks like Zend Framework, Symfony and the likes1, we are in fact building monolithical applications, that can be called and routed only through a single front controller file (usually index.php), which doesn't make the design of the application internals monolithic per se, but does make the application an sich monolithic.
Continuing with this type of development is in huge conflict with various philosophies, such as the UNIX principle. We build one application that does a lot of things very well2. This UNIX principle is not something to be overlooked lightly, because it is historically the single greatest success factor of free software and UNIX based development such as Linux, BSD, GNU, etc..., not to mention PHP. Note that free software means "Free to choose what software you want to use" also; not only free in terms of money or legalities, but also free in terms of dependency and freedom of choice.
Moreover, note that the UNIX philosophy holds strong resemblance to the Principle of Least Knowledge and Separation of Concern, which, in my opinion, are by far the two most vital and important fundementals of software design, when keeping maintainability, testability and long term development in mind. Parts of the application that need updating because either the code is irrelevant or some better component has arisen, should be easily cut out and replaced by another. Think of, for example, replacing Apache with NginX, or even replacing the current PHP payment module with a Java one.
Any component in an application should be responsible for one thing in the application, and one thing only. Vice versa, the component that does the job, should be the component that is the very best at that job. You wouldn't write your own SQL implementation, if you know there's perfectly good relational databases out there. So why build your own page caching mechanism if the webserver can cache pages for you? Why build your own routing and dispatching system if the webserver has perfectly good routing and dispatching systems in place?
What with those quote MVC unquote frameworks?
PHP does one thing very well: it is a very fast interpreter of a reasonably simple programming language, and is best suited for processing variables (typically CGI requests) and responding with text output (typically HTML formatted content), while communicating with any backend such as MySQL and filesystems.
MVC frameworks are deviating from this premiss. They hold the PHP application responsible for far more than just handling an incoming request and serving the response. This could be a problem. It complicates the application design very much that all those responsibilities come onto the shoulders of a single application. The danger lies in the fact that inside this application, there might be more trouble identifying the responsibilities and concerns of the different components. If you would separate those responsibilities from the main application, you would find that there is a lot that we're dealing with inside our application, which is very irrelevant to the application's own concern.
The best example of this is caching. Either caching stuff that is coming from the database (which a properly configured MySQL already does for you), or caching entire pages (which can be done by the web server): this does not necessarily belong in your PHP code. It belongs in your application, yes, but your application consists of more than just PHP. It is the entire stack of software that all follow the UNIX principle: Linux, Apache, PHP and MySQL. Of course, substitute any of the software with an equivalent alternative, if you have reason.
Another example is routing and dispatching. One of the most tedious and tiresome jobs inside your application is making sure that every link is SEO friendly; i.e. that it contains words that are relevant to the page's content, and that it's hierarchical position is relevant to where the page belongs in the site, while still having an idea of what the canonical url of that page should be to avoid duplicated indexing by search engines. I have had not a single project where this would cost not a few hours - at best - to take into account. Why? Because we think to hard about it, and we fear the simple solution might impact performance.
Where complexity is, is where the problem lies
Whenever you find you're adding complexity to your application because of non-functional requirements such as performance reasons or cost reduction, you need to take a step back and re-evaluate the problem at hand.
A very simple example of this is a YAML file read by a set of PHP classes, rendering a PHP array. This array can be cached in a compiled form using var_export. Having the application itself responsible for this compilation will add complexity to the set of responsibilities of the application. It would be much easier to have the application depend on a PHP configuration file, and let some build process make sure the YML file is compiled into PHP.
Another example is compilation of template files. If you have a set of template files in another language than PHP, for example Twig or Smarty, you can have the compilation of these files be part of a build process. This doesn't have to be on-the-fly. PHP programmers are used to having stuff work "on the fly", but it is a convenience that can bite you in the ass. It's far easier to have a build script that makes sure all cachable data is flushed, than have your application responsible for checking of freshness of these caches. To be even more exact: your application doesn't even need to know that the stuff it's reading is actually a cache. It only needs some files, and they're there.
My final example is SEO friendly URL's. There is no reason the application itself should worry about this. We have lots of available tools to rewrite outbound and inbound url's to whatever we need to rewrite. The only thing this tool needs to know is a very simple "from" => "to" mapping of all these URL's, and that's it. The program can translate friendly URL's into internal URL's, and vice versa. If the program does that, and does that very well, we are out of the pickle.
Data versus layout
Formatting the data of your application in to whatever format (usually HTML), is the responsibility of some sort of view renderer. Action Controllers have the responsibility in translating a request into some sort of response, and usually dispatch to a view renderer internally to have the response formatted as HTML. Now here's something fishy going on. The action controllers gain three responsibilities in this way; two explicit (handling the request with the appropriate action and returning it's result) and one implicit: rendering the result. Since the result of any action always is data, the action controller should not be responsible for the formatting of the data. It is in fact irrelevant at that point in time. You'll just want to know what the result of that action is; either some state (success or failure) or some data. It makes much more sense to have an intermediate format for this (be it a PHP array of data, XML, JSON, or some other form), and have the view on this data or result be detached entirely from the action's operation. This way, the action controller has only one responsibility: returning the request's result. Note that the request can be defined as simply as a few method parameters; it needn't be an HTTP request as such.
The front controller of the application can be responsible for dispatching one front request to one or more action requests, that all have their result. Whether or not the front controller renders this response as HTML is moot.
Why not utilize PHP as a build result?
I think we all might gain more in using PHP as a result of our project's build. We might simply program in a higher level programming language, and have some compilation or build process build all kinds of php files, that may contain a meriad of repetitive code, just because that is what ultimately the PHP interpreter is best at. Simply preprocessing the hypertext, in stead of doing it the other way around and having PHP implement byte code caches and high level programming features. Maybe PHP was the right language right before it got cocky...
< Return to main page