Sunday, 5 October 2014

But, is it web scale ?

Before we start to cover the topic of how to achieve parallel concurrency in PHP, we should first think about when it is appropriate.

You may hear veterans of programming say (and newbies parrot) things like:
Threading is not web scale.
 This is enough to write off parallelism as something we shouldn't do for our web applications, it seems obvious that there is simply no need to multi-thread the rendering of a template, the sending of email, or any other of the laborious tasks that a web application must carry out in order to be useful.

But rarely do you see an explanation of why this is the case: Why shouldn't your blog be able to multi-thread a response ?

The reasoning is simple: Threading at the front end of a web application doesn't make good sense; if a controller instructs the operating system to create even a reasonable number of threads, for example 8, and a small number of clients request the controller, for example 100, you are asking your hardware to execute 800 threads concurrently.

CPU's would have to look much different than they do today before anyone could say that is a good idea.

This is not how we make use of 1:1 parallel concurrency in any language that supports it.

Some programming environments employ a N:1 (many-to-one) or N:N (many-to-many) model, whereby the kernel is not necessarily aware of every thread created, this is colloquially known as green threading. PHP uses a 1:1 (one-to-one) model whereby the kernel is aware of every thread created.

So, when is it appropriate ?

Take MySQL for an example; it uses parallelism to realize its extremely complex services, many other bits of software in the web stack do the same.

When we look at how they provide their services, we can see that MySQL's complex execution environment is isolated from the service that uses it by process boundaries; that is to say, services are provided by some sane form of IPC and or RPC.

This is how we make use of parallel concurrency; we separate and isolate those parts of our application that require what parallelism provides.

We keep our frontends simple, we keep them scalable.

Our backends can create and manipulate as many threads as they need to service the frontend of our applications, they can be responsible for all of the complexity in an isolated, controlled and well designed environment.

Can PHP really be used to write system services ?

Historically we have advised against writing long running scripts in PHP.

Today, PHP is suitable for such things, obviously there are pitfalls, there are things to think about, but there is no technical barrier to writing long-running scripts, even ones that run indefinitely.

Obviously, much like there is in Java, or any other language that might be used to write system services or long-running code, care must be taken to ensure that the code you write is logically able to run indefinitely; if you append an infinite amount of members to an array you will need an infinite amount of memory.

With great power ...

Parallelism is one of the most powerful tools in our toolbox, multicore and multiprocessor systems have changed computing forever.

But with great power comes great responsibility; don't abuse it, remember the story of the controller that created 800 threads with a tiny amount of traffic, whatever you do, ensure this can never happen.