Tuesday, 28 October 2014

Remember Tommy Flowers

Fig 1. Tommy Flowers MBE
Tommy Flowers might well be a name you never heard before today, and yet, he should be remembered as one of the giants upon the shoulders of which we all stand as computer programmers.

Modern computing was, at least in part, born of war, many technological advancements are. As a result of that, they can be shrouded in secrecy by necessity, robbing the people involved of the recognition they deserve, of their place in history.

This is a travesty.

We can never give Tommy the recognition he deserves, we can speak his name, we can remember him.

Born on 22nd December 1905, the son of a bricklayer in the East End of London. Tommy was 8 years old when the first world war broke out. By May 1915, when the first bombs were dropped on London, Tommy was just 9 years old.

Tommy did not have a privileged life, he didn't mingle with high society, he worked extremely hard, to become anything.

As a young man Tommy took an apprenticeship in mechanical engineering at the Royal Arsenal in Woolwich, London.

While working as an apprentice at the Royal Arsenal, he took evening classes at the University of London, earning an electrical engineering degree.

In 1926 Tommy took a research position in telecommunications at the General Post Office, in London.

The world was recovering, slowly, from the first world war when Tommy took his position at the General Post Office, he settled into the life of a researcher. He focused upon the development of an all electronic telephone exchange.

Tommy could not have known then, but everything he had done with his life was preparing him to save the lives of thousands, and play his part in changing the world.

The world began to go to war again in 1939. By 1941 information technology was already playing a big part in the success of the allies.  Alan Turing was already at Bletchley Park, playing a big part in cracking the Enigma code.

It was in February 1941 that Tommy first came into contact with the Bletchley Park team when director of the General Post Office was asked by Alan Turing for help. Turing needed engineers and researchers, able to build his machines.

Initially Turing wanted engineers to construct a relay based machine, still focused on cracking, ever faster, the Enigma code used to encrypt Nazi communications.

The initial project fell through, but Turing noticed that Tommy was dedicated, and clearly gifted. In 1943 Turing introduced Tommy to Max Newmann. Newmann was in charge of the effort to automate part of the cryptanalysis of the Lorenz cipher.

Lorenz was much more complicated than Enigma, from an engineering point of view it demanded a much more complex machine. Tommy and another researcher by the name of Frank Morell worked on the first machine to decipher Lorenz.

The first machine was named the Heath Robbinson, after cartoonist William Heath Robbinson, known for drawing wantonly complex machinery that was supposed to carry out simple tasks.

Fig 2. A potato peeler by W. Heath Robinson

The Heath Robinson had a couple of dozen valves and can be considered the predecessor to Colossus. Heath Robinson was effective, but slow, and difficult to operate.

The construction of Colossus was the brain child of Tommy, as the name suggests it was an immense machine, consisting of, at first, 1800 valves, and occupying an entire room. It would be the worlds first electronic digital programmable computer.

Tommy went to the powers that be at Bletchley Park with his vast experience, and his new idea, and was met with skepticism, since at that time the most complicated comparable device had something like 150 valves.

He tried to argue that the telephone exchange was of a comparable complexity and was reliable because it was operated in a stable controlled environment and was powered on all the time.

Tommy knew from his work at the General Post Office that valves were perfectly reliable, so long as you did not turn them off, since they come under most stress when being powered on.

They refused to fund Colossus, he was met with the notion that by the time his machine was operational, the war would be over and there would be no need for it.

He obviously knew they were wrong, he began work on Colossus anyway, with his own funds. Later he got the backing of his superiors at the General Post Office, this afforded him rapid delivery of the parts required to construct Colossus.

His team of dedicated engineers built the first working Colossus in just 11 months, it worked, 5 times faster than Heath Robinson. Shortly after, the machine was dismantled and moved to Bletchley Park.

On 1st June 1944, a mark 2 machine was operational at Bletchley Park, it immediately provided vital information regarding the D-Day landings planned for 5th June, just 4 days after the machine went operational.

The act of ignoring the opinion of his superiors at Bletchley Park, the act of using his own funds to begin working on the first Colossus when he did, afforded them that vital intelligence.

By the end of the war there was 10 Colossi in operation.

The war was won, the government recognized Tommy's effort by granting him £1000, which didn't cover the cost of his personal expenses for the initial development of the machine, and was shared among himself and his engineers.

After the war the machines were dismantled, and their blueprints said to be destroyed in an effort to maintain secrecy. Anyone who had worked on Colossus bound by the official secrets act.

When Tommy approached the Bank of England for a loan to build a new machine, they simply didn't believe it would work. Bound by the official secrets act, he could not tell the bank he had already built such machines, and they had just helped to win a world war.

Tommy returned to his job at the General Post Office, continuing work on all electronic telephone exchanges, a project brought to completion in 1950.

Tommy Flowers died on 28th October 1998, a man of 92 years.

16 years ago today, the world lost a real life hero, and a pioneer.

Remember him.

Sunday, 26 October 2014


Fig 1. The wrong end of a stick
phpdbg is a debugging platform that was merged into 5.6, recently it got some cool updates, and is the first working debugger for PHP7.

One chap done all the work required to make it compatible with PHP7, and wrote all the cool new stuff. That was Bob Weinand (@bwoebi).

Before I go further, thank you Bob.

One of these cool updates was the improvement of remote operation of phpdbg.

When phpdbg was merged, you could operate it remotely, but, it was what I referred to as protocol-free, that is to say that stdin and stdout were a socket pair, like an old inetd service. A quite horrible design in reality.

When phpdbg was merged, we had probably spent about a week writing it, it didn't have all problems solved, still doesn't.

We were only focused, for a very short time, on writing an interactive debugger, like gdb is to C.

We never intended to step on the toes of xdebug, xdebug is very mature software, a staple of the ecosystem today, and has been for many years. You would have to be barking mad to ignore that.

We had a quite narrow scope, there are things in the debugger clearly aimed at internals developers.

All we wanted was something like gdb for the Zend VM.

Debugging PHP at the engine level is not so easy, when you are trying to do things like write patches for Zend, introduce new opcodes, bend the rules in an extension somehow.

I think the second kind of break point we supported was an opline address, something almost completely unknown to most PHP programmers, impossible to discover in user land by normal means.

We need to be able to debug normal code too, so, based on a simplified set of gdb commands, we set about giving it the kind of features we need to debug code, as it happens the kind of features PHP programmers need too.

At this point, we thought we did have something quite cool, something useful for more than just internals developers. We showed it to a few people, who exhibited an interest in trying it. While that conversation was going on it was requested, by Andi Gutmans at Zend, that we RFC for inclusion and distribute phpdbg with PHP.

We were not aiming for that, we were happy to just write it, and get a tiny patch merged into php-src to make distribution outside of the php-src tree possible for Windows.

We were of course pretty pleased with ourselves, we done the RFC thing and it was voted in, unanimously.

So far, phpdbg has been included in two dot releases of PHP, most people probably haven't used it yet, it's still very young software.

Before any release of PHP included phpdbg, the users of PHPStorm started talking about integration, someone used their feature request system to request that the PHPStorm team at least investigate it.

You can guess their investigations showed that phpdbg wasn't really suitable for use in an IDE.

However, they started a productive conversation with us, they were keen on trying something new, and we are quite happy to put effort into making phpdbg more accessible.

At first, I said we should use dbgp, because maximum gain. However, I didn't do the work, I wasn't involved in the conversation with PHPStorm guys very much. It turns out that dbgp is just not suitable for our needs.

That's all we done, we haven't started any war.

We are still not stepping on the toes of xdebug, xdebug is still extremely mature software and people will continue to use it.

phpdbg is fundamentally different to xdebug, you cannot in all cases use phpdbg for the same things as xdebug.

phpdbg does some stuff that xdebug doesn't do, however, the number of people really needing those things are probably few.

You could say that there is some competition, because of overlap. Perfectly healthy competition, that we never really wanted to get into.

Whatever about it, there is no war.

We dropped the ball, and misunderstood the situation, which made internals noisy.

The title of the vote on the RFC is "Distribute phpdbg with PHP5.6+".

At no time until this weekend was it mentioned that we should need to go through the RFC process to develop our project.

This is not the understanding we actually had, and I'm not sure what we intend to do about it yet, or even if we can do anything.

Tuesday, 21 October 2014

Unicorn or Unicode

Fig 1. A Unicorn
Sometimes we will get an idea that would create something our limited foresight says will be beautiful. We chase after the idea, regardless of everything.

We have unicorns in programming.

This morning I want to talk briefly about the approach some of us are trying to make in adding Unicode string support to PHP7.

Unicode all the things !

PHP6 was a real thing, a bunch of effort went into it. Among other things it aimed to introduce Unicode string support at the level of the language, so that all strings were Unicode.

For many practical reasons, the project was all but abandoned. All of the features bar the Unicode string support were back ported from the 6 branch to 5.3 and released.

I'm going to assert that Unicode everywhere in PHP is a unicorn. History proves my assertion to be true, we could not overcome the performance problem inherent in treating all strings as Unicode, there was no beautiful animal.

It was absolutely worth doing, until code is released it is research, valuable research that all of us can learn from.

Some time in the future, it might be worth trying again.

Be sensible, Unicode some of the things !

PHP7 got fast, but the problem of not having any decent Unicode string support in PHP is hanging around like a bad smell. I don't think any of us want to destroy our new found performance by chasing unicorns.

So some of us have put together a wrapper around ICU's UnicodeString class. The PHP API is derived, in part from work done by Nikita Popov (@nikita_ppv).

The extension developed into something that could support backends other than ICU, there is a native Windows backend in the pipeline, for example.

It performs just enough internals magic to provide a decent iterator, sensible casts, and the ability to read dimensions.

Sure, you have to choose which strings you want to work with as Unicode strings, however, that seems like the only sensible option to me anyway. I never really needed the string passed to the include construct to be Unicode.

You can have a look at the RFC, prepared by Phil Sturgeon (@philsturgeon) here.

The extension is available here.

Let the discussion begin.

Monday, 20 October 2014


Fig 1. An actual miracle

All the time I am awake, I think about writing code, if I'm not thinking about it, I'm either driving, or doing it. I don't need very much at all to be happy, to be productive.

I am productive, I purposely involve myself in things where I know I can be useful. I do that at work, I do that for PHP. I do that so that every morning, I have some code to think about.

Not very long ago, I was presented with a horrible, horrible problem. I ended up with my 2 kids and partner in a 20 foot long caravan on some friends land, totally broke, totally stuck.

I woke up nearly every day, I tried to be involved, I tried to do my work, slowly, my ability to concentrate slipped away from me. I was overwhelmed with dread, paralyzed by the thought that I failed. My health, my perspective deteriorated, very quickly.

I really thought I was stuck, I kept letting people down and thought it wouldn't be long before the offer to be involved in the conversation would slip away, with everything else.

It is extremely difficult to ask for help, it's even more difficult when you have nobody to ask.

My dear friend Anthony (@ircmaxell) suggested that I reach out to the community, and ask for help. It took me a week, of abject misery, to accept that this was a good idea. I worked out what I would need to get back on my feet, and after some input from Cal Evans (@CalEvans) on the content for the page, we put up a gofundme campaign.

What happened next, beggars belief. The whole community started to tweet and talk about it, it was a gigantic, much needed hug.

In the first few hours, there was thousands of pounds donated, I watched the email from gofundme and twitter come flooding in as if it was happening to someone else. I read the email from gofundme that said I was getting such a response that I would be getting a digest of emails, rather than an email for each donation, completely in awe of what seemed to be happening.

All of us, my kids, my partner, sat watching the screen. I cried, several times. I struggled to explain to my kids that people they have never met were giving us this money, because they wanted us to have a house.

After 24 hours, the total was over £11000, nearly twice what I had said I needed.

It still didn't really seem real, I was sure that something would go wrong. I got my banking details sorted on the gofundme site, and waited, an agonizing 7 days or thereabouts, while gofundme verified my identify and commenced transfers.

At 2am one morning, the first payment cleared my bank account. £10000 was waiting for me. My partner and I had talked all week about what to do next, we looked all over the country for somewhere we could afford make a fresh start. We had decided, for many reasons, that we wanted to go and live on the Isle of Wight. By half past two that morning, the kids were asleep in the back of the car and we were heading for the coast.

It took another week or so, then we found home:

Fig 2. Home

This morning, I just want to say thank you, for the single greatest thing the PHP community ever done, for anyone. I feel incredibly lucky that someone was me.

To everyone that spared us a thought, to everyone that donated, to my friends who had to put up with my moaning and whining, all of you are amazing.

Thanks :)

Sunday, 19 October 2014

Building complexity

Fig 1. A fig, right ?
Recently there has been much discussion relating to FIG. Sparked by an open letter to FIG written by Anthony (@ircmaxell).

The letter communicates a concern that he and others in the community have; recently FIG seem to be coming up with wantonly complicated solutions to what can be simple problems.

These communications are for everybody,  they are open, they use, as literary devices, particular drafts and code examples. I suspect it had some unwanted side effects, these letters weren't about any particular PSR.

In the follow up to this letter, the closing statement really speaks to all of us:
By getting creative with OO solutions, you can build incredibly powerful, strong and (most importantly) simple abstractions. The best way to build complexity is by composing simplicity. If you start with complex, you can never get simple.

This morning, I want to talk briefly about a refocused FIG, and why that matters.

What is FIG ?

Some of us may not really know what FIG is about, while it's hard to believe there is anybody left that has never heard of a "PSR", we should start by looking at what FIG is really about, as a group.
The idea behind the group is for project representatives to talk about the commonalities between our projects and find ways we can work together. Our main audience is each other, but we’re very aware that the rest of the PHP community is watching. If other folks want to adopt what we’re doing they are welcome to do so, but that is not the aim.
 A fine summary of what they were all about, perhaps. However ...

It is no accident that we are watching !

We live in a new world, where the developers of a framework don't have a very different role to the developers of any project.

All of us build complexity by composing simplicity.

We obviously shouldn't aim for complex solutions, what we are doing is complicated. If it's even realistic to talk about solutions that aren't complicated it's arguably because of the ecosystem supporting us.

We compose our modern solutions from what we would like to be simple components. Those components do need to integrate.

FIG should be able to recognize that the standards they create can effect the ecosystem supporting all of us, they are in some cases prohibitive of starting with simplicity. They are positioned to do an important job, and should recognize the implications of their work.

If your aim is true, you will likely hit your target. I'd like to rewrite the What is FIG? quote to read more like this:
The idea behind the group is to aid the PHP ecosystem in their pursuit to build complexity by composing simplicity by defining simple, focused and extensible common interfaces for modern components.
I desperately need them to hit that target, so does everyone supported by the ecosystem.

Everyone is obviously watching.

Wednesday, 15 October 2014

Hulk Syndrome

Fig 1. An incredible hulk

[Banner] Ah, see. I don't get a suit of armor. I'm exposed, like a nerve. It's a nightmare.
What brilliant use of language that is, to explain why hulk is so angry all the time.

In just using the words "exposed" and "nerve", it communicates so much, that hulk is in unimaginable pain, that it's difficult to just think about what he is doing.

This morning I'm going to try to explain Hulk Syndrome, the thing that seemingly happens to us when we think we are being unjustly criticized, and propose a way to tame the beast.

Banner was exposed to gamma rays, why are programmers angry ?

From the first time you go out into the community and try to talk about something, you are ruthlessly criticized. Criticism is usually a good thing, it usually comes from a good place, but can be none the less difficult to hear.

Criticisms are our gamma rays, after prolonged exposure, whenever we are out in public, we are exposed, like a nerve. It can be a nightmare.

It's not enough that the rational bits of our brain can recognize the obvious truth that criticism usually comes from a good place. In the heat of the moment, we can tend to display characteristics of Hulk Syndrome, even if we don't let our symptoms leak onto twitter, email, or any other communications.

How to tame the beast ?

We don't have to be slaves to our brain chemistry, and emotions. I like to think I am getting mine under control. I can easily be wrong and maybe I'm as hulk as ever, but I don't feel it.

I achieve this by doing a very simple thing. Whenever I feel as if some communication contains unjust criticism, I ask myself the following:

Can I reword the communication, so that if there are questions being asked, they provoke a solution or insight, so that if there are genuine concerns being raised, I notice them ?

I think this works because it's illogical to write something to myself that is hostile, it's illogical to perceive it as hostile in any way.

It doesn't matter what you do to tame the beast, so long as by the time subsequent communications take place, you are actually thinking about what is being said to you.

Why not just hulk smash the haters ?

Just like hulk, we destroy everything around us when we do. Not in the real world, obviously, but in our working relationships with people who care so much they are reaching out to communicate in the first place.

Fig 2. Thor, "reasoning" with hulk
[Thor] We are not your enemies, Banner. Try to think!

Tuesday, 14 October 2014

I'm all about the context

Fig 1. Meghan Trainor
PHP is a dynamically typed language written in C, C is anything but dynamic. So then, in some sense variables are not dynamically typed, so why, when at least some of the infrastructure exists to allow users to work with types, do we always stop short of actually allowing it.

This morning I'm going to try to explain the obsession with context, even justify it.

What really makes PHP "dynamic" ?

We stop short of exposing the strict typing that already partly exists in internals because the thing that makes PHP a dynamic language, is its ability to coerce the type of a variable depending on the context the variable was used in.

This is, in practice, what makes PHP dynamic, it is the reason that a function expecting a string can accept an integer and still continue executing.

What could change that ?

There are some RFCs and ideas floating around the community that, on the face of it at least, would be brilliant for PHP, they really solve problems.

The first example is scalar type hints, this subject has been in RFC many times, in many different forms.

The strictest forms, when the implementation will cause Zend to barf if the wrong type is passed are always flat out rejected. That's obviously an approach that is far too strict for PHP.

Other patches exist for scalar hints that perform the same kind of coercion as Zend does for internal functions, this seems to be the most suitable implementation, but even still can create unexpected behaviour because of the way Zend currently performs coercion.

The solution to this problem is likely to fix the way Zend performs coercion first, so that coercion performed by internal and user functions are always the same, and both are more intuitive.

Another, more complicated example is the Scalar Objects work done by Nikita Poppov.

I love this idea, it does solve problems, however it causes some too: what should happen if you call ::length() on an integer ?

Today, strlen will accept an integer and treat it as a string, because in the context of a call to strlen, the first parameter is a string.

There doesn't seem to be a reasonable way around this, so that if we introduce this kind of functionality in core in its current form, we are undoubtedly changing the nature of PHP. Not quite changing from dynamic to strict, but we're changing it, blurring the line even further.

Being PHP, hopefully means remaining dynamic

There probably are solutions to the problem of introducing the kinds of things experienced developers seem to want, while keeping PHP as dynamic as it is today. But, we should think very carefully before getting behind something that is only good for a very very small percentage of us.

Easy to get started has all kinds of positive, very valuable, knock on affects. You could argue that this effects how easy it is to train new developers, how easy it is for new teams to reach productivity, how easy it is for a project to actually succeed.

Since being dynamic and easy appear to go hand in hand, we should value the current dynamic nature of PHP, as much as we value how easy it is to get going.

Monday, 13 October 2014

Who is driving this thing ?

Fig 1. A car crash
If nobody has acceptable control over a project, that project will crash. This year PHP celebrates 20 years of existence. For many years PHP has been the most widely used programming language on the web. I think we can say with some certainty that PHP is under control.

Quite often I come across the following sentiment:

Fig 2. A passionate individual

In a recent inspiring talk, given by Anthony (@ircmaxell) at #phpnw14, he encourages us to look upon what we normally perceive as trolls, to be passionate individuals.

I'm really trying to take this advice seriously; it has obvious benefits; even if that comment was made to flame and annoy, an opportunity has presented itself to ask and answer some important questions.

Specifically, the following:
  • Who controls PHP ? 
  • What are those people with control aiming for ?
I'm going to attempt answering those questions this morning.


Who controls it ?

In recent years, the PHP project has adopted an RFC (request for comments) based proposal system: Every time somebody wants to make a change to PHP, they have to write an RFC outlining their plan, communicate to all of the voting members what justifies their proposed changes, discuss on open mailing lists the implications and details of their proposal with the community at large. All of that has to happen before a vote is taken to decide if the proposed changes will be integrated into the PHP distribution.

Voting members are, for the most part, made up of internals developers, documentation maintainers, extension maintainers, and other php.net systems designers and administrators.

To become a voting member, you do not need to be a C programming god, you don't even need to know C in any detail. What you do need to do is dedicate some of your time to trying to improve the PHP project.

One of our greatest strengths is definitely the documentation project, it is a huge part of what makes PHP so accessible, and remain so accessible. Being a documentation maintainer equips you with all kinds of intricate knowledge, it teaches you an enormous amount.

Some of the most valuable input during the RFC process comes from those individuals that don't actually write in C, but have bibliographic knowledge of the way PHP works in the real world; because they have dedicated so much of their time to documentation bugs and maintenance.

What are those people with control aiming for ? 

There is no answer I can give that applies to everyone, nor should there be.

If a person really thinks that their favourite framework, component, or package is not being represented in the decisions being made, then they need to appeal to the leaders of that community to do what it takes to get involved, or even do it themselves.

It might seem unfair, unfair that in some sense you have to earn a right to vote.

Facta, non verba.

You are familiar with this concept, it means "deeds, not words".

Words are cheap, it seems obvious that just because someone is talking doesn't necessarily mean they are worth listening too.

There are a lot of people talking; we should take seriously individuals that have proven they have the kind of knowledge it takes to make good decisions when it comes to RFCs, knowledge you can only get by being really involved.

Enough words, get involved.

Friday, 10 October 2014

To everyone but that guy

Fig 1. Internet Jesus
It took many hours (read: seconds) of research to discover who Internet Jesus actually is. We can assume that the advice you are about to receive does not apply to Him, and Him alone.

It's Friday today, and I thought I would take some time to share with you something we try to do at my place of work; probably the best use of Friday's you have ever heard of.

Refactoring Friday

During the week, we carefully stitch, or recklessly smash together code; we are getting our job done. We've ticked off our tasks and moved on.

Except for Him, we all make mistakes, and we could all make good use of another hour on every task we have ever completed.

Soon, the intricate knowledge of what you have done this week will be gone from memory. In a surprisingly short amount of time, you will come across code you wrote this week and wonder why the world is the way it is, you will pray for a better world.

If you work in a shop with a million developers, or you work on your own. I encourage you to ask whoever calls the shots to leave Friday alone.

Make Refactoring Friday a thing; before every week ends, allow yourself to revisit the tasks you have completed and just ask the following questions:
  • Have I violated any internal standards ?
  • Have I violated any good programming practices ?
  • Could this code be improved ?
 If you can't honestly answer no to all of those questions, your code needs to be refactored; right now, while the intricate knowledge still has a chance of being in place, is the best time to do it.


Wednesday, 8 October 2014

Monkeys and Humans

Fig 1. A (cheeky) monkey
For some people, it's easier to make sense of just what a monkey is if we assume that all a monkey is ever trying to do is be a human. That if it could just lose the hair and walk upright, we could give it a bank account, job, car, and the rest of it; they'd fit right in.

Since HHVM became a thing, there has been this kind of attitude, that somehow Zend is now behind, and that it should strive to be HHVM; lose the hairy bits and it will fit right in.

Monkeys and humans make perfectly good livings, just the way they are, as are Zend and HHVM.

In recent months Zend has got extremely profitable gains, not by introducing a Just-in-Time capable engine, but by aggressively refactoring the Zend engine, introducing better conventions with regard to memory usage, and as a result of that, better API's; it's paying off, extremely well. These changes will be in the wild when PHP "7" is released.

I want to encourage people to think differently about where to take Zend next; the world is better off with monkeys and humans.

Zend is a complicated animal, but its got nothing on HHVM; the introduction of JIT capabilities at the level of the engine will probably bite us in the ass.

I understand the attraction of running code directly on the CPU, I understand that even while most of us don't need it, none the less, we want our code to be as fast as physics and the engineering of the day will allow.

There is another way, a much less intrusive, more productive way.

Not very long ago, after some late night conversations and bar room programming, Anthony Ferrara (@ircmaxell) and I came upon the idea to be able to compile those bits of your code you know are very slow to machine code, in userland; I set about exposing libjit to userland and JITFu was born, Anthony set about writing the frontend compiler and Recki-CT was born.

Today Recki-CT is able to generate machine code (using JITFu) and C; this means you can turn a limited subset of the PHP language into machine code at runtime, or into a pecl extension as part of your deployment process.

The beauty of Recki-CT is that it is entirely written in PHP, so many more people can be involved in the development.

The community is quite brilliant at solving problems; for nearly every problem that is solved in the core, there is a better solution in the wild. There is now no reason to approach the problem of making your code fast differently, what Recki-CT is doing is putting the power at your fingertips.

If you have even a faint interest, I encourage you to get involved, checkout Recki-CT and start reading.

Sunday, 5 October 2014

But, is it web scale ?

Before we start to cover the topic of how to achieve parallel concurrency in PHP, we should first think about when it is appropriate.

You may hear veterans of programming say (and newbies parrot) things like:
Threading is not web scale.
 This is enough to write off parallelism as something we shouldn't do for our web applications, it seems obvious that there is simply no need to multi-thread the rendering of a template, the sending of email, or any other of the laborious tasks that a web application must carry out in order to be useful.

But rarely do you see an explanation of why this is the case: Why shouldn't your blog be able to multi-thread a response ?

The reasoning is simple: Threading at the front end of a web application doesn't make good sense; if a controller instructs the operating system to create even a reasonable number of threads, for example 8, and a small number of clients request the controller, for example 100, you are asking your hardware to execute 800 threads concurrently.

CPU's would have to look much different than they do today before anyone could say that is a good idea.

This is not how we make use of 1:1 parallel concurrency in any language that supports it.

Some programming environments employ a N:1 (many-to-one) or N:N (many-to-many) model, whereby the kernel is not necessarily aware of every thread created, this is colloquially known as green threading. PHP uses a 1:1 (one-to-one) model whereby the kernel is aware of every thread created.

So, when is it appropriate ?

Take MySQL for an example; it uses parallelism to realize its extremely complex services, many other bits of software in the web stack do the same.

When we look at how they provide their services, we can see that MySQL's complex execution environment is isolated from the service that uses it by process boundaries; that is to say, services are provided by some sane form of IPC and or RPC.

This is how we make use of parallel concurrency; we separate and isolate those parts of our application that require what parallelism provides.

We keep our frontends simple, we keep them scalable.

Our backends can create and manipulate as many threads as they need to service the frontend of our applications, they can be responsible for all of the complexity in an isolated, controlled and well designed environment.

Can PHP really be used to write system services ?

Historically we have advised against writing long running scripts in PHP.

Today, PHP is suitable for such things, obviously there are pitfalls, there are things to think about, but there is no technical barrier to writing long-running scripts, even ones that run indefinitely.

Obviously, much like there is in Java, or any other language that might be used to write system services or long-running code, care must be taken to ensure that the code you write is logically able to run indefinitely; if you append an infinite amount of members to an array you will need an infinite amount of memory.

With great power ...

Parallelism is one of the most powerful tools in our toolbox, multicore and multiprocessor systems have changed computing forever.

But with great power comes great responsibility; don't abuse it, remember the story of the controller that created 800 threads with a tiny amount of traffic, whatever you do, ensure this can never happen.

Wednesday, 1 October 2014

A synchronous explanation of concurrency

Relatively new on the PHP scene are the following words:
  • Concurrent
  • Parallel
  • Asynchronous
These words have very specific meanings, they are not interchangeable.

It's important to communicate our ideas precisely, so that we don't confuse our audience. We should not try to make what we are saying more complex than it is, quite the opposite; it is in our best interest to make our ideas as palatable as possible.

I intend to provide a coherent, correct, explanation of what concurrency in programming is all about.

I'm going to use the example of a program which has three distinct tasks to execute, what the tasks are is unimportant for the explanation.

Synchronous Execution

The following diagram shows our program's execution:
 ---      ||
| 0 |     ||
| 0 |     ||
| 0 |     ||
| 0 |     ||
| 0 |     ||
| 0 |    \  /
|---|     ||
| 1 |     ||
| 1 |     ||
| 1 |    Time
| 1 |     ||
| 1 |     ||
| 1 |     ||
|---|     ||
| 2 |    \  /
| 2 |     ||
| 2 |     ||
| 2 |     ||
| 2 |     ||
| 2 |     ||
 ---      ||
We can see that the instructions for each task are executed in a linear fashion, by a single thread of execution.

Parallel Concurrency

For execution of our tasks to be parallel requires more than one thread of execution:
 -----------      ||
| 0 | 1 | 2 |     ||
| 0 | 1 | 2 |     ||
| 0 | 1 | 2 |     ||
| 0 | 1 | 2 |    \  /
| 0 | 1 | 2 |     ||
| 0 | 1 | 2 |     ||
 --- -------      ||
We can see from the diagram that our tasks run truly concurrently, concurrently with respect to time, reducing the overall time it takes to execute all three tasks.

Asynchronous Concurrency

The following diagram shows the asynchronous model:
 ---      ||
| 2 |     ||
| 1 |     ||
| 0 |     ||
| 1 |     ||
| 0 |     ||
| 1 |    \  /
|---|     ||
| 1 |     ||
| 2 |     ||
| 1 |    Time
| 0 |     ||
| 1 |     ||
| 2 |     ||
|---|     ||
| 0 |    \  /
| 0 |     ||
| 2 |     ||
| 2 |     ||
| 2 |     ||
| 0 |     ||
 ---      ||
We can see that the tasks are interleaved by the programmer, forcing the tasks to execute concurrently with respect to each other, but it seems to take as long as the synchronous model to execute.

Why write asynchronous code ?

Asynchronous concurrency is most useful when our tasks are I/O bound, where a considerable amount of time is actually spent waiting.

The following diagram shows the synchronous execution of our I/O bound tasks:
 ---      ||
| 0 |     ||
| 0 |     ||
| - |     ||
| - |     ||
| 0 |     ||
| 0 |    \  /
|---|     ||
| 1 |     ||
| 1 |     ||
| - |    Time
| - |     ||
| 1 |     ||
| 1 |     ||
|---|     ||
| 2 |    \  /
| 2 |     ||
| - |     ||
| - |     ||
| 2 |     ||
| 2 |     ||
 ---      ||
We can see that there is time spent doing literally nothing while waiting for subsystems and or hardware to do their job !

The asynchronous model has our instructions interleaved allowing us to eliminate waiting and continue executing another tasks instructions, making the diagram for asynchronous I/O bound code look more like:
 ---      ||
| 2 |     ||
| 0 |     ||
| 1 |     ||
| 0 |    \  /
|---|     ||
| 2 |     ||
| 1 |    Time
| 0 |     ||
| 2 |     ||
|---|     ||
| 1 |    \  /
| 1 |     ||
| 2 |     ||
| 0 |     ||
 ---      ||
So asynchronous concurrency can also reduce the time it takes to execute the same I/O bound instructions by executing another tasks instructions.

Why write parallel code ? 

Asynchronous concurrency can only help in the case of I/O bound code, parallel concurrency has a much bigger domain (the rest).

Just like asynchronous concurrency however there are limits; it is a mistake to think that throwing threads at anything will make it faster.

Achieving parallel concurrency is a huge subject, far too big for the footnote of my first blog post; I intend to cover the subject in detail.