Tuesday, 20 August 2019


Regarding contribution to open source projects

The following scenario has been played out countless times, by a large number of [first time] contributors, to a large number of OSS projects:

  • Find OSS project we love
  • Attempt to navigate [source of] project
  • Find this confusing
  • Decide that there needs to be some "cleanup"
  • Start making pull requests to cleanup [the source code]
  • Pull requests are ignored, refused, or argued against
  • Dismay

Most of the time, we quickly get a feel for the kind of pull requests that will be accepted, everybody moves on ...

But I wonder if they move on before putting any effort into explaining why this type of contribution is not always met with jumps for joy ...

I wonder if we can't say some things about all open source projects, that might guide new contributors to making the kind of contributions that have a good chance of being accepted. Perhaps, more importantly, making contributions in such a way that the (new) contributor becomes an asset for the project.


Communicating with a group of people at once, often about complicated technical matters, is not an easy thing, no matter the calibre of the programmer.

The following sections will seek to provide guidance regarding communications with existing contributing members of a project.

Signal to Noise

Contributing members of a project generally have a steady signal to noise ratio, that's to say, on a yearly (or quarterly, whatever) basis, they make about the same amount of noise, as they do contribute to progress. The ratio varies, depending on the contributor, but steadiness makes for normality, so typically we all get used to even the loudest contributors.

New contributors on the other hand, if nothing else than by virtue of the fact that they have more questions to ask than anybody else, tend to have an unsteady signal to noise ratio, and because the entities you are dealing with are human, this simply annoys them.

The people being annoyed, they very well remember being just as annoying as you are being, so don't take it to heart ...

But maybe there are some things you can do as a new contributor to try and keep a steady ratio ...

Mental Metabolism

New contributors are in learning mode, no matter how familiar they are with the subject matter, language, or patterns used in the project.

In learning mode, we tend to ask as many questions as occur to us, and seek answers as quickly as possible.

This tends to result in noisy communications: A thing to consider is that for every communication you might be sending notification to every subscriber to the medium, be it a github repository, mailing list, forum, or whatever.

Unless you are providing a patch for a life support system and lives are in danger, there is absolutely nothing that can't wait 24 hours.

By waiting a day between your communications, you are affording other contributors time to have their say (and possibly answer questions before you have had the opportunity to ask them). You are also providing yourself much more thinking time, possibly increasing the quality of your communication, and certainly reducing noise.

O(1) Communications

New contributions tend to garner responses or input from several sources, and having waited 24 hours since your last communication, you might have a pile of communications to respond to.

You can further reduce noise levels by responding to everybody at once, don't be afraid to tag people, segment, or otherwise format communications to achieve this.

Once you have the response for the day written out, you might try to compress it by finding overlap, you might even try to preempt what will happen next and try to provide answers to the coming questions in advance.

Source of Truth

Healthy projects tend to develop experts that don't necessarily have any involvement in contributing directly to the project: There are many many more Ruby experts than there are Ruby contributors, mutatis mutandis, the same is true for any project you like. These experts tend to be available outside of official channels, sometimes in groups, sometimes even in realtime.

Let opening an issue, pull request, or otherwise sending "official" communications be a last resort. Realise that the project you are dealing with is never the only source of truth, and that by finding other sources you can further and greatly reduce your noise level.

Choosing Battles

Be ready to gracefully walk away from any contribution or idea you had, be ready before you send your first communication to just be wrong.

Begin wrong is fine, that's how we learn ... It's sometimes hard to see that in the moment though, and find ourselves fighting tooth and nail for what we think is correct.

Realise that your first contributions are not likely to be your most important. You're dealing with humans, and if you make a lot of noise at an early stage, your voice will carry less weight later on.

The Pull Request

Whether communication should precede a pull request is sometimes determined by the rules of the project.

If you are making this decision, then consider how well the project deals with and responds to pull requests. 

If there seems to be evidence that the project struggles to deal with pull requests and if there is a channel for communications other than the channel opened with the pull request, consider communicating before opening a pull request.

Cost and Value

A pull request must result in a net gain for the project.

There are so many ways to achieve a gain that it would be pointless to try to enumerate them. The most common path to a loss is as a result of miscalculation of cost.

As the author of the pull request, you might want to say that the cost associated is whatever it cost you to write the pull request.

But that's not the only cost associated, here's a better starting place for a list of costs:

  • The time it took you to write the pull request
  • The time it's going to take someone, or some people to review the pull request
  • The time it's going to take someone to merge the pull request
  • The time it's going to take someone, or some people to document your changes
  • The time it's going to cost anyone familiar with whatever you are changing

All of this, before the correctness of the pull request even matters. 

Of course, correctness does matter, as well as the content of the pull request ... In some cases, a pull request will have an ongoing, or ongoing costs associated with it. 

Generally, if that ongoing cost is going to be incurred by humans we call it technical debt.

Generally, if that ongoing cost is going to be incurred by the machines executing the code, we refer to it as implementation complexity, or some other fancy term.

These costs hard to quantify, and there is likely disparity between your method of calculating these costs and the methods of existing contributors.

The value your pull request provides must exceed the sum of these costs (as calculated by existing contributors) to be worthwhile, and you should be utterly confident that it does that before you open the pull request.

Final Words

Don't let anything, including the words herein, deter you from your goal of becoming a contributor. 

Achieving your goal will be your reward ...

Sunday, 28 July 2019

Nailed Lids

Fig 1. A Black Box

When we are developing, we go to great lengths to take measurements and gather insights with every kind of testing under the sun, coverage, reviews and so on. These measurements, insights, and processes give us the confidence to take what we made to production.

Production ... a kind of black box, with the lid nailed shut.

You can take a crow bar and pry open one corner of the lid, and using the torch on your phone illuminate one corner of the box. Common crow bars are the Xdebug profiler and XHprof ... These are all great tools, but they are still only illuminating a tiny part of the box, one process at a time.

A technology gaining popularity in recent years is Application Performance Monitoring. APM solutions, typically by taking a large hammer to the box, can provide valuable insight. The large hammer leaves its mark though.

Note: I can only see the source code of open source APM agents.

All of these solutions have one very major drawback: They undertake their work in the same thread as is supposed to be executing PHP - it doesn't matter how well written the code is, or how low the vendor claims the overhead is. It is a mathematical certainty that using these solutions will have a detrimental impact on the performance of the code they are meant to be profiling and or providing insights about.

Clever APM agents will send data in parallel, nevertheless, the majority of their work is still done in the PHP thread.

All of these solutions do one or both of:

  • override the executor function
  • overwrite functions in function tables

Without going into too much detail: Ordinarily the vm is stackless, that means when the executor function (the vm) is entered and a function is called in the code being executed, the executor is not re-entered. Setting the executor function breaks this behaviour, meaning that recursion can lead to stack overflow.

Overwriting functions used to be the bread and butter of hacky extensions like runkit and uopz, and it used to be simple. With the advent of immutable classes, it's not so simple anymore - the function table of the class you are editing, and functions therein, may reside in opcache's SHM. Changing that memory is considered illegal. The VM aggressively uses caching to avoid hashtable lookups, changing a function that exists in the run time cache of another function will lead to faults if that cache entry is subsequently referenced.

A quick word on XHprof and (some) derivatives ... these use the RDTSC instruction as a timing source, have a quick read of the Wikipedia, this hasn't been a good idea in a very long time. They do indeed set affinity to maximize reliability, nevertheless the fragility of using this is unquestionable, and more modern portable API's exist ... nevertheless, it works, and I don't hear everyone being confused that their profiles don't make sense, so more of a technical gripe than anything.

Note: Tideways no longer uses RDTSC, but does use the modern equivalent.

Of course, you can find safe ways to overwrite functions, and maybe a recursive executor is not so terrible for you ...

Conventional wisdom is that if you want to trace or otherwise observe the runtime of PHP, you have to use the hooks that Zend provides and your knowledge of how the Zend layer works. As a result many extensions do these things, or otherwise have a similar detrimental impact on the performance of code. But, they are generally aimed at development time, not production. Doing these things in one process so that Xdebug can debug (or profile) it, pcov can provide coverage for it, or uopz can let your 100 year old tests run is not so bad, a reasonable price to pay for the value being extracted.

Doing these things to a few processes at a time in production, such that APM solutions have enough of a stream of data to provide valuable insights, might also be reasonable. Similarly an APM agent may be extremely lightweight and perform something more akin to the function of a request logger than that of a profiler, limiting their ability to provide insight but making them suitable for production.


First some words about the differences between our development and production environments ...

Our development and staging environments may well operate at capacity, they may well have no spare cores, and no spare cycles - they have every core pinned at 100% usage or close and no capacity to create more processes.

Our production environments must by definition have the ability to deal with production demand. While every core that is running a PHP process might be pinned at 100% or close, we have spare cores and or idle processes.

Getting the lid off the box ...

Stat is a super modern, high performance provider of profile information for production. It uses parallel uio and an atomic ring buffer to make profile data for a set of PHP processes available in realtime over a unix or TCP socket.

Stat does all its work in parallel to PHP, which overcomes the first major drawback of any existing solution. It has no need to set an executor, or otherwise interfere with the runtime of PHP.

Stat is a work in progress, and it may be a month or more before the first release happens, however, if anyone wants to get started on working on any user interfaces (which I will not be writing), I'd be happy to start collaborating on that immediately.

You can find a bit more information in the readme.

That's all I have to say about that right now ...

Wednesday, 17 July 2019

Trimming the Phat

Fig 1. A very fancy Tomb

We all think we know how dead code elimination works, we can just reference code coverage, or run static analysis, or rely on our own internal model of the code, which is always absolutely perfect ...

Dead can mean multiple things when we're talking about code, at least:
  • Compiler context - dead code is "unreachable", it can never be executed
  • Execution context - dead code is "unused", it has not been, or is not going to be called
The distinction between compile and execute in PHP is somewhat blurred, as a result, some dead code detection that should be part of the compiler have traditionally been part of code coverage. In the latest versions of PHP, opcache eliminates unreachable code.

Static analysis and coverage reports can tell you about dead code in those narrow scopes defined, but there is another sense in which code might be considered dead:
  • Code that is in fact unused in production
My manager recently asked me to come up with something so that we can detect dead code in this production sense.

I'm quite adept at bending PHP to my will, however, this task presents some not insignificant challenges. Normally, when we want to abuse PHP in some strange way, we're doing so in the name of testing. 

Testing is a nice quiet place, where there's only one process to care about, not much can go wrong. If you are careful, you can write some really nice tooling, the overhead is very acceptable, and people rave about it on twitter (pcov).

Production on the other hand is a scary place, where mistakes may cost a lot of money, where there are in the order of hundreds of processes to care about: Extracting statistical information from hundreds of processes without adversely affecting their performance is not a simple task.


Tombs is my solution to the problem of detecting code that is unused in production. Code that even though may be reported as reachable, covered, or used, is in fact never called in production.

There's something quite pleasing about the requirements for a task translating almost perfectly into the solution. The requirements for Tombs were:
  • Must not use more CPU time in PHP processes than is absolutely necessary (i.e. be production ready)
  • Must report statistics centrally for every process in a pool
The first requirement, aside from the obvious, means that Tombs needs to have an API that faces the system rather than user land PHP, we can't inject code into production so the processes that gather statistics must be separate and might be on different machines entirely.

The second requirement means that Tombs needs to use shared mapped memory, like O+ or APC(u).

O+ and APC(u) both achieve safety in their use of shared memory by multiple processes using mutual exclusion - implemented either as file locks, pthread mutex, or the windows equivalent - this makes perfect sense for them. It means that even though many processes may compile the about to be cached file, or execute the function that returns the about to be cached variable, only one process can insert the file or variable into shared memory.

Reporting live statistics about a system is similar to trying to count the number of birds in flight over the earth - it will change while reporting. In this environment, mutex makes very little sense, what we need here is a lock free implementation of the structure that stores information we need to share. We need to know that no matter how large the set of data being returned, we don't have to exclude other processes from continuing to manipulate that data concurrently.

Using Tombs

Simply load Tombs in a production environment and without modifying any code, allow normal execution to take place over the course of hours, days, or weeks. 

Now, when you open the Tombs socket the data returned represents the functions and methods that have not been executed by any process in the pool since Tombs was started.

Using this data, you can now make decisions about the removal or refactoring of code to reduce or hopefully eliminate dead code.

If you use Tombs, reach out to me and tell me how it worked out for you ....

Saturday, 30 March 2019


Fig 1. A chap performing the Detroit JIT
Unless you have been living under a rock, or are from the past (in which case, welcome), you will be aware that a JIT is coming to PHP 8: The vote ended, quietly, today, with a vast majority in favour of merging into PHP 8, so, it's official.

Throw some crazy shapes in celebration, suggestion given in Fig 1, and it's even called "The (Detroit) JIT" ...

Now sit down and read the following myth busting article, we're going to clear up some confusion around what the JIT is, what it will benefit, and delve into how it works (but only a little, because I don't want you to be bored).

Since I don't know who I'm talking to, I'm going to start at the beginning with the simple questions and work up to the complex ones, if you already are sure you know the answer to the question in a heading, you can skip that part ...

What is JIT ?

PHP implements a virtual machine, a kind of virtual processor - we call it Zend VM. PHP compiles your human readable script into instructions that the virtual machine understands (we call them opcodes), this stage of execution is what we refer to as "Compile Time". At the "Runtime" stage of execution the virtual machine (Zend VM) executes your code's instructions (opcodes).

This all works very well, and tools like APC (in the past) and OPCache (today) cache your code's instructions (opcodes) so that "Compile Time" only happens when it must.

First, one line to explain what JIT is in general: Just-in-time is a compiler strategy that takes an intermediate representation of code and turns it into architecture dependent machine code at runtime - just-in-time for execution.

In PHP, this means that the JIT treats the instructions generated for the Zend VM as the intermediate representation and emits architecture dependent machine code, so that the host of your code is no longer the Zend VM, but your CPU directly.

Why does PHP need a JIT ?

The focus of the PHP internals community since slightly before PHP 7.0 has been performance, brought about by healthy competition from Facebook's HHVM project. The majority of the core changes in PHP 7.0 were contained in the PHPNG patch, which improved significantly the way in which PHP utilizes memory and CPU at its core, since then every one of us has been forced to keep one eye on performance.

Since PHP 7.0 some performance improvements have been made, optimizations for the HashTable (a core data structure for PHP), specializations in the Zend VM for certain opcodes, specializations in the compiler for certain sequences, and a constant stream of improvements to the Optimizer component of OPCache ... and many others besides, too boring to list.

It's a brute fact that these optimizations can only take us so far and we are rapidly approaching, or maybe have already met, a brick wall in our ability to improve it any further.

Caveat: When we say things like "we can't improve it any further", what we really mean is, "the trade-offs we would have to make to improve it any further no longer look appealing" ... whenever we talk about performance optimizations, we're talking about trade-offs. Often, trade-offs in simplicity for performance. We would all like to think that the simplest code is the fastest code, but that simply is not the case in the modern world of C programming. The fastest code is often that code which is prepared to take advantage of architecture dependent intrinsics or platform (compiler) dependent builtins. Simplicity just is not a guarantee of the best performance ...

At this time, the ability for PHP to JIT would appear to be the best way to squeeze more performance from PHP.

Will the JIT make my website faster ?

In all probability, not significantly.

Maybe not the answer you were expecting: In the general case, applications written in PHP are I/O bound, and JIT works best on CPU bound code.

What on earth does "I/O and CPU bound" mean ?

When we want to describe the general performance characteristics of a piece of code, or an application, we use the terms I/O bound and CPU bound.

In the simplest possible terms:
  • An I/O bound piece of code would go faster if we could improve (reduce, optimize) the I/O it is doing.
  • A CPU bound piece of code would go faster if we could improve (reduce, optimize) the instructions the CPU is executing - or (magically) increase the clock speed of the CPU :)
A piece of code, or an application, may be I/O bound, CPU bound, or bound equally to CPU and I/O.

In general, PHP applications tend to be I/O bound - the thing that is slowing them down is the I/O which they are performing - connecting, reading, and writing to databases, caches, files, sockets and so on.

What does CPU bound PHP look like ?

CPU bound code is not something a lot of PHP programmers will be familiar with, because of the nature of most PHP applications - their job tends to be connect to some database, and or possibly a cache, do some light lifting and spit out an html/json/xml response.

You may look around your codebase and find lots of code that has nothing whatever to do with I/O, code that is calling functions completely disconnected from I/O even, and be confused that I seem to be implying that this doesn't make your application CPU bound, even though there may be many more lines of code that deal with non I/O than I/O.

PHP is actually quite fast, it's one of the fastest interpreted languages in the world. There is no remarkable difference between the Zend VM calling a function that has nothing to do with I/O, and making the same call in machine code. There is clearly a difference, but the fact is that machine code has a calling convention, and Zend VM has a calling convention, machine code has a prologue and Zend VM has a prologue: Whether you call some_c_level_function() in Zend Opcodes or machine code doesn't make a significant difference to the performance of the application making the call - although it may seem to make a significant difference to that call.

Note: A calling convention is (roughly) a sequence of instructions executed *before* entering into another function, a prologue is a sequence of instructions executed *at entry* into another function: The calling convention in both cases pushes arguments onto the stack, and the prologue pops them off the stack.

What about loops, and tail calls and X I hear you ask: PHP is actually quite smart, with the Optimizer component of OPCache enabled your code is transformed as if by magic into the most efficient form you could have written.

It's important to note now that JIT doesn't change the calling convention of Zend Functions from the convention established by the VM - Zend must be able to switch between JIT and VM modes at any time and so the decision was taken to retain the calling convention established by the VM. As a result those calls that you see everywhere aren't remarkably faster when JIT'd.

If you want to see what CPU bound PHP code looks like, look in Zend/bench.php ... This is obviously an extreme example of CPU bound code, but it should drive home the point that where the JIT really shines is in the area of mathematics.

Did PHP make the ultimate trade-off to make math faster ?

No. We did it to widen the scope of PHP, and considerably so. 

Without wanting to toot our own horn, we have the web covered - If you are a web programmer in 2019 and you haven't considered using PHP for your next project, then you are doing the web wrong - in this very biased PHP developer's opinion.

To improve the ability to execute math faster in PHP seems, at a glance, to be a very narrow scope. 

However, this in fact opens the door on things such as machine learning, 3d rendering, 2d (gui) rendering, and data analysis, to name just a few.

Why can't we have this in PHP 7.4 ?

I just called the JIT "the ultimate trade-off", and I think it is: It's arguably one of the most complex compiler strategies ever invented, maybe the most complex. To introduce a JIT is to introduce considerable complexity.

If you ask Dmitry (the author of the JIT) if he made PHP complex, he would say "No, I hate complexity" (that's a direct quote).

At bottom, complex is anything we do not understand, and at the moment, there are very few internals developers (less than a handful) that truly understand the implementation of JIT that we have.

PHP 7.4 is coming up fast, merging into PHP 7.4 would leave us with a version of PHP that less than a handful of people could debug, fix, or improve (in any real sense). This is just not an acceptable situation for those people that voted no on merging into PHP 7.4.

In the time between now and PHP 8, many of us will be working in our spare time to understand the JIT: We still have features we want to implement and tools we need to rewrite for PHP 8, and first we must understand the JIT. We need this time, and are very grateful that a majority of voters saw fit to give it to us.

Complex is not synonymous with horrible: Complex can be beautiful, like a nebula, and the JIT is that kind of complex. You can, in principle, fully understand something complex and only make a marginal reduction in the apparent complexity of that thing. In other words, even when there are 20 internals developers who are as familiar with the JIT as Dmitry is, it doesn't really change the complex nature of JIT.

Will development of PHP slow down ?

There's no reason to think that it will. We have enough time that we can say with confidence that by the time PHP 8 is generally available, there will be enough of us familiar with the JIT to function at least as well as we do today when it comes to fixing bugs and pushing PHP forward.

When trying to square this with the view that a JIT is inherently complex, consider that the majority of our time spent on introducing new features is actually spent discussing that feature. For the majority of features, and even fixes, code may take in the order of minutes or hours to write, and discussions take in the order of weeks or months. In rare cases, the code for a feature may take in the order of hours or days to write, but the discussion will always take longer in those rare cases.

That's all I have to say about that ... 

Enjoy your weekend.

Wednesday, 13 February 2019

Parallel PHP: The Next Chapter

Some years ago, to prove some people on the internet wrong, and because I had a break from normal work - the first such break in years - I decided to write pthreads. My memory fails me a little, but from what I can recall, nobody actually saw that first version, I developed the idea over the following weeks and months and was allowed to publish this work to PECL. It was my introduction to serious internals programming.

The thing I was proving wrong is that PHP is not designed to be used in threads: This is flat out wrong and has been since the 22nd of May in the year 2000, when TSRM was merged into PHP. TSRM - Thread Safe Resource Manager allows you to build PHP such that it can be embedded in a threaded server, like Apache. These builds are colloquially referred to as ZTS - Zend Thread Safe. PHP is very much designed to be used in threads. You may hear people say that TSRM is unstable, or that it's not safe ... there were mistakes in the original implementation, like any software. But it is safe, and it is theoretically sound, and it has been in use on Windows as the primary mode of executing PHP since shortly after it was merged, necessarily so because of a lack of support for proper forking.

What it's not designed to do, or rather I should say what has been given no attention except by me, is exposing threads to userland. The reason for this is the architecture that PHP has, often referred to as "share nothing" seems to be antithetical to threads: Normally, when you start a thread, it operates in the same address space as the thread or process that created it, they share everything. I've been using threads in other languages for a very long time, and the very thing that other people think makes PHP an unsuitable candidate makes me think it more suitable than other languages. The thing that makes programming with threads hard is precisely that they share data, the cognitive overhead increases with each additional thread you create. The models you have to build in your head become unreasonable and prone to mistakes. That is why more modern languages like Go try to hide all of the complexities of threading behind a nice simple API, so that the programmer doesn't actually need to learn about the intricacies of how to use a condition variable, or a mutex, or when to synchronize, they only have to learn how the simple API works.

Having never written a threading API before pthreads, and being left entirely on my own to do it even when the code became public, maybe I made some questionable decisions. I couldn't accept that using parallelism could be easy, I would repeat like a mantra that threading is hard and API's can't solve that. I wanted pthreads to expose the same kind of API that Java has, and my focus could not be shifted by reason. I vaguely remember the first time I went into IRC on internals to talk about pthreads, and people, including Rasmus, tried to reason with me that I was maybe making a mistake, that threading at the frontend of a website doesn't make sense, others said they would have preferred a simpler API ... these pleas fell on deaf ears, and I regret it. I spent many hundreds, possibly thousands of hours writing and rewriting pthreads until it is what you see today, a kind of monster that about 4 people really understand excluding myself, that only the same number of projects have really managed to deploy with any success.

A slight tangent: Threading at the frontend of a website, on the face of it, doesn't make sense: If you have a web request that creates a "reasonable" number of threads, let's say 8, and 1000 clients come along at once, you are asking your hardware to execute 8*1000 threads not including the threads or processes that done the creation, this is unreasonable and cannot possibly scale (with a 1:1 threading model, which is what pthreads has). That said, other languages do manage to integrate parallelism into the web response, but it's tricky, and takes a lot of thought and expertise. I've never suggested that you should build software in this way, and would never suggest it, eventually I prohibited the use of pthreads in anything but the CLI in an attempt to force users to be reasonable.

Tangent over: The great thing about being wrong, and acknowledging that you are wrong, is that you have the opportunity to learn, and do better. If we were never wrong, programming would be boring, we would just chug out code for our entire lives, and miss out on the feeling of truly understanding something for the first time, a feeling I love and cherish, and that keeps me at my keyboard.

I've been aware of my mistakes for some time, but pthreads does have some large projects relying on it, and mostly it's development is controlled by other people now. Occasionally I will give advice or commit something, but I can't, in good conscience, tell anyone to use it, it's simply too hard.

While aware, there was nothing driving me to write another API, until recently when Zend made their intention to merge the JIT  they have been working on for years into PHP. Just think for a moment what it would mean to be able to execute machine code in parallel in user land PHP ... this is not a thing I could have ever imagined happening all those years ago, but it would seem a possibility in the not too distant future.

pthreads has to jump through so many hoops to make threads work and provide the API that it does, that it's not reasonable to talk about it being able to execute machine code.

Recently, I set to work on a new threading API, named Parallel, it is not an exact clone of any existing threading API, it is an API focused on being simple and hiding the complexity inherent in utilising parallelism in your application, it is also focused on being forward compatible with the JIT, for that day when we can actually execute machine code in userland and in parallel.

I should mention now that the current implementation of the JIT, which is not finished, doesn't actually have proper support for TSRM, a fact that only became evident in the past few days. However, the conversation some of us had with Dmitry (the author of the JIT) about the lack of support for ZTS in JIT has inspired him to look at ways to make ZTS with and without a JIT much more efficient. So, it's highly likely that this support will come, although I can't say when.

While the JIT is an exciting prospect for parallelism, I'm also excited to be able to provide a really nice, simple API, that any PHP programmer can understand and use. For the next year at the very least, the JIT doesn't exist for most PHP developers, and you can get to work getting to know how to use parallel ...

Parallel is not complete, but is stable: More features are planned, something like Go's channels would be a nice addition, and I've already started to think about and discuss the implementation of this with other internals developers.

I wish you all the best of luck ... and for those people thinking about using pthreads for new code: don't be silly.

Monday, 28 January 2019

Running for Coverage

Today we're going to look at the history and the future of coverage collection in PHP.

History is the easy bit: For most of the history of PHP, Xdebug has provided the only implementation to php-code-coverage. Simple.

Then in 2015, just after phpdbg was merged into PHP, some clever sausages extended the instruction logging facility that I wrote into phpdbg for internals developers in order to provide another implementation to php-code-coverage. To paraphrase a popular book "and they saw that it was good" ...

But was it good !? It was fast, it didn't add any complication to phpdbg and for a lot of people, they didn't notice (or didn't care about) the mistakes phpdbg was making.

That's right, it was merged with mistakes ...

You might think the job of a coverage collector is just to hook into Zend and find out what lines have been executed any way it can, and on the face of it, that's true (and difficult to get wrong, you might think).

But, think more carefully about it, and you'll realise that actually a coverage collector must know what instructions are executable (and important to the user), if for no other reason than Zend will insert an implicit return statement into all functions (or top level code, file) even if you have an explicit one. It does this so that all functions certainly end with return, this is important for boring internal reasons as well as obvious ones.

A graphic example of why it's important to know which instructions are executable is this:

/* 1 */ function foo($bar) {
/* 2 */    if ($bar) {
/* 3 */        return true;
/* 4 */    }
/* 5 */ }
At the end of this function, on line 5, Zend inserts that implicit return, a collector that doesn't know if that return is an executable instruction must ignore it, and so will report inaccurate coverage of the function if the first control path is taken.

Quite early on in the life of Xdebug, Derick developed branch analysis. At the time, it was the only implementation of branch analysis for PHP code, so a very valuable thing. Branch analysis allows Xdebug to determine that the implicit return is important, and so mark it as coverable/executable code to be included in any trace.

In addition the branch analysis in Xdebug is the basis for its support for branch or path coverage, which is in my opinion, and Derick's, the most valuable feature of Xdebug's coverage, but unfortunately unusable in its current state. Although there are plans to improve that, and I may even help. There's no real bias here.

phpdbg has no such analysis, and used a fast, but inaccurate method that results in ignoring all implicit returns, executable or not. This makes the reports phpdbg generates less than accurate, and less than trustworthy ... but it does do it fast ... ish.

So now it's 2015, we have two debuggers, both with support for coverage, one of them obviously superior to the other, and one of them fast, but inferior.

Now I want you to question whether it makes sense for a debugger to have support for code coverage at all; gdb has no such thing, I haven't used any Java since 2015, but I don't remember seeing code coverage collection as a feature of any debugger I ever used for it, quite late on in the game Visual Studio did get support for coverage, but not as an extension of debugging, but a feature of the IDE itself ...

The answer is no, it doesn't really make sense, the two things interfere with each other. A debugger must gain such a degree of control over the execution environment it is debugging that they come with unavoidable overhead, they may even need to change (or have the ability to change) the path that is taken through code which is antithetical to collecting coverage. A coverage collection tool needs to do the opposite of a debugger: Change as little as possible, try not to slow down the execution of code more than is absolutely necessary to do your job - Not a concern for a debugger, nobody really cares if a debugger is slow - it will spend a lot of its time paused, waiting for you to figure out what to do next !

I should say now that Derick flat disagrees me with here, and his reasoning is not wrong: Xdebug started as a debugging aid, the more fully blown features such as step debugging and profiling were added later. So adding coverage to the set of tools wasn't crazy, it makes sense.

But we have two debuggers that support coverage, we're so spoiled, or unlucky ... or foolhardy ... or doomed ...

The fact is that while Xdebug has superior collection to phpdbg, most people disable Xdebug in their CI, and in their development environments for performance reasons. If they run coverage in CI, it's for a small project, or they use phpdbg and put up with the mistakes (or don't know about  them).

When Sebastian Bergmann was conversing with the clever sausages that made phpdbg support coverage collection, he even warned them they were following a doomed path, and suggested it might be better if code coverage was a standalone extension, nevertheless he merged their work and we all moved forward.

Doing something about this situation has been on my todo list since shortly after the driver for phpdbg was merged into php-code-coverage, not very high up on my todo list, but present.

Quite recently I saw a blog post from Sebastian about making Xdebug collect coverage faster, he failed to mention phpdbg in the whole article which was questioned by reddit users and twitterers alike. He didn't mention it because he knows very well what its limitations are, and didn't want to encourage people to use something that makes mistakes. Totally fair.

But, someone on the internet said a thing and my mind became occupied, completely, with fixing this problem. There's no good reason that in 2019, we can't collect accurate coverage, and fast.

I set to work on PCOV, which is a standalone extension that implements the kind of interface that php-code-coverage needs, it does this with as little overhead as possible, as it should. At first, I copied the faulty method of ignoring executable returns from phpdbg, I done this to prototype it as fast as possible and see just what kind of performance we can get. The results were remarkable, the overhead was so very low that it managed to outpace phpdbg on every test suite I ran, by a considerable margin.

Even though flawed, I thought this is worth sharing, so I made it nice and made a readme and opened a pull request to have the PHP part of the driver merged into php-code-coverage. But I didn't stop thinking ...

I then read from a post on Dericks blog that mentioned he was looking for ways to improve the performance of coverage collection in Xdebug. In the blog post is a one liner about preferring correctness over speed.

I absolutely agree with preferring correctness over speed, and I couldn't sleep knowing that I had just introduced a known flaw in brand new code, sure it was faster than phpdbg, but objectively not better at the job of collecting accurate coverage.

It so happens that Xdebug is not the only software in the ecosystem that performs analysis of code, in fact Optimizer, part of PHP for many years now, also performs analysis and is the "source of truth" for what is an executable Zend instruction, since non-executable ones are destined to be removed automatically during one of its many optimization passes.

You will notice that nobody ever complains that Optimizer is slow to analyze code, the reason for this, is that it's not slow at all, it has a very succinct implementation of a control flow graph ... not much of PHP is succinct, but so well abstracted is this feature of Optimizer that you can lift it from opcache and drop it into whatever you like with very little work.

That is what I did next ... So now PCOV and Zend agree absolutely, and always will, about what is executable code.

It may seem presumptious of me to talk about the future of coverage being PCOV, but humbly, I'd like to suggest that it should be, and like you to consider that I'm talking about the distant future, not tomorrow: I think phpdbg and maybe Xdebug should drop that feature altogether and maybe we can team up and add some really cool but usable and fast features to PCOV that php-code-coverage always wanted, such as branch or path coverage, a much superior criteria than line coverage.

I should make clear at this point that Derick is not so keen on that idea currently, and would like to pursue his own path for Xdebug, with plans to refactor and improve upcoming Xdebug releases, possibly containing the many features within Xdebug - so it is able to behave only as a profiler, or a debugger, or a collection tool. Honestly I would be surprised if he wanted to drop anything from such mature and widely deployed software, but let's see where we are in 5 years, perhaps ...

At this moment, you will find it hard to use PCOV in your projects as I'm waiting for Sebastian to review the pull request and make his decision, presumably about the version of php-code-coverage that PCOV will first be included in.

It's Monday morning, and I've got nothing better to do than write a blog post ... When you can use it easily, another post will follow.

That's all for now, enjoy your week :)

Sunday, 27 January 2019

Faking It

Fig 1. A Mockingbird
As well as mentoring and code review one of my main tasks at work is to improve the test suites and improve the testing and development methodologies we use. This is no small task and has resulted in the publication of a few extensions, one of them is uopz.

Before we continue; I work in the real world, where not all code was written yesterday using the best standards and best methods, and it's fine to say "just fix your code", but totally unrealistic, we have to deal with reality.

You may not hear people raving about using uopz, because for most of us, if you are doing the sort of things that uopz makes possible, there is something wrong with the way you are writing code. This is technically true, of course, nevertheless, we have work to do.

When I first started my current job, our test suite was bound tightly to runkit, and failed all the time. When I wrote uopz and we adopted it, it stabilised and developers could once again get on with their work. All the while we would repeat to each other that we would move away from uopz by improving our tests and code. This hasn't really happened, instead, because uopz allows certain crazy things, they are the crazy things that were being done in tests.

It's five years later, and uopz takes up a fair amount of my time with each new PHP version, it's quite a headache, for a temporary solution. Realising that actually this is entirely my fault because I gave them the tools to work like this, I decided we needed new tools.

Some time ago, I wrote an extension called Componere (latin, composer ... I like latin names), the purpose of Componere is to allow the programmer to (re)define complex types at runtime. I showed this to a couple of colleagues, got some "wow, cool", but they never picked it up. I later realised that they didn't pick it up because it relies on the author having at least some knowledge of the way classes are built internally. So even though extremely powerful, it got ignored.

We required a higher level API, and so I've written a mocking framework in PHP, built on top of Componere that is almost without limitation. It's name is "mimus" which is taken from Latin, "mimic" and shared with the first part of the Latin binomial name for the animals everyone knows as Mockingbirds.

I am fully aware that the ecosystem contains within it many mocking frameworks, however they all have the same set of problems, limitations on what methods you can mock or how. This may be good enough for the vast majority of people, you can just write your code so that you can mock the parts you need at test time. However, if you have 3M LOC, it's not so simple; We need to do some of the things that can't be supported properly if you write the whole framework in PHP, such as stubbing privates, statics, finals. We also have 14 tonnes (number pulled from the air) of tests split across many suites and projects, this makes invoking PHP parsing in PHP several tens of thousands of times unrealistic.  Wipe that cringe off your face, real world, remember ...

While you don't hear people raving about uopz, I happen to know it's used in some very large scale deployments of PHP: I know that there's a number of people out there doing exactly the same sort of horrible things in tests that we were doing, and that mimus is freeing us from, slowly.

We're a month into the switch from uopz and hacking the engine apart to a more modern, more sensible world. The developers are really enjoying themselves using mimus too, which is a bonus, and probably born of the fact that they don't have to feel quite so "dirty" when writing tests.

I'm not going to repeat the readme for mimus here, or show any code, because it's Sunday and you probably don't want that. Tomorrow morning, before you write another test that uses uopz, check out mimus ...

Enjoy the rest of your Sunday ...