Sunday, 28 July 2019

Nailed Lids

Fig 1. A Black Box

When we are developing, we go to great lengths to take measurements and gather insights with every kind of testing under the sun, coverage, reviews and so on. These measurements, insights, and processes give us the confidence to take what we made to production.

Production ... a kind of black box, with the lid nailed shut.

You can take a crow bar and pry open one corner of the lid, and using the torch on your phone illuminate one corner of the box. Common crow bars are the Xdebug profiler and XHprof ... These are all great tools, but they are still only illuminating a tiny part of the box, one process at a time.

A technology gaining popularity in recent years is Application Performance Monitoring. APM solutions, typically by taking a large hammer to the box, can provide valuable insight. The large hammer leaves its mark though.

Note: I can only see the source code of open source APM agents.

All of these solutions have one very major drawback: They undertake their work in the same thread as is supposed to be executing PHP - it doesn't matter how well written the code is, or how low the vendor claims the overhead is. It is a mathematical certainty that using these solutions will have a detrimental impact on the performance of the code they are meant to be profiling and or providing insights about.

Clever APM agents will send data in parallel, nevertheless, the majority of their work is still done in the PHP thread.

All of these solutions do one or both of:

  • override the executor function
  • overwrite functions in function tables

Without going into too much detail: Ordinarily the vm is stackless, that means when the executor function (the vm) is entered and a function is called in the code being executed, the executor is not re-entered. Setting the executor function breaks this behaviour, meaning that recursion can lead to stack overflow.

Overwriting functions used to be the bread and butter of hacky extensions like runkit and uopz, and it used to be simple. With the advent of immutable classes, it's not so simple anymore - the function table of the class you are editing, and functions therein, may reside in opcache's SHM. Changing that memory is considered illegal. The VM aggressively uses caching to avoid hashtable lookups, changing a function that exists in the run time cache of another function will lead to faults if that cache entry is subsequently referenced.

A quick word on XHprof and (some) derivatives ... these use the RDTSC instruction as a timing source, have a quick read of the Wikipedia, this hasn't been a good idea in a very long time. They do indeed set affinity to maximize reliability, nevertheless the fragility of using this is unquestionable, and more modern portable API's exist ... nevertheless, it works, and I don't hear everyone being confused that their profiles don't make sense, so more of a technical gripe than anything.

Note: Tideways no longer uses RDTSC, but does use the modern equivalent.

Of course, you can find safe ways to overwrite functions, and maybe a recursive executor is not so terrible for you ...

Conventional wisdom is that if you want to trace or otherwise observe the runtime of PHP, you have to use the hooks that Zend provides and your knowledge of how the Zend layer works. As a result many extensions do these things, or otherwise have a similar detrimental impact on the performance of code. But, they are generally aimed at development time, not production. Doing these things in one process so that Xdebug can debug (or profile) it, pcov can provide coverage for it, or uopz can let your 100 year old tests run is not so bad, a reasonable price to pay for the value being extracted.

Doing these things to a few processes at a time in production, such that APM solutions have enough of a stream of data to provide valuable insights, might also be reasonable. Similarly an APM agent may be extremely lightweight and perform something more akin to the function of a request logger than that of a profiler, limiting their ability to provide insight but making them suitable for production.


First some words about the differences between our development and production environments ...

Our development and staging environments may well operate at capacity, they may well have no spare cores, and no spare cycles - they have every core pinned at 100% usage or close and no capacity to create more processes.

Our production environments must by definition have the ability to deal with production demand. While every core that is running a PHP process might be pinned at 100% or close, we have spare cores and or idle processes.

Getting the lid off the box ...

Stat is a super modern, high performance provider of profile information for production. It uses parallel uio and an atomic ring buffer to make profile data for a set of PHP processes available in realtime over a unix or TCP socket.

Stat does all its work in parallel to PHP, which overcomes the first major drawback of any existing solution. It has no need to set an executor, or otherwise interfere with the runtime of PHP.

Stat is a work in progress, and it may be a month or more before the first release happens, however, if anyone wants to get started on working on any user interfaces (which I will not be writing), I'd be happy to start collaborating on that immediately.

You can find a bit more information in the readme.

That's all I have to say about that right now ...


  1. Hi, Joe. I am willing to go with building user interface :-) How can I start?

    1. Maybe open an issue so we can discuss your ideas there ?

  2. Thank you for sharing the post.
    Blogger at Radio Box