Monday, 28 January 2019

Running for Coverage

Today we're going to look at the history and the future of coverage collection in PHP.

History is the easy bit: For most of the history of PHP, Xdebug has provided the only implementation to php-code-coverage. Simple.

Then in 2015, just after phpdbg was merged into PHP, some clever sausages extended the instruction logging facility that I wrote into phpdbg for internals developers in order to provide another implementation to php-code-coverage. To paraphrase a popular book "and they saw that it was good" ...

But was it good !? It was fast, it didn't add any complication to phpdbg and for a lot of people, they didn't notice (or didn't care about) the mistakes phpdbg was making.

That's right, it was merged with mistakes ...

You might think the job of a coverage collector is just to hook into Zend and find out what lines have been executed any way it can, and on the face of it, that's true (and difficult to get wrong, you might think).

But, think more carefully about it, and you'll realise that actually a coverage collector must know what instructions are executable (and important to the user), if for no other reason than Zend will insert an implicit return statement into all functions (or top level code, file) even if you have an explicit one. It does this so that all functions certainly end with return, this is important for boring internal reasons as well as obvious ones.

A graphic example of why it's important to know which instructions are executable is this:

/* 1 */ function foo($bar) {
/* 2 */    if ($bar) {
/* 3 */        return true;
/* 4 */    }
/* 5 */ }
At the end of this function, on line 5, Zend inserts that implicit return, a collector that doesn't know if that return is an executable instruction must ignore it, and so will report inaccurate coverage of the function if the first control path is taken.

Quite early on in the life of Xdebug, Derick developed branch analysis. At the time, it was the only implementation of branch analysis for PHP code, so a very valuable thing. Branch analysis allows Xdebug to determine that the implicit return is important, and so mark it as coverable/executable code to be included in any trace.

In addition the branch analysis in Xdebug is the basis for its support for branch or path coverage, which is in my opinion, and Derick's, the most valuable feature of Xdebug's coverage, but unfortunately unusable in its current state. Although there are plans to improve that, and I may even help. There's no real bias here.

phpdbg has no such analysis, and used a fast, but inaccurate method that results in ignoring all implicit returns, executable or not. This makes the reports phpdbg generates less than accurate, and less than trustworthy ... but it does do it fast ... ish.

So now it's 2015, we have two debuggers, both with support for coverage, one of them obviously superior to the other, and one of them fast, but inferior.

Now I want you to question whether it makes sense for a debugger to have support for code coverage at all; gdb has no such thing, I haven't used any Java since 2015, but I don't remember seeing code coverage collection as a feature of any debugger I ever used for it, quite late on in the game Visual Studio did get support for coverage, but not as an extension of debugging, but a feature of the IDE itself ...

The answer is no, it doesn't really make sense, the two things interfere with each other. A debugger must gain such a degree of control over the execution environment it is debugging that they come with unavoidable overhead, they may even need to change (or have the ability to change) the path that is taken through code which is antithetical to collecting coverage. A coverage collection tool needs to do the opposite of a debugger: Change as little as possible, try not to slow down the execution of code more than is absolutely necessary to do your job - Not a concern for a debugger, nobody really cares if a debugger is slow - it will spend a lot of its time paused, waiting for you to figure out what to do next !

I should say now that Derick flat disagrees me with here, and his reasoning is not wrong: Xdebug started as a debugging aid, the more fully blown features such as step debugging and profiling were added later. So adding coverage to the set of tools wasn't crazy, it makes sense.

But we have two debuggers that support coverage, we're so spoiled, or unlucky ... or foolhardy ... or doomed ...

The fact is that while Xdebug has superior collection to phpdbg, most people disable Xdebug in their CI, and in their development environments for performance reasons. If they run coverage in CI, it's for a small project, or they use phpdbg and put up with the mistakes (or don't know about  them).

When Sebastian Bergmann was conversing with the clever sausages that made phpdbg support coverage collection, he even warned them they were following a doomed path, and suggested it might be better if code coverage was a standalone extension, nevertheless he merged their work and we all moved forward.

Doing something about this situation has been on my todo list since shortly after the driver for phpdbg was merged into php-code-coverage, not very high up on my todo list, but present.

Quite recently I saw a blog post from Sebastian about making Xdebug collect coverage faster, he failed to mention phpdbg in the whole article which was questioned by reddit users and twitterers alike. He didn't mention it because he knows very well what its limitations are, and didn't want to encourage people to use something that makes mistakes. Totally fair.

But, someone on the internet said a thing and my mind became occupied, completely, with fixing this problem. There's no good reason that in 2019, we can't collect accurate coverage, and fast.

I set to work on PCOV, which is a standalone extension that implements the kind of interface that php-code-coverage needs, it does this with as little overhead as possible, as it should. At first, I copied the faulty method of ignoring executable returns from phpdbg, I done this to prototype it as fast as possible and see just what kind of performance we can get. The results were remarkable, the overhead was so very low that it managed to outpace phpdbg on every test suite I ran, by a considerable margin.

Even though flawed, I thought this is worth sharing, so I made it nice and made a readme and opened a pull request to have the PHP part of the driver merged into php-code-coverage. But I didn't stop thinking ...

I then read from a post on Dericks blog that mentioned he was looking for ways to improve the performance of coverage collection in Xdebug. In the blog post is a one liner about preferring correctness over speed.

I absolutely agree with preferring correctness over speed, and I couldn't sleep knowing that I had just introduced a known flaw in brand new code, sure it was faster than phpdbg, but objectively not better at the job of collecting accurate coverage.

It so happens that Xdebug is not the only software in the ecosystem that performs analysis of code, in fact Optimizer, part of PHP for many years now, also performs analysis and is the "source of truth" for what is an executable Zend instruction, since non-executable ones are destined to be removed automatically during one of its many optimization passes.

You will notice that nobody ever complains that Optimizer is slow to analyze code, the reason for this, is that it's not slow at all, it has a very succinct implementation of a control flow graph ... not much of PHP is succinct, but so well abstracted is this feature of Optimizer that you can lift it from opcache and drop it into whatever you like with very little work.

That is what I did next ... So now PCOV and Zend agree absolutely, and always will, about what is executable code.

It may seem presumptious of me to talk about the future of coverage being PCOV, but humbly, I'd like to suggest that it should be, and like you to consider that I'm talking about the distant future, not tomorrow: I think phpdbg and maybe Xdebug should drop that feature altogether and maybe we can team up and add some really cool but usable and fast features to PCOV that php-code-coverage always wanted, such as branch or path coverage, a much superior criteria than line coverage.

I should make clear at this point that Derick is not so keen on that idea currently, and would like to pursue his own path for Xdebug, with plans to refactor and improve upcoming Xdebug releases, possibly containing the many features within Xdebug - so it is able to behave only as a profiler, or a debugger, or a collection tool. Honestly I would be surprised if he wanted to drop anything from such mature and widely deployed software, but let's see where we are in 5 years, perhaps ...

At this moment, you will find it hard to use PCOV in your projects as I'm waiting for Sebastian to review the pull request and make his decision, presumably about the version of php-code-coverage that PCOV will first be included in.

It's Monday morning, and I've got nothing better to do than write a blog post ... When you can use it easily, another post will follow.

That's all for now, enjoy your week :)

8 comments:

  1. PCOV is amazing! I stopped tracking or caring about PHP code coverage a while back because of how slow or inaccurate it was. Also, I know our code coverage is not all that great, and there is no practical way to raise it significantly in any reasonable amount of time, so why bother depressing myself? But then when I heard of PCOV, I got interested. I was able to get it installed (with the help of a workmate), and managed to write some code with the php-code-coverage that tracked and reported code coverage that ran fast enough to not annoy me.

    Then I even found a way to make code coverage relevant to us again. We now have a test class that other test classes can inherit from when they want to enforce "strict code coverage". The idea is to primarily use it with tests that have a one to one mapping with the class they are testing. The test will figure out, based on the name of itself, which class it is testing. It will then track it's own completely independent code coverage of that class. The only coverage that gets reported is any missed lines, and this is considered to mean the test failed.

    So it is a very strict form of code coverage requirements in two ways. First, it requires 100% coverage of the class under test. Second, that coverage must occur entirely as a result of that test, coverage from other tests run in that suite don't count.

    The beauty of it is how it is practical to use on legacy code with little or no pre-existing code coverage, because it is not project wide coverage tracking or enforcing. When you want to use TDD to start a new class, you start by making a test that inherits from that strict one. This will ensure that right from the start, your new class has 100% coverage. If it has 100% coverage in the beginning (which it will if it is empty), it is actually quite easy to keep it that way going forward (assuming you use TDD or at least write tests immediately after writing code). And if for whatever reason you decided you don't want that strict coverage requirement anymore for that class, you just change the test to not inherit from that class.

    ReplyDelete
  2. What a fabulous post this has been. Never seen this kind of useful post. I am grateful to you and expect more number of posts like these. Thank you very much. Tally Invoice Format in Excel

    ReplyDelete
  3. Fantastic service! This saved me so much time - I needed to update my resume and LinkedIn quickly. Definitely worth the money, as resumeyard service was quick and the content was superior than what I could have written for myself. Highly recommended!

    ReplyDelete
  4. thanks for your extrodinary services , go ahead
    Scaffolding dealers in chennai

    ReplyDelete
  5. Consider also if there is a considerable number of readers that will find your definition essay interesting. best services here for the academic ghostwriting.

    ReplyDelete