Saturday, 8 November 2014

Future Notes

Fig 1. Go PHP "7"
Now that the master branch of PHP (PHP7) is stabilizing, it is time for those of us who maintain extensions to think about beginning the task of upgrading the source code to work with new API's and conventions in PHP7.

Without going into too much detail about why PHP7 is the way it is, I'm going to go through the major differences extension maintainers or authors need to be aware of.

The Zend Object

Those of us that don't hate our users provide Object Orientated API's, you might say that from the offset PHP7 is different.

When an extension registers a class entry in the MINIT routine, it sets a handler named create_object.

A typical create_object handler for a 5 series extension looks something like:
zend_object_value my_create_object(zend_class_entry *ce TSRMLS_DC) {
    zend_object_value retval;
    MY *object = (MY*) emalloc(sizeof(*object));

    zend_object_std_init(&object->std TSRMLS_CC);
    /* ... */

    retval.handlers = &my_handlers;
    retval.handle   = zend_objects_store_put(
        object,
        my_dtor, my_free, my_clone TSRMLS_CC);

    return retval;
}

We can see that the handler is expected to return a zend_object_value, and is expected to store the object in the object store itself.

While a typical create_object handler for 7 series looks like:
zend_object* my_create_object(zend_class_entry *ce TSRMLS_DC) {
    MY *object = (MY*) emalloc(sizeof(*object));

    zend_object_std_init(&object->std TSRMLS_CC);
    /* ... */

    object->std.handlers = &my_handlers;

    return &object->std;
}

The subtle differences are because a zval in PHP7 has zend_object* in the value union, and zend_object_std_init calls zend_objects_store_put.

In addition, the destroy, free and clone handlers are now part of the object handlers struct (my_handlers), rather than set by the call to zend_objects_store_put.

To fetch the allocated object in PHP5 series, a call to zend_object_store_get_object was required, since the zend_object* is now part of the value union, this call is eliminated in PHP7 series.

Objects stored in the object store in the PHP5 series require that the first member of the struct was a zend_object, for example:
typedef struct _my {
    zend_object std;
    int  my_integer;
    /* other members here */
} MY;

Objects in PHP7 do not have the same limitation. This means that fetching the object allocated by a create_object handler in PHP7, given a zval, can be performed as follows:
PHP_METHOD(My, method) {
    MY *object;

    if (zend_parse_parameters_none() != SUCCESS) {
        return;
    }
    
    object = (MY*) ((char*)Z_OBJ_P(getThis()) - XtOffsetOf(MY, std));    
}

Even simpler, if one sticks to the established convention of having zend_object be the first member in the structure, the object allocated by create_object can accessed as follows:
PHP_METHOD(My, method) {
    MY *object;

    if (zend_parse_parameters_none() != SUCCESS) {
        return;
    }
    
    object = (MY*) Z_OBJ_P(getThis());  
}

Every little counts, the fact that objects can accessed with pointer arithmetic is super cool, and saves many calls to zend_objects_store_get_object for even the simplest extension.

The only other thing to remember is Zend needs to know the offset of the zend_object in your objects structure, if zend_object is not the first member the offset should be stored in the objects handlers at the field named offset. This is typically done during MINIT when handler structures are first created by the extension.

Levels of Indirection

While we are C programmers, and so have intricate knowledge of pointers with triple indirection, indirection has an undeniable cognitive overhead. Many levels of indirection makes it harder to read and debug code, and in my opinion one of the best improvements in PHP7 is to drop the convention that it's okay to work with pointers with many levels of indirection.

This effects everything, from the fact that Z_*_PP macros no longer exist, to the HashTable and other Zend API's having significant changes.

HashTable and Strings

Hash tables are a staple of any extension, and Zend. The API has always felt like it was a compromise, and I'm pleased to observe that PHP7 finally has a nice HashTable API.

The first obvious change is where API functions used to take a char * and an int to represent a string and it's length respectively, PHP7 makes use of the zend_string structure for keys.

The zend_string structure in PHP7 can be refcounted and have hashes pre-calculated, rather cool.

If we look at PHP5 code that performs the familiar operation of fetching from a HashTable:

PHP_METHOD(My, method) {
    char *str;
    int   str_len;
    zval  **value;
    MY    *object = (MY*) zend_objects_store_get_object(getThis() TSRMLS_CC);

    if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "s", &str, &str_len) != SUCCESS) {
        return;
    }

    if (zend_hash_find(object->table, str, str_len, (void**) &value) != SUCCESS) {
        /* handle failure */
        return;
    }
}

In constrast, the PHP7 code is much less stupid:
PHP_METHOD(My, method) {
    zend_string *str;
    zval        *value;
    MY          *object = (MY*) ((char*)Z_OBJ_P(getThis()) - XtOffsetOf(MY, std));

    if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "S", &str) != SUCCESS) {
        return;
    }

    value = zend_hash_find(object->table, str);
}

Note that zend_parse_parameters has a new type specifier for a zend_string*.

Sometimes a zend_string will not be available, and it may be inefficient to create one for a lookup or some other HashTable operation, for these cases the HashTable API has a set of functions with _str_ in their name, for example, zend_hash_str_find, which still accept a char* and a size_t.

Where to start ?

I have provided a brief explanantion of the main differences effecting extension maintainers in PHP7, some people might be able to get started with just this information.

I don't know how anyone else learned how to program for Zend, but personally, I read code.

Now is a good time to dig around in some of the headers so you can see in detail what has changed:

No comments:

Post a Comment