 |
Fig 1. Go PHP "7" |
Now that the master branch of PHP (PHP7) is stabilizing, it is time for those of us who maintain extensions to think about beginning the task of upgrading the source code to work with new API's and conventions in PHP7.
Without going into too much detail about why PHP7 is the way it is, I'm going to go through the major differences extension maintainers or authors need to be aware of.
The Zend Object
Those of us that don't hate our users provide Object Orientated API's, you might say that from the offset PHP7 is different.
When an extension registers a class entry in the
MINIT
routine, it sets a handler named
create_object
.
A typical
create_object
handler for a 5 series extension looks something like:
zend_object_value my_create_object(zend_class_entry *ce TSRMLS_DC) {
zend_object_value retval;
MY *object = (MY*) emalloc(sizeof(*object));
zend_object_std_init(&object->std TSRMLS_CC);
/* ... */
retval.handlers = &my_handlers;
retval.handle = zend_objects_store_put(
object,
my_dtor, my_free, my_clone TSRMLS_CC);
return retval;
}
We can see that the handler is expected to return a
zend_object_value
, and is expected to store the object in the object store itself.
While a typical create_object handler for 7 series looks like:
zend_object* my_create_object(zend_class_entry *ce TSRMLS_DC) {
MY *object = (MY*) emalloc(sizeof(*object));
zend_object_std_init(&object->std TSRMLS_CC);
/* ... */
object->std.handlers = &my_handlers;
return &object->std;
}
The subtle differences are because a
zval
in PHP7 has
zend_object*
in the value union, and
zend_object_std_init
calls
zend_objects_store_put
.
In addition, the destroy, free and clone handlers are now part of the object handlers struct (
my_handlers
), rather than set by the call to
zend_objects_store_put
.
To fetch the allocated object in PHP5 series, a call to
zend_object_store_get_object
was required, since the
zend_object*
is now part of the value union, this call is eliminated in PHP7 series.
Objects stored in the object store in the PHP5 series require that the first member of the struct was a
zend_object
, for example:
typedef struct _my {
zend_object std;
int my_integer;
/* other members here */
} MY;
Objects in PHP7 do not have the same limitation. This means that fetching the object allocated by a
create_object
handler in PHP7, given a zval, can be performed as follows:
PHP_METHOD(My, method) {
MY *object;
if (zend_parse_parameters_none() != SUCCESS) {
return;
}
object = (MY*) ((char*)Z_OBJ_P(getThis()) - XtOffsetOf(MY, std));
}
Even simpler, if one sticks to the established convention of having
zend_object
be the first member in the structure, the object allocated by
create_object
can accessed as follows:
PHP_METHOD(My, method) {
MY *object;
if (zend_parse_parameters_none() != SUCCESS) {
return;
}
object = (MY*) Z_OBJ_P(getThis());
}
Every little counts, the fact that objects can accessed with pointer arithmetic is super cool, and saves many calls to
zend_objects_store_get_object
for even the simplest extension.
The only other thing to remember is Zend needs to know the offset of the
zend_object
in your objects structure, if
zend_object
is not the first member the offset should be stored in the objects handlers at the field named
offset
. This is typically done during
MINIT
when handler structures are first created by the extension.
Levels of Indirection
While
we are C programmers, and so have intricate knowledge of pointers with triple indirection, indirection has an undeniable cognitive overhead. Many levels of indirection makes it harder to read and debug code, and in my opinion one of the best improvements in PHP7 is to drop the convention that it's okay to work with pointers with many levels of indirection.
This effects everything, from the fact that
Z_*_PP
macros no longer exist, to the
HashTable
and other Zend API's having significant changes.
HashTable and Strings
Hash tables are a staple of any extension, and Zend. The API has always felt like it was a compromise, and I'm pleased to observe that PHP7 finally has a nice HashTable API.
The first obvious change is where API functions used to take a
char *
and an
int
to represent a string and it's length respectively, PHP7 makes use of the
zend_string
structure for keys.
The
zend_string
structure in PHP7 can be refcounted and have hashes pre-calculated, rather cool.
If we look at PHP5 code that performs the familiar operation of fetching from a
HashTable
:
PHP_METHOD(My, method) {
char *str;
int str_len;
zval **value;
MY *object = (MY*) zend_objects_store_get_object(getThis() TSRMLS_CC);
if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "s", &str, &str_len) != SUCCESS) {
return;
}
if (zend_hash_find(object->table, str, str_len, (void**) &value) != SUCCESS) {
/* handle failure */
return;
}
}
In constrast, the PHP7 code is much less stupid:
PHP_METHOD(My, method) {
zend_string *str;
zval *value;
MY *object = (MY*) ((char*)Z_OBJ_P(getThis()) - XtOffsetOf(MY, std));
if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "S", &str) != SUCCESS) {
return;
}
value = zend_hash_find(object->table, str);
}
Note that
zend_parse_parameters
has a new type specifier for a
zend_string*
.
Sometimes a
zend_string
will not be available, and it may be inefficient to create one for a lookup or some other
HashTable
operation, for these cases the
HashTable
API has a set of functions with
_str_
in their name, for example,
zend_hash_str_find
, which still accept a
char*
and a
size_t
.
Where to start ?
I have provided a brief explanantion of the main differences effecting extension maintainers in PHP7, some people might be able to get started with just this information.
I don't know how anyone else learned how to program for Zend, but personally, I read code.
Now is a good time to dig around in some of the headers so you can see in detail what has changed: