How to Learn About PHP Internals

Where can I learn about PHP internals?

The PHP Manual has a (sadly mostly empty) chapter on PHP internals.

The main development mailing list is internals@lists.php.net. You can sign up via php.net and/or use Markmail to search the archives.

The git repository for PHP is located on git.php.net, but there is also a mirror on GitHub.

For browsing the source code you should use the lxr.php.net cross reference tool.

The PHP wiki has a list of various resources on PHP development (blog posts, books, slides, etc).

In particular there is an (older) book by Sara Golemon: Extending and Embedding PHP.

A more current and ongoing effort is http://www.phpinternalsbook.com

If you have questions, you should try the #php.pecl room on efnet.


Also see this presentation by Sebastian Bergmann about Compiler Internals:

  • http://www.scribd.com/doc/18171982/PHP-Compiler-Internals

And make sure to check Nikic's blog. He's got a number of posts on how to read the source:

  • http://nikic.github.com/

In addition to that, you can check the PHP Credits for individual contributers:

  • http://www.php.net/credits.php

A number of them run their own blogs which might contain more information.

PHP Internals: How does TSRMLS_FETCH Work?

Firstly, I would not pay too much attention to what the manual says on PHP's internals. It is very outdated, and there's a good chance it is going to be removed from the manual in the near future. There are two websites currently dedicated to PHP's internals: PHPInternalsBook.com and PHPInternals.net (I author content for the latter). There's also a couple of good blogs to follow, including Nikita's and Julien's.

The TSRM in PHP's 5.x series was quite invasive. When wanting to access any Zend globals from within a function, the choice was between either fetching the TLS memory pointer from a function call (such as pthread_getspecific, which was relatively expensive) or propagating the TLS memory pointer through function parameters (a messy and error-prone affair, but the faster way). The TSRMLS_FETCH macro you mentioned was used for the former approach.

In PHP 7.x, propagating the TLS memory pointer (via the TSRMLS_[D|C]C? macros) has been removed completely (though their macros are still defined for backwards compatibility - they just won't do anything). The preferred way to access the TSRM's TLS now is via its static cache. This is basically just a thread local global variable used to hold the current TLS memory pointer.

Here are the relevant macros:

#define TSRMLS_CACHE _tsrm_ls_cache // the TLS global variable
#define TSRMLS_CACHE_DEFINE() TSRM_TLS void *TSRMLS_CACHE = NULL; // define it
#define TSRMLS_CACHE_UPDATE() TSRMLS_CACHE = tsrm_get_ls_cache() // update it - i.e. calls pthread_getspecific()
#define TSRMLS_CACHE_RESET() TSRMLS_CACHE = NULL // reset it

Using the above macros does require special care to update the static cache appropriately (usually during the GINIT, and sometimes RINIT, phases of an extension). However, it is a cleaner way to provide access to the TLS memory pointer without the mess of propagating it via function parameters or the performance hit of always fetching it (via pthread_getspecific and similar).

Extra reading:

  • Native TLS (the RFC that introduced this change in PHP 7.0)
  • Threads and PHP (Julien's blog post on the TSRM)

Good in-depth technical documentation on PHP

I would start with the internals section of the PHP Manual.

There's also an internals page in the PHP Wiki.

PHP Internals: Difference Between INI Macros

(the following is my best guess on how all this works -- corrections welcomed in the comments)

The STD_PHP_INI_ENTRY and STD_PHP_INI_ENTRY_EX macros allow an end-user-programmer to create ini settings whose values are saved in memory (and, presumably, can be set and fetched via ini_set/ini_get). The PHP_INI_ENTRY and PHP_INI_ENTRY_EX macros allow an end-user-programmer to create ini settings that trigger a callback function once, and then take some action in their own program/extension (i.e. setting some global-ish state in their program not related to PHP's ini system).

The _EX version of the macros take an extra parameter -- this extra parameter is a callback that PHP will use to display the ini value in places like phpinfo. For example, you cam see the ldap.maxlinks ini definition here with a display_link_numbers callback. The source then defines the display_link_numbers callback here.

The STD_ macros are designed to work with a specific sort of state object/structs. Using the above ldap.maxlinks example again, the three key parameters are max_links, zend_ldap_globals, and ldap_globals.

STD_PHP_INI_ENTRY_EX("ldap.max_links", "-1", PHP_INI_SYSTEM, OnUpdateLong,
max_links, zend_ldap_globals, ldap_globals,
display_link_numbers)

Above, the zend_ldap_globals parameters is the name of a struct definition, setup with the the ZEND_BEGIN_MODULE_GLOBALS and ZEND_END_MODULE_GLOBALS macros. You can see the macro calls that create the zend_ldap_globals definition here. The above max_links parameter is a field on this same struct.

Finally, the ldap_globals parameter is an instance of that struct, created via the PHP_GINIT_FUNCTION macro. This macro allows a programmer to setup a "per-php-request" global and (I think) is the memory where PHP will store the ini's value. You can see the ldap per-request global setup here.

When you've setup a struct to hold you ini's state like the above, you can then use a set of predefined PHP callbacks (OnUpdateLong above) to have these values automatically set when the PHP user sets a value via php.ini (or one of the various other places a PHP ini value can be set, depending on which PHP_INI constant you've passed to your STD_ macro (PHP_INI_SYSTEM above)).



Related Topics



Leave a reply



Submit