PhpRiot
News Archive
PhpRiot Newsletter
Your Email Address:

More information

Understanding PHP's internal function definitions (PHP's Source Code for PHP Developers - Part 2)

Note: This article was originally published at Planet PHP on 16 March 2012.
Planet PHP

Welcome to the second part of the aoPHP's Source Code For PHP Developersa series.

In the previous part ircmaxell explained where you can find the PHP source code and how it is basically structured and also gave a small introduction to C (as that's the language PHP is written in). If you missed that post, you probably should read it before starting with this one.

What we'll cover in this article is locating the definitions of internal functions in the PHP codebase, as well as understanding them.

How to find function definitions

For a start, let's try to find out how the strpos function is defined.

The first thing to try, is to go to the PHP 5.4 source code root and type strpos into the search box at the top of the page. The result will be a huge listing of strpos occurrences in the PHP source code.

As this doesn't really help us much, we use a little trick: Instead of searching for just strpos, we search for "PHP_FUNCTION strpos" instead (don't forget the quotes, they are important).

Now we are left with only too entries:

/PHP_5_4/ext/standard/ php_string.h 48 PHP_FUNCTION(strpos); string.c 1789 PHP_FUNCTION(strpos)

First thing to notice is that both occurrences are in the ext/standard folder. This is exactly where one would expect to find them, as the strpos function (together with pretty much all other string, array and file functions) is part of the standard extension.

Now open both links in new tabs and see what code hides behind them.

You'll find that the first link leads you to the php_string.h file, which is full of code looking like this:

// ... PHP_FUNCTION(strpos); PHP_FUNCTION(stripos); PHP_FUNCTION(strrpos); PHP_FUNCTION(strripos); PHP_FUNCTION(strrchr); PHP_FUNCTION(substr); // ...

This is exactly how a typical header file (a file ending in .h) looks like: A plain list of functions which are defined elsewhere. We aren't really interested in this, as we already know what we're looking for.

The second link is much more interesting: It leads to the string.c file, which contains the actual source code of the function.

Before I'll walk you through the code step by step, I'd recommend you to try and understand the function by yourself. It's a really simple function and most things should be clear even if you don't know the exact details.

The skeleton of a PHP function

All PHP functions share the same basic structure. At the top there are a few variable declarations, then there is a zend_parse_parameters call, then comes the main logic, with RETURN_*** and php_error_docref calls intermixed.

So, let's start with the variable declarations:

zval *needle; char *haystack; char *found = NULL; char needle_char[2]; long offset = 0; int haystack_len;

The first line declares needle as being a pointer to a zval. A zval is PHP's internal representation of an arbitrary PHP value. How exactly it looks like will be subject of the next post.

The second line declares haystack as a pointer to a character. At this point you'll have to remember that in C, arrays are represented by pointers to their first value. I.e. the haystack will point to the first character of the $haystack string you passed in. Then haystack + 1 will point to the second character, haystack + 2 to the third, and so on. So one could read in the whole string by always incrementing the pointer by one.

The problem arising here is that PHP has to know when the string ends. Otherwise it would always keep incrementing the pointer without ever stopping. In order to deal with this, PHP also stores an explicit length, here in the haystack_len variable.

The last declaration of interest to us at this point is the offset variable, which will be used to store the third parameter of the function: the offset to start searching at. It is declared as a

Truncated by Planet PHP, read more at the original (another 9803 bytes)