Why do I have to use the reference operator (&) in a function call?
Setup
I am borrowing a function from an open source CMS that I frequently use for a custom project.
It's purpose is not important to this question but if you want to know it's a simple static cache designed to reduce database queries. I can call getObject 10 times in one page load and not have to worry about hitting the database 10 times.
Code
A simplified version of the function looks like this:
function &staticStorage($name, $default_value = NULL)
{
static $data = array();
if (isset($data[$name])
{
return $data[$name];
}
$data[$name] = $default_value;
return $data[$name];
}
This function would be called in something like this:
function getObject($object_id)
{
$object = &staticStorage('object_' . $object_id);
if ($object)
{
return $object;
}
// This query isn't accurate but that's ok it's not important to the question.
$object = databaseQuery('SELECT * FROM Objects WHERE id = @object_id',
array('@object_id => $object_id'));
return $object;
}
The idea is that once I call static_storage
the returned value will update the static storage as it is changed.
The problem
My interest is in the line $object = &staticStorage('object_' . $object_id);
Notice the &
in front of the function. The staticStorage
function returns a reference already so I initially did not include the reference operator preceding the function call. However, without the reference preceding the function call it does not work correctly.
My understanding of pointers is if I return a pointer php will automatically cast the variable as a pointer $a = &$b
will cause $a
to point to the value of $b
.
The question
Why? If the function returns a reference why do I have to use the reference operator to precede the function call?
From the PHP docs
Note: Unlike parameter passing, here you have to use & in both places - to indicate that you want to return by reference, not a copy, and to indicate that reference binding, rather than usual assignment, should be done for $myValue.
http://php.net/manual/en/language.references.return.php
Basically, its to help the php interpreter. The first use in the function definition is to return the reference, and the second is to bind by reference instead of value to the assignment.
By putting the &
in the function declaration, the function will return a memory address of the return value. The assignment, when getting this memory address would interpret the value as an int unless explicitly told otherwise, this is why the second &
is needed for the assignment operator.
EDIT: As pointed out by @ringø below, it does not return a memory address, but rather an object that will be treated like a copy (technically copy-on-write).
The PHP doc explains how to use, and why, functions that return references.
In your code, the getObject()
function needs also a &
(and the call as well) otherwise the reference is lost and the data, while usable, is based on PHP copy-on-write (returned data and source data point both to the same actual data until there is a change in one of them => two blocks of data having a distinct life)
This wouldn't work (syntax error)
$a = array(1, 2, 3);
return &$a;
this doesn't work as intended (no reference returned)
$a = array(1, 2, 3);
$ref = &$a;
return $ref;
and without adding the &
to the function call as you said, no reference returned either.
To the question as to why... There doesn't seem to be a consistent answer.
&
is missing PHP treats data as if it isn't a reference (like returning an array for instance) with no warning whatsoever PHP evolved during the years but still inherits some of the initial poor design choices. This seems to be one of them (this syntax is error prone as one may easily miss one &
... and no warning ahead... ; also why not directly return a reference like return &$var;
?). PHP made some progress but still, traces of poor design subsist.
You may also be interested in this chapter of the doc linked above
Do not use return-by-reference to increase performance. The engine will automatically optimize this on its own. Only return references when you have a valid technical reason to do so.
Finally, it's better not to look too much for equivalences between the pointers in C and the PHP references (Perl is closer than PHP in this regard). PHP adds a layer between the actual pointer to data and variables and references point rather to that layer than the actual data. But a reference is not a pointer. If $a
is an array and $b
is a reference to $a
, using either $a
or $b
to access the array is equivalent. There is no dereference syntax, a *$b
for instance like in C. $b
should be seen as an alias of $a
. This is also the reason a function can only return a reference to a variable.