Month: March 2008

Development

Problems arising from PHP type casting in ==

While trying to work through the issues I mentioned in the last post I started doing some serious digging and testing. Here is what I have found.

PHP seems to use == for determining equivalance when performing array_search, in_array and switch, while using either === or strcmp when doing array_key_exists.

The result of this is that array_search and in_array will return improper results when used on an array with mixed string and integer values. (Another thing I found, that may or may not be related, is that array keys will be cast from strings to integers when those strings are valid integer values.)

array_search() with mixed types

$one = array (
  'abc',
  'abc1',
  111,
  '111b',
  2,
  '2xyz',
  '123a',
  123
);
$two = $one;
for ($i = 0; $i < count($one); $i++) {
  $xkey = array_search($one[$i], $two);
  if(strcmp(strval($one[$i]), strval($two[$xkey])) != 0) {
    // This should NEVER be reached, but it is, often!
    $eq = 'FALSE';
  } else {
    $eq = 'true';
  }
}
Row $one $two Correct? Found Notes
0 abc abc true 0 abc == abc : array_search($one[0], $two) where $one[0] = string(3) “abc”
1 abc1 abc1 true 1 abc1 == abc1 : array_search($one[1], $two) where $one[1] = string(4) “abc1”
2 111 111 true 2 111 == 111 : array_search($one[2], $two) where $one[2] = int(111)
3 111b 111b FALSE 2 111b == 111 : array_search($one[3], $two) where $one[3] = string(4) “111b”
4 2 2 true 4 2 == 2 : array_search($one[4], $two) where $one[4] = int(2)
5 2xyz 2xyz FALSE 4 2xyz == 2 : array_search($one[5], $two) where $one[5] = string(4) “2xyz”
6 123a 123a true 6 123a == 123a : array_search($one[6], $two) where $one[6] = string(4) “123a”
7 123 123 FALSE 6 123 == 123a : array_search($one[7], $two) where $one[7] = int(123)

array_search() with all strings

$one = array (
  'abc',
  'abc1',
  '111',
  '111b',
  '2',
  '2xyz',
  '123a',
  '123'
);
$two = $one;
for ($i = 0; $i < count($one); $i++) {
  $xkey = array_search($one[$i], $two);
  if(strcmp(strval($one[$i]), strval($two[$xkey])) != 0) {
    // This should NEVER be reached, and with all strings it isn't.
    $eq = 'FALSE';
  } else {
    $eq = 'true';
  }
}
Row $one $two Correct? Found Notes
0 abc abc true 0 abc == abc : array_search($one[0], $two) where $one[0] = string(3) “abc”
1 abc1 abc1 true 1 abc1 == abc1 : array_search($one[1], $two) where $one[1] = string(4) “abc1”
2 111 111 true 2 111 == 111 : array_search($one[2], $two) where $one[2] = string(3) “111”
3 111b 111b true 3 111b == 111b : array_search($one[3], $two) where $one[3] = string(4) “111b”
4 2 2 true 4 2 == 2 : array_search($one[4], $two) where $one[4] = string(1) “2”
5 2xyz 2xyz true 5 2xyz == 2xyz : array_search($one[5], $two) where $one[5] = string(4) “2xyz”
6 123a 123a true 6 123a == 123a : array_search($one[6], $two) where $one[6] = string(4) “123a”
7 123 123 true 7 123 == 123 : array_search($one[7], $two) where $one[7] = string(3) “123”

in_array() and array_key_exists()

$array = array('111'=>'111', '11b'=>'11b', '222b'=>'222b','2x22'=>'2x22');
$keys = array_keys($array);
$searches = array('111b',222,11,'222b',2);
foreach ($searches as $search) {
  $ia = (in_array($search, $array))?'true':'false';
  $ake = (array_key_exists($search, $array))?'true':'false';
  if ($search === '222b') {
    // This is the only place where either should return true
    $iaf = ($ia == 'true')?" class=\"$true\"":" class=\"$false\"";
    $akef = ($ake == 'true')?" class=\"$true\"":" class=\"$false\"";
    $notes = "** Both should be true **";
  } else {
    $iaf = ($ia == 'false')?" class=\"$true\"":" class=\"$false\"";
    $akef = ($ake == 'false')?" class=\"$true\"":" class=\"$false\"";
    $notes = "Both should be false";
  }
}

Notice how the array keys are cast to type int in both the original array and in array_keys.

$array $keys
array(4) {
  [111]=>
  string(3) "111"
  ["11b"]=>
  string(3) "11b"
  ["222b"]=>
  string(4) "222b"
  ["2x22"]=>
  string(4) "2x22"
}
array(4) {
  [0]=>
  int(111)
  [1]=>
  string(3) "11b"
  [2]=>
  string(4) "222b"
  [3]=>
  string(4) "2x22"
}
Search Item in_array array_key_exists Notes
111b false false Both should be false
222 true false Both should be false
11 true false Both should be false
222b true true ** Both should be true **
2 true false Both should be false

So, it appears that array_key_exists() uses either or === strcmp() while in_array() uses ==

NOTE: Calling array_key_exists() with a string value ‘111’ will return true for an item with a key of int 111. This is not the same behavior as ===, but is the same behavior as strcmp() which must be what is used internally for array_key_exists().

The difference between the 3 operations is clear:

$a $b $a == $b $a === $b strcmp(strval($a),strval($b))
int(123) string(4) “123b” bool(true) bool(false) int(-1)

Another area where this becomes an issue is in switch statements. Take the following, for example:

switch()

$array = array(111,'222b');
foreach($array as $val)
{
  $row = ($row == 'row')?'offset_row':'row';
  $false = ($false == 'false')?'offset_false':'false';
  $true = ($true == 'true')?'offset_true':'true';
  switch($val)
  {
    case '111b': // this displays
      $match = '111b';
      $f = " class=\"$false\"";
      $notes = "Incorrect: should have fallen through to next case";
      break;
    case 111: // never makes it here even tho this is correct
      $match = 111;
      $f = " class=\"$true\"";
      $notes = "** Correct **";
      break;
    case 222: // this displays
      $match = 222;
      $f = " class=\"$false\"";
      $notes = "Incorrect: should have fallen through to next case";
      break;
    case '222b': // never makes it here even tho this is correct
      $match = '222b';
      $f = " class=\"$true\"";
      $notes = "** Correct **";
      break;
    default:
      $match = 'no match';
      $f = " class=\"$false\"";
      $notes = "Incorrect: should have matched";
      break;
  }
}
Search Item Match Notes
111 111b Incorrect: should have fallen through to next case
222b 222 Incorrect: should have fallen through to next case
Development

PHP array_search implicit cast of search term

There is an error in the values that array_search returns when searching on an array that has a mix of numeric values (123) and alpha-numeric mixed strings that start with the same and follow with alpha characters (‘123a’).

The results are actually kind of bizarre, but explainable by a bug in PHP’s equivalence test. When testing for equivalence (using ==) PHP determines that 123 == ‘123xyz’. PHP casts the string to an integer when doing the comparison (so ‘123xyz’ becomes 123). This is documented in bugs.php.net (http://bugs.php.net/bug.php?id=23110) – but this leads to problems: both switch and array_search use == for comparison.

So, using:

$one = array (
  'abc',
  'abc1',
  111,
  '111b',
  2,
  '2xyz',
  '123a',
  123
);
$two = $one;

foreach($one as $val)
{
  $key = array_search($val, $two);
  if ($key !== false) {
    echo "$val == {$two[$key]} \n";
    if (strcmp(strval($val), strval($two[$key])) == 0) {
      echo "strcmp returns true";
    } else {
      echo "strcmp returns false";
    }
  } else {
    echo "$val not found \n";
  }
}

results in:

abc == abc -- strcmp returns true
abc1 == abc1 -- strcmp returns true
111 == 111 -- strcmp returns true
111b == 111 -- strcmp returns false
2 == 2 -- strcmp returns true
2xyz == 2 -- strcmp returns false
123a == 123a -- strcmp returns true
123 == 123a -- strcmp returns false

This becomes a real problem when you can’t be sure that the values in an array are all of the same type. However, if you are sure that all the values in the array are of type string then array_search works flawlessly.

I am still unsure how to work around this, however, I think having a version of array_search that doesn’t do an implicit cast on the search value would be of great use.