Description
The following code:
<?php
$units = ['5', '10', '5', '3A', '5', '5'];
$unique = array_unique($units, SORT_REGULAR);
print_r($unique);
Resulted in this output:
Array
(
[0] => 5
[1] => 10
[3] => 3A
[4] => 5
)
But I expected this output instead:
Array
(
[0] => 5
[1] => 10
[3] => 3A
)
Demonstrations:
Root Cause
The algorithm:
- Sort array using comparison function from
php_get_data_compare_func_unstable()
- Walk through sorted array comparing only adjacent elements
- Delete duplicates when adjacent elements compare equal
The bug:
SORT_REGULAR uses zend_compare() which calls zendi_smart_strcmp() for string comparisons. This function has non-transitive behavior when mixing numeric and non-numeric strings:
"5" < "10" → true (numeric comparison: 5 < 10)
"10" < "3A" → true (lexicographic: "1" < "3")
"3A" < "5" → true (lexicographic: "3" < "5") Creates a cycle!
Because the comparison is non-transitive, sorting algorithms (which require transitive comparisons) produce inconsistent results depending on input order.
The deduplication walks through comparing adjacent elements:
lastkept = position_0; // "5"
position_1 "10" != "5" → keep, lastkept = position_1
position_2 "10" == "10" → delete
position_3 "3A" != "10" → keep, lastkept = position_3
position_4 "5" != "3A" → keep ← Bug! Never compared to position_0
position_5 "5" == "5" → delete
The root issue: Non-transitive comparisons break the sorting algorithm's guarantee that equal values will be grouped together. The adjacent-only comparison is correct - but it requires the array to be properly sorted first, which requires transitive comparisons.
Comparison with SORT_STRING
<?php
$units = ['5', '10', '5', '3A', '5', '5'];
echo count(array_unique($units, SORT_REGULAR)) . "\n"; // 4 ✗ Wrong
echo count(array_unique($units, SORT_STRING)) . "\n"; // 3 ✓ Correct
SORT_STRING uses lexical comparison without numeric extraction, so duplicates stay grouped.
Workaround
For simple arrays of scalar values, you can use array_unique with default SORT_STRING flag.
<?php
$unique = array_unique($array, SORT_STRING);
For arrays or objects.
$uniqueAddr = [];
foreach ($addresses as $addr) {
if (! in_array($addr, $uniqueAddr)) {
$uniqueAddr[] = $addr;
}
}
PHP Version
PHP 8.4.13 (cli) (built: Sep 26 2025 00:45:36) (NTS clang 15.0.0)
Copyright (c) The PHP Group
Built by Laravel Herd
Zend Engine v4.4.13, Copyright (c) Zend Technologies
with Zend OPcache v8.4.13, Copyright (c), by Zend Technologies
Description
The following code:
Resulted in this output:
But I expected this output instead:
Demonstrations:
Root Cause
The algorithm:
php_get_data_compare_func_unstable()The bug:
SORT_REGULAR uses
zend_compare()which callszendi_smart_strcmp()for string comparisons. This function has non-transitive behavior when mixing numeric and non-numeric strings:"5" < "10"→ true (numeric comparison: 5 < 10)"10" < "3A"→ true (lexicographic: "1" < "3")"3A" < "5"→ true (lexicographic: "3" < "5") Creates a cycle!Because the comparison is non-transitive, sorting algorithms (which require transitive comparisons) produce inconsistent results depending on input order.
The deduplication walks through comparing adjacent elements:
The root issue: Non-transitive comparisons break the sorting algorithm's guarantee that equal values will be grouped together. The adjacent-only comparison is correct - but it requires the array to be properly sorted first, which requires transitive comparisons.
Comparison with SORT_STRING
SORT_STRINGuses lexical comparison without numeric extraction, so duplicates stay grouped.Workaround
For simple arrays of scalar values, you can use array_unique with default SORT_STRING flag.
For arrays or objects.
PHP Version