Using array_unique() with multidimensional arrays
There's one problem with array_unique(): It doesn't work with multidimensional arrays. Here's an example:
$array = array(
array(
'id' => 123,
'name' => 'Some Product',
'ean' => '1234567890123'
),
array(
'id' => 123,
'name' => 'Some Product',
'ean' => '4852950174938'
),
array(
'id' => 123,
'name' => 'Some Product',
'ean' => '1234567890123'
),
);
$uniqueArray = array_unique($array);
var_dump($uniqueArray);
Two elements are exactly the same, but one element has a different EAN, yet the var_dump() returns the following:
array(1) {
[0]=>
array(3) {
["id"]=>
int(123)
["name"]=>
string(12) "Some Product"
["ean"]=>
string(13) "1234567890123"
}
}
Obviously this is unexpected behaviour. array_unique() threw out the second element, which is clearly not the same as Element 1 and 3. The easiest way I came across is using md5 hashes for comparison of the elements. All you need is to iterate over the first dimension, serialize it and create a MD5 hash of it for comparison:
/**
* Create Unique Arrays using an md5 hash
*
* @param array $array
* @return array
*/
function arrayUnique($array, $preserveKeys = false)
{
// Unique Array for return
$arrayRewrite = array();
// Array with the md5 hashes
$arrayHashes = array();
foreach($array as $key => $item) {
// Serialize the current element and create a md5 hash
$hash = md5(serialize($item));
// If the md5 didn't come up yet, add the element to
// to arrayRewrite, otherwise drop it
if (!isset($arrayHashes[$hash])) {
// Save the current element hash
$arrayHashes[$hash] = $hash;
// Add element to the unique Array
if ($preserveKeys) {
$arrayRewrite[$key] = $item;
} else {
$arrayRewrite[] = $item;
}
}
}
return $arrayRewrite;
}
$uniqueArray = arrayUnique($array);
var_dump($uniqueArray);
Now the result is the one array_unique() should have already given:
array(2) {
[0]=>
array(3) {
["id"]=>
int(123)
["name"]=>
string(12) "Some Product"
["ean"]=>
string(13) "1234567890123"
}
[1]=>
array(3) {
["id"]=>
int(123)
["name"]=>
string(12) "Some Product"
["ean"]=>
string(13) "4852950174938"
}
}
This works with as many dimensions as you like.
25 comments
Lorem
24.02.2009, 16:42 o'clock
Did you fill in a bug report? This is really bad behavior.
Jeremy
10.03.2009, 22:45 o'clock
Thank you. Clear, concise, fast. Works fantastic. Well done and thanks again.
Dominik Jungowski
13.03.2009, 11:57 o'clock
@Lorem: I just did (although there was already a similar bug report from 2001): http://bugs.php.net/bug.php?id=47642
Frank
05.05.2009, 00:29 o'clock
I need a fix for this as I've stumbled across this bug today. Unfortunately, your code outputs the exact same result as array_unique().
Dominik Jungowski
05.05.2009, 09:34 o'clock
An example where it puts out the same would be nice.
As for the example from the blogpost: I have just tested it once more and it worked as it should.
btw. the bug report was closed ("was never intended to work with multi-dimensional arrays") but at least the documentation has been updated
Dominik Jungowski
12.06.2009, 16:46 o'clock
There was indeed one bug when using associative arrays. For that reason I updated the blogpost and added the preserveKeys parameter to the function.
zeromatrix
15.11.2009, 02:05 o'clock
Thank you very much for this knowledge. After Change the PHP version 4 to 5 my old code are not workin correct, but this works very fine !!
17.12.2009, 08:43 o'clock
awesome - this rocks! and so do you!!
Mukul
12.02.2010, 14:12 o'clock
Thanks for this piece of code…
John Flynn
01.03.2010, 21:03 o'clock
I have borrowed this functionality and left credit in the comments. Good idea. Thanks!
Christian
30.04.2010, 16:04 o'clock
AWESOME CODE!!!! Real cool! Thx thx thx!
Christian
30.04.2010, 18:03 o'clock
Is this function working with arrays of any size? Although it worked fine with smaller ones (a few hundred entries), it didn't seem to work with an array with 5500 entries as the resulting array was still the same size (and there are lots of duplicates in there). The smaller ones I tested it with were actually parts of this big one.
Not sure if it's a function limit or some other server/PHP limitation.
Christian
Dominik Jungowski
30.04.2010, 20:51 o'clock
Well, I haven't tested it with a array that large so far, but I can take a look at it and see what results I get.
Christian
02.05.2010, 12:11 o'clock
FYI: My array with 5500 entries is actually one of the smaller ones. The biggest one contains 110.000 entries. Support for that size would be much appreciated! ;)
14.05.2010, 11:30 o'clock
Thanks, it solved my problem also in minutes, great work!
Dominik Jungowski
14.05.2010, 17:18 o'clock
@Michael: I haven't had the time yet to check out your problem, but I will do it soon!
KingIsulgard
14.07.2010, 15:43 o'clock
This is really bad coding. I wouldn't use it. This code can be so much more efficient.
I wrote this code. It has the option to preserve keys, but default is non preserving keys. It is an optional parameter.
function arrayUnique($array, $preserveKeys = false) {
$newArray = array();
$newKeys = array();
// Add items to new array and remove doubles
foreach($array as $key => $item) {
$newArray[serialize($item)] = $item;
$newKeys[] = $key;
}
// Set keys
if($preserveKeys) {
array_combine($newKeys, $newArray);
} else {
$newArray = array_values($newArray);
}
return $newArray;
}
Achin Sharma
23.08.2010, 15:28 o'clock
hmm cool script but somehow it didnt work for me.. alas i am kind of stuck at this..
Dominik Jungowski
15.09.2010, 09:31 o'clock
@Christian: I can't seem to reconstruct your problem. I have just tested the code with an array of 5004 entries and the code worked fine. Do you have any more information you can provide, like samples of some array entries?
Christo
29.09.2010, 23:52 o'clock
Very effective! Fixed my SQL join issue straight away, and its a logical solution I can understand.
Thank you
>_<
Bernie
26.01.2011, 08:52 o'clock
While I think, this is quite a piece of sexy code, it did not work for me.
I need to deal with a two-dimensional array of integers (array(x,2)), where some of the tuples are identical (and hence need to be removed).
While trying to debug the problem, I found that "serialize()" did not return the same string for identical tuples - the md5-idea is therfore doomed to fail.
I wonder about the "serialize()"-effect, though. It suggests, that the tuples concerned are actually not identical. But from the unserialized object this is not observable…
yi
05.09.2011, 11:30 o'clock
you can actually use array_unique() function with flag SORT_REGULAR to archive the same result
carlos
19.01.2012, 00:27 o'clock
Ching Chong is right. Just use SORT_REGULAR. You are all idiots.
31. Januar 2009
comments feed
recent posts
Peter Rother
09.02.2009, 18:45 o'clock
Nice example, thx. I think i can use it in future projects.