Using array_unique() with multidimensional arrays

Permanent Link: Using array_unique() with multidimensional arrays 31. Januar 2009 RSS Feed for comments on RSS-Feed für Kommentare zu: Using array_unique() with multidimensional arrays comments feed

There's one problem with array_unique(): It doesn't work with multidimensional arrays. Here's an example:

$array = array(
array(
'id' => 123,
'name' => 'Some Product',
'ean' => '1234567890123'
),
array(
'id' => 123,
'name' => 'Some Product',
'ean' => '4852950174938'
),
array(
'id' => 123,
'name' => 'Some Product',
'ean' => '1234567890123'
),
);
$uniqueArray = array_unique($array);
var_dump($uniqueArray);

Two elements are exactly the same, but one element has a different EAN, yet the var_dump() returns the following:

array(1) {
[0]=>
array(3) {
["id"]=>
int(123)
["name"]=>
string(12) "Some Product"
["ean"]=>
string(13) "1234567890123"
}
}

Obviously this is unexpected behaviour. array_unique() threw out the second element, which is clearly not the same as Element 1 and 3. The easiest way I came across is using md5 hashes for comparison of the elements. All you need is to iterate over the first dimension, serialize it and create a MD5 hash of it for comparison:

/**
* Create Unique Arrays using an md5 hash
*
* @param array $array
* @return array
*/
function arrayUnique($array, $preserveKeys = false)
{
// Unique Array for return
$arrayRewrite = array();
// Array with the md5 hashes
$arrayHashes = array();
foreach($array as $key => $item) {
// Serialize the current element and create a md5 hash
$hash = md5(serialize($item));
// If the md5 didn't come up yet, add the element to
// to arrayRewrite, otherwise drop it
if (!isset($arrayHashes[$hash])) {
// Save the current element hash
$arrayHashes[$hash] = $hash;
// Add element to the unique Array
if ($preserveKeys) {
$arrayRewrite[$key] = $item;
} else {
$arrayRewrite[] = $item;
}
}
}
return $arrayRewrite;
}

$uniqueArray = arrayUnique($array);
var_dump($uniqueArray);

Now the result is the one array_unique() should have already given:

array(2) {
[0]=>
array(3) {
["id"]=>
int(123)
["name"]=>
string(12) "Some Product"
["ean"]=>
string(13) "1234567890123"
}
[1]=>
array(3) {
["id"]=>
int(123)
["name"]=>
string(12) "Some Product"
["ean"]=>
string(13) "4852950174938"
}
}

This works with as many dimensions as you like.

25 comments

Peter Rothers Gravatar

Peter Rother
09.02.2009, 18:45 o'clock

Nice example, thx. I think i can use it in future projects.

Lorems Gravatar

Lorem
24.02.2009, 16:42 o'clock

Did you fill in a bug report? This is really bad behavior.

Jeremys Gravatar

Jeremy
10.03.2009, 22:45 o'clock

Thank you. Clear, concise, fast. Works fantastic. Well done and thanks again.

Dominik Jungowskis Gravatar

Dominik Jungowski
13.03.2009, 11:57 o'clock

@Lorem: I just did (although there was already a similar bug report from 2001): http://bugs.php.net/bug.php?id=47642

Franks Gravatar

Frank
05.05.2009, 00:29 o'clock

I need a fix for this as I've stumbled across this bug today. Unfortunately, your code outputs the exact same result as array_unique().

Dominik Jungowskis Gravatar

Dominik Jungowski
05.05.2009, 09:34 o'clock

An example where it puts out the same would be nice.

As for the example from the blogpost: I have just tested it once more and it worked as it should.

btw. the bug report was closed ("was never intended to work with multi-dimensional arrays") but at least the documentation has been updated

Dominik Jungowskis Gravatar

Dominik Jungowski
12.06.2009, 16:46 o'clock

There was indeed one bug when using associative arrays. For that reason I updated the blogpost and added the preserveKeys parameter to the function.

zeromatrix
15.11.2009, 02:05 o'clock

Thank you very much for this knowledge. After Change the PHP version 4 to 5 my old code are not workin correct, but this works very fine !!


17.12.2009, 08:43 o'clock

awesome - this rocks! and so do you!!

Mukuls Gravatar

Mukul
12.02.2010, 14:12 o'clock

Thanks for this piece of code…

John Flynns Gravatar

John Flynn
01.03.2010, 21:03 o'clock

I have borrowed this functionality and left credit in the comments. Good idea. Thanks!

Christians Gravatar

Christian
30.04.2010, 16:04 o'clock

AWESOME CODE!!!! Real cool! Thx thx thx!

Christians Gravatar

Christian
30.04.2010, 18:03 o'clock

Is this function working with arrays of any size? Although it worked fine with smaller ones (a few hundred entries), it didn't seem to work with an array with 5500 entries as the resulting array was still the same size (and there are lots of duplicates in there). The smaller ones I tested it with were actually parts of this big one.

Not sure if it's a function limit or some other server/PHP limitation.

Christian

Dominik Jungowskis Gravatar

Dominik Jungowski
30.04.2010, 20:51 o'clock

Well, I haven't tested it with a array that large so far, but I can take a look at it and see what results I get.

Christians Gravatar

Christian
02.05.2010, 12:11 o'clock

FYI: My array with 5500 entries is actually one of the smaller ones. The biggest one contains 110.000 entries. Support for that size would be much appreciated! ;)


14.05.2010, 11:30 o'clock

Thanks, it solved my problem also in minutes, great work!

Dominik Jungowskis Gravatar

Dominik Jungowski
14.05.2010, 17:18 o'clock

@Michael: I haven't had the time yet to check out your problem, but I will do it soon!

KingIsulgards Gravatar

KingIsulgard
14.07.2010, 15:43 o'clock

This is really bad coding. I wouldn't use it. This code can be so much more efficient.

I wrote this code. It has the option to preserve keys, but default is non preserving keys. It is an optional parameter.

function arrayUnique($array, $preserveKeys = false) {
$newArray = array();
$newKeys = array();

// Add items to new array and remove doubles
foreach($array as $key => $item) {
$newArray[serialize($item)] = $item;
$newKeys[] = $key;
}

// Set keys
if($preserveKeys) {
array_combine($newKeys, $newArray);
} else {
$newArray = array_values($newArray);
}

return $newArray;
}

Achin Sharmas Gravatar

Achin Sharma
23.08.2010, 15:28 o'clock

hmm cool script but somehow it didnt work for me.. alas i am kind of stuck at this..

Dominik Jungowskis Gravatar

Dominik Jungowski
15.09.2010, 09:31 o'clock

@Christian: I can't seem to reconstruct your problem. I have just tested the code with an array of 5004 entries and the code worked fine. Do you have any more information you can provide, like samples of some array entries?

Christos Gravatar

Christo
29.09.2010, 23:52 o'clock

Very effective! Fixed my SQL join issue straight away, and its a logical solution I can understand.
Thank you
>_<

Yuriys Gravatar

Yuriy
26.10.2010, 13:37 o'clock

Great and WORKING example. Thank you.

Bernies Gravatar

Bernie
26.01.2011, 08:52 o'clock

While I think, this is quite a piece of sexy code, it did not work for me.

I need to deal with a two-dimensional array of integers (array(x,2)), where some of the tuples are identical (and hence need to be removed).

While trying to debug the problem, I found that "serialize()" did not return the same string for identical tuples - the md5-idea is therfore doomed to fail.

I wonder about the "serialize()"-effect, though. It suggests, that the tuples concerned are actually not identical. But from the unserialized object this is not observable…

yis Gravatar

yi
05.09.2011, 11:30 o'clock

you can actually use array_unique() function with flag SORT_REGULAR to archive the same result

carloss Gravatar

carlos
19.01.2012, 00:27 o'clock

Ching Chong is right. Just use SORT_REGULAR. You are all idiots.

Write a comment

(will not be published)