Array_Unique VS Array_Flip

array_unique vs array_flip

I benchmarked it for you: CodePad

Your intuition on this was correct!

$test=array();
for($run=0; $run<1000; $run++)
$test[]=rand(0,100);

$time=microtime(true);

for($run=0; $run<100; $run++)
$out=array_unique($test);

$time=microtime(true)-$time;
echo 'Array Unique: '.$time."\n";

$time=microtime(true);

for($run=0; $run<100; $run++)
$out=array_keys(array_flip($test));

$time=microtime(true)-$time;
echo 'Keys Flip: '.$time."\n";

$time=microtime(true);

for($run=0; $run<100; $run++)
$out=array_flip(array_flip($test));

$time=microtime(true)-$time;
echo 'Flip Flip: '.$time."\n";

Output:

Array Unique: 1.1829199790955
Keys Flip: 0.0084578990936279
Flip Flip: 0.0083951950073242

Note that array_keys(array_flip($array)) will give a new key values in order, which in many cases may be what you want (identical except much faster to array_values(array_unique($array))), whereas array_flip(array_flip($array)) is identical (except much faster) to array_unique($array) where the keys remain the same.

Why does array_unique sort the values?

If you think about it algorithmically, the way to remove duplicates is to go through a list, keep track of items you find, and get rid of things that are already in that "found this" list. One easy way to accomplish this is to sort a list. That way it's obvious where to remove duplicates efficiently. Think about you, let alone a computer; which one of these lists is easier to remove duplicates from?

apple
banana
cantaloupe
apple
durian
apple
banana
cantaloupe

or

apple
apple
apple
banana
banana
cantaloupe
cantaloupe
durian

Edit: After looking into it a bit (and finding this article), it looks like while the two both get the job done, they are not functionally equivalent, or at least they aren't always. To paraphrase a couple of these points:

  1. array_unique() sorts the values, as you noted, so array_flip(array_flip()) wouldn't return the same-ordered array -- but this might be desired.
  2. If the values are objects, then you can't make them keys (right?), i.e. the flip method wouldn't work out of the box on all arrays, whereas the sort method works fine, regardless of the value types.

PHP Performance question: Faster to leave duplicates in array that will be searched or do array_unique?

I think array_unique is slower than in_array but it makes sense if you want to search the array more than one time or if you want to save memory.

Another option is to use array_flip (which will also drop duplicate keys) and then use isset or array_key_exists since they are way faster than in_array, personally I would go this way.

How to read two big files and compare contents

You won't need to call array_filter() or array_unique() if you are going to call array_flip() -- it will eliminate the duplicates for you because you can't have duplicate keys in the same level of an array.

Furthermore:

  1. array_unique() is stated to be slower than array_flip() (and there are times when it is slower than two array_flip()s)
  2. array_filter() has a bad reputation for killing falsey/empty/null/zero-ish data, so I will caution you not to use its default behavior.
  3. array_flip() sets up the very speedy isset() check. isset() will likely outperform array_key_exists() because isset() doesn't check for null values.
  4. I am adding the FILE_SKIP_EMPTY_LINES flag to file() call so that your lookup array is potentially smaller.
  5. Calling rtrim() of every line of your big file, may be causing some drag too. Do you know if you have consistently identical newline characters on both files? It would spare you six hundred millions calls of rtrim() if you can safely remove the FILE_IGNORE_NEW_LINES flag from the file() call. Alternatively, if you know the newlines (e.g. \n? or \r\n?) that trail the big.txt lines, you can append specific newline(s) to the $lookup keys -- this means preparing the smaller file's data versus every line of the big file.

Untested Code:

$lookup = array_flip(file('small.txt', FILE_IGNORE_NEW_LINES | FILE_SKIP_EMPTY_LINES));
if($file = fopen('big.txt', 'r')){
while(!feof($file)){
$line = rtrim(fgets($file));
if (isset($lookup[$line])) {
echo "$lines : exists.\n";
}
}
fclose($file);
}

How to counting varians data from array?

You can flip the array, and check its length:

echo count(array_flip($arr));

This works because an array index must be unique, so you end up with one element for every unique item in the original array:

http://php.net/manual/en/function.array-flip.php

If a value has several occurrences, the latest key will be used as its value, and all others will be lost.

This is (was?) somewhat faster than array_unique, but unless you are calling this a LOT, array_unique is a lot more descriptive, so probably the better option

Find two values that are equal in array

All of my methods will return the desired result as long as there IS a duplicate. It is also assumed because of your sample input, that there is only 1 duplicate in the array. The difference between my methods (and the other answers on this page) will be milliseconds at most for your input size. Because your users will not be able to distinguish between any of the correct methods on this page, I will suggest that the method that you implement should be determined by "readability","simplicity", and/or "brevity". There are many coders who always default to for/foreach/while loops. There are others who always defer to functional iterators. Your choice will probably just come to down to "your coding style".

Input:

$arr=[1,1,2,3,4];

Method #1: array_count_values(), arsort(), key()

$result=array_count_values($arr);
arsort($result); // this doesn't return an array, so cannot be nested
echo key($result);
// if no duplicate, this will return the first value from the input array

Explanation: generate new array of value occurrences, sort new array by occurrences from highest to lowest, return the key.


Method #2: array_count_values(), array_filter(), key()

echo key(array_filter(array_count_values($arr),function($v){return $v!=1;}));
// if no duplicate, this will return null

Explanation: generate the array of value occurrences, filter out the 1's, return the lone key.


Method #3: array_unique(), array_diff_key(), current()

echo current(array_diff_key($arr,array_unique($arr)));
// if no duplicate, this will return false

Explanation: remove duplicates and preserve the keys, find element that went missing, return the lone value.

Further consideration after reading: https://www.exakat.io/avoid-array_unique/ and the accepted answer from array_unique vs array_flip I have a new favorite 2-function one-liner...

Method #4: array_count_values(), array_flip()

echo array_flip(array_count_values($arr))[2];
// if no duplicate, this will show a notice because it is trying to access a non-existent element
// you can use a suppressor '@' like this:
// echo @array_flip(array_count_values($arr))[2];
// this will return null on no duplicate

Explanation: count the occurrences (which makes keys of the values), swap the keys and values (creating a 2-element array), access the 2 key without a function call. Quick-smart!

If you wanted to implement Method #4, you can write something like this:(demo)

$dupe=@array_flip(array_count_values($arr))[2];
if($dupe!==null){
echo "The pair number is $dupe";
}else{
echo "There were no pairs";
}

There will be many ways to achieve your desired result as you can see from all of the answers, but I'll stop here.

How to array_push unique values inside another array

You could foreach() through $DocumentID and check for the current value in $UniqueDocumentID with in_array() and if not present add it. Or use the proper tool:

$UniqueDocumentID = array_unique($DocumentID);

To your comment about wanting sequential keys:

$UniqueDocumentID = array_values(array_unique($DocumentID));

The long way around:

$UniqueDocumentID = array();

foreach($DocumentID as $value) {
if(!in_array($value, $UniqueDocumentID)) {
$UniqueDocumentID[] = $value;
}
}

How to remove duplicate values from an array in PHP

Use array_unique().

Example:

$array = array(1, 2, 2, 3);
$array = array_unique($array); // Array is now (1, 2, 3)

PHP array_intersect + array_flip with array that has values multiple times

The following code does the job. I hope it is self-explanatory.

array_unique(array_intersect_key($arr1, array_flip($arr2)))


Related Topics



Leave a reply



Submit