array_unique vs array_flip
I benchmarked it for you: CodePad
Your intuition on this was correct!
$test=array();
for($run=0; $run<1000; $run++)
$test[]=rand(0,100);
$time=microtime(true);
for($run=0; $run<100; $run++)
$out=array_unique($test);
$time=microtime(true)-$time;
echo 'Array Unique: '.$time."\n";
$time=microtime(true);
for($run=0; $run<100; $run++)
$out=array_keys(array_flip($test));
$time=microtime(true)-$time;
echo 'Keys Flip: '.$time."\n";
$time=microtime(true);
for($run=0; $run<100; $run++)
$out=array_flip(array_flip($test));
$time=microtime(true)-$time;
echo 'Flip Flip: '.$time."\n";
Output:
Array Unique: 1.1829199790955
Keys Flip: 0.0084578990936279
Flip Flip: 0.0083951950073242
Note that array_keys(array_flip($array))
will give a new key values in order, which in many cases may be what you want (identical except much faster to array_values(array_unique($array))
), whereas array_flip(array_flip($array))
is identical (except much faster) to array_unique($array)
where the keys remain the same.
Why does array_unique sort the values?
If you think about it algorithmically, the way to remove duplicates is to go through a list, keep track of items you find, and get rid of things that are already in that "found this" list. One easy way to accomplish this is to sort a list. That way it's obvious where to remove duplicates efficiently. Think about you, let alone a computer; which one of these lists is easier to remove duplicates from?
apple
banana
cantaloupe
apple
durian
apple
banana
cantaloupe
or
apple
apple
apple
banana
banana
cantaloupe
cantaloupe
durian
Edit: After looking into it a bit (and finding this article), it looks like while the two both get the job done, they are not functionally equivalent, or at least they aren't always. To paraphrase a couple of these points:
- array_unique() sorts the values, as you noted, so array_flip(array_flip()) wouldn't return the same-ordered array -- but this might be desired.
- If the values are objects, then you can't make them keys (right?), i.e. the flip method wouldn't work out of the box on all arrays, whereas the sort method works fine, regardless of the value types.
PHP Performance question: Faster to leave duplicates in array that will be searched or do array_unique?
I think array_unique
is slower than in_array
but it makes sense if you want to search the array more than one time or if you want to save memory.
Another option is to use array_flip
(which will also drop duplicate keys) and then use isset
or array_key_exists
since they are way faster than in_array
, personally I would go this way.
How to read two big files and compare contents
You won't need to call array_filter()
or array_unique()
if you are going to call array_flip()
-- it will eliminate the duplicates for you because you can't have duplicate keys in the same level of an array.
Furthermore:
array_unique()
is stated to be slower thanarray_flip()
(and there are times when it is slower than twoarray_flip()
s)array_filter()
has a bad reputation for killing falsey/empty/null/zero-ish data, so I will caution you not to use its default behavior.array_flip()
sets up the very speedyisset()
check.isset()
will likely outperformarray_key_exists()
becauseisset()
doesn't check fornull
values.- I am adding the
FILE_SKIP_EMPTY_LINES
flag tofile()
call so that your lookup array is potentially smaller. - Calling
rtrim()
of every line of your big file, may be causing some drag too. Do you know if you have consistently identical newline characters on both files? It would spare you six hundred millions calls ofrtrim()
if you can safely remove theFILE_IGNORE_NEW_LINES
flag from thefile()
call. Alternatively, if you know the newlines (e.g.\n
? or\r\n
?) that trail the big.txt lines, you can append specific newline(s) to the$lookup
keys -- this means preparing the smaller file's data versus every line of the big file.
Untested Code:
$lookup = array_flip(file('small.txt', FILE_IGNORE_NEW_LINES | FILE_SKIP_EMPTY_LINES));
if($file = fopen('big.txt', 'r')){
while(!feof($file)){
$line = rtrim(fgets($file));
if (isset($lookup[$line])) {
echo "$lines : exists.\n";
}
}
fclose($file);
}
How to counting varians data from array?
You can flip the array, and check its length:
echo count(array_flip($arr));
This works because an array index must be unique, so you end up with one element for every unique item in the original array:
http://php.net/manual/en/function.array-flip.php
If a value has several occurrences, the latest key will be used as its value, and all others will be lost.
This is (was?) somewhat faster than array_unique
, but unless you are calling this a LOT, array_unique
is a lot more descriptive, so probably the better option
Find two values that are equal in array
All of my methods will return the desired result as long as there IS a duplicate. It is also assumed because of your sample input, that there is only 1 duplicate in the array. The difference between my methods (and the other answers on this page) will be milliseconds at most for your input size. Because your users will not be able to distinguish between any of the correct methods on this page, I will suggest that the method that you implement should be determined by "readability","simplicity", and/or "brevity". There are many coders who always default to for/foreach/while loops. There are others who always defer to functional iterators. Your choice will probably just come to down to "your coding style".
Input:
$arr=[1,1,2,3,4];
Method #1: array_count_values(), arsort(), key()
$result=array_count_values($arr);
arsort($result); // this doesn't return an array, so cannot be nested
echo key($result);
// if no duplicate, this will return the first value from the input array
Explanation: generate new array of value occurrences, sort new array by occurrences from highest to lowest, return the key.
Method #2: array_count_values(), array_filter(), key()
echo key(array_filter(array_count_values($arr),function($v){return $v!=1;}));
// if no duplicate, this will return null
Explanation: generate the array of value occurrences, filter out the 1
's, return the lone key.
Method #3: array_unique(), array_diff_key(), current()
echo current(array_diff_key($arr,array_unique($arr)));
// if no duplicate, this will return false
Explanation: remove duplicates and preserve the keys, find element that went missing, return the lone value.
Further consideration after reading: https://www.exakat.io/avoid-array_unique/ and the accepted answer from array_unique vs array_flip I have a new favorite 2-function one-liner...
Method #4: array_count_values(), array_flip()
echo array_flip(array_count_values($arr))[2];
// if no duplicate, this will show a notice because it is trying to access a non-existent element
// you can use a suppressor '@' like this:
// echo @array_flip(array_count_values($arr))[2];
// this will return null on no duplicate
Explanation: count the occurrences (which makes keys of the values), swap the keys and values (creating a 2-element array), access the 2
key without a function call. Quick-smart!
If you wanted to implement Method #4, you can write something like this:(demo)
$dupe=@array_flip(array_count_values($arr))[2];
if($dupe!==null){
echo "The pair number is $dupe";
}else{
echo "There were no pairs";
}
There will be many ways to achieve your desired result as you can see from all of the answers, but I'll stop here.
How to array_push unique values inside another array
You could foreach()
through $DocumentID
and check for the current value in $UniqueDocumentID
with in_array()
and if not present add it. Or use the proper tool:
$UniqueDocumentID = array_unique($DocumentID);
To your comment about wanting sequential keys:
$UniqueDocumentID = array_values(array_unique($DocumentID));
The long way around:
$UniqueDocumentID = array();
foreach($DocumentID as $value) {
if(!in_array($value, $UniqueDocumentID)) {
$UniqueDocumentID[] = $value;
}
}
How to remove duplicate values from an array in PHP
Use array_unique().
Example:
$array = array(1, 2, 2, 3);
$array = array_unique($array); // Array is now (1, 2, 3)
PHP array_intersect + array_flip with array that has values multiple times
The following code does the job. I hope it is self-explanatory.
array_unique(array_intersect_key($arr1, array_flip($arr2)))
Related Topics
Add a Checkout Checkbox Field That Enable a Percentage Fee in Woocommerce
Symfony2 - How to Switch from "Dev" to "Prod"
Does PHP Run Faster Without Warnings
How to Convert Between 12 Hour Time and 24 Hour Time in PHP
Setting Up a Cronjob in Windows Xampp
Dynamically Load Information to Twitter Bootstrap Modal
Vcruntime140.Dll 14.0 Not Compatible with PHP Build
HTML Table Using MySQLi and PHP
Send Email from Localhost with Gmail(Windows)
Reading Ssl Page with Curl (Php)
PHP Regex to Remove Http:// from String
How to Trigger Xdebug Profiler for a Command Line PHP Script