What's the Difference Between Arrays and Hashes

What's the difference between arrays and hashes?

From Ruby-Doc:

Arrays are ordered, integer-indexed collections of any object. Array indexing starts at 0, as in C or Java. A negative index is assumed to be relative to the end of the array—that is, an index of -1 indicates the last element of the array, -2 is the next to last element in the array, and so on. Look here for more.

A Hash is a collection of key-value pairs. It is similar to an Array, except that indexing is done via arbitrary keys of any object type, not an integer index. Hashes enumerate their values in the order that the corresponding keys were inserted.

Hashes have a default value that is returned when accessing keys that do not exist in the hash. By default, that value is nil. Look here for more.

What are the differences of Array and Hash in PHP?

Both the things you are describing are arrays. The only difference between the two is that you are explicitly setting the keys for the second one, and as such they are known as associative arrays. I do not know where you got the Hash terminology from (Perl?) but that's not what they are referred as in PHP.

So, for example, if you were to do this:

$foo = array(1,2,3,4,5);
print_r($foo);

The output would be:

Array
(
[0] => 1
[1] => 2
[2] => 3
[3] => 4
[4] => 5
)

As you can see, the keys to access the individual values you put in were created for you, but are there nonetheless. So this array is, essentially, associative as well. The other "type" of array is exactly the same way, except you are explcitily saying "I want to access this value with this key" instead of automatic numeric indexes (although the key you provide could also be numeric).

$bar = array('uno' => 'one', 'dos' => 'two');
print_r($bar);

Would output:

Array
(
[uno] => one
[dos] => two
)

As you might then expect, doing print $bar['one'] would output uno, and doing $foo[0] from the first example would output 1.

As far as functions go, PHP functions will most of the time take either one of these "types" of array and do what you want them to, but there are distinctions to be aware of, as some functions will do funky stuff to your indexes and some won't. It is usually best to read the documentation before using an array function, as it will note what the output will be depending on the keys of the array.

You should read the manual for more information.

Unexpected access performance differences between arrays and hashes

Consider the memory layout of an array of arrays, say with dimensions 3x3... you've got something like this:

memory address       usage/content
base [0][0]
base+sizeof(int) [0][1]
base+2*sizeof(int) [0][2]
base+3*sizeof(int) [1][0]
base+4*sizeof(int) [1][1]
...

Given an array of dimensions [M][N], all that's needed to access an element in at indices [i][j] is to add the base memory address to the data element size times (i * M + j)... a tiny bit of simple arithmetic, and therefore extremely fast.

Hashes are far more complicated data structures and inherently slower. With a hash, you need to take time to hash the key (and the harder the hash tries to make sure different keys will - statistically - scatter pretty randomly throughout the hash output range even if they're similar keys the slower the hash tends to be, if the hash function doesn't make that effort you'll have more collisions in the hash table and slower performance there), then the hash value needs to be mapped on to the current hash table size (usually using "%"), then you need to compare keys to see if you've found the hoped-for key or a colliding element or an empty element. It's a far more involved process than array indexing. You should probably do some background reading about hash function and hash table implementations....

The reason hashes are so often useful is that the key doesn't need to be numeric (you can always work out some formula to generate a number from arbitrary key data) and need not be near-contiguous for memory efficiency (i.e. a hash table with say memory capacity for 5 integers could happily store keys 1, 1000 and 12398239 - whereas for an array keyed on those values there would be a lot of virtual address space wasted for all the indices in between, which have no data anyway, and anyway more data packed into a memory page means more cache hits).

Further - you should be careful with benchmarks - when you do clearly repetitive work with unchanging values overwriting the same variable, an optimiser may avoid it and you may not be timing what you think you are. It's good to use some run-time inputs (e.g. storing different values in the containers) and accumulate some dependent result (e.g. summing the element accesses rather than overwriting it), then outputting the result so any lazy evaluation is forced to conclude. With things like JITs and VMs involved there can also be kinks in your benchmarks as compilation kicks in or branch prediction results are incorporated.

When is it better to use an array instead of a hash in Perl?

    • Arrays are indexed by numbers.
    • Hashes are keyed by strings.
    • All indexes up to the highest index exist in an array.
    • Hashes are sparsely indexed. (e.g. "a" and "c" can exist without "b".)

There are many emergent properties. Primarily,

    • Arrays can be used to store ordered lists.
    • It would be ugly an inefficient to use hashes that way.
    • It's not possible to delete an element from an array unless it's the highest indexed element.
    • You can delete from an ordered list implemented using an array, though it is inefficient to remove elements other than the first or last.
    • It's possible to delete an element from a hash, and it's efficient.

Differentiating between arrays and hashes in Javascript

You could check the length property as SLaks suggested, but as soon as you pass it a function object you'll be surprised, because it in fact has a length property. Also if the object has a length property defined, you'll get wrong result again.

Your best bet is probably:

function isArray(obj) {
return Object.prototype.toString.call(obj) === "[object Array]";
}

jQuery uses it, and a "couple of" other people... :)

It is more fail proof than the instanceof way. The method is also suggested by the following article:

'instanceof' considered harmful (or how to write a robust 'isArray') (@kagax)

Another thing to add that this function is almost identical to the Array.isArray function in ES 5 spec:

15.4.3.2 Array.isArray ( arg )

  1. If Type(arg) is not Object, return
    false.
  2. If the value of the [[Class]]
    internal property of arg is "Array",
    then return true.
  3. Return false.

Find difference between arrays in Ruby, where the elements are hashes

array1 - array2 works by putting the elements of array2 into a temporary hash, then returning all elements of array1 that don't appear in the hash. Hash elements are compared using == to determine whether they match.

Comparing two hashes with == gives true if all the keys and values of the hashes match using ==. So

h1 = {'ID' => '7891'}
h2 = {'ID' => '7891'}
h1 == h2

evaluates to true, even though h1 and h2 are different hashes, and the corresponding elements will be correctly removed.

The only consideration I can think of is if you always have strings everywhere in the hash keys and values. If they are sometimes integers, or symbols, like {:ID => 7891} then you aren't going to get the results you want, because :ID == 'ID' and '7891' == 7891 are both false.

What is the difference between @() and @{} in Powershell and when to use one over another?

An array is simply a list of values whereas a hashtable is a collection of key/value pairs.

Here are some examples:

Array

$i = @(1,2,3,4,5)

Hashtable

[hashtable]$i = @{ Number = 1; Shape = "Square"; Color = "Blue"}

ruby Compare values between hashes in different arrays

This will return an array of matching hashes:

res = new_array1.inject([]) { |memo, hash| memo << hash if new_array2.any? { |hash2| hash[:ID] == hash2[:ID] && hash[:index] == hash2[:index] && hash[:column] == hash2[:column] }; memo } 
# => [{:index=>4, :column=>0, :ID=>"ABC"}, {:index=>4, :column=>1, :ID=>"XYZ"}, {:index=>4, :column=>2, :ID=>"BCD-1547"}]

res.each do |hash|
# do something
end

If an item in new_array1 has the same index, column and ID keys as any item in new_array2 it will be included.

You could also simpify if these are the only keys in the hashes by using == to compare equality:

res = new_array1.inject([]) { |memo, hash| memo << hash if new_array2.any? { |hash2| hash == hash2 }; memo }

The inject method, aliased and also known as reduce, takes a collection and creates a new value from it, each time the block given to inject is called it is given the next element of the collection and the return value of the previous block (the first time the block is called it is given the seed value passed to inject). This allows you to build up a value similar to recursion.

There are some examples of inject here: Need a simple explanation of the inject method

The any? method will return true as soon as the given block returns true for any of the given collection elements. If the block never returns true then any? returns false. So:

[0,0,0,1,0].any? { |num| num == 1 } # => true
[0,0,0,0,0].any? { |num| num == 1 } # => false


Related Topics



Leave a reply



Submit