What Is the Point of Heterogenous Arrays

what is the point of heterogenous arrays?

As katrielalex wrote: There is no reason not to support heterogeneous lists. In fact, disallowing it would require static typing, and we're back to that old debate. But let's refrain from doing so and instead answer the "why would you use that" part...

To be honest, it is not used that much -- if we make use of the exception in your last paragraph and choose a more liberal definition of "implement the same interface" than e.g. Java or C#. Nearly all of my iterable-crunching code expects all items to implement some interface. Of course it does, otheriwise it could do very little to it!

Don't get me wrong, there are absolutely valid use cases - there's rarely a good reason to write a whole class for containing some data (and even if you add some callables, functional programming sometimes comes to the rescue). A dict would be a more common choice though, and namedtuple is very neat as well. But they are less common than you seem to think, and they are used with thought and discipline, not for cowboy coding.

(Also, you "User as nested list" example is not a good one - since the inner lists are fixed-sized, you better use tuples and that makes it valid even in Haskell (type would be [(String, Integer)]))

Is heterogeneous array possible?

Standard containers are designed to have all the elements of the same type. There are however several techniques to give the impression of some degree of heterogeneity:

  • use a type that can contain all the intended types you mean to store (e.g. long long can store also long, int, short and signed char values).
  • use a union type, that has one member of each of the type you intend to store
  • use a boost::any
  • use a boost::variant, which is a better alternative to the union mentioned above.

You could also consider usign a polymorphic type to store objects of any of its derived type. However this could be more tricky as it appears, due to the risk of slicing.

Why do we go for objects in Javascript when arrays themselves can hold heterogeneous types of data?

Array should be used when you have to access values a specific order or want looping through , if you don't care about the key name go with array or if you are concerned with value should be accessed with string key then objects are best .

Are BSON arrays heterogeneous or homogeneous?

it does say so in the spec.

it says a document is made up of elements prepended with null bytes

document    ::=     int32 e_list "\x00"

a document is made up of elements

e_list  ::=     element e_list

and elements can be of any type BSON supports

element     ::=     "\x01" e_name double    64-bit binary floating point
| "\x02" e_name string UTF-8 string
| "\x03" e_name document Embedded document
....<snip>....

the first note at the bottom of the page explains that lists are simply documents with magic ascending string keys.

Array - The document for an array is a normal BSON document with
integer values for the keys, starting with 0 and continuing
sequentially. For example, the array ['red', 'blue'] would be encoded
as the document {'0': 'red', '1': 'blue'}.

as such,
BSON will happily serialize {"Key1":[12, "12", 12.1, "a string", Binary(0x001232)]}

V8: Heterogeneous Array Literals

From this blog post provided by Mathias, a V8 developer:

Common elements kinds

While running JavaScript code, V8 keeps track of what kind of elements
each array contains. This information allows V8 to optimize any
operations on the array specifically for this type of element. For
example, when you call reduce, map, or forEach on an array, V8 can
optimize those operations based on what kind of elements the array
contains.

Take this array, for example:

const array = [1, 2, 3];

What kinds of elements does it contain? If you’d ask the typeof
operator, it would tell you the array contains numbers. At the
language-level, that’s all you get: JavaScript doesn’t distinguish
between integers, floats, and doubles — they’re all just numbers.
However, at the engine level, we can make more precise distinctions.
The elements kind for this array is PACKED_SMI_ELEMENTS. In V8, the
term Smi refers to the particular format used to store small integers.
(We’ll get to the PACKED part in a minute.)

Later adding a floating-point number to the same array transitions it to a more generic elements kind:

const array = [1, 2, 3];
// elements kind: PACKED_SMI_ELEMENTS
array.push(4.56);
// elements kind: PACKED_DOUBLE_ELEMENTS
Adding a string literal to the array changes its elements kind once again.

const array = [1, 2, 3];
// elements kind: PACKED_SMI_ELEMENTS
array.push(4.56);
// elements kind: PACKED_DOUBLE_ELEMENTS
array.push('x');
// elements kind: PACKED_ELEMENTS

....

V8 assigns an elements kind to each array.
The elements kind of an array is not set in stone — it can change at runtime. In the earlier example, we transitioned from PACKED_SMI_ELEMENTS to PACKED_ELEMENTS.
Elements kind transitions can only go from specific kinds to more general kinds.

THUS, behind the scenes, if you're constantly adding different types of data to the array at run time, the V8 engine has to adjust behind the scenes, losing the default optimization.

As far as constructor vs. array literal

If you don’t know all the values ahead of time, create an array using the array literal, and later push the values to it:

const arr = [];
arr.push(10);

This approach ensures that the array never transitions to a holey elements kind. As a result, V8 can optimize any future operations on the array more efficiently.

Also, to clarify what is meant by holey,

Creating holes in the array (i.e. making the array sparse) downgrades
the elements kind to its “holey” variant. Once the array is marked as holey, it’s holey forever — even if it’s packed later!

It might also be worth mentioning that V8 currently has 21 different element kinds.

More resources

  • V8 Internals for JavaScript Developers - a talk by Mathias Bynens
  • JavaScript Engines - How Do They Even? - a talk by Franziska Hinkelmann

Dealing with (members of) heterogeneous arrays

This is the way it has to be because PowerShell uses pipelines. When you run ex. $array1 | Export-CSV ...., PowerShell starts to write to the CSV-file as soon as the first object arrives. At that point it needs to know what the header will look like as that is the first line in a csv-file. So PowerShell has to assume that the class/properties of the first object represents all the remaining objects in the pipeline. The same goes for Format-Table and similar commands that need to set a style/view before outputting any objects.

The usual workaround to this is to specify the header manually using Select-Object. It will add all missing properties to all objects with a value of $null. This way, all the objects sent to ex. Export-CSV will have all the same properties defined.

To get the header, you need to receive all unique property-names from all objects in your array. Ex.

$array1 |
ForEach-Object { $_.PSObject.Properties} |
Select-Object -ExpandProperty Name -Unique

Title
Price
Author

Then you can specify that as the header using Select-Object -Properties Title,Price,Author before sending the objects to Export-CSV Ex:

$a = New-Object –TypeName PSObject
$a | Add-Member –MemberType NoteProperty –Name Title –Value "Journey to the West"
$a | Add-Member –MemberType NoteProperty –Name Price –Value 12

$b = New-Object –TypeName PSObject
$b | Add-Member –MemberType NoteProperty –Name Title –Value "Faust"
$b | Add-Member –MemberType NoteProperty –Name Author –Value "Goethe"

$array = $a,$b

$AllProperties = $array |
ForEach-Object { $_.PSObject.Properties} |
Select-Object -ExpandProperty Name -Unique

$array | Select-Object -Property $AllProperties | Export-CSV -Path "mycsv.out" -NoTypeInformation

This will create this CSV-file:

"Title","Price","Author"
"Journey to the West","12",
"Faust",,"Goethe"

If you have mulltiple arrays you can combine them like this $array = $array1 + $array2

Managing different objects in an heterogeneous Java array

Why don't you add an enum FigureType to your base class that identifies the child class?

public static enum FigureType {

Square,
Circle
}

public static class Figure {
private FigureType type;

public Figure(FigureType type) {
this.type = type;
}

public FigureType getType() {
return type;
}

public void draw() {
}

public String getColor() {
return null;
}
}

You would have to add a default constructor to each child class that calls the parent class constructor with its FigureType.

public static class Square extends Figure {

public Square() {
super(FigureType.Square);
}

@Override
public void draw() {
System.out.println("Square");
}
}

public static class Circle extends Figure {

public Circle() {
super(FigureType.Circle);
}

@Override
public void draw() {
System.out.println("Circle");
}

public float getRadius() {
return 8;
}
}

Usage:

public static void main(String[] args) {

Figure[] figures = new Figure[3];
figures[0] = new Circle();
figures[1] = new Circle();
figures[2] = new Square();

for (Figure figure : figures) {
figure.getColor();
figure.draw();
if (figure.getType() == FigureType.Circle) {
((Circle) figure).getRadius();
}
}
}

Results:

Circle
Circle
Square

No exception

Use heterogeneous arrays to store different child classes?

Make a container of base class pointers:

std::vector<std::unique_ptr<Color>> colors;

and insert allocated derived classes:

colors.emplace_back(new Black("John", "Black", 10, 15));

How do Swift's heterogenous value type arrays work?

I was curious about the same, although I did not have time enough to completely get to the bottom of it. Still I think I have gotten some approximation worth of placing here as an answer.

Firstly, there it this article from Jason Bell, which provides some hints at how it all works behind the scenes (not only for Swift but also for Objective-C and other languages).

Secondly, if I take this simple program:

protocol Foo { }

struct Bar: Foo { }

var fooArray = [Foo]()

fooArray.append(Bar())
fooArray.append(Bar())
fooArray.append(Bar())

let arrayElement = fooArray[0]

print(arrayElement)

... and compile it into LLVM IR by doing swiftc -emit-ir unveil.swift > unveil.ir then I can fish out the following IR code that corresponds to a simple fooArray.append(Bar()):

%15 = getelementptr inbounds %P6unveil3Foo_* %3, i32 0, i32 1
store %swift.type* bitcast (i64* getelementptr inbounds ({ i8**, i64, { i64, i8*, i32, i32, i8*, %swift.type** (%swift.type*)*, %swift.type_pattern*, i32, i32, i32 }*, %swift.type* }* @_TMfV6unveil3Bar, i32 0, i32 1) to %swift.type*), %swift.type** %15, align 8
%16 = getelementptr inbounds %P6unveil3Foo_* %3, i32 0, i32 2
store i8** getelementptr inbounds ([0 x i8*]* @_TWPV6unveil3BarS_3FooS_, i32 0, i32 0), i8*** %16, align 8
%17 = getelementptr inbounds %P6unveil3Foo_* %3, i32 0, i32 0
call void @_TFV6unveil3BarCfMS0_FT_S0_()
%18 = bitcast %P6unveil3Foo_* %3 to %swift.opaque*
call void @_TFSa6appendurfRGSaq__Fq_T_(%swift.opaque* noalias nocapture %18, %swift.type* %14, %Sa* nocapture dereferenceable(8) @_Tv6unveil8fooArrayGSaPS_3Foo__)

Here you can find the LLVM IR syntax, but for me above means that Swift arrays are really arrays of pointers.

Also, similarly to IR, I can get to the assembly for the same Swift line, which is:

leaq    __TWPV6unveil3BarS_3FooS_(%rip), %rax
leaq __TMfV6unveil3Bar(%rip), %rcx
addq $8, %rcx
movq %rcx, -56(%rbp)
movq %rax, -48(%rbp)
callq __TFV6unveil3BarCfMS0_FT_S0_
leaq __Tv6unveil8fooArrayGSaPS_3Foo__(%rip), %rdx
leaq -80(%rbp), %rax
movq %rax, %rdi
movq -160(%rbp), %rsi
callq __TFSa6appendurfRGSaq__Fq_T_

... again, above manipulates the pointers, so that confirms the theory.

And finally, there are SIL headers SILWitnessTable.h and SILWitnessVisitor.h from swift.org to be found at swift/include/swift/SIL/ that suggest the same.

Actually, I guess (and I hope that someone who really knows what he's talking about would weigh in here) that value-types (e.g. structs) and reference-types (read classes) are not so much different under the hood of Swift. Probably the main difference is whether copy-on-write in enforced or not.



Related Topics



Leave a reply



Submit