Why Is It Considered Bad to Expose List<T>

Why is it considered bad to expose ListT?

I agree with moose-in-the-jungle here: List<T> is an unconstrained, bloated object that has a lot of "baggage" in it.

Fortunately the solution is simple: expose IList<T> instead.

It exposes a barebones interface that has most all of List<T>'s methods (with the exception of things like AddRange()) and it doesn't constrain you to the specific List<T> type, which allows your API consumers to use their own custom implementers of IList<T>.

For even more flexibility, consider exposing some collections to IEnumerable<T>, when appropriate.

Is a public readonly ListT bad design?

General speaking you should allways take the most general type to reduce any tight coupling and to provide only those members you actually need access to. Having said this in some situations it might be better to use an ICollection instead which provides access to basic methods such as Add, Remove and Clear.

However making the collection readonly or even better a Get-only property is probably a good idea and nothing can be said against this.

ListT or IListT

If you are exposing your class through a library that others will use, you generally want to expose it via interfaces rather than concrete implementations. This will help if you decide to change the implementation of your class later to use a different concrete class. In that case the users of your library won't need to update their code since the interface doesn't change.

If you are just using it internally, you may not care so much, and using List<T> may be ok.

Why not inherit from ListT?

There are some good answers here. I would add to them the following points.

What is the correct C# way of representing a data structure, which, "logically" (that is to say, "to the human mind") is just a list of things with a few bells and whistles?

Ask any ten non-computer-programmer people who are familiar with the existence of football to fill in the blank:

A football team is a particular kind of _____

Did anyone say "list of football players with a few bells and whistles", or did they all say "sports team" or "club" or "organization"? Your notion that a football team is a particular kind of list of players is in your human mind and your human mind alone.

List<T> is a mechanism. Football team is a business object -- that is, an object that represents some concept that is in the business domain of the program. Don't mix those! A football team is a kind of team; it has a roster, a roster is a list of players. A roster is not a particular kind of list of players. A roster is a list of players. So make a property called Roster that is a List<Player>. And make it ReadOnlyList<Player> while you're at it, unless you believe that everyone who knows about a football team gets to delete players from the roster.

Is inheriting from List<T> always unacceptable?

Unacceptable to whom? Me? No.

When is it acceptable?

When you're building a mechanism that extends the List<T> mechanism.

What must a programmer consider, when deciding whether to inherit from List<T> or not?

Am I building a mechanism or a business object?

But that's a lot of code! What do I get for all that work?

You spent more time typing up your question that it would have taken you to write forwarding methods for the relevant members of List<T> fifty times over. You're clearly not afraid of verbosity, and we are talking about a very small amount of code here; this is a few minutes work.

UPDATE

I gave it some more thought and there is another reason to not model a football team as a list of players. In fact it might be a bad idea to model a football team as having a list of players too. The problem with a team as/having a list of players is that what you've got is a snapshot of the team at a moment in time. I don't know what your business case is for this class, but if I had a class that represented a football team I would want to ask it questions like "how many Seahawks players missed games due to injury between 2003 and 2013?" or "What Denver player who previously played for another team had the largest year-over-year increase in yards ran?" or "Did the Piggers go all the way this year?"

That is, a football team seems to me to be well modeled as a collection of historical facts such as when a player was recruited, injured, retired, etc. Obviously the current player roster is an important fact that should probably be front-and-center, but there may be other interesting things you want to do with this object that require a more historical perspective.

CollectionT versus ListT what should you use on your interfaces?

To answer the "why" part of the question as to why not List<T>, The reasons are future-proofing and API simplicity.

Future-proofing

List<T> is not designed to be easily extensible by subclassing it; it is designed to be fast for internal implementations. You'll notice the methods on it are not virtual and so cannot be overridden, and there are no hooks into its Add/Insert/Remove operations.

This means that if you need to alter the behavior of the collection in the future (e.g. to reject null objects that people try to add, or to perform additional work when this happens such as updating your class state) then you need to change the type of collection you return to one you can subclass, which will be a breaking interface change (of course changing the semantics of things like not allowing null may also be an interface change, but things like updating your internal class state would not be).

So by returning either a class that can be easily subclassed such as Collection<T> or an interface such as IList<T>, ICollection<T> or IEnumerable<T> you can change your internal implementation to be a different collection type to meet your needs, without breaking the code of consumers because it can still be returned as the type they are expecting.

API Simplicity

List<T> contains a lot of useful operations such as BinarySearch, Sort and so on. However if this is a collection you are exposing then it is likely that you control the semantics of the list, and not the consumers. So while your class internally may need these operations it is very unlikely that consumers of your class would want to (or even should) call them.

As such, by offering a simpler collection class or interface, you reduce the number of members that users of your API see, and make it easier for them to use.

C# ListT.ToArray performance is bad?

No that's not true. Performance is good since all it does is memory copy all elements (*) to form a new array.

Of course it depends on what you define as "good" or "bad" performance.

(*) references for reference types, values for value types.

EDIT

In response to your comment, using Reflector is a good way to check the implementation (see below). Or just think for a couple of minutes about how you would implement it, and take it on trust that Microsoft's engineers won't come up with a worse solution.

public T[] ToArray()
{
T[] destinationArray = new T[this._size];
Array.Copy(this._items, 0, destinationArray, 0, this._size);
return destinationArray;
}

Of course, "good" or "bad" performance only has a meaning relative to some alternative. If in your specific case, there is an alternative technique to achieve your goal that is measurably faster, then you can consider performance to be "bad". If there is no such alternative, then performance is "good" (or "good enough").

EDIT 2

In response to the comment: "No re-construction of objects?" :

No reconstruction for reference types. For value types the values are copied, which could loosely be described as reconstruction.

Data Access Layer: Exposing List: bad idea?

Usually it's best to expose the least powerful interface that the user can still meaningfully work with. If the user just needs some enumerable data, return IEnumerable<User>. If that's not enough because the user needs to be able to modify the list (attention! shouldn't often be the case), return an IList<User>.

/EDIT:

Joel asks a valid question in his comment: Why indeed expose the least powerful interface instead of granting the user maximum power? (paraphrased)

The idea behind this is that the method returning the data might not expect the user to modify its content: Another method of the class might still expect the list to be non-empty after a reference to it was returned. Imagine the user removes all data from the list. The other method now has to make an additional check that ele might have been unnecessary.

More importantly, this exposes parts of the internal implementation through the return type. If I need to change the implementation in the future so that it no longer uses an IList container, I have a problem: I either need to change the method contract, introducing a build-breaking change. Or I need to copy the data into a list container.

As an example, imagine that an efficient implementation uses a Dictionary and just returns the Values collection which doesn't implement IList.

Array versus ListT: When to use which?

It is rare, in reality, that you would want to use an array. Definitely use a List<T> any time you want to add/remove data, since resizing arrays is expensive. If you know the data is fixed length, and you want to micro-optimise for some very specific reason (after benchmarking), then an array may be useful.

List<T> offers a lot more functionality than an array (although LINQ evens it up a bit), and is almost always the right choice. Except for params arguments, of course. ;-p

As a counter - List<T> is one-dimensional; where-as you have have rectangular (etc) arrays like int[,] or string[,,] - but there are other ways of modelling such data (if you need) in an object model.

See also:

  • How/When to abandon the use of Arrays in c#.net?
  • Arrays, What's the point?

That said, I make a lot of use of arrays in my protobuf-net project; entirely for performance:

  • it does a lot of bit-shifting, so a byte[] is pretty much essential for encoding;
  • I use a local rolling byte[] buffer which I fill before sending down to the underlying stream (and v.v.); quicker than BufferedStream etc;
  • it internally uses an array-based model of objects (Foo[] rather than List<Foo>), since the size is fixed once built, and needs to be very fast.

But this is definitely an exception; for general line-of-business processing, a List<T> wins every time.

Is OptionalList bad practice?

I think the point is moot. There are two possible cases:

1. The caller needs to be aware that default values have been returned

In this case, the caller will not be able to use the orElse()/orElseGet() construct, and will have to check with isPresent(). This is no better than checking whether the list is empty.

2. The caller does not need to be aware that default values have been returned

In which case you might as well hide the implementation details behind a single List getValues() method that returns the default values in case no values were found.


As to the general applicability of using Optional<List>, I think the Brian Goetz quote from this answer says it best:

Our intention was to provide a limited mechanism for library method
return types where there needed to be a clear way to represent "no
result", and using null for such was overwhelmingly likely to cause
errors.

When it comes to lists (and collections in general), there is already a clear way to represent "no result", and that is an empty collection.

OK to return an internal ListT as an IEnumerableT or ICollectionT?

Not only there's nothing wrong with it, but it's actually good practice: expose only what is strictly necessary. That way, the caller can't rely on the fact that the method will return a List<T>, so if for some reason you need to change the implementation to return something else, you won't break your contract. However the calling code might break if it (incorrectly) made assumptions about what the method actually returns.



Related Topics



Leave a reply



Submit