Sunday, July 17, 2005

Collections in .NET

The .NET Framework supports collections of many types, and it often is not obvious which to use when. Here are some referenences on the topic, and some observations of my own.

The main references in the MSDN Library are:

System.Collections and System.Collections.Specialized which describe the standard classes.
But one also needs to understand the interfaces associated with them. They are shown in their respective namespace hierarchies at Collections Hierarchy and Specialized Hierarchy. Collections holds a few general classes and Specialized holds some others which seem to be in two categories - two standard collections which only hold strings, and several to serve as base classes for implementing your collections (presumably type-restricted).

The important classes to hold sets of objects are (C:Collection) (SC: SpecializedCollection) :

  • System.Array (not actually in Collections)
  • C.ArrayList -
  • C.Hashtable - (which is an associative-array or dictionary)
  • C.SortedList - doubly indexed set of items; versatile, but slow
  • C.Queue - simple queue
  • C.Stack - simple stack
  • SC.StringCollection
  • SC.StringDictionary
  • SC.ListDictionary
  • SC.HybridDictionary
The most important interfaces are:

  • ICollection
  • IDictionary
  • IList
  • IEnumerable & IEnumerator
  • ICloneable
  • IComparer

I have been trying to use these collections for a long time, and find it confusing to figure the best one to use in a given situation. I now realize that one reason is that their naming has been remarkably infelicitous, particularly when referencing the underlying interfaces. Here are some examples:

  1. SortedList does NOT implement IList
  2. StringDictionary does NOT implement IDictionary
  3. Hashtable and Dictionary distinctions are unclear. Why not use one name if the same? Mitchell says a dictionary is an abstract data structure, and that it can be implemented in different ways, eg HashTable, LinkedList (called ListDictionary), or the HybridDictionary. Then why not call the Hashtable a HashDictionary? Ah well.

Scott Mitchell wrote a six part Extensive Examination of Data Structures for MSDN online on which spends a lot of time discussing performance isstues. He also has two more introductory articles at his web site. Using Collections, part of a chapter from one of their books, discusses use of the five main classes in Collections, and Specialized Collections discusses briefly the specialized string containers. Some observations to note:

[fill in refs here]

There are two sides to using collections. One is how to load them and the other is how to access them. I am interested in flexible/reusable code, so am looking at using the interfaces rather than the specific classes in routines I write. On the access side it looks like IEnumerable is implemented by all the classes I have listed above (ICollection inherits from it). So if you just want to load all elements into some kind of display list that is fine. Random access (indexed ordinally or by key) is important for larger collections.

When building the collection, I often do not know the size so ones which support the Add method (IList) are very useful. Then you can copy the contents to an array for certain situations where it provides better access.

In addition there is the issue of implementing collections of objects for your own container classes.

Several folks have posted specialized classes of their own to fill needs unmet in the standard framework. Some examples:

Peter Blum has published an AutoSortArrayList Class (ref from Scott Mitchell)

It turns out that there are a few techniques available in C# which are not available in VB.NET. They are demonstrated in the C# Tutorials which show the use of Indexers and Indexed Properties along with some other techniques such as ensuring type-safety for collection enumerators in different ways. The tutorials offer two warnings without explanation:

  • Although indexers are a powerful feature, it is important to use them only when the array-like abstraction makes sense. Always carefully consider whether using regular method(s) would be just as clear.
  • Note Use this technique sparingly! Only use this pattern if the abstraction provided by using array indexing operations significantly clarifies code that uses your class, and if the indexers have both Get and Set accessors.
It appears that the Indexed Properties technique is a bit of a kludge to implement something that is easy in C++ and even VB&VB.NET. I have to experiment with them.

more to come ...


Blogger Allan Wolff said...

I stumbled upon some more resources on Collections in .NET.

Fons Sonnemans posted a table on the SpecializedCollections here.

He also pointed to an article by Jan Tielens: Collections in .NET, Round One

11:07 AM  

Post a Comment

<< Home