Ben Bowen's Blog [Xenoprimate] • Home / Blog • • About • • Subscribe •

Postmortems - Tale of Two Casts 

Guess the Outcome 

About a year ago I came across a bug that took me a while to get to the bottom of. Look at the following code and see if you can predict the output:

static unsafe void Main(string[] args) {
	const int ELEMENT_COUNT = 30;
	uint[] uintArray = new uint[ELEMENT_COUNT];
	List<uint> uintList = new List<uint>();

	for (int i = 0; i < ELEMENT_COUNT; ++i) {
		uintArray[i] = (uint)i;
		uintList.Add((uint)i);
	}

	var intEnumerable = uintArray.Cast<int>();
	Console.WriteLine(intEnumerable.Count());
	intEnumerable = uintList.Cast<int>();
	Console.WriteLine(intEnumerable.Count());
}
View Code Fullscreen • "Two Casts"
Here's the answer: The program throws an exception after printing 30 to the console once. If you predicted that, congratulations, because I sure didn't! My code was a little more abstracted than this, I was taking an IEnumerable<uint> and calling .Cast<int>() on it before performing more logic (if I recall correctly, something to do with vertex buffer indices). Nonetheless, in essence, I had something that boiled down to the code above.

LINQ Casts 

Firstly, we have to ask ourselves what outcome do we expect from the code above? One possible answer is to expect two '30's printed to the console. However, this would be wrong.

LINQ's Cast() applies only for a specific, pedantic kind of typecast. C# uses the term casting somewhat errantly to refer to three different types of type conversion (conversions, coercions, and casts):

int i = 3;
uint u = (uint) i; // Conversion - When one type is deliberately and explicitly converted to another
long l = i; // Coercion - When one type is implicitly coerced to another

object o = "My string";
string s = (string) o; // Cast - When an object's type is narrowed or widened along its inheritance heirarchy, or reinterpreted to/from an interface
View Code Fullscreen • "Type Conversions"
Because C# programmers are used to calling anything that looks like Type a = (Type) b; a cast, they may not often realise that there is a large distinction to be made between the different kinds. In general, an actual typecast (such as string s = (string) myObject;) involves checking that the target variable can be cast to the target type and then simply using the same reference with more (or sometimes less) members safely available. However, a type conversion (such as int i = (int) myFloat;) often involves reading and interpreting the target value, and converting its binary representation to fit the new target type. Under the hood, these are clearly very different operations.

In fact, it may interest you to know that Visual Basic makes this distinction more obvious with two different conversion functions CType (for actually converting types) and DirectCast (for performing type casts).

Getting back on track then, LINQ's Cast() method is meant to perform only an actual typecast, and not a coercion or conversion. That explains our eventual exception - uints can not be typecast to ints - only converted. However, weirdly enough, we still saw one '30' on the console before the application crashed- what gives?

Implementation Details 

The devil's in the details, let's take a look at how Cast<T>() is actually implemented:

	public static IEnumerable<TResult> Cast<TResult>(this IEnumerable source) {
		IEnumerable<TResult> typedSource = source as IEnumerable<TResult>;
		if (typedSource != null) return typedSource;
		if (source == null) throw Error.ArgumentNull("source");
		return CastIterator<TResult>(source);
	}

	static IEnumerable<TResult> CastIterator<TResult>(IEnumerable source) {
		foreach (object obj in source) yield return (TResult)obj;
	}
View Code Fullscreen • "LINQ Cast Implementation"
As a performance optimisation, the first thing Cast() does it check to see if the source can be directly typecast as an IEnumerable<TResult>. If it can, great, let's just return that. Otherwise, the implementation goes on to check if source is null, and if not, returns an iterator-generated collection that attempts to typecast every element in the collection.

Now, as you will probably know, the following code will fail to compile:

var a = (int[]) uintArray;
var b = (List<int>) uintList;
View Code Fullscreen • "Invalid Casts"
Both lines report an error akin to "Cannot convert type 'uint[]' to 'int[]'"/"Cannot convert type 'List<uint>' to 'List<int>'". C# refuses to cast collections of one type to collections of another if those types are not typecastable. However, the CLR itself has different ideas.

We can rewrite the code to something like this:

var a = (int[]) (object) uintArray;
var b = (List<int>) (object) uintList;
View Code Fullscreen • "Invalid Casts While Subverting Compiler"
This code attempts to perform the exact same operation, but by adding the casts through a cast to object, we've deliberately subverted the C# compiler. Now the CLR at runtime must make the check that the compiler was doing for us before. Usually this is a bad thing- we should always prefer to have as much type safety checking at compile time as possible. However, the CLR's rules about what kind of types can be cast to one another are subtly different to C#'s.

The following is a response from Eric Lippert (a C# compiler developer) to Jon Skeet when asked about this very subject:

First source of confusion: in C# we have conflated two completely
different operations as 'cast' operations. The two operations that we
have conflated are what the CLR calls casts and coercions.

    8.3.2 Coercion

    Sometimes it is desirable to take a value of a type that is not    
    assignment-compatible with a location, and convert the value to a
    type that is assignment-compatible. This is accomplished through
    coercion of the value.
 
    Coercion takes a value of a particular type and a desired type and
    attempts to create a value of the desired type that has equivalent
    meaning to the original value. Coercion can result in
    representation changes as well as type changes; hence coercion does
    not necessarily preserve the identity of two objects.
 
    There are two kinds of coercion: widening, which never loses  
    information, and narrowing, in which information might be lost. An
    example of a widening coercion would be coercing a value that is a
    32-bit signed integer to a value that is a 64-bit signed integer.
    An example of a narrowing coercion is the reverse: coercing a 64-
    bit signed integer to a 32-bit signed integer. Programming
    languages often implement widening coercions as implicit
    conversions, whereas narrowing coercions usually require an
    explicit conversion.

    Some widening coercion is built directly into the VES operations on
    the built-in types (see �12.1). All other coercion shall be  
    explicitly requested. For the built-in types, the CTS provides
    operations to perform widening coercions with no runtime checks and
    narrowing coercions with runtime checks.

    8.3.3 Casting

    Since a value can be of more than one type, a use of the value
    needs to clearly identify which of its types is being used. Since  
    values are read from locations that are typed, the type of the
    value which is used is the type of the location from which the
    value was read. If a different type is to be used, the value is
    cast to one of its other types. Casting is usually a compile time
    operation, but if the compiler cannot statically know that the
    value is of the target type, a runtime cast check is done. Unlike  
    coercion, a cast never changes the actual type of an object nor
    does it change the representation. Casting preserves the identity
    of objects.

    For example, a runtime check might be needed when casting a value
    read from a location that is typed as holding a value of a
    particular interface. Since an interface is an incomplete
    description of the value, casting that value to be of a different
    interface type will usually result in a runtime cast check.

We conflate these two things in C#, using the same operator syntax and
terminology for both casts and coercions.
 
So now it should be clear that there is no 'cast' from int to float in
the CLR. That's a coercion, not a cast.

Second source of confusion: inconsistency in the CLR spec.

The CLR spec says in section 8.7

    Signed and unsigned integral primitive types can be assigned to
    each other; e.g., int8 := uint8 is valid. For this purpose, bool
    shall be considered compatible with uint8 and vice versa, which
    makes bool := uint8 valid, and vice versa. This is also true for
    arrays of signed and unsigned integral primitive types of the same
    size; e.g., int32[] := uint32[] is valid.
View Code Fullscreen • "Eric Lippert's Response"
Source: microsoft.public.dotnet.languages.csharp

As we can see in the bottom paragraph, the CLR considers arrays of the same size to be potentially convertible to each other if both types are equal-sized integral primitives (e.g. int and uint). Therefore, when the CLR comes to run the statement (int[]) (object) uintArray it recognises this as a conversion from uint[] to int[], which it allows (although C# does not), and the cast succeeds at runtime!

Conclusion 

In essence, my bug came down to me accidentally hitting on a CLR capacity that isn't valid in C#, and then only noticing my error when trying to use the same code with non-array types. I'm happy to admit that it took me quite some time to work out exactly what the issue was at the time!

Anyway, in summary:

LINQ's Cast() method works only for typecasts around the inheritance heirarchy, and not coercions or conversions (use a standard Select() and perform the conversion manually)
C# doesn't allow conversions between integral array types of similar integer size and array length, but the CLR does, so it can inadvertently "force it through"
Don't rely on or use this behaviour however- if you want to convert between primitive array types, use Buffer.BlockCopy!