This new feature allows you to specify that a property or field of a class/struct/record
must be set in an
object initializer:
public class VehicleMetadata {
public required DateOnly ManufactureDate { get; init; }
public required bool IsTaxed { get; init; }
public required VehicleClass Class { get; init; }
}
static void Test() {
var m = new VehicleMetadata { // Fails to compile here unless all 'required' properties are set in the initializer
Class = VehicleClass.Car,
IsTaxed = true,
ManufactureDate = new DateOnly(1990, 1, 19)
};
}
View Code Fullscreen • "Required modifier example"
So when is this useful? In my opinion, the right time to use this is when:
You have a type that is somewhat "plain old data", e.g. it's a collection of fields/properties (such as a configuration descriptor or metadata type), and;
At least some of the properties are non-optional.
The benefits here are:
You don't need to use redundant/boilerplate-y constructors to enforce required values being set (especially when those constructors are just passing through parameters and setting the properties), and;
The instantiator of your object gets to use an arguably more flexible/cleaner syntax for setting the mandatory values.
...But what if we want to offer an
optional constructor that helps set the required members with some simple processing? Here's an example:
public class VehicleMetadata {
public required DateOnly ManufactureDate { get; init; }
public required bool IsTaxed { get; init; }
public required VehicleClass Class { get; init; }
public VehicleMetadata() { } // Leave the default constructor still available
// This is a new helper ctor that sets all three properties from an official government 'vehicle code' string
[SetsRequiredMembers]
public VehicleMetadata(string governmentalDatabaseMetadataCode) {
var split = governmentalDatabaseMetadataCode.Split('-');
ManufactureDate = new DateOnly(Int32.Parse(split[0]), Int32.Parse(split[1]), Int32.Parse(split[2]));
IsTaxed = Boolean.Parse(split[3]);
Class = Enum.Parse<VehicleClass>(split[4]);
}
}
static void Test() {
var m = new VehicleMetadata("1990-01-19-True-Car");
}
View Code Fullscreen • "Required members with SetsRequiredMembers attribute"
The invocation of the constructor alone isn't enough for the compiler unless the constructor is decorated with the
SetsRequiredMembers attribute. Otherwise it still expects/requires us to set the
required properties in an initializer, even though the constructor is setting them anyway.
I would actually counsel against this pattern entirely. Unfortunately, the
SetsRequiredMembers attribute is just a simple "escape-hatch" that tells the compiler to accept invocations of that ctor as always setting all required members, even if they're actually not! There is no static analysis going on, just a check for the prescence of the attribute.
It's entirely possible to add a new
required field to a type at a later date and forget to update the constructor. The following code compiles and crashes at runtime with a
NullReferenceException:
public class VehicleMetadata {
public required DateOnly ManufactureDate { get; init; }
public required bool IsTaxed { get; init; }
public required VehicleClass Class { get; init; }
public required string Model { get; init; } // Note: This is a new required property, but we forget to set it in our custom constructor
public VehicleMetadata() { }
[SetsRequiredMembers]
public VehicleMetadata(string governmentalDatabaseMetadataCode) {
var split = governmentalDatabaseMetadataCode.Split('-');
ManufactureDate = new DateOnly(Int32.Parse(split[0]), Int32.Parse(split[1]), Int32.Parse(split[2]));
IsTaxed = Boolean.Parse(split[3]);
Class = Enum.Parse<VehicleClass>(split[4]);
}
}
static void Test() {
var m = new VehicleMetadata("1990-01-19-True-Car");
Console.WriteLine("Vehicle model length is: " + m.Model.Length); // Oops! m.Model was never set anywhere despite being 'required'!
}
View Code Fullscreen • "Bad SetsRequiredMembers usage"
My advice therefore is to never supply "helper" constructors on types with
required members, and instead use static factory methods:
public class VehicleMetadata {
public required DateOnly ManufactureDate { get; init; }
public required bool IsTaxed { get; init; }
public required VehicleClass Class { get; init; }
public required string Model { get; init; }
public static VehicleMetadata FromGovernmentCode(string governmentCode, string model) {
var split = governmentCode.Split('-');
return new VehicleMetadata { // Compiler won't let us forget to set 'Model' here. If we add more members in future, we can't compile without updating this method too.
ManufactureDate = new DateOnly(Int32.Parse(split[0]), Int32.Parse(split[1]), Int32.Parse(split[2])),
IsTaxed = Boolean.Parse(split[3]),
Class = Enum.Parse<VehicleClass>(split[4]),
Model = model
};
}
}
static void Test() {
var m = VehicleMetadata.FromGovernmentCode("1990-01-19-True-Car", "BenMobile");
Console.WriteLine("Vehicle model length is: " + m.Model.Length); // Works! :)
}
View Code Fullscreen • "Static factory methods for helping construct types with required members"
The raw string literals feature allows you to include line breaks and speechmarks/quotemarks in strings. Simply start the string with an arbitrary number of quotemarks (at least 3), and end it with the same number. Here's the example from Microsoft:
string longMessage = """
This is a long message.
It has several lines.
Some are indented
more than others.
Some should start at the first column.
Some have "quoted text" in them.
""";
View Code Fullscreen • "Raw String Literal"
The most useful case for this will almost certainly be working with blocks of XML, Json, etc in test cases etc. Any number of quotemarks are allowed within the string literal, as long as the number of consecutive quotemarks never equals the starting/ending number.
It's also possible to now linebreak in string interpolations, useful for more complex interpolated values:
Console.WriteLine($"My name is {
GetDatabase()
.GetUserFromId(someId)
.Resolve()
.UserName
}");
View Code Fullscreen • "Linebreak in string interpolation"
A minor change, but you can now specify generic type parameters in attributes:
public class MyAttribute<T> : Attribute { }
public class SomeContainingType<T> {
[MyAttribute<T>] // Before we could not have had a generic attribute like this
public T SomeProperty { get; }
}
View Code Fullscreen • "Generic attribute example"
This feature was actually added to support "generic math" -- technically a .NET feature and not a C# one so I won't talk about it here, but there's a good link here:
Generic Math - MSDN.
Since default interface methods were introduced in C# 8, we've been able to specify static methods in interfaces. This feature has now been extended to allow "static virtual" and "static abstract" methods. Although they seem similar, this newer change actually provides a rather deeper functionality. Here's a basic example:
interface IMyInterface {
static virtual string GetName() => "Ben";
static abstract int GetAge();
}
class MyClass : IMyInterface {
public static int GetAge() => 33; // Won't compile without this declaration, because GetAge() is declared as abstract in IMyInterface.
}
View Code Fullscreen • "Static abstract and virtual interface members example"
Notice how the
static abstract method '
GetAge()' in
IMyInterface means that I
must declare a
static GetAge() in
MyClass.
So what's the value of this? Well, we can access these static members via generic type parameters:
void PrintStaticNameAndAge<T>() where T : IMyInterface {
Console.WriteLine($"Name: {T.GetName()}; Age: {T.GetAge()}");
}
static void Test() {
PrintStaticNameAndAge<MyClass>(); // Prints "Name: Ben; Age: 33" to console
}
View Code Fullscreen • "Static abstract and virtual interface members usage example"
Just like non-static virtual members, we can override static virtual members too to override the functionality if we choose.
Anyway, the biggest boon so far from this feature has definitely been generic math... But I'm actually
personally more interested in the feature for its powerful metaprogramming capabilities. One thought that instantly jumped to mind is that this gives a way to add a pseudo-generic constructor constraint with non-nullary constructors. In other words, we can get something akin to a
where T : new(sometypesHere) generic constraint! Well... Kind of. Here's an example:
interface ISomeFactory<out TSelf> {
static abstract TSelf CreateInstance(int a, string b, bool c);
}
class SomeClass : ISomeFactory<SomeClass> {
public int A { get; init; }
public string B { get; init; }
public bool C { get; init; }
public static SomeClass CreateInstance(int a, string b, bool c) {
return new SomeClass { A = a, B = b, C = c };
}
}
static T InstantiateItemGenerically<T>(int a, string b, bool c) where T : ISomeFactory<T> {
return T.CreateInstance(a, b, c);
}
static void Test() {
var someClass = InstantiateItemGenerically<SomeClass>(1, "hello", true);
Console.WriteLine($"{someClass.A},{someClass.B},{someClass.C}"); // Prints "1,hello,True" on console
}
View Code Fullscreen • "Factory static interface approach"
First, we declare a factory interface (ISomeFactory<TSelf>) that defines the kind of "constructor" (i.e. static factory method) we'd like to have generic access to.
Then, we implement our static interface on a class (SomeClass). Notice that the class declares itself as the type parameter for the ISomeFactory declaration. We could 'lie' here and declare another type, but when we attempted to use SomeClass.CreateInstance(int,string,bool) later on, we'd get a compiler error.
Then, we can create a generic method (InstantiateItemGenerically) that has the capability to invoke the static interface method.
Finally, we use the generic method in the Test() method.
Of course, this isn't foolproof:
This only works for types we have the ability to edit ourselves (i.e. we need to be able to declare the factory interface on the type in order to use it this way).
Unlike the new() operator, which can't "lie", the CreateInstance implementation doesn't actually have to return a 'new' instance of TSelf.
...But it's a neat tool to have in the toolbox, I think.
You can now create
list patterns, combining them with most other patterns, to match against
sequences of patterns:
public record User(int Age, string Name, bool IsLoggedIn);
static void Test() {
var users = new User[] {
new(33, "Ben", true),
new(31, "Jos", false),
new(19, "Rob", false)
};
if (users is [_, { Name.Length: >= 3 }, (19, _, false), ..]) { // List Pattern here
Console.WriteLine("Yep! That matches!");
}
}
View Code Fullscreen • "List Pattern example"
The above snippet prints "Yep! That matches!" to the console. The list pattern
[_, { Name.Length: >= 3 }, (19, _, false), ..] checks to see whether our
users array matches four things (separated by commas):
_: First element can be anything,
{ Name.Length: >= 3 }: The second element's Name property must be at least 3 chars long,
(19, _, false): The third element's Age must be 19, their Name can be anything, and they must not be LoggedIn,
..: The array can be 3 or more elements long, we don't care about anything after the first three Users. In this context, the .. operator is known as a 'slice' operator.
It's also possible to declare variables inside the pattern. For example, we could modify the match to
if (dtList is [var f, { Name.Length: >= 3 }, (19, _, false), ..] && f.Name == "Ben"), and continue using the variable
f in the body of the
if statement:
if (users is [var f, { Name.Length: >= 3 }, (19, _, false), ..] && f.Name == "Ben") {
Console.WriteLine($"Yep! That matches! First user: {f}"); // Prints "Yep! That matches! First user: User { Age = 33, Name = Ben, IsLoggedIn = True }"
}
View Code Fullscreen • "List Pattern with declared variable"
Finally, you can actually use the slice operator to further sub-specify your pattern or declare another variable, which is a new array made up of the elements that match the slice:
if (users is [_, .. { Length: > 1 }]) { // Here, "Length" refers to a User[] that is created inline by the slice- it contains the Users that weren't matched by the first part of the pattern
Console.WriteLine($"There is more than one user after the first."); // Prints "There is more than one user after the first."
}
if (users is [_, { Name.Length: >= 3 }, (19, _, false), .. var slice]) { // Here, we declare a name for the sliced user array and use it in the if body below
Console.WriteLine($"Remaining users: {String.Join<User>(", ", slice)}"); // Prints "Remaining users: User { Age = 99, Name = Seb, IsLoggedIn = True }"
}
View Code Fullscreen • "List Pattern and advanced Slice Operator usage"
This new operator (
>>>) allows right-shifting a signed integer without copying the sign bit (essentially mimicing a right-shift on an unsigned integer):
static void Test() {
var i = -100;
var signedShift = i >> 1;
var unsignedShift = i >>> 1;
Console.WriteLine($" I: {i, -12} Binary: {Convert.ToString(i, 2).PadLeft(32, '0')}");
Console.WriteLine($" SignedShift: {signedShift,-12} Binary: {Convert.ToString(signedShift, 2).PadLeft(32, '0')}");
Console.WriteLine($"UnsignedShift: {unsignedShift,-12} Binary: {Convert.ToString(unsignedShift, 2).PadLeft(32, '0')}");
/* Console output:
*
* I: -100 Binary: 11111111111111111111111110011100
* SignedShift: -50 Binary: 11111111111111111111111111001110
* UnsignedShift: 2147483598 Binary: 01111111111111111111111111001110
*/
}
View Code Fullscreen • "Unsigned right-shift example"
In my previous post,
I lamented how the implementation of field inline-initializers for structs felt error-prone. In fact, I specifically called out the very fragile case where adding a new constructor to a struct could in some circumstances break code that was simply using the 'default' (no-args) ctor, even though it appeared unrelated:
"All-in-all, I'm not sure allowing us to shoot ourselves in the foot this way is wise. It feels like a bit of a
footgun- adding a non-parameterless constructor to a struct that was previously using only field initialization will change the behaviour of every invocation of its parameterless constructor throughout the codebase from using the field initializers to instead acting like
default(). I
do like the addition of custom parameterless struct constructors, but I could have happily lived without field initializers. I don't think they add much but they have the potential to lead to constly and confusing mistakes, and in my codebases I will probably not use them as much as is possible. I also feel like programmers who aren't aware of these subtleties may get confused when using them or reading any code written that uses them."
Well, starting with C# 11, the compiler now restricts this to some degree
(which is a breaking change). The upshot is that the example I gave previously no longer compiles:
// This compiled in C# 10, but emits an error in C# 11:
public readonly struct User { // Error CS8983 A 'struct' with field initializers must include an explicitly declared constructor.
public readonly string Name = "<no name>";
public readonly int Age = -1;
}
View Code Fullscreen • "No-Longer compiling struct example"
Personally, I'm still not hugely convinced on the value of inline-initialization for struct fields, but at least this change makes it harder to write fragile code.
On a side-note, C# 11 now made it so that struct contstructors don't have to initialize every field (or call
this()). I was curious if this was breakable, so I tried this:
struct User {
public int Age { get; }
[SkipLocalsInit]
public User() { }
}
[SkipLocalsInit]
static void Test() {
var userArray = stackalloc User[10000];
for (var i = 0; i < 10000; ++i) {
userArray[i] = new();
}
for (var i = 0; i < 10000; ++i) {
if (userArray[i].Age != 0) Console.WriteLine("Interesting...");
}
}
View Code Fullscreen • "Attempting to break auto-ctors in structs"
...But I never saw "Interesting...". I checked the IL and it seems like an initialization to
Age is baked right in to my no-args constructor:
.method public hidebysig specialname rtspecialname instance void
.ctor() cil managed
{
.custom instance void [System.Runtime]System.Runtime.CompilerServices.SkipLocalsInitAttribute::.ctor()
= (01 00 00 00 )
.maxstack 8
IL_0000: ldarg.0 // this
IL_0001: ldc.i4.0
IL_0002: stfld int32 BlogTests.Program/User::'<Age>k__BackingField'
// [22 20 - 22 21]
IL_0007: ret
} // end of method User::.ctor
View Code Fullscreen • "Auto-ctor MSIL"
..So
SkipLocalsInit is going to have no effect.
Strings in C# are encoded as UTF-16 (2-byte characters). A new suffix has been added for string literals that indicates the string should be UTF-8 instead:
var utf8Str = "A test string"u8; // Notice the 'u8' at the end of the string literal
View Code Fullscreen • "UTF-8 String Literal"
So, what is the type of
utf8Str? Actually, it's a
ReadOnlySpan<byte>! In other words, the
u8 suffix actually does one thing only: It makes it easier for us to declare a span of bytes that probably represent a UTF-8 string!
If we look at the decompiled source for
utf8Str, we can see that it is actually declared as "
ReadOnlySpan<byte> utf8Str = new ReadOnlySpan<byte>((void*) &\u003CPrivateImplementationDetails\u003E.\u0038AAB4E7E3B82555A541AD98EF18F2706395AC4557408E5E6D37FAD5D4839CBF4, 13);. In a nutshell, this is actually creating a span that refers to string data embedded directly in the compiled assembly itself. This means no garbage, and the same memory location being accessed for the same string every time (which makes the branch predictor/speculative execution pipeline very happy).
You can now pattern-match a span of chars as though it were a string:
Span<char> charSpan = stackalloc[] { 'H', 'e', 'l', 'l', 'o' };
var msgToPrint = charSpan switch {
"Hello" => "It said hello!",
"Goodbye" => "It said goodbye.",
_ => ""
};
Console.WriteLine(msgToPrint); // Prints "It said hello!" to console
View Code Fullscreen • "Span<char> pattern match example"
I tried it with
Span<byte> and UTF-8 strings, but it's not implemented (and because UTF8 strings are not compile-time constants it probably won't be).
You can now declare any type with the '
file' access modifier (e.g. as opposed to
public or
internal) which means that only other types within the same source code file will be able to see/access it. Mostly designed for generated source code to take use of.
file class MyClass { // Only accessible within the same .cs file
private protected string SomeProperty { get; set; }
}
View Code Fullscreen • "File-Scoped type example"
Finally, we end on probably the most complicated-to-understand new feature of C# 11. C# 7 gave us
ref locals and returns and C# 7.2 added
ref structs and Span<T> (it might be worth brushing up on those features if you need to before reading this section). Now, with C# 11, we have the ability to declare
ref fields.
Everything beyond this point took me days of fiddling with code and reading
very dense specs to really get to grips with. I've tried to make the following as easy-to-understand as I possibly can, but it might still take a few reads to fully 'get'. :) Also, I freely admit a lack of expertise here! Please leave comments at the bottom of the page if you spot any corrections or mistakes-- I will update this post as soon as I see them! Also, in this author's humble opinion, the compiler messages associated with everything discussed here can be pretty unhelpful at times.
Ref fields let us store a
reference to a value rather than the value itself, and they can only be declared on
ref structs. As an example, let us imagine a type that stores a
reference to an
int and an arbitrary "length" value. These two fields will represent the start of a contiguous allocation of memory (that's the reference) and how many integers are stored in this 'contiguous allocation' (that's the length). We'll use this type to help us Sum and Average the integer span:
readonly ref struct IntSpanAggregateHelper {
readonly ref readonly int _firstNumberRef;
readonly int _length;
public IntSpanAggregateHelper(ref int firstNumberRef, int length) {
_firstNumberRef = ref firstNumberRef;
_length = length;
}
public int Sum() {
var result = 0;
for (var i = 0; i < _length; ++i) {
ref var curNumRef = ref Unsafe.Add(ref Unsafe.AsRef(_firstNumberRef), i);
result += curNumRef;
}
return result;
}
public double Average() => (double) Sum() / _length;
}
static void Test() {
var intArray = new[] { 1, 2, 3, 4, 5 };
var aggHelper = new IntSpanAggregateHelper(ref intArray[0], intArray.Length);
Console.WriteLine($"Sum: {aggHelper.Sum()} | Average: {aggHelper.Average()}"); // Prints "Sum: 15 | Average: 3"
}
View Code Fullscreen • "Poor-man's Span"
The constructor's first parameter stores the reference in _firstNumberRef to an integer that is meant to represent the "start" of a span of integers, and the second parameter stores the "length" of that span in _length.
The leading/first 'readonly' in readonly ref readonly int _firstNumberRef indicates that the reference itself should be readonly (e.g. we can not change which memory address is being pointed to). This aligns with the usual meaning of 'readonly', i.e. the variable can not be changed.
The second 'readonly' in readonly ref readonly int _firstNumberRef is optional, but it lets us specify that the data pointed to by our reference also can not be modified-- i.e. we can not use this reference to alter the integers in our 'span'.
We've defined two methods, Sum() and Average() which will calculate the sum (total) and average (mean) of all integers in the span.
In the Test() method at the end of the sample we create the span by passing a reference to the first element of intArray as the _firstNumberRef.
One thing to note is that a reference to the start of some contiguous block of values combined with the length of that block is
basically all a Span<T> is. In other words, the type we defined in this example is just a worse
Span<int> (but it works as a nice example of
ref fields)!
As with previous versions of C#, the compiler will stop us from attempting to maintain/return/store references to data when that data will go out of scope before the reference itself does. As a reminder of this behaviour, remember that the following method produces a compiler error:
static ref int DoSomethingBad() {
var x = 123;
return ref x; // Can't do this because 'x' only exists on the current method's stack. Once we return from this method, it no longer exists. Dereferencing the returned value would be an undefined behaviour.
}
View Code Fullscreen • "Attempt to return reference to stack var"
As of C# 11,
ref fields introduce more restrictions in the compiler to ensure safety. As always,
ref structs can not be stored on the heap (either deliberately or via 'accidental' boxing etc), but it's now theoretically possible to accidentally 'leak' a reference by storing it in a
ref field. The compiler will check this for us and prohibit certain operations:
ref struct IntRefWrapper {
public ref int MyRef;
}
static IntRefWrapper DoSomethingBad() {
var x = 123;
var refWrapper = new IntRefWrapper { MyRef = ref x };
return refWrapper; // Won't compile-- compiler knows we're leaking a reference to 'x'.
}
View Code Fullscreen • "Attempt to return reference to stack var via ref field"
The compiler protects against accidental leaking of references to invalidated memory locations by tracking a "safe-to-escape" scope for every value and "ref-safe-to-escape" scopes for
references to those values. These define the maximum 'scope' that a variable/local/parameter can safely 'escape' to, and the compiler will complain if you violate this constraint. Some examples of scopes include
current method,
calling method, and
return only:
A value with a safe-to-escape scope of calling method is allowed to 'escape' anywhere in the program. You can return it, you can assign it to a field somewhere, you can pass it to other methods, etc.
A value with a safe-to-escape scope of return only can only 'escape' the current method by return statement (or via assignment to an out parameter).
A value with a safe-to-escape scope of current method is not allowed to 'escape' outside the stack for the currently executing method. You can still pass it as a parameter to other methods (excepting certain circumstances which I'll cover further on).
For most standard values
calling method is the safe-to-escape scope-- in other words it's almost always permitted to return a value (to the calling method) or assign it to a field somewhere (which is implied: If you can return a value to the calling method you can also reassign it somewhere else). For some values however this may be different; for example
current method is used for values that are only valid within the current stack (such as a stackallocated
Span<T>).
When attempting to assign a value (e.g.
x = y;), the compiler will check that the safe-to-escape scope of the left-hand-operand (
x) is narrower-or-equal-to the safe-to-escape scope of the right-hand-operand (
y). Similarly, with ref-assignments (e.g.
x = ref y;) a similar check is made using the ref-safe-to-escape scope.
Here are some examples of safe-to-escape scopes in action:
// In this simplest example, we can return 'x' from a method and copy its value to an external field
// because its safe-to-escape scope is "calling method". In other words, the variable 'x' is allowed
// to 'escape' the current method:
static int CallingMethodExample(SomeClassType c) {
var x = 123;
c.AnIntegerProperty = x; // 'x' has safe-to-escape scope of "calling method", so this is permitted
return x; // 'x' has safe-to-escape scope of "calling method", so this is permitted
}
// In this example, we show what happens when a value/variable has a safe-to-escape scope of "current method"
// (meaning once the current method terminates its lifetime will also terminate). In other words, in this case
// the variable 'x' represents a value that will cease to have meaning once the current method completes:
static Span<int> CurrentMethodExample(SomeRefStructType s) {
Span<int> x = stackalloc[] { 1, 2, 3 };
s.AnIntegerSpanProperty = x; // 'x' has a safe-to-escape scope of "current method", which means it is NOT allowed to escape OUTSIDE the current method, so this does not compile
return x; // 'x' has a safe-to-escape scope of "current method", which means it is NOT allowed to escape OUTSIDE the current method, so this does not compile
}
View Code Fullscreen • "Safe-to-escape examples"
Here are some examples of ref-safe-to-escape scopes in action:
// References passed in as fields on ref-struct parameters have ref-safe-to-escape scopes of "calling method".
// This makes sense because clearly the references ALREADY exist outside this method, they were set when they were passed in.
static ref int CallingMethodExample(SomeRefStructType s1, SomeRefStructType s2) {
s2.AnIntegerRefField = ref s1.AnIntegerRefField; // 'ref s1.AnIntegerRefField' has a ref-safe-to-escape scope of "calling method", so this is permitted
return ref s1.AnIntegerRefField; // 'ref s1.AnIntegerRefField' has a ref-safe-to-escape scope of "calling method", so this is permitted
}
// References passed in directly to methods as parameters have ref-safe-to-escape scopes of "return only".
// "Return only" scope lies somewhere inbetween "current method" and "calling method" -- the ref is allowed
// to escape the current method but ONLY via return statement.
static ref int ReturnOnlyExample(ref int x, SomeRefStructType s) {
s.AnIntegerRefField = ref x; // 'ref x' has a ref-safe-to-escape scope of "return only", so this does not compile
return ref x; // 'ref x' has a ref-safe-to-escape scope of "return only", so this is permitted
}
// References to stack variables will always have a ref-safe-to-escape scope of "current method"
// (meaning those references can exist only within the current method but may not escape any further).
static ref int CurrentMethodExample(SomeRefStructType s) {
var x = 123;
s.AnIntegerRefField = ref x; // 'ref x' has a ref-safe-to-escape scope of "current method", which means the reference is NOT allowed to escape OUTSIDE the current method, so this does not compile
return ref x; // 'ref x' has a ref-safe-to-escape scope of "current method", which means the reference is NOT allowed to escape OUTSIDE the current method, so this does not compile
}
View Code Fullscreen • "Ref-safe-to-escape examples"
It should be fairly self-explanatory that ref-safe-to-escape scopes will never be greater than the safe-to-escape scope of the value the reference points to. In fact, that's the whole point of tracking reference scopes-- to make sure that we can't "keep" a reference to something that no longer is a value value.
One thing that may surprise is that
ref parameters (and
in parameters) have a ref-safe-to-escape scope of
return only. Before C# 11, the scope was actually
calling method, but because
ref fields did not exist yet this had the same de-facto effect of being
return only anyway. In order to make existing APIs compile with the introduction of
ref fields, the scope of
ref parameters was narrowed. However
as we'll see later on, it
is possible to re-expand the ref-safe-to-escape scope of
ref parameters to
calling method in C# 11 (and therefore allow the safe 'leak' of
ref parameters to
ref fields).
Something a little unintuitive about all of this is that
the way a variable is declared is what determines its safe-to-escape scope. This
does makes sense for the compiler, but it can certainly lead to some rather confusing behaviour:
static Span<int> CreateLengthOneSpanExample1(ref int i) {
var result = new Span<int>(ref i);
return result;
}
static Span<int> CreateLengthOneSpanExample2(ref int i) {
Span<int> result;
result = new Span<int>(ref i); // Fails to compile on this line
return result;
}
static Span<int> CreateLengthOneSpanExample3(ref int i) {
Span<int> result = stackalloc int[1];
result = new Span<int>(ref i);
return result; // Fails to compile on this line
}
View Code Fullscreen • "Examples of variable declaration affecting safe-to-escape scope"
Interestingly, only
CreateLengthOneSpanExample1 above compiles, the other two methods do not. In each example, the safe-to-escape scope of the local variable
result is being determined at the point it is declared (e.g. on the very first line in each method), which is what leads to the compilation errors:
In example 1 (which compiles fine), result is declared with an inline assignment of an instantiation of a ref struct type. The next sentence is quite hard to parse, so maybe read it twice: The value returned by the constuctor of a ref struct type has a safe-to-return scope defined as the narrowest safe-to-escape scopes of all parameters passed in to that constructor OR the narrowest ref-safe-to-escape scope of all ref parameters, whichever is narrower. So, in our example, we only passed in a ref i, which has a ref-safe-to-escape scope of return only. Therefore, the safe-to-escape scope of the expression new Span<int>(ref i) is also return only. The scope of result is therefore assigned to also be return only, which means we're allowed to return it on the next line via return result;.
In example 2 however, result is initially declared but not inline-assigned. The compiler treats this the same as if we'd inline-assigned it to default, which has a safe-to-escape scope of calling method. This means we can return result and pass it to other methods as well as assigning it to fields anywhere we like. However, this presents a problem when attempting to "re"assign result on the second line-- the expression new Span<int>(ref i) has a safe-to-escape scope of return only, but in the first line we've already set the safe-to-escape scope of result to calling method. Because the right-hand-operand of the assignment operation has a narrower safe-to-escape scope than the left-hand-operand, the compiler disallows this.
In example 3 (which also does not compile), result is inline-assigned to stackalloc int[1]. By definition, stackalloc expressions have a safe-to-escape scope of current method (which should be obvious), therefore the safe-to-escape scope of result is set to current method also. On the following line, just like in the previous example, we attempt to assign the value of the expression new Span<int>(ref i) to result; but in this example this is permitted, because the right-hand-operand's scope (return only) is wider than the left-hand's (current method). The problem comes when we attempt to return result; on the final line: We can not return a value whose safe-to-escape is current method (rather than at least being return only).
It is possible in future compiler versions that simple escape analysis might be implemented such that the examples above will all compile, but there will likely always be complex-enough cases that the compiler will have to assume the most rigid possible interpretation of the "rules". Therefore, it's important to undertand how these rules are set up.
Furthermore, object initlializers have the same rules as constructors within the context of assigning safe-to-escape scopes. That means that only the first method in the following example compiles, even though they appear to be doing the same thing:
ref struct SpanWrapper {
public Span<int> IntSpan;
}
static void ExampleOne() {
Span<int> stackSpan = stackalloc[] { 1, 2, 3 };
var spanWrapper = new SpanWrapper { IntSpan = stackSpan };
}
static void ExampleTwo() {
Span<int> stackSpan = stackalloc[] { 1, 2, 3 };
var spanWrapper = new SpanWrapper();
spanWrapper.IntSpan = stackSpan; // Does not compile
}
View Code Fullscreen • "Object initializers follow constructor rules for safe-to-escape assignment"
The spec also talks about a concept called "method arguments must match". In essence, this is a declaration of how and when it is safe to mix various arguments as parameters to any method that takes at least one
ref struct parameter
by reference.
The rule that the compiler uses is best formalized as: "At the point of invocation of the method, it is required that the safe-to-escape lifetimes of all
ref struct arguments and the ref-safe-to-escape lifetimes of all
ref arguments are
wider than or equal to the safe-to-escape lifetimes of all
ref ref struct arguments".
The reasoning behind this rule makes sense if you play around with attempting to break ref-safety enough and/or go through the examples in the spec. I'm not going to explain the reasoning in detail here, I'm just going to show the implications of this rule-- but you can
read through the spec yourself if you wish.
Using this rule we can investigate some examples. In this first example, we fail to compile:
ref struct SpanWrapper {
public Span<int> IntSpan;
}
static void Test() {
Span<int> heapSpan = new[] { 1, 2, 3 };
Span<int> stackSpan = stackalloc[] { 4, 5, 6 };
var wrapper = new SpanWrapper {
IntSpan = heapSpan
};
Console.WriteLine(LengthMatches(ref wrapper, stackSpan)); // Fails to compile here
}
static bool LengthMatches(ref SpanWrapper wrapper, Span<int> comparand) {
return wrapper.IntSpan.Length == comparand.Length;
}
View Code Fullscreen • "Method arguments must match example 1"
The reason this fails to compile is as follows:
heapSpan has a safe-to-escape scope of calling method because it is declared with an inline-assignment to a heap-allocated array.
stackSpan has a safe-to-escape scope of current method because it is declared with an inline-assignment to a stack-allocated array.
wrapper has a safe-to-escape scope of calling method because it is initialized with heapSpan (and therefore it adopts the same safe-to-escape scope).
Finally, becuase LengthMatches() takes at least one ref argument of a ref struct type, we must apply the "method arguments must match" rule at the call-site. In this instance, the argument wrapper has a safe-to-escape scope of calling method, but stackSpan has a safe-to-escape of current method, so this does not compile.
Don't forget: Invocations of instance methods on any
struct value (
ref structs or just regular ones) always use a
ref parameter implicitly as part of the way single-dispatch OOP works. For example, in the below code both formulations of
IncrementAnInt() are identical as far as the compiler is concerned when talking about safe-to-escape scopes:
struct MyStruct {
public int AnInt;
public void IncrementAnInt() => AnInt++;
}
static void IncrementAnInt(ref MyStruct @this) => @this.AnInt++;
static void Test() {
var s = new MyStruct { AnInt = 3 };
s.IncrementAnInt();
IncrementAnInt(ref s);
Console.WriteLine(s.AnInt); // 5
}
View Code Fullscreen • "Single dispatch deconstruction"
...Therefore, invoking instance methods on
ref struct instances will always fall under the "method arguments must match" rule, because by definition there is a
ref ref struct parameter always involved (e.g. '
this'). The following example does not compile for the exact same violation of "method arguments must match" as the first example:
ref struct SpanWrapper {
public Span<int> IntSpan;
public bool LengthMatches(Span<int> s) => IntSpan.Length == s.Length;
}
static void Test() {
Span<int> heapSpan = new[] { 1, 2, 3 };
Span<int> stackSpan = stackalloc[] { 4, 5, 6 };
var wrapper = new SpanWrapper {
IntSpan = heapSpan
};
Console.WriteLine(wrapper.LengthMatches(stackSpan)); // Fails to compile here
}
View Code Fullscreen • "Implicit ref ref struct argument via single dispatch invoking method arguments must match rule"
There are two ways we can fix both examples and make them compile. The first, and most preferable, is actually to make
SpanWrapper a
readonly ref struct (simply making
IntSpan a
readonly field is not enough). This assures the compiler that we're not going to leak a reference via re-assignment to
IntSpan inside
LengthMatches() or any other method that takes a
ref SpanWrapper, because the type can not modify any of its fields after construction. The other way is via the newly-introduced
scoped keyword:
We can deliberately
narrow the ref-safe-to-escape scope of a
ref local or parameter and the safe-to-escape scope of a
ref struct local or parameter using the
scoped keyword. The keyword effectively narrows the affected scope to
current method. This is useful when we want to guarantee that a reference or value will not escape the method to which it is being passed.
Returning to the previous example, marking
s in
LengthMatches() as
scoped enforces that we can't 'leak'
s anywhere:
ref struct SpanWrapper {
public Span<int> IntSpan;
// Notice the 'scoped' modifier here
public bool LengthMatches(scoped Span<int> s) => IntSpan.Length == s.Length;
}
static void Test() {
Span<int> heapSpan = new[] { 1, 2, 3 };
Span<int> stackSpan = stackalloc[] { 4, 5, 6 };
var wrapper = new SpanWrapper {
IntSpan = heapSpan
};
Console.WriteLine(wrapper.LengthMatches(stackSpan)); // This now compiles!
}
View Code Fullscreen • "Scoped parameter example"
The compiler is now happy because it
knows stackSpan can't 'escape'
LengthMatches() (due to it now being marked as
scoped, which reduces its safe-to-escape scope to
current method only). The tradeoff is that we can not re-assign fields via
scoped parameters:
ref struct SpanWrapper {
public Span<int> IntSpan;
// This is NOT permitted unless we remove the 'scoped' modifier-- newSpan is only safe-to-escape inside current method
public void SetIntSpan(scoped Span<int> newSpan) => IntSpan = newSpan;
}
View Code Fullscreen • "Scoped field reassignment fail"
Additionally,
scoped can be applied to locals to force their safe-to-escape scope to
current method, even if the declaration would otherwise assign a different scope:
static void Test() {
var x = 123;
scoped var wrapper = new IntRefWrapper(); // We force 'wrapper' to be scoped to the current method even though the expression 'new IntRefWrapper()' would normally set it to calling method
wrapper.IntRef = ref x; // This wouldn't be permitted without the scoped modifier above
PrintRefVal(wrapper);
}
static void PrintRefVal(IntRefWrapper w) => Console.WriteLine(w.IntRef);
View Code Fullscreen • "Scoped local example"
Finally, what if we want to
widen the scope of a parameter? As we already explored, the safe-to-escape scope of a
ref parameter is
return only by default. But what if we wanted to write a method that
can assign a passed-in
ref parameter elsewhere? In this case, we can mark the parameter with the
[UnscopedRef] attribute:
ref struct IntRefContainer {
public ref int RefA;
public ref int RefB;
public ref int RefC;
// UnscopedRef attribute widens the safe-to-escape scope of 'i' from return only to
// calling method, allowing us to assign the reference elsewhere
public void AssignAll([UnscopedRef] ref int i) {
RefA = ref i;
RefB = ref i;
RefC = ref i;
}
}
View Code Fullscreen • "UnscopedRef on parameter"
The compiler is aware of the meaning of this attribute and stops us accidentally leaking references:
static IntRefContainer CreateContainer() {
var x = 123;
var container = new IntRefContainer();
container.AssignAll(ref x); // Compiler prohibits this because of the [UnscopedRef] attribute
return container;
}
View Code Fullscreen • "Implications of UnscopedRef attribute"
Finally, we can also apply
UnscopedRef to struct methods (and properties). By default, '
this' for structs is treated as a
scoped ref, which means we can't return fields of a struct by ref via that
this reference. The reasons for this are
explained in the spec, but the upshot is that we can widen
this's scope to
return only rather than
current method, thus allowing us to return references from within a struct instance method/property, by annotating with
[UnscopedRef]:
struct IntContainer {
public int A;
public int B;
public int C;
[UnscopedRef] public ref int ARef => ref A; // Won't compile without UnscopedRef
}
View Code Fullscreen • "UnscopedRef on struct instance method"
Everything I described above can be 'turned off' by marking your enclosing method/type as
unsafe:
static ref int DoSomethingDumb() {
unsafe {
var x = 123;
return ref x; // Really dumb, but permitted by the compiler because I'm in an 'unsafe' context
}
}
View Code Fullscreen • "Unsafe disables scope checking"
Ultimately a lot of the purpose of the changes in more recent C# versions regarding
ref-everything is to make it easier to write high-performance code
without using
unsafe. If you understand all the rules and safety behaviours that I described above, you shouldn't find yourself needing
unsafe very often at all-- but it's nice to know the option is there if we need it.
Edit 21st Aug '23: Amended a line in the description of safe-to-escape scopes to include out parameters; see https://www.reddit.com/r/csharp/comments/15x6trv/c_11_recap_with_a_detailed_explanation_of_ref/jx4xuhr/ for more info