C# code performance improvement with Span<T> Type
A new structure Span<T> was introduced since C# 7.2. The main goal of it is to avoid allocating new objects on heap memory when working with the contiguous region of arbitrary memory requirements.
By using Span<T>, the following are a couple of advantages we may see
- Avoid allocating memory on the Heap for the new objects created.
- Less call to the GC process and which improves the performance as this CPU time can be used for the actual process. Also, no need to manage non-allocated objects.
Example to understand how Span<T> can give better performance
Before understanding how this Span will work underneath. Let’s see a concrete example to demonstrate the usage of Span in real-time scenarios.
Below is a code block that has a method accepting a Full name string and returning a First Name by calling the Substring on the Full Name string.
public string GetFirstNameFromFullNameWithString() { string fullName = "Sai Kumar"; string firstName = fullName.Substring(0, fullName.IndexOf(' ', 0)); return firstName; }
This looks like a very simple example, but when observing the way we are fetching the First Name from the given full name is by calling a Substring method and this leads to creating another new object on the Heap memory along with a stack value to store this new object memory reference. Once, this method is done, GC should keep monitoring this thread-level stack values and run its process if it founds any non-allocated references. In our case, once the method is done, we are releasing this memory and GC should claim this memory immediately.
Now, we rewrite the same method using Span and see how can we get the advantage of it as we discussed above.
public ReadOnlySpan<char> GetFirstNameFromFullNameWithSpan() { string fullName = "Sai Kumar"; ReadOnlySpan<char> name = fullName; ReadOnlySpan<char> firstName = name.Slice(0, name.IndexOf(' ')); return firstName; }
The above code will return the first name from the given full name by using the Span type. Here, when observed we are using ReadOnlySpan<T>. The Span<T> and ReadOnlySpan<T> both are struct types, but ReadOnlySpan<T> we used because strings are immutable. When we create a Span type variable and assign it with a string, it will create an object on the stack memory. Once we apply a Slice method from Span type, we actually perform the Substring operation on stack memory and not on Heap. As we nowhere allocating any new objects on the Heap, no pressure on the GC process to manage these object references as well as claim these memories on the Heap.
Also, as we reduced using Heap and GC calls, there will be a good improvement in performance as we use thread-level stacks which are high in performance and can dispose of automatically once they get used.
Below are the benchmark details from both methods.
It is very clear that using Span gives good performance and also when observed the method using String has a memory allocation on Heap whereas there is no memory allocated in the case of Span.
Following is the complete source code, I used BenchmarkDotNet to get the Benchmark stats above.
class Program { static void Main(string[] args) { Console.WriteLine("Hello World!"); BenchmarkRunner.Run<SpanDemo>(); Console.ReadKey(); } } [MemoryDiagnoser] public class SpanDemo { [Benchmark] public string GetFirstNameFromFullNameWithString() { string fullName = "Sai Kumar"; string firstName = fullName.Substring(0, fullName.IndexOf(' ', 0)); return firstName; } [Benchmark] public ReadOnlySpan<char> GetFirstNameFromFullNameWithSpan() { string fullName = "Sai Kumar"; ReadOnlySpan<char> name = fullName; ReadOnlySpan<char> firstName = name.Slice(0, name.IndexOf(' ')); return firstName; } }
How Span<T> works internally
Well, we understand the advantages of using Span<T> for all the contiguous string manipulation requirements. Now, let’s see how this is possible and it works when we create this variable.
Span<T> is a ref struct type and we know struct type is a value type and stores on stack memory. Every span variable holds 2 fields, one field to store the real address of the string to manipulate (offset) and another variable to store its length for it to consider. Below is the signature of the Span<T>.
public readonly ref struct Span<T> { //A byref or a native ptr. internal readonly ByReference<T> _pointer; //The number of elements this Span contains. private readonly int _length; ... }
Here the field _pointer is of type ByReference<T> a special struct to store the address of the string. The _length is an integer to store the length of the string to be part of this Span. For example, in our example case, the _pointer will hold the address of the FullName string on Heap Memory and _length will be the 3. The same, if you want the last name from the given same FullName, this _pointer will maintain the offset address I.e., the address of the string from where this span should start and length. In our case, _pointer should be from char “K” and the length will be 5. All these values will be stored as stack values.
Limitations of Span<T>
- As Span<T> is of type ref struct it never goes to the Heap memory. When the compiler found the given type is of ref struct, it will raise a compile time when we declared it as a class field.
- Can’t implement interfaces
- Can’t be boxed to Value Type or object
- Can’t use in any lambda expressions
- Can’t be used in iterators
- Can’t be a type argument (ex. Tuples)
- Can’t use it in Async methods, but can use it if we return a Task object.
Conclusion
Span<T>/ReadOnlySpan<T> will give you a high-performance improvement in some critical scenarios.
Happy coding 🙂