Preface
What? Use C# interpolation string processor to write an inputsscanf
? You're sure it's not for outputsprintf
?
I guess many readers will probably have the above thoughts after seeing the title. However, we are really doing it here.sscanf
, notsprintf
。
Interpolation string processor
C# has a feature called interpolation string. Using interpolation strings, you can naturally insert the value of a variable into the string, such as:$"abc{x}def"
, This change has been passed in the pastThe way to format strings is to no longer need to pass a string template first and then pass parameters one by one, which is very convenient.
Going further on the basis of interpolated strings, C# supports interpolated string processors, which means you can customize the interpolation behavior of strings. For example, a simple example:
[InterpolatedStringHandler] struct Handler(int literalLength, int formattedCount) { public void AppendLiteral(string s) { ($"Literal: '{s}'"); } public void AppendFormatted<T>(T v) { ($"Value: '{v}'"); } }
When using it, you only need to pass itstring
All parameters are changed to this oneHandler
Types can handle interpolated strings in the way you customize. Our interpolated strings will be automatically transformed into by the C# compiler.Handler
The constructs and calls of are then passed in:
void Foo(Handler handler) { } var x = 42; Foo($"abc{x}def");
For example, in the above example, you will get the output:
Literal: 'abc'
Value: '42'
Literal: 'def'
This greatly facilitates the processing of various structured log frameworks. You only need to simply pass the interpolated strings in. The log framework can perform structured parsing according to the way you interpolate, thus completely avoiding manual formatting of strings.
Interpolation string processor with parameters
In fact, the interpolation string processor of C# also supports additional parameters:
[InterpolatedStringHandler] struct Handler(int literalLength, int formattedCount, int value) { public void AppendLiteral(string s) { ($"Literal: '{s}'"); } public void AppendFormatted<T>(T v) { ($"Value: '{v}'"); } } void Foo(int value, [InterpolatedStringHandlerArgument("value")] Handler handler) { } Foo(42, $"abc{x}def");
So,42
Will be transmittedhandler
ofvalue
Among the parameters, this allows us to capture the context from the caller. After all, in the log scenario, it is common to determine different formats based on different parameters.
sscanf?
As we all know, there is a very commonly used function in C/C++.sscanf
, it accepts a text input and a formatted template, and then passes a reference to the variable in the formatted part, and parses the value of the variable:
const char* input = "test 123 test"; const char* template = "test %d test"; int v = 0; sscanf(input, template, &v); printf("%d\n", v); // 123
So can we copy one in C#? sure! It only takes a little bit of black magic.
Implement sscanf with C#
First we make an interpolation string processor with parameters:
[InterpolatedStringHandler] ref struct TemplatedStringHandler(int literalLength, int formattedCount, ReadOnlySpan<char> input) { private ReadOnlySpan<char> _input = input; public void AppendLiteral(ReadOnlySpan<char> s) { } public void AppendFormatted<T>(T v) where T : ISpanParsable<T> { } }
Here we put allstring
All changed toReadOnlySpan<char>
Reduce allocation.
Followsscanf
We should use it in theory to make something like this:
void sscanf(ReadOnlySpan<char> input, ReadOnlySpan<char> template, params object[] args);
But obviously, what we need here is(ref object)[]
, because we need to pass references to update external variables, rather than directly treating the value of the variable asobject
Passed in. So what should I do?
You will find that the interpolation string processor of C# already contains the values of each variable, so we don't need to pass similar things like C/C++%d
Placeholders like this to insert variables! Relative to"test %d test"
We can write directly$"test {v} test"
, and then pass this by referencev
。
A very natural idea is that we just need toAppendFormatted<T>(T v)
Change toAppendFormatted<T>(ref T v)
It's not enough.
However, after actually doing this, you will find that this does not work:
[InterpolatedStringHandler] ref struct TemplatedStringHandler(int literalLength, int formattedCount, ReadOnlySpan<char> input) { private ReadOnlySpan<char> _input = input; public void AppendLiteral(ReadOnlySpan<char> s) { } public void AppendFormatted<T>(ref T v) where T : ISpanParsable<T> { } } void sscanf(ReadOnlySpan<char> input, [InterpolatedStringHandlerArgument("input")] TemplatedStringHandler template);
When we try to callsscanf
When:
int v = 0; sscanf("test 123 test", $"test {ref v} test"); // error CS1525: Invalid expression term 'ref'
An error has been reported! Write in the value part of the interpolated stringref
Keywords are invalid!
Note that this error is from the parser of the C# compiler, which means that as long as we syntactically take thisref
Kill it, and it can be compiled.
At this moment, we had a sudden inspiration, we didn’t have C#in
To pass read-only references? C# forin
Passing read-only references will automatically help us create references and pass them in, without explicitly specifying them in syntax.ref
So let's use this feature to transform it:
[InterpolatedStringHandler] ref struct TemplatedStringHandler(int literalLength, int formattedCount, ReadOnlySpan<char> input) { private ReadOnlySpan<char> _input = input; public void AppendLiteral(ReadOnlySpan<char> s) { } public void AppendFormatted<T>(in T v) where T : ISpanParsable<T> { } }
Then you will find that the following code can be successfully compiled:
int v = 0; sscanf("test 123 test", $"test {v} test");
At this time, we only have the last step to success: the read-only reference is passed in, but in order to extract the variable, we need to update the referenced value, what should we do?
Fortunately, we haveConvert read-only references to variable references, and then the last problem is solved, we can start our implementation.
[InterpolatedStringHandler] ref struct TemplatedStringHandler(int literalLength, int formattedCount, ReadOnlySpan<char> input) { private int _index = 0; private ReadOnlySpan<char> _input = input; public void AppendLiteral(ReadOnlySpan<char> s) { var offset = Advance(0); // Skip consecutive whitespace characters first _input = _input[offset..]; _index += offset; if (_input.StartsWith(s)) // Remove the non-variable part of the template string from the input string { _input = _input[..]; } else throw new FormatException($"Cannot find '{s}' in the input string (at index: {_index})."); _index += ; literalLength -= ; } public void AppendFormatted<T>(in T v) where T : ISpanParsable<T> { var offset = Advance(0); // Skip consecutive whitespace characters first _input = _input[offset..]; _index += offset; var length = Scan(); // Calculate the length until the next whitespace character if ((_input[..length], null, out var result)) // Analysis! { (in v) = result; // Change read-only reference to variable reference and update the reference value _input = _input[length..]; _index += length; formattedCount--; } else { throw new FormatException($"Cannot parse '{_input[..length]}' to '{typeof(T)}' (at index: {_index})."); } } // Scan backward until the blank character stops private int Scan() { var length = 0; for (var i = 0; i < _input.Length; i++) { if (_input[i] is ' ' or '\t' or '\r' or '\n') break; length++; } return length; } // Skip all whitespace characters private int Advance(int start) { var length = start; while (length < _input.Length && _input[length] is ' ' or '\t' or '\r' or '\n') { length++; } return length; } }
Then we provide asscanf
Expose our interpolation string processor:
static void sscanf(ReadOnlySpan<char> input, [InterpolatedStringHandlerArgument("input")] TemplatedStringHandler template) { }
use
int x = 0; string y = ""; bool z = false; DateTime d = default; sscanf("test 123 hello false 2025/01/01T00:00:00 end", $"test{x}{y}{z}{d}end"); (x); (y); (z); (d);
Get the output:
123
hello
False
January 1, 2025 0:00:00
andscanf
It's justsscanf((), template)
It's just abbreviation, so here we havesscanf
It's completely enough.
in conclusion
The interpolation string processor of C# is very powerful. Using this feature, we have successfully achieved better than in C/C++.sscanf
There are also many string parsing functions that need to be used better. Not only does it not require formatting string placeholding, but it can also automatically deduce types. Even the need to pass variable references one by one in the subsequent parameters is directly eliminated. On this basis, we have achieved zero allocation.
This is the end of this article about C# using interpolated string processor to write a sscanf. For more related C# interpolated string content, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!