An article explains the asynchronous iterator in C#

Today I will write about the asynchronous iterators in C# - mechanisms, concepts and some useful features

The concept of iterator

The concept of iterator appeared earlier in C#, and many people may have become more familiar with it.

Iterators are usually used in some specific scenarios.

For example: There is a foreach loop:

foreach (var item in Sources)
{
  (item);
}

This loop implements a simple function: print out each item in Sources in the console.

Sometimes Sources may be a set of fully cached data, for example: List<string>:

IEnumerable<string> Sources(int x)
{
  var list = new List<string>();
  for (int i = 0; i < 5; i++)
    ($"result from Sources, x={x}, result {i}");
  return list;
}

There will be a small problem here: before we print the first data of Sources, we must first run the complete Sources() method to prepare the data. In actual applications, this may take a lot of time and memory. What's more, Sources may be a bounded-free list, or an open-ended list of uncertain lengths, such as a queue that only processes one data item at a time, or a queue that has no logical ending itself.

In this case, C# gives a good iterator solution:

IEnumerable<string> Sources(int x)
{
  for (int i = 0; i < 5; i++)
    yield return $"result from Sources, x={x}, result {i}";
}

This approach works much like the previous piece of code, but with some fundamental differences - we don't use cache, but just make one element available at a time.

To help understand, let's take a look at the explanation of foreach in the compiler:

using (var iter = ())
{
  while (())
  {
    var item = ;
    (item);
  }
}

Of course, this is a conceptual explanation after omitting a lot of things, and we don’t worry about this detail. But the general meaning is this: the compiler calls GetEnumerator() on the expression passed to the foreach, and then uses a loop to check whether there is the next data (MoveNext()), and after getting the positive answer, proceed and access the Current property. And this attribute represents the element that is advanced.

In the above example, we access a forward list without size limit through MoveNext()/Current. We also used the very complex thing about yield iterator - at least that's what I think.

Let's remove the yield in the above example and rewrite it to see:

IEnumerable<string> Sources(int x) => new GeneratedEnumerable(x);

class GeneratedEnumerable : IEnumerable<string>
{
  private int x;
  public GeneratedEnumerable(int x) =>  = x;

  public IEnumerator<string> GetEnumerator() => new GeneratedEnumerator(x);

  IEnumerator () => GetEnumerator();
}

class GeneratedEnumerator : IEnumerator<string>
{
  private int x, i;
  public GeneratedEnumerator(int x) =>  = x;

  public string Current { get; private set; }

  object  => Current;

  public void Dispose() { }

  public bool MoveNext()
  {
    if (i < 5)
    {
      Current = $"result from Sources, x={x}, result {i}";
      i++;
      return true;
    }
    else
    {
      return false;
    }
  }

  void () => throw new NotSupportedException();
}

After writing this way, it is easier to understand the working process by comparing the yield iterator above:

First, we give an object IEnumerable. Note that IEnumerable and IEnumerator are different.

When we call Sources, a GeneratedEnumerable is created. It stores the state parameter x and exposes the required IEnumerable method.

Later, when foreach iterates over the data, GetEnumerator() is called, which in turn calls GeneratedEnumerator to act as a cursor on the data.

The MoveNext() method logically implements a for loop, but only one step is performed every time MoveNext() is called. More data will be returned through Current. Also add: the return false in the MoveNext() method corresponds to the yield break keyword and is used to terminate the iteration.

Is it easy to understand?

Let’s talk about iterators in asynchronous.

Iterator in asynchronous

The above iteration is a synchronization process. Now Dotnet development work is more inclined to be asynchronous and uses async/await to do it, especially in improving the scalability of the server.

The biggest problem with the above code is MoveNext(). Obviously, this is a synchronous method. If it takes some time to run, the thread will be blocked. This makes the code execution process unacceptable.

The closest way we can do is to get data asynchronously:

async Task<List<string>> Sources(int x) {...}

However, asynchronous acquisition of data cannot solve the problem of data cache latency.

Fortunately, C# has specially added support for asynchronous iterators for this purpose:

public interface IAsyncEnumerable<out T>
{
  IAsyncEnumerator<T> GetAsyncEnumerator(CancellationToken cancellationToken = default);
}
public interface IAsyncEnumerator<out T> : IAsyncDisposable
{
  T Current { get; }
  ValueTask<bool> MoveNextAsync();
}
public interface IAsyncDisposable
{
  ValueTask DisposeAsync();
}

Note that asynchronous iterators are already included in the framework starting with .NET Standard 2.1 and .NET Core 3.0. In earlier versions, it was necessary to introduce manually:

# dotnet add package
The current version number of this package is 5.0.0.

Still the logic of the above example:

IAsyncEnumerable<string> Source(int x) => throw new NotImplementedException();

Let's see what foreach can look like after await:

await foreach (var item in Sources)
{
  (item);
}

The compiler will interpret it as:

await using (var iter = ())
{
  while (await ())
  {
    var item = ;
    (item);
  }
}

Here is a new thing: await using. The same usage as using, but DisposeAsync is called when released, not Dispose, including recycling and cleaning.

This code is actually very similar to the previous synchronous version, except that await is added. However, the compiler will decompose and rewrite the asynchronous state machine and it becomes asynchronous. The principle is not explained in detail, it is not the content that this article focuses on.

So, how do iterators with yield asynchronous? Look at the code:

async IAsyncEnumerable&lt;string&gt; Sources(int x)
{
  for (int i = 0; i &lt; 5; i++)
  {
    await (100); // This simulates asynchronous delay    yield return $"result from Sources, x={x}, result {i}";
  }
}

Well, it feels comfortable to watch.

Is this over? The pattern of Tusson is broken. Asynchronous has a very important feature: cancel.

So, how to cancel asynchronous iteration?

Cancel of asynchronous iteration

The asynchronous method supports cancellation through CancellationToken. Asynchronous iteration is no exception. Looking at the definition of IAsyncEnumerator<T> above, the cancel flag is also passed to the GetAsyncEnumerator() method.

So, what if it is a manual cycle? We can write this way:

await foreach (var item in (cancellationToken).ConfigureAwait(false))
{
  (item);
}

This is equivalent to:

var iter = (cancellationToken);
await using ((false))
{
  while (await ().ConfigureAwait(false))
  {
    var item = ;
    (item);
  }
}

Yes, ConfigureAwait also works with DisposeAsync(). So in the end it becomes:

await ().ConfigureAwait(false);

The cancel capture of asynchronous iteration is done, how to use it next?

Look at the code:

IAsyncEnumerable&lt;string&gt; Sources(int x) =&gt; new SourcesEnumerable(x);
class SourcesEnumerable : IAsyncEnumerable&lt;string&gt;
{
  private int x;
  public SourcesEnumerable(int x) =&gt;  = x;

  public async IAsyncEnumerator&lt;string&gt; GetAsyncEnumerator(CancellationToken cancellationToken = default)
  {
    for (int i = 0; i &lt; 5; i++)
    {
      await (100, cancellationToken); // Simulate asynchronous delay      yield return $"result from Sources, x={x}, result {i}";
    }
  }
}

If a CancellationToken is passed through WithCancellation, the iterator will be cancelled at the correct time - including the period of asynchronous data fetching (in the example). Of course, we can also check IsCancellationRequested at any location in the iterator or call ThrowIfCancellationRequested().

In addition, the compiler will also complete this task through [EnumeratorCancellation], so we can also write it like this:

async IAsyncEnumerable&lt;string&gt; Sources(int x, [EnumeratorCancellation] CancellationToken cancellationToken = default)
{
  for (int i = 0; i &lt; 5; i++)
  {
    await (100, cancellationToken); // Simulate asynchronous delay    yield return $"result from Sources, x={x}, result {i}";
  }
}

This writing method is actually the same as the above code, the difference is that it adds a parameter.

In practical applications, we have the following choices in writing:

// Not cancelledawait foreach (var item in Sources)

// Cancel via WithCancellationawait foreach (var item in (cancellationToken))

// Cancel via SourcesAsyncawait foreach (var item in SourcesAsync(cancellationToken))

// Cancel via SourcesAsync and WithCancellationawait foreach (var item in SourcesAsync(cancellationToken).WithCancellation(cancellationToken))

// Cancel through different tokensawait foreach (var item in SourcesAsync(tokenA).WithCancellation(tokenB))

Several methods are different from application scenarios, but in essence there is no difference. For two tokens, when any token is cancelled, the task will be cancelled.

Summarize

Synchronous iteration is actually used more in various codes, but asynchronous iteration is used very well. On the one hand, this is a relatively new thing, but on the other hand, it will be a bit tangled, so many people dare not touch it.

Today's this is also a summary of some personal experience. I hope it will be helpful for everyone to understand iteration.

This is all about this article about asynchronous iterators in C#. For more related asynchronous iterators in C#, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!