In-depth analysis of asynchronous and multithreading in C#

Many developers have misunderstandings about asynchronous code and multithreading, as well as how they work and how they are used. Here you will understand the differences between the two concepts and implement them using C#.

Me: "Waiter, this is my first time visiting this restaurant. Does it usually take 4 hours to get the food?"

Waiter: "Oh yes sir. There is only one chef in the kitchen of this restaurant."

Me: “…Is there only one chef?”

Waiter: "Yes, sir, we have several chefs, but only one at a time works in the kitchen."

Me: "So 10 other people standing in the kitchen in chef's uniforms...do nothing? Is the kitchen too small?"

Waiter: "Oh, our kitchen is big, sir."

Me: "Then why don't they work at the same time?"

Waiter: "Sir, this is a good idea, but we haven't figured out how to do it yet."

Me: "Okay, weird. But...hey... where is the chef now? I haven't seen anyone in the kitchen now."

Waiter: "Yes, sir. There was an order of kitchen supplies running out, so the chef had stopped cooking and stood outside waiting for delivery."

Me: "It looks like he can cook while waiting, maybe the delivery guy can tell them directly when it will be?"

Waiter: "Another a wonderful idea, sir. We have a delivery doorbell at the back, but the chef loves to wait. I'll get you some more water."

What a bad restaurant, right? Unfortunately, a lot of programs work like this.

There are two different ways to make this restaurant do better.

First, it is obvious that each individual dinner order can be processed by a different chef. Each is a list of things that have to happen in a specific order (prepare the ingredients, mix them, then cook, etc.). So if every chef is committed to handling the stuff on this list, several dinner orders can be made at the same time.

This is a real-world multithreaded example. Computers have the ability to have multiple different threads running simultaneously, each thread responsible for performing a series of activities in a specific order.

Then there is asynchronous behavior. It should be clear that asynchronous is not multi-threaded. Remember the chef who has been waiting for takeout? It's a waste of time! While waiting, he didn't do anything meaningful, such as cooking. Moreover, waiting won't make delivery faster. Once he calls to order supplies, the delivery will happen at any time, so why wait? Instead, the delivery man simply rings the doorbell and says, “Hey, this is your supply!”

There are a lot of I/O activities that are handled by something outside of the code. For example, send a network request to a remote server. It's like ordering a restaurant. The only thing your code does is make calls and receive results. If you choose to wait for the result and do nothing at all between the two, then this is the "synchronous" behavior.

However, if you prefer to be interrupted/notified when the result returns (like the delivery man rings the doorbell when he arrives), while also being able to handle other things, then this is "asynchronous" behavior.

Asynchronous code can be used as long as the work is done by an object that is not directly controlled by the current code. For example, when you write a bunch of data to a hard drive, your code does not perform the actual write operation. It simply requests the hardware to perform the task. So you can start writing with asynchronous encoding and then get notified when writing is done while continuing to work on other things.

The advantage of asynchronous is that it does not require additional threads and is therefore very efficient.

"Wait!" you said. "If there is no extra thread, then who or what is waiting for the result? How does the code know the result returned?"

Remember that doorbell? There is a system in your computer called the "interrupt" system, which works a bit like that doorbell. When your code starts an asynchronous activity, it basically installs a virtual doorbell. When other tasks (written to hard drive, waiting for network response, etc.) are completed, interrupt the system "interrupts" the currently running code and hits the doorbell, letting your application know that there is a task waiting! There is no need for a thread to sit there and wait!

Let's quickly review our two tools:

Multithreading: Use an extra thread to perform a series of activities/tasks.

Asynchronous: Use the same thread and interrupt system to allow other components outside the thread to complete some activities and be notified when the activity ends.

UI thread

Another important thing to know is why using these tools is good. In .net, there is a main thread called UI thread, which is responsible for updating all the visual parts of the screen. By default, this is where everything runs. When you click a button, you want to see the button be pressed briefly and then return, it is the responsibility of the UI thread. There is only one UI thread in your app, which means if your UI thread is busy doing heavy calculations or waiting for network requests or something like that, then it can't update what you see on the screen until it's done. The result is that your application looks like "freezing" - you can click a button, but nothing seems to happen because the UI thread is busy doing other things.

Ideally, you want the UI thread to be as idle as possible, so that your application always seems to be responding to user actions. This is the origin of asynchronous and multithreading. By using these tools, you can ensure that heavy work is done elsewhere and that UI threads remain well and responsive.

Now let's see how to use these tools in C#.

Asynchronous operation of C#

The code to perform asynchronous operations is very simple. You should know two main keywords: "async" and "await", so people usually call it async/await. Suppose you have code like this now:

public void Loopy()
{
  var hugeFiles = new string[] {
   "Gr8Gonzos_Home_Movie_In_8k_Res.mkv", // 1 GB
   "War_And_Peace_In_150_Languages.rtf", // 1.2 GB
   "Cats_On_Catnip.mpg"         // 0.9 GB
  };

  foreach (var hugeFile in hugeFiles)
  {
    ReadAHugeFile(hugeFile);
  }
  
  ("All done!");
}


public byte[] ReadAHugeFile(string bigFile)
{
  var fileSize = new FileInfo(bigFile).Length; // Get the file size
  var allData = new byte[fileSize];      // Allocate a byte array as large as our file
  using (var fs = new (bigFile, ))
  {
    (allData, 0, (int)fileSize);   // Read the entire file...
  }
  return allData;               // ...and return those bytes!
}

In the current form, these are run synchronously. If you click a button to run Loopy() from the UI thread, the application will appear to freeze until all three major files are read, because each "ReadAHugeFile" is going to take a long time to run on the UI thread and will be read synchronously. This is not good! Let's see if we can turn ReadAHugeFile into an asynchronous way so that the UI thread can continue to process other things.

Whenever there are commands that support asynchronously, Microsoft will usually give us synchronous and asynchronous versions of these commands. In the above code, the object has both "Read" and "ReadAsync" methods. So the first step is to modify "" to "".

public byte[] ReadAHugeFile(string bigFile)
{
  var fileSize = new FileInfo(bigFile).Length; // Get the file size
  var allData = new byte[fileSize];      // Allocate a byte array as large as our file
  using (var fs = new (bigFile, ))
  {
    (allData, 0, (int)fileSize); // Read the entire file asynchronously...
  }
  return allData;               // ...and return those bytes!
}

If you run it now, it returns immediately and there will be no data in the "allData" byte array. Why?

This is because ReadAsync starts reading and returns a task object, which is a bit like a bookmark. This is a "Promise" for .net, which returns the result once the asynchronous activity is completed (such as reading data from the hard disk), which the task object can use to access. But if we don't do anything to this task, the system will immediately proceed to the next line of code, which is our "return allData" line, which will return an array that has not yet been filled with data.

Therefore, it is useful to tell the code to wait for the result (but this way, the original thread can continue to do other things during this period). To do this we used a "awaiter", which is as simple as adding the word "await" before the async call:

public byte[] ReadAHugeFile(string bigFile)
{
  var fileSize = new FileInfo(bigFile).Length; // Get the file size
  var allData = new byte[fileSize];      // Allocate a byte array as large as our file
  using (var fs = new (bigFile, ))
  {
    await (allData, 0, (int)fileSize); // Read the entire file asynchronously...
  }
  return allData;               // ...and return those bytes!
}

If you run it now, it returns immediately and there will be no data in the "allData" byte array. Why?

public byte[] ReadAHugeFile(string bigFile)
{
  var fileSize = new FileInfo(bigFile).Length; // Get the file size
  var allData = new byte[fileSize];      // Allocate a byte array as large as our file
  using (var fs = new (bigFile, ))
  {
    await (allData, 0, (int)fileSize); // Read the entire file asynchronously...
  }
  return allData;               // ...and return those bytes!
}

oh. If you try it, you will find an error. This is because .net needs to know that this method is asynchronous and it will eventually return a byte array. So the first thing we do is add the word "async" before the return type and then use Task<…>, like this:

public async Task<byte[]> ReadAHugeFile(string bigFile)
{
  var fileSize = new FileInfo(bigFile).Length; // Get the file size
  var allData = new byte[fileSize];      // Allocate a byte array as large as our file
  using (var fs = new (bigFile, ))
  {
    await (allData, 0, (int)fileSize); // Read the entire file asynchronously...
  }
  return allData;               // ...and return those bytes!
}

OK! Now we cook! If we run our code now, it will continue to run on the UI thread until we reach the await of the ReadAsync method. At this point, . net knows that this is an activity that will be executed by the hard disk, so "await" puts a small bookmark in the current location and the UI thread returns to its normal processing (all visual updates, etc.).

Then, once the hard drive has read all the data, the ReadAsync method copies it all into the allData byte array, and the task is now completed, so the system rings the doorbell so that the original thread knows that the result is ready. The original thread said: "Great! Let me go back where I left!" As soon as I get a chance, it goes back to "await" and then proceeds to the next step, returning to the allData array, which is now filled with our data.

If you look at the example one by one and are using a recent version of Visual Studio, you will notice this line:

ReadAHugeFile(hugeFile);

… Now, it is represented by a green underscore, and if you hover over it, it will say, “Because this call is not waiting, the execution of the current method will continue until the call is completed.” Consider applying the 'await' operator to the result of the call. "

This is Visual Studio lets you know that it acknowledges ReadAHugeFile() is an asynchronous method, rather than returning a result, which is also returning a task, so if you want to wait for the result, then you can add a "await":

await ReadAHugeFile(hugeFile);

...but if we do this, then you also have to update the method signature:

public async void Loopy()

Note that if we are on a method that does not return anything (void return type), then we don't need to wrap the return type in Task<…>.

But, let's not do this. Instead, let's take a look at what we can do with asynchronousness.

If you don't want to wait for the results of ReadAHugeFile(hugeFile) as you may not care about the end result, but you don't like the green underscore/warning, you can use a special trick to tell .net. Just assign the result to the _character, like this:

_ = ReadAHugeFile(hugeFile);

This is the syntax of .net, which means "I don't care about the result, but I don't want to bother me with its warning."

OK, let's try something else. If we use await on this line, it will wait for the first file to be read asynchronously, then wait for the second file to be read asynchronously, and finally wait for the third file to be read asynchronously. But...what if we want to read all 3 files asynchronously at the same time and then after all 3 files are finished, we allow the code to continue to the next line?

There is a method called(), which itself is an asynchronous method that you can await. Pass in a list of other task objects and wait for it, once all tasks are completed, it will be completed. So the easiest way is to create a List<Task> object:

List<Task> readingTasks = new List<Task>();

...and then, when we add the Task in each ReadAHugeFile() call to the list:

foreach (var hugeFile in hugeFiles) {  
   (ReadAHugeFile(hugeFile));
}

… Finally, we await ():

await (readingTasks);

The final method is as follows:

public async void Loopy()
{
  var hugeFiles = new string[] {
   "Gr8Gonzos_Home_Movie_In_8k_Res.mkv", // 1 GB
   "War_And_Peace_In_150_Languages.rtf", // 1.2 GB
   "Cats_On_Catnip.mpg"         // 0.9 GB
  };


  List<Task> readingTasks = new List<Task>();
  foreach (var hugeFile in hugeFiles)
  {
    (ReadAHugeFile(hugeFile));
  }
  await (readingTasks);


  (());
}

Some I/O mechanisms work better than others when it comes to parallel activities (e.g., network requests usually work better than hard disk reads, but this depends on the hardware), but the principle is the same.

Now, the last thing the "await" operator has to do is extract the final result. So in the example above, ReadAHugeFile returns a task <byte[]>. Await's magic function will automatically throw the Task<> wrapper when it's done and return the byte[] array, so if you want to access the bytes in Loopy() you can do this:

byte[] data = await ReadAHugeFile(hugeFile);

Again, await is a magical little command that makes asynchronous programming very simple and handles all kinds of small things for you.

Now let's move to multithreading.

Multithreading in C#

Microsoft sometimes gives you 10 different ways to do the same thing, and that's how it uses multithreading. You have the BackgroundWorker class, Thread, and Task (they have several variants). Ultimately, they all do the same thing, just with different functions. Nowadays, most people use Tasks because they are easy to set up and use, and they can also interact well with asynchronous code if you want to do this (we'll talk about it later). If you are curious, there are many articles about these specific differences, but we use tasks here.

To get any method to run in a separate thread, just use the() method to execute it. For example, suppose you have a method like this:

public void DoRandomCalculations(int howMany)
{
  var rng = new Random();
  for (int i = 0; i < howMany; i++)
  {
    int a = (1, 1000);
    int b = (1, 1000);
    int sum = 0;
    sum = a + b;
  }
}

We can call it in the current thread like this:

DoRandomCalculations(1000000);

Or we can have another thread do this:

(() => DoRandomCalculations(1000000));

Of course, there are some different versions, but that's the overall idea.

One advantage of Task.run() is that it returns a task object that we can wait for. So if you want to run a bunch of code in a separate thread and then wait for it to complete before going to the next step, you can use await, like you saw in the previous section:

var finalData = await (() => {});

Remember that this article discusses how to get started and how these concepts work, but it is not comprehensive. But perhaps with this knowledge, you will be able to understand more complex articles from others about the more advanced kinds of multithreading and asynchronous encoding.

The above is a detailed content of in-depth analysis of asynchronous and multi-threading in C#. For more information about asynchronous and multi-threading in C#, please pay attention to my other related articles!