I have been busy recently and haven’t written a blog for a long time. This article mainly tells you about the partitions in PLINQ. The previous article introduced parallel programming, here we will introduce in detail the partitions and custom partitions in parallel programming.
Let’s make a assumption first, suppose we have a 200Mb text file that needs to be read, how can we achieve the optimal speed? Yes, it is obvious that it is splitting, splitting the text file into many small files, making full use of the advantages of multi-core CPUs in our computers, so that each CPU can be fully utilized and maximized efficiency. However, in PLINQ, we have a data source. If we want to perform the largest parallelization operation, we need to split it into multiple parts that can be accessed simultaneously by multiple threads. This is the partition in PLINQ. Of course, Microsoft has already thought of this for us and knows that its users may have this need, so let’s talk about the default partitioning program provided by Microsoft.
The default partitioning program provided by Microsoft is also called Task Parallel Library (TPL). In fact, when you use PLINQ's ForEach, we will partition it internally by default. How about it, isn't it very convenient? But sometimes, you may need to split it yourself, so that is another and more advanced usage, which is the custom partition of PLINQ. There are two types of custom partitions, one is partitioned by scope, and the other is partitioned by block. Among them, partitioning according to scope can provide very good performance for linked list collections, such as IList, etc., but it also has one disadvantage that if a thread completes in advance, it will not help other threads to complete their work. Partition by block is when we do not know the size of the collection we want to operate, we can use partition by block. In the partition by block, each thread or task in a parallel loop or query uses a certain number of source elements in a block, process them, and then return to retrieve other elements. The partitioner ensures that all elements are distributed and there are no duplicates. Blocks can be of any size.
Generally, partitioning by range will only be faster if the delegate's execution time is short to medium, the source has a large number of elements, and the total workload per partition is roughly equal. Therefore, partitioning by block is faster in most cases. For sources with few elements or long delegate execution time, the performance of partitioning by block and partition by range is roughly equal.
So how do we implement dynamic partitioning? Below is an example taken from MSDN.
Each time the partition calls MoveNext to the enumerator, the enumerator provides a partition containing a list element. For PLINQ and ForEach, the partition is a Task instance. Since the request occurs on multiple threads simultaneously, access to the current index is synchronized.
// // An orderable dynamic partitioner for lists // class OrderableListPartitioner<TSource> : OrderablePartitioner<TSource> { private readonly IList<TSource> m_input; public OrderableListPartitioner(IList<TSource> input) : base(true, false, true) { m_input = input; } // Must override to return true. public override bool SupportsDynamicPartitions { get { return true; } } public override IList<IEnumerator<KeyValuePair<long, TSource>>> GetOrderablePartitions(int partitionCount) { var dynamicPartitions = GetOrderableDynamicPartitions(); var partitions = new IEnumerator<KeyValuePair<long, TSource>>[partitionCount]; for (int i = 0; i < partitionCount; i++) { partitions[i] = (); } return partitions; } public override IEnumerable<KeyValuePair<long, TSource>> GetOrderableDynamicPartitions() { return new ListDynamicPartitions(m_input); } private class ListDynamicPartitions : IEnumerable<KeyValuePair<long, TSource>> { private IList<TSource> m_input; private int m_pos = 0; internal ListDynamicPartitions(IList<TSource> input) { m_input = input; } public IEnumerator<KeyValuePair<long, TSource>> GetEnumerator() { while (true) { // Each task gets the next item in the list. The index is // incremented in a thread-safe manner to avoid races. int elemIndex = (ref m_pos) - 1; if (elemIndex >= m_input.Count) { yield break; } yield return new KeyValuePair<long, TSource>( elemIndex, m_input[elemIndex]); } } IEnumerator () { return ((IEnumerable<KeyValuePair<long, TSource>>)this) .GetEnumerator(); } } } class ConsumerClass { static void Main() { var nums = (0, 10000).ToArray(); OrderableListPartitioner<int> partitioner = new OrderableListPartitioner<int>(nums); // Use with (partitioner, (i) => (i)); // Use with PLINQ var query = from num in () where num % 2 == 0 select num; foreach (var v in query) (v); } }
This is an example of partitioning by block, where each block consists of an element. By providing multiple elements at once, you can reduce lock contention and theoretically achieve faster performance. However, sometimes larger blocks may require additional load balancing logic to keep all threads busy before work is done.
The above is a detailed explanation of the partitions in c# PLINQ. For more information about partitions in c# PLINQ, please pay attention to my other related articles!