PLINQ – example

.NET 4 contains a new class ParallelEnumerable in the System.Linq namespace to split the work

of queries across multiple threads. Although the Enumerable class defines extension methods to the

IEnumerable < T > interface, most extension methods of the ParallelEnumerable class are extensions for

the class ParallelQuery < TSource > . One important exception is the AsParallel() method that extends

IEnumerable < TSource > and returns ParallelQuery < TSource > , so a normal collection class can be

queried in a parallel manner.

Parallel queries

To demonstrate Parallel LINQ, a large collection is needed. With small collections you will not see any

effect when the collection fits inside the CPU ’ s cache. In the following code, a large int array is filled with

random values:

const int arraySize = 100000000;

var data = new int[arraySize];

var r = new Random();

for (int i = 0; i < arraySize; i++)

{

data[i] = r.Next(40);

}

Now you can use a LINQ query to filter the data and get a sum of the filtered data. The query defines a filter

with the where clause to summarize only the items with values < 20 , and then the aggregation function

sum is invoked. The only difference to the LINQ queries you ’ ve seen so far is the call to the AsParallel()

method.

var sum = (from x in data.AsParallel()

where x < 20

select x).Sum();

As with the LINQ queries you have seen so far, the compiler changes the syntax to invoke the methods

AsParallel() , Where() , Select() , and Sum() . AsParallel() is defined with the ParallelEnumerable

class to extend the IEnumerable < T > interface, so it can be called with a simple array. AsParallel()

returns ParallelQuery<TSource>. Because of the returned type, the Where() method that is chosen by

the compiler is ParallelEnumerable.Where() instead of Enumerable.Where(). In the following code, the

Select() and Sum() methods are from ParallelEnumerable as well. In contrast to the implementation of

the Enumerable class, with the ParallelEnumerable class the query is partitioned so that multiple threads

can work on the query. The array can be split into multiple parts where different threads work on every part

to filter the remaining items. After the partitioned work is completed, merging needs to take place to get the

summary result of all parts.

var sum = data.AsParallel().Where(x => x < 20).Select(x => x).Sum();

Running this code starts the task manager so you can see that all CPUs of your system are busy. If you

remove the AsParallel() method, multiple CPUs might not be used. Of course if you do not have multiple

CPUs on your system, then don’t expect to see an improvement with the parallel version.