Improving Processing Performance with Parallel.ForEach

Printer-friendly version

The .NET Framework, Task Parallel Library (TPL), introduced with version 4, can significantly increase processing performance by using all available cores on the host computer more efficiently. With the typical execution model, a task, which is a unit of work, executes sequentially on a single CPU core. However, for a recent long-running task, I wanted to leverage parallelization to distribute work across multiple processors given that my box has four cores and eight logical processors (see Figure 1) to improve the overall processing time.

 

Figure 1. Logical Processors

In this case, the task was record deletion, as over a million records were added to a table in Dynamics CRM online in error and needed to be removed. The standard Bulk Delete process was used to attempt the cleanup, but took a while and ended up erroring out. So, I decided use the CRM SDK and LINQPad to test a delete script (see Listing 1). The process involved was the typical retrieve the list of Ids to delete, then loop through the result set and delete the records. The initial attempt worked, but was slow with a batch of 50,000 records taking about an hour to delete.

Given that the deletion could be easily split into tasks that execute efficiently on their own, it was well suited for parallelism with the TPL. TPL provides a basic form of structured parallelism via three static methods in the Parallel Class:

  • Parallel.Invoke: Executes each of the provided actions, possibly in parallel
  • Parallel.For: Executes a for loop in which iterations may run in parallel
  • Parallel.ForEach: Executes a for loop in which iterations may run in parallel

I leveraged the Parallel.ForEach method and had to make only minor changes to the original query as shown in Figure 1 and in the code listing.

Figure 2: foreach query change

 

var entities = history.Where(e => e.tls_User.Id == userId)

               .Select(e => new { e.Id })

               .ToList();

foreach(var e in entities)

{

    Delete("history", e.Id);

}

Listing 1: Original Query

 

var entities = history.Where(e => e.tls_User.Id == userId)

               .Select(e => new { e.Id })

               .ToList();

Parallel.ForEach(entities, (e) => { 

    Delete("history", e.Id);

});

Listing 2: Revised Query

 

After switching from the standard C# foreach to the Parallel.ForEach and re-running the query, the performance gain was almost ten times faster. Instead of ~833 records per minute (~50,000 an hour), the revised query processed ~8,000 records per minute (~480,000 an hour).  

You can also parallelize the LINQ query by adding AsParallel() method and then parallelize the foreach by using the ForAll() method. Here’s a simple example: 

"abcdef".AsParallel().Select(c => char.ToUpper(c)).ForAll(Console.Write);

All I wanted to do was to parallelize a foreach, so I used its parallel version, Parallel.ForEach().

As you can see, for the deletion process, there was a significant increase in performance using parallelism, but that may not always be the case, as it depends on several factors such as the number of CPUs, the iterations involved, the type of parallelism (Data, Task or Dataflow) and whether it’s an embarrassingly parallel problem. However, programs that are properly designed to take advantage of parallelism can execute faster than their sequential counterparts, which is often a significant market advantage. 

For more information see the following links:

About the Author:

 

TopLine Strategies delivers the complete integration and development of sales, marketing and customer service technologies that enable corporate clientele to improve revenue streams and strengthen customer interactions. Our project management and consulting is designed to achieve timely delivery, 100 percent user adoption of the technologies we implement and deliver measurable returns on investments for our clients.

Comments (0)

Related Blogs

TheReact Native Open Source roadmap was announced in Q4 2018 after they decided to invest more in the React Native open source community.

October is not just about pumpkins, fall foliage, and cooler temps anymore. October 2018 also means the exciting introduction of Microsoft Dynamics 365 for Customer Engagement.

Back in 2016, Microsoft introduced its intentions to refresh its CRM and ERP strategy with Dynamics 365. At the heart of its services was the Common Data Model (CDM).