Going parallel with C#

Web Developer Bartosz discusses how he cut down the processing time for sending 8,000 to 10,000 daily emails using Microsoft's Task Parallel Library.

Sometimes, a few small changes to code can save a significant amount of application processing time.

The Problem

I recently had to refactor a Microsoft Windows Service Application, which is responsible for sending around 8,000 to 10,000 emails daily. A single application run was taking around 8 hours to complete and I wanted to cut this time to around 2 hours max. The processing workflow was fairly simple and could be divided into 4 steps:

  • Iterate through a list of all users
  • Check if a user should receive an email
  • If so, retrieve the email content from an external HTTP request
  • Send an e-mail

The code fulfilling the task wasn’t complex, therefore there wasn’t a huge scope to achieve there. However, performance analysis showed that the most time consuming operation throughout the process was step 3 (generating the e-mail content), which took on average 3.5 seconds to compete! This meant that when my application was processing 10,000 emails it was wasting well over 6 hours just waiting for http responses to return from the external website. All of this because the emails were processed synchronously on one thread, therefore each email had to wait for the previous one to finish. But what if I could process multiple emails at the same time?

The Solution

Using the Task Parallel Library (TPL), and functionality it provides, there might be a solution for the scenario described above. TPL was first introduced in .NET version 4.0 and contains a set of public types and APIs whose main purpose is maximizing code performance by dynamically scaling degree of concurrency and making the most effective use of modern multicore CPUs. To achieve this, TPL manages the partitioning of work, scheduling threads on the ThreadPool, state management and other low level tasks, creating an abstraction layer which is easier for a programmer to work with. Two main concepts build into TPL are asynchronous and parallel programming. I was fairly aware of the first of them, which uses the async/await keywords, introduced in .NET 4.5. Maybe because they seemed to get all the love from the press and became a new “cool feature” that coincided well with the rise in mobile applications development. However, programming mostly in a web environment and using AJAX requests to support asynchronous aspects of my applications, I have never found myself motivated enough to examine other aspects of TPL. For that reason, discovering Parallel class and simplicity of its implementation was such a great surprise for me.

Data parallelism is not a new concept and has been around in .NET for a while. It was introduced as part of the TPL along with the general concept of “Task”. A Task is an independent unit of work, running within the program and managing resources within its own scope. The Parallel class allows our program to run several of those Tasks concurrently and farms out the calls to each of them on different cores. The number of threads is auto-tuned based upon the configuration of the machine and how each thread is being used by other programs. All public and protected members of the Parallel class are thread-safe and the most common implementations of it are the For() loop:

 Parallel.For(0, 10, index =>


      Console.WriteLine("Task Id {0} processing index: {1}", Task.CurrentId, index);


And the ForEach() equivalent:

  var numbers = new List<int>() { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 };

  Parallel.ForEach(numbers, number =>


      Console.WriteLine("Task Id {0} processing index: {1}", Task.CurrentId, number);


Both methods hold multiple overload varieties to accommodate different scenarios.


Processing multiple Tasks in parallel on multiple threads introduces specific challenges that programmers need to be aware of. Most of them origin from the concurrent workflow and unpredictable order of processing each thread. When processing in parallel, writing to shared memory locations should be limited to minimum. If there is any calculation, or even a simple count assigned to a global variable, one can expect unpredictable behaviour to happen and exceptions to be thrown. Different approaches can be taken to tackle those situations such as using locks, or better, the Interlocked class. Both of those allow programmers to handle atomic operations for simple variables, and preserve the true value globally. I have used the latter one in my application to track the total number of emails sent out to users:

  Interlocked.Increment(ref emailsProcessed);

In scenarios when we need to handle some more complex logic and not just a simple count, we can use one of the thread-safe collection classes from System.Collections.Concurent namespace. In my application I have used the ConcurrentDictionary object to improve the processing of some data encrypted properties, and ConcurentQueue to hold any exceptions thrown from multiple threads. Reading from and writing to a database can be also problematic in multithreaded environment, especially if we use object relational mapper like Entity Framework (EF). Version 6.0 introduced async support but some features, like lazy loading, are still only possible synchronously. My application had an issue with a single DbContext being shared between different threads, which was occasionally throwing InvalidOperationException, and I managed to resolve it by creating new context per each request. It’s also worth mentioning that the Parallel class is a good choice when results of the method don’t relate to each other. My application sends e-mails to users and I don’t have to worry about the order in which they are delivered, but in cases where the order of returned items matters, one can use the PLINQ library. PLINQ is just a parallel version of LINQ which allows us to use a descriptive and elegant syntax, like so:

  return numbers


    .Select(number => Process(number));

PLINQ is not as fast as a Parallel.ForEach() loop, but it does preserve the order of the collection being processed, so a decision can be made depending on your use-case.


Most online resources around TPL concentrate on the asynchronous features of it – async/await. Many libraries are being updated (such as EF 6), or designed from ground up with async support (OWIN, Web API, SignalR). In all of this buzz, lesser known features, like parallel programming, seem to be overlooked. Developers are often not fully aware of their advantages or how easy they are to implement, while sometimes just a few changes to our code may bring truly impressive results. Refactoring my application involved changing the ForEach statement and amending some code within the loop to support concurrent processing. The benefit I got from those few changes has cut down the overall processing time by over 70%.

By Bartosz Malinowski, Web Developer