并行算法的出现,随之而产生的也曾有了并行集合,及线程安全集合;微软向的也算周到,没有忘记linq,也推出了linq的并行版本,plinq - Parallel Linq。

一,并行集合 - 线程安全集合

  并行计算使用的多个线程同时进行计算,所以要控制每个线程对资源的访问,我们先来看一下平时常用的列表集合,在并行计算下的表现,新建一个控制台应用程序,添加一个PEnumerable类(当然你也直接写到主方法里面测试,建议分开写),写如下方法:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Collections.Concurrent;

namespace ThreadPool
{
    public class PEnumerable
   {
      public static void ListWithParallel()
      {
         List<int> list = new List<int>();
         Parallel.For(0, 10000, item =>
         {
            list.Add(item);
         });
         Console.WriteLine("List's count is {0}",list.Count());
      }
   }
}

看到结果中显示的5851,但是我们循环的是10000次啊!怎么结果不对呢?这是因为名单是非线程安全集合,意思就是说所有的线程都可以修改他的值。

下面我们来看下并行集合 - 线程安全集合,在System.Collections.Concurrent命名空间中,首先来看一下ConcurrentBag泛型集合,其用法和列表类似,先来写个方法测试一下:

public static void ConcurrentBagWithPallel()
  {
     ConcurrentBag<int> list = new ConcurrentBag<int>();
     Parallel.For(0, 10000, item =>
     {
        list.Add(item);
     });
     Console.WriteLine("ConcurrentBag's count is {0}", list.Count());
  }

可以看到,ConcurrentBag集合的结果是正确的下面我们修改代码看看ConcurrentBag里面的数据到底是怎么存放的,修改代码如下:

public static void ConcurrentBagWithPallel()
  {
     ConcurrentBag<int> list = new ConcurrentBag<int>();
     Parallel.For(0, 10000, item =>
     {
        list.Add(item);
     });
     Console.WriteLine("ConcurrentBag's count is {0}", list.Count());
     int n = 0;
     foreach(int i in list)
     {
        if (n > 10)
           break;
        n++;
        Console.WriteLine("Item[{0}] = {1}",n,i);
     }
     Console.WriteLine("ConcurrentBag's max item is {0}", list.Max());

  }

可以看到,ConcurrentBag中的数据并不是按照顺序排列的,顺序是乱的,随机的。我们平时使用的最大,首先,最后等LINQ方法都还有。其时分类似可枚举的用法,大家可以参考微软的MSDN了解它的具体用法。

关于线程安全的集合还有很多,和我们平时用的集合都差不多,比如类似字典的ConcurrentDictionary,还有ConcurrentStack,ConcurrentQueue等。

二,并行Linq的用法及性能

1,进行AsParallel

前面了解了并行的对于和的foreach,今天就来看一下的LINQ的并行版本是怎么样吧为了测试,我们添加一个自定义类,代码如下?

public class Custom

{ public string Name { get; set; } public int Age { get; set; } public string Address { get; set; } }

写如下测试代码:

public static void TestPLinq()
  {
     Stopwatch sw = new Stopwatch();
     List<custom> customs = new List<custom>();
     for (int i = 0; i < 2000000; i++)
     {
        customs.Add(new Custom() { Name = "Jack", Age = 21, Address = "NewYork" });
        customs.Add(new Custom() { Name = "Jime", Age = 26, Address = "China" });
        customs.Add(new Custom() { Name = "Tina", Age = 29, Address = "ShangHai" });
        customs.Add(new Custom() { Name = "Luo", Age = 30, Address = "Beijing" });
        customs.Add(new Custom() { Name = "Wang", Age = 60, Address = "Guangdong" });
        customs.Add(new Custom() { Name = "Feng", Age = 25, Address = "YunNan" });
     }
     sw.Start();
     var result = customs.Where<custom>(c => c.Age > 26).ToList();
     sw.Stop();
     Console.WriteLine("Linq time is {0}.",sw.ElapsedMilliseconds);
     sw.Restart();
     sw.Start();
     var result2 = customs.AsParallel().Where<custom>(c => c.Age > 26).ToList();
     sw.Stop();
     Console.WriteLine("Parallel Linq time is {0}.", sw.ElapsedMilliseconds);
  }

其实也就是加了一个进行AsParallel()方法,下面来看下运行结果:

public static void OrderByTest()
  {
     Stopwatch stopWatch = new Stopwatch();
     List<custom> customs = new List<custom>();
     for (int i = 0; i < 2000000; i++)
     {
        customs.Add(new Custom() { Name = "Jack", Age = 21, Address = "NewYork" });
        customs.Add(new Custom() { Name = "Jime", Age = 26, Address = "China" });
        customs.Add(new Custom() { Name = "Tina", Age = 29, Address = "ShangHai" });
        customs.Add(new Custom() { Name = "Luo", Age = 30, Address = "Beijing" });
        customs.Add(new Custom() { Name = "Wang", Age = 60, Address = "Guangdong" });
        customs.Add(new Custom() { Name = "Feng", Age = 25, Address = "YunNan" });
     }
     stopWatch.Restart();
     var groupByAge = customs.GroupBy(item => item.Age).ToList();
     foreach (var item in groupByAge)
     {
        Console.WriteLine("Age={0},count = {1}", item.Key, item.Count());
     }
     stopWatch.Stop();
     Console.WriteLine("Linq group by time is: " + stopWatch.ElapsedMilliseconds);
     stopWatch.Restart();
     var lookupList = customs.ToLookup(i => i.Age);
     foreach (var item in lookupList)
     {
        Console.WriteLine("LookUP:Age={0},count = {1}", item.Key, item.Count());
     }
     stopWatch.Stop();
     Console.WriteLine("LookUp group by time is: " + stopWatch.ElapsedMilliseconds);
  }

运行结果如下:

enter image description here ToLookup方法是将集合转换成一个只读集合,所以在大数据量分组时性能优于名单。大家可以查阅相关资料,这里由于篇幅问题,不再细说。