Multithreading, lambdas and local variables
My question is, in the below code, can I be sure that the instance methods will be accessing the variables I think they will, or can they be changed by another thread while I'm still working? Do closures have anything to do with this, ie will I be working on a local copy of the IEnumerable<T>
so enumeration is safe?
To paraphrase my question, do I need any locks if I'm never writing to shared variables?
public class CustomerClass
{
private Config cfg = (Config)ConfigurationManager.GetSection("Customer");
public void Run()
{
var serviceGroups = this.cfg.ServiceDeskGroups.Select(n => n.Group).ToList();
var groupedData = DataReader.GetSourceData().AsEnumerable().GroupBy(n => n.Field<int>("ID"));
Parallel.ForEach<IGrouping<int, DataRow>, CustomerDataContext>(
groupedData,
() => new CustomerDataContext(),
(g, _, ctx) =>
{
var inter = this.FindOrCreateInteraction(ctx, g.Key);
inter.ID = g.Key;
inter.Title = g.First().Field<string>("Title");
this.CalculateSomeProperty(ref inter, serviceGroups);
return ctx;
},
ctx => ctx.SubmitAllChanges());
}
private Interaction FindOrCreateInteraction(CustomerDataContext ctx, int ID)
{
var inter = ctx.Interactions.Where(n => n.Id = ID).SingleOrDefault();
if (inter == null)
{
inter = new Interaction();
ctx.InsertOnSubmit(inter);
}
return inter;
}
private void CalculateSomeProperty(ref Interaction inter, IEnumerable<string> serviceDeskGroups)
{
// Reads from the List<T> class instance variable. Changes the state of the ref'd object.
if (serviceGroups.Contains(inter.Group))
{
inter.Ours = true;
}
}
}
I seem to have found the answer and in the process, also the question.
The real question was whether local "variables", that turn out to be actually objects, can be trusted for concurrent access. The answer is no, if they happen to have internal state that is not handled in a thread-safe manner, all bets are off. The closure doesn't help, it just captures a reference to said object.
In my specific case - concurrent reads from IEnumerable<T>
and no writes to it, it is actually thread safe, because each call to foreach
, Contains()
, Where()
, etc. gets a fresh new IEnumerator
, which is only visible from the thread that requested it. Any other objects, however, must also be checked, one by one.
So, hooray, no locks or synchronized collections for me :)
Thanks to @ebb and @Dave, although you didn't answer the question directly, you pointed me in the right direction.
If you're interested in the results, this is a run on my home PC (a quad-core) with Thread.SpinWait
to simulate the processing time of a row. The real app had an improvement of almost 2X (01:03 vs 00:34) on a dual-core hyper-threaded machine with SQL Server on the local network.
Single-threaded, using foreach
. I don't know why, but there is a pretty high number of cross-core context switches.
Using Parallel.ForEach
, lock-free with thread-locals where needed.
Right now, from what I can tell, your instance methods are not using any member variables. That makes them stateless and therefore threadsafe. However, in that same case, you'd be better off marking them "static" for code clarity and a slight performance benefit.
If those instance methods were using a member variable, then they'd only be as threadsafe as that variable (for example, if you used a simple list, it would not be threadsafe and you may see weird behavior). Long story short, member variables are the enemy of easy thread safety.
Here's my refactor (disclaimer, not tested). If you want to provide data that's passed in, you'll stay saner if you pass them as parameters and don't keep them as member variables :
UPDATE: You asked for a way to reference your read only list, so I've added that and removed the static tags (so that the instance variable can be shared).
public class CustomerClass
{
private List<string> someReadOnlyList;
public CustomerClass(){
List<string> tempList = new List<string>() { "string1", "string2" };
someReadOnlyList = ArrayList.Synchronized(tempList);
}
public void Run()
{
var groupedData = DataReader.GetSourceData().AsEnumerable().GroupBy(n => n.Field<int>("ID"));
Parallel.ForEach<IGrouping<int, DataRow>, CustomerDataContext>(
groupedData,
() => new CustomerDataContext(),
(g, _, ctx) =>
{
var inter = FindOrCreateInteraction(ctx, g.Key);
inter.ID = g.Key;
inter.Title = g.First().Field<string>("Title");
CalculateSomeProperty(ref inter);
return ctx;
},
ctx => ctx.SubmitAllChanges());
}
private Interaction FindOrCreateInteraction(CustomerDataContext ctx, int ID)
{
var query = ctx.Interactions.Where(n => n.Id = ID);
if (query.Any())
{
return query.Single();
}
else
{
var inter = new Interaction();
ctx.InsertOnSubmit(inter);
return inter;
}
}
private void CalculateSomeProperty(ref Interaction inter)
{
Console.Writeline(someReadOnlyList[0]);
//do some other stuff
}
}
链接地址: http://www.djcxy.com/p/56630.html
上一篇: 在OSX上挂钩C ++方法?
下一篇: 多线程,lambda表达式和局部变量