Entity Framework: Avoiding Inserting Duplicates

Say, I have the following conceptual model, there are strories that have tags (more than one, so it's a many-to-many relationship), plus each tag belongs to a particular category.

My data comes from an external source and before inserting it I want to make sure that no duplicated tags are added.

Updated code snippet:

static void Main(string[] args)
    {
        Story story1 = new Story();
        story1.Title = "Introducing the Entity Framework";
        story1.Tags.Add(new Tag { Name = ".net",  });
        story1.Tags.Add(new Tag { Name = "database" });

        Story story2 = new Story();
        story2.Title = "Working with Managed DirectX";
        story2.Tags.Add(new Tag { Name = ".net" });
        story2.Tags.Add(new Tag { Name = "graphics" });

        List<Story> stories = new List<Story>();
        stories.Add(story1);
        stories.Add(story2);

        EfQuestionEntities db = new EfQuestionEntities();

        Category category = (from c in db.Categories
                             where c.Name == "Programming"
                             select c).First();

        foreach (Story story in stories)
        {
            foreach (Tag tag in story.Tags)
            {
                Tag currentTag = tag;
                currentTag = GetTag(tag.Name, category, db);
            }

            db.Stories.AddObject(story);
        }

        db.SaveChanges();
    }

    public static Tag GetTag(string name, Category category, EfQuestionEntities db)
    {
        var dbTag = from t in db.Tags.Include("Category")
                    where t.Name == name
                    select t;

        if (dbTag.Count() > 0)
        {
            return dbTag.First();
        }

        var cachedTag = db.ObjectStateManager.GetObjectStateEntries(EntityState.Added).
            Where(ose => ose.EntitySet == db.Tags.EntitySet).
            Select(ose => ose.Entity).
            Cast<Tag>().Where(x => x.Name == name);

        if (cachedTag.Count() != 0) 
        {
            return cachedTag.First();
        }

        Tag tag = new Tag();
        tag.Name = name;
        tag.Category = category;

        db.Tags.AddObject(tag);

        return tag;
    }

However, I get an exception about an object with the same EntityKey that is already present in the ObjectContext.

Also, if I remove the else statement I will get an exception about violating an FK constraint, so it seems like its Category attribute is set to null.


I 've had the same problem with EF. Here's what I ended up doing:

  • Instead of doing story1.Tags.Add(new Tag { Name = ".net", }) yourself, routed all Tag creation through a helper method like this: story1.Tags.Add(GetTag(".net")) .
  • The GetTag method checks the tags in the context to see if it should return an existing entity, like you do. If it does, it returns that.
  • If there is no existing entity, it checks the ObjectStateManager to see if there are Tag entities added to the context but not already written to the db. If it finds a matching Tag , it returns that.
  • If it still has not found the Tag , it creates a new Tag , adds it to the context, and then returns it.
  • In essence this will make sure that no more than one instance of any Tag (be it already existing or just created) will be used throughout your program.

    Some example code lifted from my project (uses InventoryItem instead of Tag , but you get the idea).

    The check in step 3 is done like this:

    // Second choice: maybe it's not in the database yet, but it's awaiting insertion?
    inventoryItem = context.ObjectStateManager.GetObjectStateEntries(EntityState.Added)
        .Where(ose => ose.EntitySet == context.InventoryItems.EntitySet)
        .Select(ose => ose.Entity)
        .Cast<InventoryItem>()
        .Where(equalityPredicate.Compile())
        .SingleOrDefault();
    
    if (inventoryItem != null) {
        return inventoryItem;
    }
    

    If the Tag is not found in step 3, here's the code for step 4:

    inventoryItem = new InventoryItem();
    context.InventoryItems.AddObject(inventoryItem);
    return inventoryItem;
    

    Update:

    It should be used like this:

    Story story1 = new Story();
    story1.Title = "Introducing the Entity Framework";
    story1.Tags.Add(GetTag(".net", category, db));
    story1.Tags.Add(GetTag("database", category, db));
    
    链接地址: http://www.djcxy.com/p/51528.html

    上一篇: EF4 Code First:如何仅更新特定字段

    下一篇: 实体框架:避免插入重复项