equals and circular references: how to resolve infinite recursion?
I have some classes that contain several fields. I need to compare them by value, ie two instances of a class are equal if their fields contain the same data. I have overridden the GetHashCode
and Equals
methods for that.
It can happen that these classes contain circular references.
Example: We want to model institutions (like government, sports clubs, whatever). An institution has a name. A Club
is an institution that has a name and a list of members. Each member is a Person
that has a name and a favourite institution. If a member of a certain club has this club as his favourite institution, we have a circular reference.
But circular references, in conjunction with value equality, lead to infinite recursion. Here is a code example:
interface IInstitution { string Name { get; } }
class Club : IInstitution
{
public string Name { get; set; }
public HashSet<Person> Members { get; set; }
public override int GetHashCode() { return Name.GetHashCode() + Members.Count; }
public override bool Equals(object obj)
{
Club other = obj as Club;
if (other == null)
return false;
return Name.Equals(other.Name) && Members.SetEquals(other.Members);
}
}
class Person
{
public string Name { get; set; }
public IInstitution FavouriteInstitution { get; set; }
public override int GetHashCode() { return Name.GetHashCode(); }
public override bool Equals(object obj)
{
Person other = obj as Person;
if (other == null)
return false;
return Name.Equals(other.Name)
&& FavouriteInstitution.Equals(other.FavouriteInstitution);
}
}
class Program
{
public static void Main()
{
Club c1 = new Club { Name = "myClub", Members = new HashSet<Person>() };
Person p1 = new Person { Name = "Johnny", FavouriteInstitution = c1 }
c1.Members.Add(p1);
Club c2 = new Club { Name = "myClub", Members = new HashSet<Person>() };
Person p2 = new Person { Name = "Johnny", FavouriteInstitution = c2 }
c2.Members.Add(p2);
bool c1_and_c2_equal = c1.Equals(c2); // StackOverflowException!
// c1.Equals(c2) calls Members.SetEquals(other.Members)
// Members.SetEquals(other.Members) calls p1.Equals(p2)
// p1.Equals(p2) calls c1.Equals(c2)
}
}
c1_and_c2_equal
should return true
, and in fact we (humans) can see that they are value-equal with a little bit of thinking, without running into infinite recursion. However, I can't really say how we figure that out. But since it is possible, I hope that there is a way to resolve this problem in code as well!
So the question is: How can I check for value equality without running into infinite recursions?
Note that I need to resolve circular references in general, not only the case from above. I'll call it a 2-circle since c1
references p1
, and p1
references c1
. There can be other n-circles, eg if a club A
has a member M
whose favourite is club B
which has member N
whose favourite club is A
. That would be a 4-circle. Other object models might also allow n-circles with odd numbers n. I am looking for a way to resolve all these problems at once, since I won't know in advance which value n can have.
An easy workaround (used in RDBMS) is to use a unique Id
to identify a Person
(any type). Then you don't need to compare every other property and you never run into such cuircular references.
Another way is to compare differently in Equals
, so provide the deep check only for the type of the Equals
and not for the referenced types. You could use a custom comparer:
public class PersonNameComparer : IEqualityComparer<Person>
{
public bool Equals(Person x, Person y)
{
if (x == null && y == null) return true;
if (x == null || y == null) return false;
if(object.ReferenceEquals(x, y)) return true;
return x.Name == y.Name;
}
public int GetHashCode(Person obj)
{
return obj?.Name?.GetHashCode() ?? int.MinValue;
}
}
Now you can change the Equals
implementation of Club
to avoid that the Members
(Persons) will use their deep check which includes the institution but only their Name
:
public override bool Equals(object obj)
{
if (Object.ReferenceEquals(this, obj))
return true;
Club other = obj as Club;
if (other == null)
return false;
var personNameComparer = new PersonNameComparer();
return Name.Equals(other.Name)
&& Members.Count == other.Members.Count
&& !Members.Except(other.Members, personNameComparer).Any();
}
You notice that i can't use SetEquals
because there is no overload for my custom comparer.
Following the suggestion of Dryadwoods, I changed the Equals
methods so that I can keep track of the items that were already compared.
First we need an equality comparer that checks reference equality for corresponding elements of pairs:
public class ValuePairRefEqualityComparer<T> : IEqualityComparer<(T,T)> where T : class
{
public static ValuePairRefEqualityComparer<T> Instance
= new ValuePairRefEqualityComparer<T>();
private ValuePairRefEqualityComparer() { }
public bool Equals((T,T) x, (T,T) y)
{
return ReferenceEquals(x.Item1, y.Item1)
&& ReferenceEquals(x.Item2, y.Item2);
}
public int GetHashCode((T,T) obj)
{
return RuntimeHelpers.GetHashCode(obj.Item1)
+ 2 * RuntimeHelpers.GetHashCode(obj.Item2);
}
}
And here is the modified Equals
method of Club
:
static HashSet<(Club,Club)> checkedPairs
= new HashSet<(Club,Club)>(ValuePairRefEqualityComparer<Club>.Instance);
public override bool Equals(object obj)
{
Club other = obj as Club;
if (other == null)
return false;
if (!Name.Equals(other.Name))
return;
if (checkedPairs.Contains((this,other)) || checkedPairs.Contains((other,this)))
return true;
checkedPairs.Add((this,other));
bool membersEqual = Members.SetEquals(other.Members);
checkedPairs.Clear();
return membersEqual;
}
The version for Person
is analogous. Note that I add (this,other)
to checkedPairs
and check if either (this,other)
or (other,this)
is contained because it might happen that after the first call of c1.Equals(c2)
, we end up with a call of c2.Equals(c1)
instead of c1.Equals(c2)
. I am not sure if this actually happens, but since I can't see the implementation of SetEquals
, I believe it is a possibility.
Since I am not happy with using a static field for the already checked pairs (it will not work if the program is concurrent!), I asked another question: make a variable last for a call stack.
For the general case that I am interested in
-- where we have classes C1
, ..., Cn
where each of these classes can have any number of VALUES (like int
, string
, ...) as well as any number of REFERENCES to any other classes of C1
, ..., Cn
(eg by having for each type Ci
a field ICollection<Ci>
) --
the question "Are two objects A
and B
equal?" , in the sense of equality that I described here,
seems to be EQUIVALENT to
the question "For two finite, directed, connected, colored graphs G
and H
, does there exist an isomorphism from G
to H
?" .
Here is the equivalence:
object
s (class instances) object
s object
s have the same type and the same values) That's an NP-hard question, so I think I'm going to discard my plan to implement this and go with a circular-reference-free approach instead.
链接地址: http://www.djcxy.com/p/40590.html下一篇: 等于和循环引用:如何解决无限递归?