On Entity Equality

I am taking a small break from my current project, and have started looking at building a common component library that all future projects will be based on. This library will have many things in it, like Specification classes, various interfaces, some helpers and extensions, as well as some persistence related things. This is where this post comes in. I got the idea from working with nhibernate that I might want to use some better entity comparisons than what is built into the clr Equals method. I took a look at several examples across the internet, including some oss products, and this is what I came up with. Please comment on it, make suggestions, or just ramble if wish. If I don’t get any feedback in the couple of days, I will probably start using it, as I’ll assume nobody saw anything worth mentioning.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Reflection;
namespace Venue.Infrastructure.Domain
{
[AttributeUsage(AttributeTargets.Property, AllowMultiple = true)]
public class SignatureAttribute : Attribute { }
public abstract class Entity : Entity<int> { }
public abstract class Entity<T>
{
private const int HASH_MULTIPLIER = 31;
private int? hashCode;
private IEnumerable<PropertyInfo> signatureProperties;
public virtual T Id { get; protected set; }
public override bool Equals(object obj)
{
// cast obj to correct type
Entity<T> candidate = obj as Entity<T>;
// if candidate is null, the entities are not equal, period
if (candidate == null)
return false;
// if they share memory, the entities are equal, period
if (ReferenceEquals(this, candidate))
return true;
// if one is transient, and the other is not, the entities are not equal
if ((IsTransient() && !candidate.IsTransient()) || (!IsTransient() && candidate.IsTransient()))
return false;
// if Ids are not equal, the entities are not equal
if (!Id.Equals(candidate.Id))
return false;
// get all entity properties with the signature attribute
IEnumerable<PropertyInfo> properties = GetSignatureProperties();
// if any property does not match, the entities are not equal
if (properties.Any())
foreach (PropertyInfo property in properties)
if (!property.GetValue(this, null).Equals(property.GetValue(candidate, null)))
return false;
// default response
return properties.Any() || base.Equals(obj);
}
public override int GetHashCode()
{
// if we already have it, return it
if (hashCode.HasValue)
return hashCode.Value;
// start by using the type hashcode
hashCode = GetType().GetHashCode();
// mix in the Id hashcode
hashCode = (hashCode * HASH_MULTIPLIER) ^ Id.GetHashCode();
// get all entity properties with the signature attribute
IEnumerable<PropertyInfo> properties = GetSignatureProperties();
// mix in property value hashcodes
if (properties.Any())
foreach (PropertyInfo property in properties)
hashCode = (hashCode * HASH_MULTIPLIER) ^ property.GetValue(this, null).GetHashCode();
// return hashcode
return hashCode.Value;
 
}
private IEnumerable<PropertyInfo> GetSignatureProperties()
{
// cache properties if needed
if (signatureProperties == null)
signatureProperties = GetType().GetProperties().Where(p => Attribute.IsDefined(p, typeof(SignatureAttribute), true));
// return cached properties
return signatureProperties;
}
public virtual bool IsTransient()
{
 
// simple test to see if we have a real Id
return Id == null || Id.Equals(default(T));
 
}
}
}

Edit: oops…I should be using “unchecked” around the hashing stuff…duly noted.

I’m confused, are you trying to accomplish? It’s early, I haven’t had much coffee, so sorry if I’m missing the obvious.

I use some class that I did, inspired from some java library.

in my base value object class I default to EqualsBuilder.RefelectionEquals(…) and HashCodeBuilder.ReflectionHashCode(…) and override when needed.

cheers,
Rui Santos

@pufa, handy, we use something similar for Value Objects. Can I make a small suggestion? It might be a good idea to cache the properties, can make some easy performance savings by doing so.

The ReflectionXXX() methods have an implementation very similar to what ValueType (struct) uses.

Yes… caching the fields would probably improve performance, but i’ve only used this in my toy apps and never realy felt the need to go so far. Any sujection on how to do it? An hash table keyed by type?

Just a note, I’ve just looked at the files and both ReflectionXXX do a check or used the Type of the objects passed as arguments. Since NHibernate sometimes creates proxies on objects returned ReflectionXXXs will not do.

@Serenarules

ok… you are marking the identity properties with [Signature]. That is fine with me… I then would be careful to not make crazy inheritance chains, overriding the defined attributes could end up not working very well.

I would make it so that it checks properties and fields…

A personal taste… make IsTransient() protected.

and last… just to tease you!! here is test for your code…


[TestFixture]
    public class SerenarulesTests
    {
        public class TestEntity : Entity
        {
            [Signature]
            public virtual object SignatureOne { get { return null; } }
        }
        [Test]
        public void Signature_properties_can_be_null()
        {
            var entity1 = new TestEntity();
            var entity2 = new TestEntity();
            Assert.DoesNotThrow(() => entity1.Equals(entity2));
        }
    }

RE: the above test. I did this very test, among others.

Category c1 = new Category[FONT=Consolas]SIZE=2;
c1.Description =
[/SIZE][/FONT]“description”;
Category c2 = new Category[FONT=Consolas]SIZE=2;[/SIZE][/FONT]

Asserts true as expected, but they have different HasCodes. This is by design.

What I am testing is not object equality, but domain equality.

Both have the same Id (0 in this case), and both have the same Signature property (“c1” for Title in this case). They therefore represent the same entity.

They have different hash codes though because they have different description values. I set it up this way so that when I get around to writing comparison routines, I can tell easier when there is a discrepency between entity instances.

Have you found a case where the class does not work?

RE: caching the property collection.

I have considered this using a static dictionary, but haven’t really explored it yet.

RE: inheritence.

Good point. Though I have never once had a need to use multiple inheritence, I’ll keep that in mind. What it may be better to do, is to create a base class that simply houses protected methods that can be used by end classes to write their own calculations.

run the test!

this line: if (!property.GetValue(this, null).Equals(property.GetValue(candidate, null)))

throws a NullReferenceException in the test.

You are caching an IEnumerable… doesn’t that run every time you iterate over it?

You are keeping a reference to the IEnumerable but this code:
[COLOR=#000000]

[/COLOR]
[COLOR=#000000][COLOR=#333333]GetType[/COLOR][COLOR=#000000]([/COLOR][COLOR=#000000])[/COLOR][COLOR=#333333].[/COLOR][COLOR=#0000ff]GetProperties[/COLOR][COLOR=#000000]([/COLOR][COLOR=#000000])[/COLOR][COLOR=#333333].[/COLOR][COLOR=#0000ff]Where[/COLOR][COLOR=#000000]([/COLOR][COLOR=#333333]p => Attribute.[/COLOR][COLOR=#0000ff]IsDefined[/COLOR][COLOR=#000000]([/COLOR][COLOR=#333333]p, [/COLOR][COLOR=#008000]typeof[/COLOR][COLOR=#000000]([/COLOR][COLOR=#333333]SignatureAttribute[/COLOR][COLOR=#000000])[/COLOR][COLOR=#333333], [/COLOR][COLOR=#0600ff]true[/COLOR][COLOR=#000000])[/COLOR][COLOR=#000000])[/COLOR][COLOR=#333333];[/COLOR][/COLOR]
[COLOR=#000000]

[/COLOR]

but is running everytime you iterante the IEnumerable?

Also you are caching per instance not Type.

I wouldn’t worry much about the inheritance problem.

but the problem I see is this…

class A : Entity
{
[Signature]
public string Foo { get; set; }
[Signature]
public string Bar{ get; set; }
}

class B : A {

[Signature]
public string OverrideBar { get; set; }
}

B should use Foo from A and OverrideBar from B instead of Bar from A. I haven’t tested but I think its not possible. But you can always override equals in B

Also you are caching per instance not Type.

Currently, yes. Understand though that it was solely while developing the basic class, to get the mechanism correct. My actual plan is to use an external dictionary class like this:

internal static class PropertyCache
{

public readonly static IDictionary<Type, IList<PropertyInfo>> EntityProperties = …

[COLOR=#000000]public readonly static IDictionary<Type, IList<PropertyInfo>> ValueObjectProperties = …

[/COLOR]}

[COLOR=#000000]

You are caching an IEnumerable… doesn’t that run every time you iterate over it?
[/COLOR]

As far as refetching the data, I don’t think so. It is forward only though, so each time you traverse it, you start at the beginning. If this is wrong, let me know.

And I didn’t see the null in your test there. I supposed I need to adjust that code a bit more. Thanks for pointing that one out. =)

I had a look at it in reflector and this how GetSignatureProperties() looks like.


private IEnumerable<PropertyInfo> GetSignatureProperties()
{
    if (this.signatureProperties == null)
    {
        if (Entity<T>.CS$<>9__CachedAnonymousMethodDelegate1 == null)
        {
            Entity<T>.CS$<>9__CachedAnonymousMethodDelegate1 = new Func<PropertyInfo, bool>(null, (IntPtr) Entity<T>.<GetSignatureProperties>b__0);
        }
        this.signatureProperties = Enumerable.Where<PropertyInfo>(base.GetType().GetProperties(), Entity<T>.CS$<>9__CachedAnonymousMethodDelegate1);
    }
    return this.signatureProperties;
}


this.signatureProperties = Enumerable.Where<PropertyInfo>(base.GetType().GetProperties(), Entity<T>.CS$<>9__CachedAnonymousMethodDelegate1);

I guess base.GetType().GetProperties() ends up being called every time you iterate over signatureProperties.

the fix is easy… call ToList().


GetType[COLOR=#000000]([/COLOR][COLOR=#000000])[/COLOR].[COLOR=#0000ff]GetProperties[/COLOR][COLOR=#000000]([/COLOR][COLOR=#000000])[/COLOR].[COLOR=#0000ff]Where[/COLOR][COLOR=#000000]([/COLOR]p => Attribute.[COLOR=#0000ff]IsDefined[/COLOR][COLOR=#000000]([/COLOR]p, [COLOR=#008000]typeof[/COLOR][COLOR=#000000]([/COLOR]SignatureAttribute[COLOR=#000000])[/COLOR], [COLOR=#0600ff]true[/COLOR][COLOR=#000000])[/COLOR][COLOR=#000000]).ToList()[/COLOR];

Interesting. Well if you look at the post above yours, you’ll see that in my planned external lookup cache, it does use lists. So I think we’re good there. In addition, I’ll probably be moving the property attaining methods to that cache object as well, to isloate the calls a bit more from the entities. For example:

PropertyCache.GetSignatureProperties(Entity<I> entity)

This way, Entity<I> doesn’t have to maintain this list itself.

=)

I guess i’m wrong on this ienumerable thing :stuck_out_tongue:

this test passes


[TestFixture]
public class testienumerable
{
 
 public class NumberStore
 {
  public int Count { get; private set; }
  public int[] GetNumbers()
  {
   Count++;
   return Enumerable.Range(1, 10).ToArray();
  }
 }

 [Test]
 public void should_call_getnumber_only_once_when_iterated_twice()
 {
  var numberStore = new NumberStore();
  IEnumerable<int> enumerable = Enumerable.Where(numberStore.GetNumbers(), (i) => i%2 == 0);
  foreach (var i in enumerable)
  {
   
  }
  foreach (var i in enumerable)
  {
   
  }
  Assert.AreEqual(numberStore.Count, 1);
 }
}

That’s kind of what I thought to start with, but thanks for verifying it.

but the where is executed everytime! that got me confused;

This whole thing got me to thinking a bit. What exactly should we consider in determining if two entity instances are to be considered equal anyway?

What are testing? Property value equality or identity equality?

If two Employee objects are instantiated at the same time, and neither has been committed, they both have Id of 0. Until one or the other is committed, don’t they potentially repesent the same entity (eachone is potentially the next to be persisted)?

If (this should never happen…) you find yourself with two instances of an entity, where both have the same Id, but each and every other property is different, including whichever one is considered unique by db standards, aren’t we still talking about the same entity, only in two different states of change?

Case in point:

Employee get’s married and changes their last name. In your design, first, middle, and last names make up the entity signature (bad idea honestly, but anyway). If you were to ask for an equality check between this newly changed instance, and one freshly fetched, they would not equate. But that doesn’t make sense because they are the same entity.

Shouldn’t the only thing that truly matter be the Id?

Entities are difirent from Values because they have identity.
Both can change but entity identity shouln’t change and must be unique.

Up until last night, that was my understanding as well. Now I am not so sure. What if we domain modelled VBulletin? Take this psuedo-code:


// a user registers
User registrant = new User(model.Username)

// they later request a username change
User member = VenueSession.CurrentUser;
member.RequestNameChange(model.RequestedUsername);

// an administrator approves the request
requestor.ApproveNameChangeRequest(VenueSession.CurrentUser)


Does that mean User is not an entity? While I still believe the difference between an Entity and a ValueObject is that Entities have “identity”, I think the definition of what that “identity” is has been diluted over time. The only thing guarenteed to never change is the Id property. ValueObjects do not have these.

I know we should try and protect “hard to change” values with protected sets and constructor input, but does this really mean those values should never change?

Does that make sense?

No its doesn’t mean they should never change. An Entity has an ID and that should never change. The reason we protect our domain is to reduce the number of place a mutation can occur.

Exactly. So by that account, I needn’t even bother with the SignatureAttribute to mark unique constraint fields at all. Correct?