Friday, February 05, 2010

A while back I wrote a blog post about DataSets and why you shouldn’t use them on service boundaries. The fundamental issues are:

  1. No non-.NET client has any idea what the data looks like you are sending them
  2. Hidden, non essential data is being passes up and down the wire
  3. Your client gets coupled to the shape of your data (usually a product of the Data Access Layer)

So when I saw that Entity Framework 4.0 supports Self Tracking Entities (STEs) I was interested to see how they would work – after all, automated change tracking is one of the reasons people wanted to use DataSets in service contracts. The idea is that as you manipulate the state the object itself tracks the changes to properties and whether it has been created new or marked for deletion. It does this by implementing an interface called IObjectWithchangeTracker which is then used by the ObjectContext to work out what needs to be done in terms of persistence. There is an extension method on the ObjectContext called ApplyChanges which does the heavy lifting.

The Entity Framework team has released a T4 Template to generate these STEs from an EDMX file and the nice thing is that the generated entities themselves have no dependency on the Entity Framework. Only the generated context class has this dependency and, so the story goes, the client needs to know nothing about Entity Framework, only the service does. The client, and the entities, remain ignorant of the persistence model. For this reason STEs have been touted as a powerful tool in n-tier based architectures

All of this seems almost too good to be true … and unfortunately it is.

To understand what the issue is with STEs we have to remember what two of the main goals of Service based systems are:

  1. Support heterogeneous systems – the service ecosystem is not bound to one technology
  2. Decoupling – to ensure changes in a service do not cascade to consumers of the service for technical reasons (there may obviously be business reasons why we might want a change in a service to effect the consumers such as changes in law or legislation)

So to understand the problem with STEs we need to look at the generated code.Here’s the model I’m using:

edmx

Now lets look at the T4 Template generated code for, say, the OrderLine

[DataContract(IsReference = true)]
[KnownType(typeof(Order))]
public partial class OrderLine: IObjectWithChangeTracker, INotifyPropertyChanged
{
    #region Primitive Properties
 
    [DataMember]
    public int id
    {
        get { return _id; }
        set
        {
            if (_id != value)
            {
                ChangeTracker.RecordOriginalValue("id", _id);
                _id = value;
                OnPropertyChanged("id");
            }
        }
    }
    private int _id;
 
    // more details elided for clarity
}

A few things to notice: firstly this is a DataContract and therefore is designed to be used on WCF contracts – that is its intent; secondly a bit of work takes place inside the generated property setters. The property setters check to see if the data is actually changed, then it records the old value and raises calls OnPropertyChanged to raise the PropertyChanged event (defined on INotifyPropertyChanged). Lets have a look inside OnPropertyChanged:

protected virtual void OnPropertyChanged(String propertyName)
{
     if (ChangeTracker.State != ObjectState.Added && ChangeTracker.State != ObjectState.Deleted)
     {
         ChangeTracker.State = ObjectState.Modified;
     }
     if (_propertyChanged != null)
     {
         _propertyChanged(this, new PropertyChangedEventArgs(propertyName));
     }
}

Ok so maybe a bit more than raising the event. It also marks the object as modified in the ChangeTracker. The ChangeTracker state is partly how the STE serializes its self tracked changes. It is this data that is used by the ApplyChanges extension method to work out what has changed. So the thing to remember here is that the change tracking is performed by code generated into the property setters.

Well the T4 Template has done its work so we create our service contract using these conveniently generated types and the client uses metadata and Add Service Reference to build its proxy code. It gets the OrderLine from the service, updates the quantity and sends it back. The service calls ApplyChanges on the context and then saves the changes and … nothing changes in the Database. What on earth went wrong?

At this point we have to step back and think about what those types we use in service contracts are actually doing. Those types are nothing more than serialization helpers – to help us bridge the object world to the XML one. The metadata generation uses the type definition and the attribute annotations to generate a schema (XSD) definition of the data in the class. Notice we’re only talking about data – there is no concept of behavior. And this is the problem When the Add Service Reference code generation takes place its based on the schema in the service metadata – so the objects *look* right, just the all important code in the property setters is missing. So you can change the state of entities in the client and the service will never be able to work out if the state has changed – so changes don’t get propagated to the database.

There is a workaround for this problem. You take the generated STEs and put them in a separate assembly which you give to the client. Now the client has all of the change tracking code available to it, changes get tracked and the service can work out what has changed and persist it.

But what have we just done? We have forced the client to be .NET. Not only that, we’ve probably compiled against .NET 4.0 and so we are requiring the client to be .NET 4.0 aware – we might as well have taken out a dependency on Entity Framework 4.0 in the client at this point. In addition, changes I make to the EDMX file are going to get reflected in the T4 generated code – I have coupled my client to my data access layer. Lets go back to the problems with using DataSets on service boundaries again. We’re back where we pretty much started – although the data being transmitted is more controlled.

So STEs at first glance look very attractive, but in service terms they are in fact similar to using DataSets in terms of the effect on the service consumer. So what is the solution? We’ll we’re back with our old friends DTOs and AutoMapper. To produce real decoupling and allow heterogeneous environments we have to explicitly model the *data* being passed at service boundaries. How this is patched this into our data access layer is up to the service. Entity Framework 4.0 certainly improves matters here over 1.0 as we can use POCOs which aid testability and flexibility of our service

 |  | 
Friday, February 05, 2010 11:48:08 AM (GMT Standard Time, UTC+00:00)  #    Comments [15]TrackbackTracked by:
"Social comments and analytics for this post" (uberVU - social comments) [Trackback]
Friday, February 05, 2010 2:37:56 PM (GMT Standard Time, UTC+00:00)
Hi Rich,

Good post on EF4 STE's. I would just like to add my two pence on the service-oriented aspect of STE's. When using STE's it's very important to place them in a separate assembly that is referenced by both the client and the service. (When adding the service reference, as you know, the entities will not be generated by the WCF proxy generator.) That assembly should only contain the STE's themselves, not the generated context class. If that is the case, there will be no dependency on EF.

There are two issues with this approach. One is that client and service are sharing an assembly, which basically violates the spirit of service-orientation, in which client and service only share schema and contract, not class. Still, I don't have a problem with that, because the classes themselves are not tied to EF or .NET specifically. The reason why the client needs to reference the assembly is that it needs the change-tracking code. But I don't see a reason why you couldn't break that part out into a helper assembly, which is referenced by the client and not the service. This is the approach I took in my MSDN Magazine article on the subject back in Dec 2008: http://msdn.microsoft.com/en-us/magazine/dd263098.aspx.

The other issue is coupling between the conceptual model and your object model. As you point out, changing the model will require you to re-generate the STE's, although this takes place when rebuilding the project. The more important issue is that you might want your object model to have different property names or a different shape from the conceptual model defined in the edmx file. In this case, DTO's are the way to go, which is also the approach I took in my article (well, STE's didn't exist back then either). The trade-off is that extra code is required in the data access layer to translate between entities and DTO's. The whole POCO support in general, and STE's in particular, obviates the need for that translation, making the developer's job much easier. For a useful comparison between STE's and DTO's, see Danny Simmons' article: http://msdn.microsoft.com/en-us/magazine/ee321569.aspx.

If anyone reading this would like a step-by-step guide on using STE's, I've written a walk-through on it: http://www.develop.com/ef4ntiersupport.

Cheers,
Tony
Friday, February 05, 2010 3:10:33 PM (GMT Standard Time, UTC+00:00)
Hi Tony

well you could compile the assembly code under .NET 3.5 SP1 (IIRC the DataContract Ref stuff was introduced then) but not in earlier versions. And you are write that formally you are not tied to EF the fact that unless you share *source* code you will be tied to .NET 4.0 - as EF is part of both the full and client profile I don;t see a huge difference in terms of coupling.

Someone could take the tracking stuff and reimplement this on another platform. But its basically a proprietory infrastructure that you would be imposing on another platform - for that matter someone could implement something DataSet-like in Java an do roughly the same job.

I just don't really see a huge benefit over using DataSets when all is said and done - other than it integrates with EF
Richard
Friday, February 05, 2010 3:53:13 PM (GMT Standard Time, UTC+00:00)
Yes, what I would do is just expose the STE generated entities as normal data contracts, but implement the change-tracking code in a separate helper assembly that is referenced only by the client. There's nothing magical or proprietary in that code, just a mechanism that marks entities as added, modified, or deleted (and caches deleted items). The part that is EF-specific is reading the ChangeTracker data contract and informing the ObjectStateManager of those changes, which takes place in the ApplyChanges extension method. None of that code is exposed to the client, so there's no coupling there.

In terms of the format of the change-tracking data contract, I would concede a coupling, because it has to be written in such a way that some code in the data access layer will need to understand it. But I don't see that as EF-specific. Once a change-tracking data contract is established, you could write any code you want on the client to record it, and on the service to read it. The question of whether you roll your own data contract, or use what is generated for STE's, is incidental IMHO. The complexity of the wire format for the ChangeTracker data contract is no where near as that of the DataSet, and it should not be difficult for a java client to understand and update with its own client-side change tracker.

The ADO.NET DataSet is a .NET-specific class, with an overly complex schema that is difficult to implement on another platform, while the ChangeTracker has a much simpler schema that is platform-neutral and not at all difficult to deal with on other platforms. To me that seems like a big difference. ;-)

Cheers,
Tony
Friday, February 05, 2010 6:03:38 PM (GMT Standard Time, UTC+00:00)
I think its a fragile argument. You could have said the same about conext based correlation in WF 3.5. Where the client "just has to pass the context to the service". But essentially you are leaking implementation detail to the service consumer. The whole point of STEs was to bury the self tracking code so the client was unaware of it - but due to the implementation the client code is painfully aware of it - or you build Self Tracking infrastructure.

The service is basically handing the client a whole bunch of complexity. For a really *usable* service the complexity needs to be encapsulated and handled by the service.

The only way they could have automated this is to copy the existing values of the entity on serialization of the data contract. They could then examine the current state and the copy of the state and work out what has changed purely from the entity. The problem with this is you end up with the equivelent of ViewState which would be unwieldy for large amounts of data
Richard
Friday, February 05, 2010 7:58:35 PM (GMT Standard Time, UTC+00:00)
You make a good point on the complexity aspect. Better than data sets, but not as simple as it could be. In my implementation, each entity simply carried an additional property indicating change-state (an enum of unmodified, added, modified or deleted). That's it. Then I had the service figure out how to apply those states to the database. That was a pretty clean separation, which allowed bulk updates (for ex, Order with OrderDetails that were removed, added or modified), without having to send in original values or query the database for them. I implemented the client change-tracker as a simple collection that could easily be replicated on non-Microsoft platforms.

The problem with the ObjectChangeTracker class is that not only includes change-state for the entity itself but has info for all the navigation properties as well (items added or removed) and other metadata like OriginalValues and ExtendedProperties. And it includes all the logic for recording change-state. Yikes! My implementation didn't need any of that stuff, but I think it's probably there to make it easier for ApplyChanges on the service-side to get change-state and inform the ObjectStateManager. So in this respect, I definitely agree it is way more complex than it really needs to be and is too coupled to what the service is expecting. They should have abstracted much of that away and placed the change-tracking logic in the ChangeTrackingCollection<T>, as I demonstrated in my article.

I'm not sure why they opted for the approach they took, because it does not bode well for the interoperability story. While there is nothing in there that references EF, and there needs to be a minimal coupling just to know how to format entity change-state, it does carry a lot of extra baggage that ApplyChanges evidently needs to do its work. Pity, because they did have my implementation in hand when designing STE's and could have approached it differently.

Cheers,
Tony

Monday, February 08, 2010 12:06:14 PM (GMT Standard Time, UTC+00:00)
But what if the only consumers are application written in .NET ?
What if i don;t wont to interop with other system?
Do i need even wcf ? NO...of course.
Can i pass datasets or STEs? Of course.
Do i need to map unmap stuff? It would be a loss of time and speed...
Why should i write slow stuff with WCF and mapping with possible interoperability in mind when ALL the software involved is developed by me?
liviu
Wednesday, February 10, 2010 7:22:37 AM (GMT Standard Time, UTC+00:00)
Hi Liviu

ok lets take the first three points: no-interop and therefore not needing WCF. Yes this is obviously true, you could use remoting or DCOM. However, WCF is generally faster than remoting and can be faster than DCOM depending on the data being passed. It is true that WCF does cater for interop but it also caters for communication that is .NET to .NET very effectively.

Can you pass DataSets or STEs? Yes of course and the interop part of my argument is not applicable here if you are assuming .NET to .NET. However the coupling argument is: your client has visibility of your Data Access Layer which means changes in the DAL will cascade to your client.

Using DTOs requires mapping of one class to another so yes there is an impact - but whether than impact is significant is another matter. You are already going over a wire to teh service and then one or more database roundtrips. The impact of mapping is likely lost in the noise. In addition, if you care about that raw speed you probably aren't using EF anyway

The points I make are about design decisions and stem from the industry's experience in building distributed systems. But as always your specific requirements and constraints may take you down another path. As with all design decisions the most important thing is you understand the trade-offs. What is the right answer? As with all general advice over architecture and design ... it depends
Richard
Friday, February 19, 2010 11:01:20 PM (GMT Standard Time, UTC+00:00)
Hi Rich,

You pretty much won me over to your point of view. I posted a blog entry with a sample app on Trackable DTO's: http://blog.tonysneed.com/2010/02/19/trackable-dtos-taking-n-tier-a-step-further-with-ef4. The implementation is much simpler and more loosely coupled than self-tracking entities.

Cheers,
Tony
Monday, December 19, 2011 2:23:01 PM (GMT Standard Time, UTC+00:00)
All people deserve good life and <a href="http://goodfinance-blog.com/topics/business-loans">business loans</a> or bank loan would make it better. Because freedom is based on money state.
Monday, January 30, 2012 12:19:43 AM (GMT Standard Time, UTC+00:00)
I love these articles. How many words can a wodrsmith smith?
Monday, January 30, 2012 12:19:52 AM (GMT Standard Time, UTC+00:00)
YMMD with that awnser! TX
Monday, January 30, 2012 7:59:38 AM (GMT Standard Time, UTC+00:00)
U9LA8C <a href="http://ptkcfvyyhnif.com/">ptkcfvyyhnif</a>
Tuesday, January 31, 2012 3:53:35 PM (GMT Standard Time, UTC+00:00)
ukSh48 , [url=http://hdfsjqdxfaca.com/]hdfsjqdxfaca[/url], [link=http://xvrxkplwqwwn.com/]xvrxkplwqwwn[/link], http://iqbykhsjaugv.com/
Wednesday, February 01, 2012 8:07:05 AM (GMT Standard Time, UTC+00:00)
v7D8rQ <a href="http://dbvprpzmkufv.com/">dbvprpzmkufv</a>
Friday, February 03, 2012 8:58:03 AM (GMT Standard Time, UTC+00:00)
SZG7jh , [url=http://oikdfydgqwfp.com/]oikdfydgqwfp[/url], [link=http://tiqlsksrjypj.com/]tiqlsksrjypj[/link], http://fgesokmavald.com/
Name
E-mail
Home page

Comment (HTML not allowed)  

Enter the code shown (prevents robots):