Last Chance to Review Our Yukon CLR Architecture Chapter

TheServerSide.NET are about to post another chapter of our book. This one discusses the new Service Broker architecture. TSS.NET only hosts two chapters at a time of our book and so when the Service Broker chapter is posted, the CLR Architecture one will be removed. Therefore, if you want to review that chapter - now is the time to do it.

The jump on page for the book is here

Chapter Download Bug Now Fixed

The bug, that made downloading chapters of our Yukon book difficult, is now fixed. The two chapters currently up for review are:

  • the CLR Architecture - which explains how the CLR is integrated into Yukon and the basics of how to expose functionality from assemblies
  • the In-Proc managed Provider - which explains how data access works from Yukon resident, CLR based, code.
The Book Review jump on page is here.

Next Chapter of Our Book is Available for Review

The next chapter of mine and Shawn Wildermuth's book about Yukon is available for review. This chapter goes into depth about the new data acces managed provider for accessing data from within the database itself. It also covers Table Valued Functions and Managed Triggers.

Problems Downloading the Chapters from TheServerSide.NET

Its been reported to me that some people are having problems downloading the chapters for review from TheServerSide.NET. You get stuck having to repeated log on to the site. I have reported this bug to TSS.NET and hopefully it will be fixed shortly.

As a workaround, if you log on to the site (using the link in the top right hand corner) before you try to download the chapters, it will all go smoothly.

Why Doesn't Yukon Stop Me Shooting Myself in the Foot?

Now I've been givin my last post some thought, and I thought I'd share my conclusions.

It is obvious that Microsoft are aware of the issues around readonly static fields and reference types in Yukon resident code. You only have to look at the SqlDefinition class to see an example of a type whose perfered usage is as a readonly static field initialized in the static constructor. SqlDefinition members are all readonly operations post construction.

Now from the UNSAFE CAS bucket I would expect to be able to do whatever I like. However, it seems to me that if a reference type is used as a static readonly field from code running from the SAFE or EXTERNAL_ACCESS CAS buckets then the Yukon verifier should be able to check that no state is changed of these types. Now obviously there is an issue that all of these types aren't simply flat objects and may contain embedded references to other objects as well so the tree needs to be chased down. In the end all we are looking for is a call to the IL instructions stfld and maybe ldflda (used for getting a managed pointer to a type - think ref params). So it would seem that on loading an assembly into the database it should be possible to check.

So why isn't this check done? I can think of a few reasons

  1. The Sql Server team is too damn lazy (I'm pretty sure its not this one)
  2. The Sql Server team hasn't thought of it (again, looking at the design of the API it must have crossed their minds at least)
  3. They wanted to do it, but they also wanted to ship the product (possibly)
  4. Its much harder than I'm making out
Its number 4. that, I think, is the real answer and heres why.

Consider the following class:

public class Person
{
  static readonly Pet fluffy;

  static Person()
  {
    fluffy = new Pet();
  }

  public static void StrokePet()
  {
    fluffy.MakeNoise();
  }
}

public class Pet
{
  public void MakeNoise()
  {
    SqlContext.GetPipe().Send("Growl");
  }
}

Well this seems fine - what could go wrong here? Let's alter the code a little ...

public class Person
{
  static readonly Pet fluffy;

  static Person()
  {
    if( DateTime.Now.Hour > 12 )
      fluffy = new Pet();
    else
      fluffy = new Cat();
}

  public static void StrokePet()
  {
    fluffy.MakeNoise();
  }
}

public class Pet
{
  public virtual void MakeNoise()
  {
    SqlContext.GetPipe().Send("Growl");
  }
}

public class Cat : Pet
{
  int volume;
  public override void MakeNoise()
  {
    string sound = "Purr";
    if( volume > 10 )
      sound = "PURR!";
    SqlContext.GetPipe().Send(sound);
    volume++;
  }
}

OK, now were in trouble. We don't know whether fluffy is a generic Pet or a Cat, and the verifier can't tell either as it depends on what time of day the static constructor runs. And this is the simplest example - we can enormously compound the issue with interfaces and Activator.CreateInstance. So now we need a list of other exclusions:

  • The static field cannot be a non-sealed class
  • The static field cannot be an interface
  • Maybe we could back off slightly and say the static field type cannot use reflection to create instances late bound
  • Maybe we could back off slightly and say the static field type cannot have virtual methods
  • ...

The problem is we end up with a bunch of conditions over and above "it must be immutable" because a verifier cannot determine in many cases how the code paths will look.

So in the end, to allow flexibility of how people want to build their object heirarchies having to state theh rule of immutability and not pretend we can offer guarantees beyond that. A pretence that the verifier could 100% guarantee things that in some situations it cannot is asking for a slew of nasty bugs hitting peoples systems. Its far better in that situation to say "hey, there's an issue you need to be aware of here" and letting the developer take responsibility for their own code.

Yukon, AppDomains and Static Fields

Static members of CLR types hold some specific issues for Yukon resident CLR code due to the way in which the AppDomain concept has been mapped into this environment.

Let's take a quick refresher as to what AppDomains are and how they are used within the CLR. All code that executes under the control of the CLR does so within an AppDomain. The AppDomain is analogous to a process in that it acts as a unit of isolation for code running within it. Code in different AppDomains have to perform a significant amount of work to talk to each other (they must use the .NET Remoting infrastructure). If code in one AppDomain encounters a serious problem (such as an unhandled exception), only that AppDomain is affected. Code running in other AppDomains continues execution without impact.

The result of this isolation is that:

  1. Assemblies are loaded on a per-AppDomain basis unless they are specifically loaded domain neutral - which means there is a separate JITed copy of code for each AppDomain.
  2. Static members of a type are specific to an AppDomain.

As of the current beta, the decision has been made to have a single AppDomain per database. This means that the CLRs basis of isolation cannot protect two pieces of code executing within a single database from each other - even if they are running in separate transactions. In other words, it is possible for code running in one transaction to make changes to static state that can been seen from another transaction even before the first has committed.

This breaks one of the golden rules of transactions - that of isolation. Therefore, code running from the SAFE or EXTERNAL_ACCESS CAS buckets cannot have non read-only static fields. Code running from the UNSAFE bucket does not have this restriction.

So does this resolve the problem? Unfortunately, it only resolves the problem partially. Ignoring code running from the UNSAFE bucket (it's inherently allowed to do potentially dangerous things), what is wrong with allowing read-only static fields? The problem stems from there being two distinct kinds of types within the CLR type system: value types and reference types. For a value type, a field is actually the data itself - the memory is allocated inline where the field is declared - this means a read-only value type is constant after construction. However, a reference type field is simply a reference to a block of memory allocated on the Garbage Collected heap. Having a read-only reference type field simply guarantees that the reference remains constant - it says nothing about the block of state of the object it refers to.

Therefore, it is still possible for code running within one transaction to see state from another - even if the code is from the SAFE bucket.

Given this, for Yukon resident code, it is imperative that any reference type that is held in a static read-only field be immutable once constructed - in other words its state must not be able to be changed. This is the only way we can guarantee that changes from one transaction are truly isolated from another concurrent one.

Event Log Access Should only Require EXTERNAL_ACCESS in Yukon

In this post I was bemoaning the fact that to access the event log from Yukon resident managed code requires the containing assembly to be loaded into the UNSAFE CAS bucket whereas I thought the EXTERNAL_ACCESS one should be enough. I've just had it confirmed that this is a bug, the EventLogPermission was accidently omitted from the EXTERNAL_ACCESS permission set. It will be fixed in the public beta, beta 2.

Incidently, if you have no idea of what I'm talking about, have a look at CLR Architecture chapter of our book which is posted here on TheServerSide .NET for public review. It goes into how the Code Access Security envionment is set up in Yukon.

Welcome SQL Server 2005

CNET has reported that Yukon and Whidbey are to ship in the first half of 2005 and as a result Yukon will officially be called SQL Server 2005

Writing DDL Triggers in Managed Code in Yukon

In my opinion, one of the neatest features of having the CLR hosted by Yukon is that we can use managed code to bridge out to the non-database world. So we can take non-database data and present it as though it were a result set - even up to the point where we can issue selects and joins against this non database data source.

Another cool feature is that in Yukon we can now hook triggers on to DDL as well as DML constructs (so things like CREATE TABLE and CREATE USER as well as INSERT and DELETE can fire triggers). Yet another neat thing is that we can create triggers in managed code. When we glue these three new features together we get the ability to provide quite powerful auditing functionality - imagine having the ability to create an Event Log record every time a new user was added to a database.

Now one issue with this is how do we know what happened when a DDL trigger fires - after all we won't get any entries in the INSERTED and DELETED tables. So somehow the database engine needs to deliver the trigger reason (or context) to the trigger. It does this by making the data accessible in the form of an XML document. Here's an example of what the data would look like for a CREATE USER statement:

<?xml version="1.0" encoding="utf-16"?>
<EVENT_INSTANCE>
  <PostTime>2004-03-09T20:04:47.450</PostTime>
  <SPID>53</SPID>
  <EventType>CREATE_USER</EventType>
  <Database>YukonTest</Database>
  <Object>bert</Object>
  <ObjectType>USER</ObjectType>
  <TSQLCommand>
    <SetOptions ANSI_NULLS="ON" 
                ANSI_NULL_DEFAULT="ON" 
                ANSI_PADDING="ON" 
                QUOTED_IDENTIFIER="ON" 
                ENCRYPTED="FALSE" />
    <CommandText>create user bert</CommandText>
  </TSQLCommand>
</EVENT_INSTANCE>

So the data we would be interested in is in this XML document. Specifically in this example I'll extract the new user name from the Object element and the database name from the Database element.

Here;s the C# code:

public static void AddUser ()
{
  SqlTriggerContext ctx = SqlContext.GetTriggerContext();

  if (ctx.TriggerAction == TriggerAction.CreateUser)
  {
    string s = new string(ctx.EventData.Value);

    StringReader r = new StringReader(s);
    XmlReader reader = new XmlTextReader(r);
    XmlDocument doc = new XmlDocument();
    doc.Load(reader);

    reader.Dispose();

    XmlNode node = doc.SelectSingleNode("//EVENT_INSTANCE/Database/text()");
    string database = node.Value;
    node = doc.SelectSingleNode("//EVENT_INSTANCE/Object/text()");
    string user = node.Value;

    EventLog evt = new EventLog("Application", ".", "DBAudit");
    evt.WriteEntry(string.Format("User {0} created in database {1}", 
                                 user, database));
  }
}

So first of all I get hold of the details about why this trigger has fired in the form of a SqlTriggerContext. I then extract the XML from the EventData property which is of SqlChars type). Using XPath I get the user name and database, then add an entry to the event log.

OK, so we have the code, but now we need to expose the functionality within the database. Here's the SQL to install the trigger:

create assembly Triggers
from 'C:\trigger\triggers.dll'
with permission_set=UNSAFE
go

create trigger AddUser
on database for create_user
as external name Triggers:CAddUser::AddUser
go

So there we have - a powerful extension without having to resort to extended stored procedures. The one thing I find frustrating is that I have to add the assembly to the UNSAFE CAS bucket otherwise it cannot write to the Event Log. To me it would seem that EXTERNAL_ACCESS should be enough - hopefully this will change by the next beta.

Our Yukon Book is now up for Review

The Yukon book that myself and Shawn Wildermuth have been working on (provisionally entitled Programming Yukon) for O'Reilly has just started an open review process. The review process is being hosted by The Server Side .NET. You can get to the book review jump on page here.

The first chapter is on the Architecture involved with the Yukon .NET support and covers how Yukon hosts the runtime, how assemblies are managed, how Code Access Security works within Yukon and the basics of exposing functionality via Stored Procedures and User Defined Functions.

All feedback is very welcome and should be sent here.

Service Broker Message Validation

One issue that hit people when they write message orientated systems is how to control what messages are allowed to be sent and also how to validate those messages. This blog entry looks at how SSB handles these issues.

SSB enforces that all messages sent must be of a predefined type - called a Message type. Message Types have a validation associated with them that can be one of three values

  1. Empty - the message is simply an notification that carries no associated data
  2. Varbinary - the data associated with the message will not be validated by SSB
  3. XML - the data associated with the message will be validated as wellformed XML.

Number 3. can be taken even further. Once you specify that a message must be wellformed XML, you can also, optionally, associate an XML Schema against which the message will be validated (XML Schema can now be loaded as a first class database object).

The Syntax for creating a Message Type is as follows (be warned that this is the syntax for the build of Yukon I currently have - the Beta 1 syntax will, apparently, be different).

CREATE MESSAGE TYPE
[http://develop.com/richardb/shippingRequest]
ENCODING VARBINARY

or for a validated XML message

CREATE MESSAGE TYPE
[http://develop.com/richardb/validatedShippingRequest]
ENCODING XML WITH 'http://develop.com/richardb/shippingSchema'

The URL after the the CREATE MESSAGE TYPE is simply a unique identifier for the message. Using indentifiers of this form make keeping uniqueness across complex distributed systems simpler.

So this determines the validation of messages, however, how do we restrict what messages may be sent to a particular service? This is the role of a Contract. A Contract determines which message types may be sent and received within a particular dialog - determining which messages can be sent by the dialog initiator and which can be sent by the dialog target (and message can also be specified as being allowed by either). These rules are then enforced by SSB.

Here is an example contract:

CREATE CONRACT
[http://develop.com/richardb/shippingContract]
(
[http://develop.com/richardb/validatedShippingRequest]
SENT BY INITIATOR
[http://develop.com/richardb/validatedShippingResponse]
SENT BY TARGET
)
This contract is then bound to a service by the service definition.
				

So whats the Service Broker?

I've been researching Yukon for a few months as I am writing a book on Yukon for O'Reilly. My co-author is Shawn Wildermuth (the ADO Guy).

My first real view of the product was at the Yukon Tech Preview in Seattle in Febrary 2003 and amongst all the things I already knew about (like XML integration and CLR hosting) one feature reallky caught my attention - the Service Broker (also known as SSB).

SSB is a reliable messaging infrastructure that is part of the database engine itself. It is built on top of the new Yukon queuing primitives and takes care of the complex parts of message orientated programming. Messages are sent to abstracted endpoints called Services. Services are bound to queues and specify the types of messages that they are prepared to receive - this is enforced by the Service Broker infrastructure.

To send a message to a service, a program starts a "Dialog" from one service (the initiator) to another (the target) and sends one or more messages to the target. These will be readable from the target service's queue only in the order they were sent - even if they were received in a different order - again, functionality that the Service Broker infrastructure provides.The program (most likely a stored procedure) that reads the target queue can send messages back on the same dialog to the initiating service.

The programs that service queues (sometimes calls Service Programs) can be autostarted by SSB and in fact multiple copies can be started if the queue readers aren't keeping up with the rate messages are received.

Anyway, I'll leave it there for now and post some more (including syntax) shortly.

Yukon Ho!

Niels has just blogged that Roger Wolter, the Microsoft Programme Manager for the new Service Broker in Yukon has delivered a session on the technology.

HOORAY

Now I can talk about it :-)

Service Broker is a very cool new feature that builds a reliable messaging infrastructure directly into the Yukon database engine. It handles all the hard stuff about writing message orientated systems (like message ordering, breaking large messages into fragments, recovery, etc).

I'm in the middle of teaching an Essential .NET class at the moment but will hopefully write a proper piece on it tonight when I get back to my hotel.