Arrays Lists and some IEnumerable Philosophy

Wednesday, September 16, 2009 9:36 PM

So I had an "architecture moment" today at work. I was refactoring some code with Jay Smith and while modifying a few classes, we changed a few data types from List<T> to T[] (from generic lists to plain old arrays) at my request. I'll get to why in a moment. Just let it be noted first that doing this caused two things: first it caused a significant amount of work at merge time for one of my teammates, and then it caused a philosophy discussion about why we should or should not bother with plain arrays instead of their big brother List class. And then the conversation veered in the direction of using IEnumerable<T> as return types and the dangers therein.

Inflicted Pain

When we made all the changes, I thought we were doing all the grunt work for everyone else on the team. I hate when what I do causes a significant amount of work for someone else; especially when the change isn't "needed" but is more of a philosophy type change. It turned out that one of the classes heavily affected by our changes was being worked on heavily by someone else. Thus the heavy merge problems for him. Completely my bad.

Array vs List

I have a very strong belief that all code should be self documenting. Therefore, to me, if I make the data type of a property List<whatever>, I feel that is telling other developers that this is a property expecting to be modified (adding and removing things). An array, however, says to the world that here is the final un-modifiable list as it was at object creation. A side result of this belief is that I pretty much never return a List from a method call since there are very few contexts where that makes sense.

Well, there usually is a reason to use List as a data type or return type. Not a good reason, but a reason... I usually see this as a convenience because those lists do get modified by adding dummy objects (like the first item in a dropdown that says "Please Select One") before data binding.

There was one other point mentioned too - performance. Yes, I know, I know. A List is bigger than an array and calling ToList on an IEnumerable is slower than calling ToArray, but not by much. And when I say "not by much", I mean seriously, seriously small amounts. I actually did a bit of testing to be sure and this is the result I got, so this part of the argument isn't worth ever mentioning again.

ToList speed (10,000 calls): 2,089 ms
ToArray speed (10,000 calls): 1,986 ms

ToList size: 16,448 bytes
ToArray size: 16,024 bytes

Total guids created: 20,002,000

The code that generated this result is at the bottom of the post.

IEnumerable Gotcha

When I mentioned that I never return List<T> from a method and always choose T[], it was asked why I don't just return IEnumerable<T> instead of an array.

Do you happen to remember the phrase deferred execution? It basically means that any variable of type IEnumerable<T> might be a list of stuff and it might just be an execution plan that will return a list of stuff as the list is enumerated.

The gotcha in that last statement is that the execution plan will run every time the list is enumerated. Did you notice the last line of the above test result? I did that on purpose to show again that an IEnumerable<T> will execute over and over. This is why even though the size of the list of guids is 1000, the GetNewGuid() method was called over 20 million times during the test!

So, you certainly can be careful when returning IEnumerable by calling ToArray() or ToList() to force the execution plan and return a finalized list. The problem for my little brain is that I constantly forget to do that! I'll run the application or the unit test, see whacky things happen or poor performance, and then go back and fix things properly. Or at least I used to when I was on the kick where every list was an IEnumerable no matter what. Now by default I use an array and only return IEnumerable when deferred execution is my intention (remember all code is self documenting).

Enough already!

Ya, ya... that's enough rambling. I really do hate long winded blog posts! Here's the code I mentioned earlier that generated that result. Happy coding!

internal class Program
{
   private static int _numGuids;

   private static void Main()
   {
      IEnumerable<Guid> guids = Enumerable
         .Range(0, 1000)
         .Select(x => GetNewGuid());

      MeasureSpeed(guids);
      MeasureSize(guids);

      Console.WriteLine("Total guids created: {0:n0}", _numGuids);
   }

   private static void MeasureSpeed(IEnumerable<Guid> guids)
   {
      const int iterations = 10000;

      var stopwatch = new Stopwatch();
      stopwatch.Start();
      for (int i = 0; i < iterations; i++) guids.ToList();
      stopwatch.Stop();
      Console.WriteLine("ToList speed ({0:n0} calls): {1:n0} ms", iterations, stopwatch.ElapsedMilliseconds);

      stopwatch.Reset();
      stopwatch.Start();
      for (int i = 0; i < iterations; i++) guids.ToArray();
      stopwatch.Stop();
      Console.WriteLine("ToArray speed ({0:n0} calls): {1:n0} ms", iterations, stopwatch.ElapsedMilliseconds);

      Console.WriteLine();
   }

   private static void MeasureSize(IEnumerable<Guid> guids)
   {
      var startingMemory = GC.GetTotalMemory(true);

      var list = guids.ToList();
      var memoryAfterList = GC.GetTotalMemory(true);
      Console.WriteLine("ToList size: {0:n0} bytes", memoryAfterList - startingMemory);

      var array = guids.ToArray();
      var memoryAfterArray = GC.GetTotalMemory(true);
      Console.WriteLine("ToArray size: {0:n0} bytes", memoryAfterArray - memoryAfterList);

      Console.WriteLine();
   }

   private static Guid GetNewGuid()
   {
      _numGuids++;
      return Guid.NewGuid();
   }
}
Tags: .net, architecture, linq

Linq to NHibernate Repository

Monday, August 31, 2009 9:50 AM

A colleague at work asked for some guidance on creating a generic repository that uses Linq to NHibernate and I thought I'd reply here instead of directly in case anyone else might find the information useful. First thing first, here is the repository I use on a project at work. I'll talk to interesting pieces of it below the code.

public interface IRepository
{
   ISession NHSsession { get; }

   /// <summary>
   /// Loads a proxy object with nothing but the primary key set.  
   /// Other properties will be pulled from the DB the first time they are accessed.
   /// Generally only use when you know you will NOT be wanting the other properties though.
   /// </summary>
   T Load<T>(object primaryKey);

   T Get<T>(object primaryKey);
   T Get<T>(Expression<Func<T, bool>> predicate);

   IQueryable<T> Find<T>();
   IQueryable<T> Find<T>(Expression<Func<T, bool>> predicate);

   T Add<T>(T entity);
   T Remove<T>(T entity);
}

public class Repository : IRepository
{
   static Repository()
   {
      _sessionFactory = Fluently.Configure()
         .Database(OracleDataClientConfiguration
            .Oracle9
            .ConnectionString(c => c.FromConnectionStringWithKey("CPSDsn"))
            .Driver("NHibernate.Driver.OracleClientDriver")
            .ShowSql()
         )
         .Mappings(mapping => mapping.FluentMappings.AddFromAssemblyOf<Repository>())
         .ExposeConfiguration(config => config.SetInterceptor(new AppInterceptor()))
         .BuildSessionFactory();
   }

   private static readonly ISessionFactory _sessionFactory;

   private static ISession _testingSession;
   public static ISession NHSession
   {
      get { return HttpContext.Current == null ? _testingSession : HttpContext.Current.Items["_nhSession"] as ISession; }
      set
      {
         if (HttpContext.Current == null)
            _testingSession = value;
         else
            HttpContext.Current.Items["_nhSession"] = value;
      }
   }

   public static void BeginUnitOfWork()
   {
      if (NHSession != null)
         throw new ApplicationException("Unit of Work already started");

      NHSession = _sessionFactory.OpenSession();
      NHSession.FlushMode = FlushMode.Commit;
      NHSession.BeginTransaction();
   }

   public static void EndUnitOfWork()
   {
      if (NHSession == null) return;

      NHSession.Transaction.Rollback();
      NHSession.Dispose();
      NHSession = null;
   }

   public static void SubmitChanges()
   {
      try
      {
         NHSession.Transaction.Commit();
         NHSession.BeginTransaction();
      }
      catch
      {
         NHSession.Transaction.Rollback();
         throw;
      }
   }

   public static void CloseSessionFactory()
   {
      _sessionFactory.Dispose();
   }

   /*******************************************************************************/
   /*******************************************************************************/

   public Repository()
   {
      _session = NHSession;
   }

   private readonly ISession _session;

   public ISession NHSsession { get { return _session; } }

   public T Load<T>(object primaryKey)
   {
      return _session.Load<T>(primaryKey);
   }

   public T Get<T>(object primaryKey)
   {
      return _session.Get<T>(primaryKey);
   }

   public T Get<T>(Expression<Func<T, bool>> predicate)
   {
      return Find<T>().SingleOrDefault(predicate);
   }

   public IQueryable<T> Find<T>()
   {
      return _session.Linq<T>();
   }

   public IQueryable<T> Find<T>(Expression<Func<T, bool>> predicate)
   {
      return Find<T>().Where(predicate);
   }

   public T Add<T>(T entity)
   {
      _session.Save(entity);
      return entity;
   }

   public T Remove<T>(T entity)
   {
      _session.Delete(entity);
      return entity;
   }
}

If I were looking at this for the first time, I think these things would jump out at me:

  • Static constructor - we're not using an IoC container to manage the NH session or session factory, so I saw this as the best way to reliably initialize the session factory. Remember that static constructors are guaranteed to be thread safe and are only called once. Pretty much exactly what we need for the NH session factory.
  • All the other static members - since we're not managing the NH session with IoC either, I needed a way to get a session. Also since this is from a web application I wanted the unit of work to be per http request. So in the Global.asax events BeginRequest and EndRequest, I call Repository.BeginUnitOfWork() and Repository.EndUnitOfWork(). And of course I didn't want any unexpected DB changes so you have to explicitly tell the repository to submit changes otherwise everything gets rolled back when ending the unit of work. The only thing remaining is the call to close the session factory. This is only used in unit tests.
  • All the NH specific stuff - yes, yes, I know. This isn't a repository you could plug into any ORM solution. I used to balk at such things that were implementation specific, but at some point I realized I was missing out on benefits of that chosen implementation. In fact, I'll talk to those benefits next.

Having talked to those things that JUMP out at you at first glance, the rest is pretty straight forward. The repository gives you what you need to get entities, get lists of entities, and of course add and remove entities from the DB.

The two methods that are specific to NH (I think) are the Load and Get methods (the Get that takes an object as primary key). Here are the reasons for their existence.

  • T Get<T>(object primaryKey) - if you use NHibernate's Get method, you'll get the added benefit of knowing NH might not have to go to the DB for that entity. If NH already has that entity loaded due to some previous call, then it will just return the one it already has. That's just freaking cool! If however, you use the other Get method that takes a predicate and uses the Linq method SingleOrDefault to get the entity, you'll hit the DB every time even if you're passing in the same predicate every time. Not cool.
  • T Load<T>(object primaryKey) - this one is very cool. If NH already has the requested object in memory, it will return to you the real deal. If not however, NH does not got to the DB to get it. Instead a proxy object is returned with nothing but the primary key set. You can use that object just as you would the real thing (pass it to constructors, use it as a parameter, etc). The intention is to use in situations where a reference to the entity is needed, but only for the sake of the relationship (FK in the DB usually), or to get to the primary key value. As a simple example, here is the body of one of the remove methods on one of my repositories: _repository.Remove(_repository.Load<MinorLine>(minorLineId));

I think that's about it. I hope you find it useful or that it at least sparks ideas for your own repository implementation.

Tags: .net, linqtonhibernate, linq, nhibernate

NHibernate Interceptors

Sunday, August 23, 2009 2:32 PM

A long time ago I asked a question on stackoverflow about table update events. The title of the question didn't really do it justice - I called it that because I assumed there would be events I could subscribe to in that would let me know when nhibernate was about to perform a database operation. In the end I figured out that what I was after is called an Interceptor.

The business case is simple: there is some data considered important enough we want to know everything that happens to the data. Classic auditing situation. The solution to these auditing needs however wasn't your standard single audit table. Instead, for any table containing "auditable" data, there is a sister table suffixed with "_HIST" that contains every column in the master table plus three additional - user, date, and action (Insert, Update, or Delete).

So when I joined the team and eventually introduced NHibernate, I started looking for a slick way to handle the auditing needs with NHibernate. My hope was that I could do something in the mappings and thus not have to change the domain entities themselves in any way. Unfortunately I never found something that would allow that, so the below is the best thing I could think of.

First, create an interface to implement on any entities who have auditing needs. This interface will not only tell the NHibernate code I show later that this is an entity to audit, but will also return the auditor that will do the work.

public interface IEntityToAudit
{
   IAuditor Auditor { get; }
}
public interface IAuditor
{
   void AuditInsert();
   void AuditUpdate(object[] previousState, string[] propertyNames);
   void AuditDelete();
}

The reason I'm passing in the previous state on update is because one of our entities only audits a few of the columns in the table it maps to. This means every time an update happens, I have to check those fields specifically to see if I need to insert a row in the audit table.

With that in place, all I have to do on my entities is implement the interface on any objects that map to a table we have to audit. For example:

public class MinorLine : IEntityToAudit
{
   protected MinorLine()
   {
      Auditor = new MinorLineAuditor(this);
   }

   public MinorLine(string code, int costCenter) : this()
   {
      Code = code;
      CostCenter = costCenter;
   }

   public virtual IAuditor Auditor { get; private set; }

   public virtual int Id { get; protected set; }
   public virtual string Code { get; protected set; }
   public virtual int CostCenter { get; set; }
}

You might have noticed that we have to specify the actual implementation of the auditor here. This was for two reasons. First, I didn't want the NHibernate code to have to map from an object to all of the specific types that can be audited. Second, creating it here allowed me to pass in the entity under audit to the auditor's constructor, thus saving me from having to deal with the untyped data that I have in the NHibernate code.

The auditor itself unfortunately just does raw ADO.NET stuff.  When asked to do an AuditInsert, it inserts a new row into its audit table with an action of 'Insert'.  When asked to do an AuditUpdate, it does the same insert, but with an action of 'Update'. Each of my auditors have one method that does the actual insert. The three methods on the interface just delegate to that method passing in the action. So below, you can see _minorLine is the entity that was passed into the constructor and actionId is the method parameter specifying which action we're auditing.

cmd.CommandText =
   "INSERT INTO MINOR_LINE_MASTER_HIST " +
   "(                                  " +
   "   MINOR_LINE_ID,                  " +
   "   MINOR_LINE_CODE,                " +
   "   COST_CENTER_NUM,                " +
   "   ACTN_USERID,                    " +
   "   ACTN_ID,                        " +
   "   ACTN_DATE                       " +
   ")                                  " +
   "VALUES                             " +
   "(                                  " +
   "   :MINOR_LINE_ID,                 " +
   "   :MINOR_LINE_CODE,               " +
   "   :COST_CENTER_NUM,               " +
   "   :ACTN_USERID,                   " +
   "   :ACTN_ID,                       " +
   "   SYSTIMESTAMP                    " +
   ")                                  ";

DbHelper.AddInParameter(cmd, "MINOR_LINE_ID", _minorLine.Id);
DbHelper.AddInParameter(cmd, "MINOR_LINE_CODE", _minorLine.Code);
DbHelper.AddInParameter(cmd, "COST_CENTER_NUM", _minorLine.CostCenter);

DbHelper.AddInParameter(cmd, "ACTN_USERID", Username());
DbHelper.AddInParameter(cmd, "ACTN_ID", actionId);

At this point, I've shown you how I flag my entities as an entity to audit and how I handle the actual auditing. The only thing left is to wire up the code that calls the auditors at the right times. I'm using Fluent NHibernate for both the mappings and the configuration. In the configuration is where you tie together all this magic. Specifically, notice the call to "SetInterceptor"

_sessionFactory = Fluently.Configure()
    .Database(OracleDataClientConfiguration
       .Oracle9
       .ConnectionString(c => c.FromConnectionStringWithKey("CPSDsn"))
       .Driver("NHibernate.Driver.OracleClientDriver")
       .ShowSql()
    )
    .Mappings(mapping => mapping.FluentMappings.AddFromAssemblyOf<Repository>())
    .ExposeConfiguration(config => config.SetInterceptor(new AppInterceptor()))
    .BuildSessionFactory();

And the AppInterceptor:

public class AppInterceptor : EmptyInterceptor
{
   public override bool OnSave(object entity, object id, object[] state, string[] propertyNames, IType[] types)
   {
      var entityToAudit = entity as IEntityToAudit;
      if (entityToAudit != null)
         entityToAudit.Auditor.AuditInsert();

      return base.OnSave(entity, id, state, propertyNames, types);
   }

   public override bool OnFlushDirty(object entity, object id, object[] currentState, object[] previousState, string[] propertyNames, IType[] types)
   {
      var entityToAudit = entity as IEntityToAudit;
      if (entityToAudit != null)
         entityToAudit.Auditor.AuditUpdate(previousState, propertyNames);

      return base.OnFlushDirty(entity, id, currentState, previousState, propertyNames, types);
   }

   public override void OnDelete(object entity, object id, object[] state, string[] propertyNames, IType[] types)
   {
      var entityToAudit = entity as IEntityToAudit;
      if (entityToAudit != null)
         entityToAudit.Auditor.AuditDelete();

      base.OnDelete(entity, id, state, propertyNames, types);
   }
}

And that's it! Once in place, I've found that adding new entities to the domain with auditing needs is extremely easy. Just create the entity and do the mappings like you've always done. Then when you're ready, just implement the interface and the auditor and you're done.

Tags: .net, nhibernate

Tulsa School of Dev - LINQ

Sunday, March 29, 2009 6:28 PM

I gave my Advanced LINQ presentation again at Tulsa's School of Dev. It was another great session with a lot of participation from the audience. It's funny how my best presentations are the ones where I don't do all of the talking! :)

As a whole, the event was awesome. It's always good to touch base with others who are active in the development community. I also attended a few really cool presentations. Ken Byrd started me off with a talk on jQuery. Even though I've been using it for a while and the talk was a "basics" talk, I picked up a few new nuggets and that is always cool. I then got to see Zain Naboulsi talk about design patterns. It was another basics talk, but watching him is always worth it as he's such an animated speaker.

The gem of the day was a talk by Chris Patterson about Event Driven Architecture. In the code demo he used Mass Transit of course (which was very cool), but the incredible thing to me was the massive mind shift I experienced during the presentation.

For some reason, I've always disliked messaging. It always seemed like a huge house of cards waiting to come crumbling down. I think it may be due to a few poor implementations I've had to work on. Whatever - I was a knucklehead regardless of the excuse.

In the time span of one hour, Chris made a bunch of connections for me and really caused me to rethink my opinions. I have no idea how I'll be able to take advantage of this new information where I work, but you can bet I'll be thinking about it. Very cool stuff.

On top of that, I won a copy of VS 2008 Pro! In fact, I rode over with Jay Smith and Devlin Liles (thanks for driving, Devlin!) and all three of us came home with prizes.

The only other interesting thing about the day was the weather. Below was my view from where I went to smoke. Despite all the snow, it cleared in time to be an uneventful drive home. Very odd!

See you at the Northwest Arkansas Code Camp

Tags:

AccountableGov Is Live!

Thursday, January 15, 2009 12:10 PM

At the beginning of last summer a colleague of mine, Jesse Core, asked me what I thought about creating a website with him. I liked the concept of what he wanted and thought it would be a great playground for me. So that was the agreement - I'd be all for it if there weren't time crunches and I could use it as a learning bed for whatever I wanted to play with. Basically I didn't want another job, but I liked having something tangible to build that would allow me to play with the things I don't get to play with at work.

The concept? Straight from the about page:

The need for accountable and transparent government is greater now than ever. While there are numerous avenues to study the voting records of federal and even state officials, it is difficult on the local level. More than likely, if you want to see how your local public officials are voting, you will need to visit City Hall and dig through mountains of papers. This is why we created AccountableGOV.com. It is a free service to cities to offer their citizens a searchable database of voting records. Citizens can now easily view historical votes, determine voting trends, view attendance records and citizen comments.

With that http://accountablegov.com was born!

Technically, this little adventure has been fun so far. Here are a few geek bullets about it:

I'm sure there's more, but that's all that's coming to mind right now. As you can see though, it's all about learning for me and it's all about fun. Cool stuff!

Another really interesting thing for me has been an early decision I made to define a site mantra: everything interesting about the site will be linkable, including (especially?) the searches. This is huge to me.

What this means is I can take any interesting search result I come up with and email it to others or blog about it. The last thing I wanted was for someone to have to tell a friend, "ok, go to the search page, type these eight keywords, and press enter.  How cool is that!".  Yuck!  Instead they should just be able to say, "check this out!" ('this' being a link of course).

Needless to say I'm excited. Now that it's live things might really start getting interesting.

Tags: