Arrays Lists and some IEnumerable Philosophy

by September 16, 2009 04:36 PM

So I had an "architecture moment" today at work. I was refactoring some code with Jay Smith and while modifying a few classes, we changed a few data types from List<T> to T[] (from generic lists to plain old arrays) at my request. I'll get to why in a moment. Just let it be noted first that doing this caused two things: first it caused a significant amount of work at merge time for one of my teammates, and then it caused a philosophy discussion about why we should or should not bother with plain arrays instead of their big brother List class. And then the conversation veered in the direction of using IEnumerable<T> as return types and the dangers therein.

Inflicted Pain

When we made all the changes, I thought we were doing all the grunt work for everyone else on the team. I hate when what I do causes a significant amount of work for someone else; especially when the change isn't "needed" but is more of a philosophy type change. It turned out that one of the classes heavily affected by our changes was being worked on heavily by someone else. Thus the heavy merge problems for him. Completely my bad.

Array vs List

I have a very strong belief that all code should be self documenting. Therefore, to me, if I make the data type of a property List<whatever>, I feel that is telling other developers that this is a property expecting to be modified (adding and removing things). An array, however, says to the world that here is the final un-modifiable list as it was at object creation. A side result of this belief is that I pretty much never return a List from a method call since there are very few contexts where that makes sense.

Well, there usually is a reason to use List as a data type or return type. Not a good reason, but a reason... I usually see this as a convenience because those lists do get modified by adding dummy objects (like the first item in a dropdown that says "Please Select One") before data binding.

There was one other point mentioned too - performance. Yes, I know, I know. A List is bigger than an array and calling ToList on an IEnumerable is slower than calling ToArray, but not by much. And when I say "not by much", I mean seriously, seriously small amounts. I actually did a bit of testing to be sure and this is the result I got, so this part of the argument isn't worth ever mentioning again.

ToList speed (10,000 calls): 2,089 ms
ToArray speed (10,000 calls): 1,986 ms

ToList size: 16,448 bytes
ToArray size: 16,024 bytes

Total guids created: 20,002,000

The code that generated this result is at the bottom of the post.

IEnumerable Gotcha

When I mentioned that I never return List<T> from a method and always choose T[], it was asked why I don't just return IEnumerable<T> instead of an array.

Do you happen to remember the phrase deferred execution? It basically means that any variable of type IEnumerable<T> might be a list of stuff and it might just be an execution plan that will return a list of stuff as the list is enumerated.

The gotcha in that last statement is that the execution plan will run every time the list is enumerated. Did you notice the last line of the above test result? I did that on purpose to show again that an IEnumerable<T> will execute over and over. This is why even though the size of the list of guids is 1000, the GetNewGuid() method was called over 20 million times during the test!

So, you certainly can be careful when returning IEnumerable by calling ToArray() or ToList() to force the execution plan and return a finalized list. The problem for my little brain is that I constantly forget to do that! I'll run the application or the unit test, see whacky things happen or poor performance, and then go back and fix things properly. Or at least I used to when I was on the kick where every list was an IEnumerable no matter what. Now by default I use an array and only return IEnumerable when deferred execution is my intention (remember all code is self documenting).

Enough already!

Ya, ya... that's enough rambling. I really do hate long winded blog posts! Here's the code I mentioned earlier that generated that result. Happy coding!

internal class Program
{
   private static int _numGuids;

   private static void Main()
   {
      IEnumerable<Guid> guids = Enumerable
         .Range(0, 1000)
         .Select(x => GetNewGuid());

      MeasureSpeed(guids);
      MeasureSize(guids);

      Console.WriteLine("Total guids created: {0:n0}", _numGuids);
   }

   private static void MeasureSpeed(IEnumerable<Guid> guids)
   {
      const int iterations = 10000;

      var stopwatch = new Stopwatch();
      stopwatch.Start();
      for (int i = 0; i < iterations; i++) guids.ToList();
      stopwatch.Stop();
      Console.WriteLine("ToList speed ({0:n0} calls): {1:n0} ms", iterations, stopwatch.ElapsedMilliseconds);

      stopwatch.Reset();
      stopwatch.Start();
      for (int i = 0; i < iterations; i++) guids.ToArray();
      stopwatch.Stop();
      Console.WriteLine("ToArray speed ({0:n0} calls): {1:n0} ms", iterations, stopwatch.ElapsedMilliseconds);

      Console.WriteLine();
   }

   private static void MeasureSize(IEnumerable<Guid> guids)
   {
      var startingMemory = GC.GetTotalMemory(true);

      var list = guids.ToList();
      var memoryAfterList = GC.GetTotalMemory(true);
      Console.WriteLine("ToList size: {0:n0} bytes", memoryAfterList - startingMemory);

      var array = guids.ToArray();
      var memoryAfterArray = GC.GetTotalMemory(true);
      Console.WriteLine("ToArray size: {0:n0} bytes", memoryAfterArray - memoryAfterList);

      Console.WriteLine();
   }

   private static Guid GetNewGuid()
   {
      _numGuids++;
      return Guid.NewGuid();
   }
}

Tags: , ,

Linq to NHibernate Repository

by August 31, 2009 04:50 AM

A colleague at work asked for some guidance on creating a generic repository that uses Linq to NHibernate and I thought I'd reply here instead of directly in case anyone else might find the information useful. First thing first, here is the repository I use on a project at work. I'll talk to interesting pieces of it below the code.

public interface IRepository
{
   ISession NHSsession { get; }

   /// <summary>
   /// Loads a proxy object with nothing but the primary key set.  
   /// Other properties will be pulled from the DB the first time they are accessed.
   /// Generally only use when you know you will NOT be wanting the other properties though.
   /// </summary>
   T Load<T>(object primaryKey);

   T Get<T>(object primaryKey);
   T Get<T>(Expression<Func<T, bool>> predicate);

   IQueryable<T> Find<T>();
   IQueryable<T> Find<T>(Expression<Func<T, bool>> predicate);

   T Add<T>(T entity);
   T Remove<T>(T entity);
}

public class Repository : IRepository
{
   static Repository()
   {
      _sessionFactory = Fluently.Configure()
         .Database(OracleDataClientConfiguration
            .Oracle9
            .ConnectionString(c => c.FromConnectionStringWithKey("CPSDsn"))
            .Driver("NHibernate.Driver.OracleClientDriver")
            .ShowSql()
         )
         .Mappings(mapping => mapping.FluentMappings.AddFromAssemblyOf<Repository>())
         .ExposeConfiguration(config => config.SetInterceptor(new AppInterceptor()))
         .BuildSessionFactory();
   }

   private static readonly ISessionFactory _sessionFactory;

   private static ISession _testingSession;
   public static ISession NHSession
   {
      get { return HttpContext.Current == null ? _testingSession : HttpContext.Current.Items["_nhSession"] as ISession; }
      set
      {
         if (HttpContext.Current == null)
            _testingSession = value;
         else
            HttpContext.Current.Items["_nhSession"] = value;
      }
   }

   public static void BeginUnitOfWork()
   {
      if (NHSession != null)
         throw new ApplicationException("Unit of Work already started");

      NHSession = _sessionFactory.OpenSession();
      NHSession.FlushMode = FlushMode.Commit;
      NHSession.BeginTransaction();
   }

   public static void EndUnitOfWork()
   {
      if (NHSession == null) return;

      NHSession.Transaction.Rollback();
      NHSession.Dispose();
      NHSession = null;
   }

   public static void SubmitChanges()
   {
      try
      {
         NHSession.Transaction.Commit();
         NHSession.BeginTransaction();
      }
      catch
      {
         NHSession.Transaction.Rollback();
         throw;
      }
   }

   public static void CloseSessionFactory()
   {
      _sessionFactory.Dispose();
   }

   /*******************************************************************************/
   /*******************************************************************************/

   public Repository()
   {
      _session = NHSession;
   }

   private readonly ISession _session;

   public ISession NHSsession { get { return _session; } }

   public T Load<T>(object primaryKey)
   {
      return _session.Load<T>(primaryKey);
   }

   public T Get<T>(object primaryKey)
   {
      return _session.Get<T>(primaryKey);
   }

   public T Get<T>(Expression<Func<T, bool>> predicate)
   {
      return Find<T>().SingleOrDefault(predicate);
   }

   public IQueryable<T> Find<T>()
   {
      return _session.Linq<T>();
   }

   public IQueryable<T> Find<T>(Expression<Func<T, bool>> predicate)
   {
      return Find<T>().Where(predicate);
   }

   public T Add<T>(T entity)
   {
      _session.Save(entity);
      return entity;
   }

   public T Remove<T>(T entity)
   {
      _session.Delete(entity);
      return entity;
   }
}

If I were looking at this for the first time, I think these things would jump out at me:

  • Static constructor - we're not using an IoC container to manage the NH session or session factory, so I saw this as the best way to reliably initialize the session factory. Remember that static constructors are guaranteed to be thread safe and are only called once. Pretty much exactly what we need for the NH session factory.
  • All the other static members - since we're not managing the NH session with IoC either, I needed a way to get a session. Also since this is from a web application I wanted the unit of work to be per http request. So in the Global.asax events BeginRequest and EndRequest, I call Repository.BeginUnitOfWork() and Repository.EndUnitOfWork(). And of course I didn't want any unexpected DB changes so you have to explicitly tell the repository to submit changes otherwise everything gets rolled back when ending the unit of work. The only thing remaining is the call to close the session factory. This is only used in unit tests.
  • All the NH specific stuff - yes, yes, I know. This isn't a repository you could plug into any ORM solution. I used to balk at such things that were implementation specific, but at some point I realized I was missing out on benefits of that chosen implementation. In fact, I'll talk to those benefits next.

Having talked to those things that JUMP out at you at first glance, the rest is pretty straight forward. The repository gives you what you need to get entities, get lists of entities, and of course add and remove entities from the DB.

The two methods that are specific to NH (I think) are the Load and Get methods (the Get that takes an object as primary key). Here are the reasons for their existence.

  • T Get<T>(object primaryKey) - if you use NHibernate's Get method, you'll get the added benefit of knowing NH might not have to go to the DB for that entity. If NH already has that entity loaded due to some previous call, then it will just return the one it already has. That's just freaking cool! If however, you use the other Get method that takes a predicate and uses the Linq method SingleOrDefault to get the entity, you'll hit the DB every time even if you're passing in the same predicate every time. Not cool.
  • T Load<T>(object primaryKey) - this one is very cool. If NH already has the requested object in memory, it will return to you the real deal. If not however, NH does not got to the DB to get it. Instead a proxy object is returned with nothing but the primary key set. You can use that object just as you would the real thing (pass it to constructors, use it as a parameter, etc). The intention is to use in situations where a reference to the entity is needed, but only for the sake of the relationship (FK in the DB usually), or to get to the primary key value. As a simple example, here is the body of one of the remove methods on one of my repositories: _repository.Remove(_repository.Load<MinorLine>(minorLineId));

I think that's about it. I hope you find it useful or that it at least sparks ideas for your own repository implementation.

Tags: , , ,

NHibernate Interceptors

by August 23, 2009 09:32 AM

A long time ago I asked a question on stackoverflow about table update events. The title of the question didn't really do it justice - I called it that because I assumed there would be events I could subscribe to in that would let me know when nhibernate was about to perform a database operation. In the end I figured out that what I was after is called an Interceptor.

The business case is simple: there is some data considered important enough we want to know everything that happens to the data. Classic auditing situation. The solution to these auditing needs however wasn't your standard single audit table. Instead, for any table containing "auditable" data, there is a sister table suffixed with "_HIST" that contains every column in the master table plus three additional - user, date, and action (Insert, Update, or Delete).

So when I joined the team and eventually introduced NHibernate, I started looking for a slick way to handle the auditing needs with NHibernate. My hope was that I could do something in the mappings and thus not have to change the domain entities themselves in any way. Unfortunately I never found something that would allow that, so the below is the best thing I could think of.

First, create an interface to implement on any entities who have auditing needs. This interface will not only tell the NHibernate code I show later that this is an entity to audit, but will also return the auditor that will do the work.

public interface IEntityToAudit
{
   IAuditor Auditor { get; }
}
public interface IAuditor
{
   void AuditInsert();
   void AuditUpdate(object[] previousState, string[] propertyNames);
   void AuditDelete();
}

The reason I'm passing in the previous state on update is because one of our entities only audits a few of the columns in the table it maps to. This means every time an update happens, I have to check those fields specifically to see if I need to insert a row in the audit table.

With that in place, all I have to do on my entities is implement the interface on any objects that map to a table we have to audit. For example:

public class MinorLine : IEntityToAudit
{
   protected MinorLine()
   {
      Auditor = new MinorLineAuditor(this);
   }

   public MinorLine(string code, int costCenter) : this()
   {
      Code = code;
      CostCenter = costCenter;
   }

   public virtual IAuditor Auditor { get; private set; }

   public virtual int Id { get; protected set; }
   public virtual string Code { get; protected set; }
   public virtual int CostCenter { get; set; }
}

You might have noticed that we have to specify the actual implementation of the auditor here. This was for two reasons. First, I didn't want the NHibernate code to have to map from an object to all of the specific types that can be audited. Second, creating it here allowed me to pass in the entity under audit to the auditor's constructor, thus saving me from having to deal with the untyped data that I have in the NHibernate code.

The auditor itself unfortunately just does raw ADO.NET stuff.  When asked to do an AuditInsert, it inserts a new row into its audit table with an action of 'Insert'.  When asked to do an AuditUpdate, it does the same insert, but with an action of 'Update'. Each of my auditors have one method that does the actual insert. The three methods on the interface just delegate to that method passing in the action. So below, you can see _minorLine is the entity that was passed into the constructor and actionId is the method parameter specifying which action we're auditing.

cmd.CommandText =
   "INSERT INTO MINOR_LINE_MASTER_HIST " +
   "(                                  " +
   "   MINOR_LINE_ID,                  " +
   "   MINOR_LINE_CODE,                " +
   "   COST_CENTER_NUM,                " +
   "   ACTN_USERID,                    " +
   "   ACTN_ID,                        " +
   "   ACTN_DATE                       " +
   ")                                  " +
   "VALUES                             " +
   "(                                  " +
   "   :MINOR_LINE_ID,                 " +
   "   :MINOR_LINE_CODE,               " +
   "   :COST_CENTER_NUM,               " +
   "   :ACTN_USERID,                   " +
   "   :ACTN_ID,                       " +
   "   SYSTIMESTAMP                    " +
   ")                                  ";

DbHelper.AddInParameter(cmd, "MINOR_LINE_ID", _minorLine.Id);
DbHelper.AddInParameter(cmd, "MINOR_LINE_CODE", _minorLine.Code);
DbHelper.AddInParameter(cmd, "COST_CENTER_NUM", _minorLine.CostCenter);

DbHelper.AddInParameter(cmd, "ACTN_USERID", Username());
DbHelper.AddInParameter(cmd, "ACTN_ID", actionId);

At this point, I've shown you how I flag my entities as an entity to audit and how I handle the actual auditing. The only thing left is to wire up the code that calls the auditors at the right times. I'm using Fluent NHibernate for both the mappings and the configuration. In the configuration is where you tie together all this magic. Specifically, notice the call to "SetInterceptor"

_sessionFactory = Fluently.Configure()
    .Database(OracleDataClientConfiguration
       .Oracle9
       .ConnectionString(c => c.FromConnectionStringWithKey("CPSDsn"))
       .Driver("NHibernate.Driver.OracleClientDriver")
       .ShowSql()
    )
    .Mappings(mapping => mapping.FluentMappings.AddFromAssemblyOf<Repository>())
    .ExposeConfiguration(config => config.SetInterceptor(new AppInterceptor()))
    .BuildSessionFactory();

And the AppInterceptor:

public class AppInterceptor : EmptyInterceptor
{
   public override bool OnSave(object entity, object id, object[] state, string[] propertyNames, IType[] types)
   {
      var entityToAudit = entity as IEntityToAudit;
      if (entityToAudit != null)
         entityToAudit.Auditor.AuditInsert();

      return base.OnSave(entity, id, state, propertyNames, types);
   }

   public override bool OnFlushDirty(object entity, object id, object[] currentState, object[] previousState, string[] propertyNames, IType[] types)
   {
      var entityToAudit = entity as IEntityToAudit;
      if (entityToAudit != null)
         entityToAudit.Auditor.AuditUpdate(previousState, propertyNames);

      return base.OnFlushDirty(entity, id, currentState, previousState, propertyNames, types);
   }

   public override void OnDelete(object entity, object id, object[] state, string[] propertyNames, IType[] types)
   {
      var entityToAudit = entity as IEntityToAudit;
      if (entityToAudit != null)
         entityToAudit.Auditor.AuditDelete();

      base.OnDelete(entity, id, state, propertyNames, types);
   }
}

And that's it! Once in place, I've found that adding new entities to the domain with auditing needs is extremely easy. Just create the entity and do the mappings like you've always done. Then when you're ready, just implement the interface and the auditor and you're done.

Tags: ,

NWA DNUG - LINQ

by February 11, 2009 04:06 PM

At the February NWADNUG meeting, I gave the lightning round presentation and thought I'd share the code from the presentation. I received some excellent feedback and really appreciate it.

I also want to mention that Jay Smith did an excellent job leading the discussion on Agile development.

The code: LINQ.zip

You can see the presentation here. Thanks Zach!

Let me know if you have any questions. Enjoy

Tags: , , ,

View Responsibility

by November 9, 2008 02:56 PM

On my current side project, I'm writing an ASP.NET MVC application and have been loving it. Something I find myself doing is breaking up a particular view into the main view that gets asked for by a controller and a few sub views. It is these sub views that caused me to ponder the question of responsibility. I wondered if it was the responsibility of the controller to distinguish between the data needed by the main view and the data needed in each sub view, or was it the main view's responsibility to dole out what the sub view needed. I ultimately came to the conclusion that it is the main view's responsibility. Here is my reasoning.

I think a key ability here is to put yourself in the shoes of another.  Or in other words, be able to see something from different perspectives.

For this particular example, I have an action where the user has asked for a particular agenda by date. So I have an AgendaController and a method called Show. The controller gets the appropriate agenda from the domain and selects a view to display it.

From the perspective of the controller: So I've been asked for an agenda. I'll get the agenda and ask a view to display it. I don't care how it gets displayed, just that it does. So I'll pass the agenda object itself to the view.

From the perspective of the view (the one selected by the controller): So I've been asked to display an agenda. I'll make the date of the agenda be the title of the page and put it here. I want the attendance to go right here, but I'll let the "AttendanceView" render that piece so it will look like it does everywhere else. I'll put the blah blah here, and this thingy over there... etc.

You get the idea. The main view is deciding where to put all the pieced of an agenda. It also decides to delegate some of the more complex pieces to other sub views. When it does that, it needs to pass to that sub view what it wants the sub view to display. For example:

<div class="section" title="Attendance">
   <% Html.RenderPartial("AttendanceView", ViewData.Model.Voters); %>
</div>

So the controller gave the main view an agenda to render and the main view is taking part of that agenda and passing it off to a sub view to handle. This is where I think this separation belongs. I don't think the sub view should have to share the same model type as the main view, and I don't think the controller should have to put that collection of voters behind a special ViewData key for the sub view. I don't even think the controller should have to know there will be sub views. It's the controller's responsibility to get the domain object, and it's the main view's responsibility to make sure that domain object gets displayed.

Of course, all this is just how I'm feeling about it today. I'm sure I could be persuaded to change my viewpoint fairly easy however.

Happy coding!

Tags: , ,