Arrays Lists and some IEnumerable Philosophy

by September 16, 2009 04:36 PM

So I had an "architecture moment" today at work. I was refactoring some code with Jay Smith and while modifying a few classes, we changed a few data types from List<T> to T[] (from generic lists to plain old arrays) at my request. I'll get to why in a moment. Just let it be noted first that doing this caused two things: first it caused a significant amount of work at merge time for one of my teammates, and then it caused a philosophy discussion about why we should or should not bother with plain arrays instead of their big brother List class. And then the conversation veered in the direction of using IEnumerable<T> as return types and the dangers therein.

Inflicted Pain

When we made all the changes, I thought we were doing all the grunt work for everyone else on the team. I hate when what I do causes a significant amount of work for someone else; especially when the change isn't "needed" but is more of a philosophy type change. It turned out that one of the classes heavily affected by our changes was being worked on heavily by someone else. Thus the heavy merge problems for him. Completely my bad.

Array vs List

I have a very strong belief that all code should be self documenting. Therefore, to me, if I make the data type of a property List<whatever>, I feel that is telling other developers that this is a property expecting to be modified (adding and removing things). An array, however, says to the world that here is the final un-modifiable list as it was at object creation. A side result of this belief is that I pretty much never return a List from a method call since there are very few contexts where that makes sense.

Well, there usually is a reason to use List as a data type or return type. Not a good reason, but a reason... I usually see this as a convenience because those lists do get modified by adding dummy objects (like the first item in a dropdown that says "Please Select One") before data binding.

There was one other point mentioned too - performance. Yes, I know, I know. A List is bigger than an array and calling ToList on an IEnumerable is slower than calling ToArray, but not by much. And when I say "not by much", I mean seriously, seriously small amounts. I actually did a bit of testing to be sure and this is the result I got, so this part of the argument isn't worth ever mentioning again.

ToList speed (10,000 calls): 2,089 ms
ToArray speed (10,000 calls): 1,986 ms

ToList size: 16,448 bytes
ToArray size: 16,024 bytes

Total guids created: 20,002,000

The code that generated this result is at the bottom of the post.

IEnumerable Gotcha

When I mentioned that I never return List<T> from a method and always choose T[], it was asked why I don't just return IEnumerable<T> instead of an array.

Do you happen to remember the phrase deferred execution? It basically means that any variable of type IEnumerable<T> might be a list of stuff and it might just be an execution plan that will return a list of stuff as the list is enumerated.

The gotcha in that last statement is that the execution plan will run every time the list is enumerated. Did you notice the last line of the above test result? I did that on purpose to show again that an IEnumerable<T> will execute over and over. This is why even though the size of the list of guids is 1000, the GetNewGuid() method was called over 20 million times during the test!

So, you certainly can be careful when returning IEnumerable by calling ToArray() or ToList() to force the execution plan and return a finalized list. The problem for my little brain is that I constantly forget to do that! I'll run the application or the unit test, see whacky things happen or poor performance, and then go back and fix things properly. Or at least I used to when I was on the kick where every list was an IEnumerable no matter what. Now by default I use an array and only return IEnumerable when deferred execution is my intention (remember all code is self documenting).

Enough already!

Ya, ya... that's enough rambling. I really do hate long winded blog posts! Here's the code I mentioned earlier that generated that result. Happy coding!

internal class Program
{
   private static int _numGuids;

   private static void Main()
   {
      IEnumerable<Guid> guids = Enumerable
         .Range(0, 1000)
         .Select(x => GetNewGuid());

      MeasureSpeed(guids);
      MeasureSize(guids);

      Console.WriteLine("Total guids created: {0:n0}", _numGuids);
   }

   private static void MeasureSpeed(IEnumerable<Guid> guids)
   {
      const int iterations = 10000;

      var stopwatch = new Stopwatch();
      stopwatch.Start();
      for (int i = 0; i < iterations; i++) guids.ToList();
      stopwatch.Stop();
      Console.WriteLine("ToList speed ({0:n0} calls): {1:n0} ms", iterations, stopwatch.ElapsedMilliseconds);

      stopwatch.Reset();
      stopwatch.Start();
      for (int i = 0; i < iterations; i++) guids.ToArray();
      stopwatch.Stop();
      Console.WriteLine("ToArray speed ({0:n0} calls): {1:n0} ms", iterations, stopwatch.ElapsedMilliseconds);

      Console.WriteLine();
   }

   private static void MeasureSize(IEnumerable<Guid> guids)
   {
      var startingMemory = GC.GetTotalMemory(true);

      var list = guids.ToList();
      var memoryAfterList = GC.GetTotalMemory(true);
      Console.WriteLine("ToList size: {0:n0} bytes", memoryAfterList - startingMemory);

      var array = guids.ToArray();
      var memoryAfterArray = GC.GetTotalMemory(true);
      Console.WriteLine("ToArray size: {0:n0} bytes", memoryAfterArray - memoryAfterList);

      Console.WriteLine();
   }

   private static Guid GetNewGuid()
   {
      _numGuids++;
      return Guid.NewGuid();
   }
}

Tags: , ,

View Responsibility

by November 9, 2008 02:56 PM

On my current side project, I'm writing an ASP.NET MVC application and have been loving it. Something I find myself doing is breaking up a particular view into the main view that gets asked for by a controller and a few sub views. It is these sub views that caused me to ponder the question of responsibility. I wondered if it was the responsibility of the controller to distinguish between the data needed by the main view and the data needed in each sub view, or was it the main view's responsibility to dole out what the sub view needed. I ultimately came to the conclusion that it is the main view's responsibility. Here is my reasoning.

I think a key ability here is to put yourself in the shoes of another.  Or in other words, be able to see something from different perspectives.

For this particular example, I have an action where the user has asked for a particular agenda by date. So I have an AgendaController and a method called Show. The controller gets the appropriate agenda from the domain and selects a view to display it.

From the perspective of the controller: So I've been asked for an agenda. I'll get the agenda and ask a view to display it. I don't care how it gets displayed, just that it does. So I'll pass the agenda object itself to the view.

From the perspective of the view (the one selected by the controller): So I've been asked to display an agenda. I'll make the date of the agenda be the title of the page and put it here. I want the attendance to go right here, but I'll let the "AttendanceView" render that piece so it will look like it does everywhere else. I'll put the blah blah here, and this thingy over there... etc.

You get the idea. The main view is deciding where to put all the pieced of an agenda. It also decides to delegate some of the more complex pieces to other sub views. When it does that, it needs to pass to that sub view what it wants the sub view to display. For example:

<div class="section" title="Attendance">
   <% Html.RenderPartial("AttendanceView", ViewData.Model.Voters); %>
</div>

So the controller gave the main view an agenda to render and the main view is taking part of that agenda and passing it off to a sub view to handle. This is where I think this separation belongs. I don't think the sub view should have to share the same model type as the main view, and I don't think the controller should have to put that collection of voters behind a special ViewData key for the sub view. I don't even think the controller should have to know there will be sub views. It's the controller's responsibility to get the domain object, and it's the main view's responsibility to make sure that domain object gets displayed.

Of course, all this is just how I'm feeling about it today. I'm sure I could be persuaded to change my viewpoint fairly easy however.

Happy coding!

Tags: , ,

Less Is More

by October 25, 2008 01:37 PM

As Bob Koss (on Object Mentor) puts it in Size Matters:

classes are just too damn big and methods are just too damn long

I couldn't agree more! I was recently tasked with adding functionality to an existing website at work. The point in the process where I was to add functionality was in the middle of this nasty page where there were multiple panels that were just hidden or shown depending on where you were in this mini process flow. The aspx was over 500 lines long and the code behind over 1,000! I cried inwardly when I realized the desired functionality was right in the middle of this mess. I did some soul searching and analysis and determined that splitting this mess into multiple pages wouldn't take much longer than figuring out how to add to the existing page without breaking anything.

As I was breaking the page into multiple pages, what Jeff Attwood said in Size Is The Enemy really hit home.

Lines of code are, and always have been, the enemy. More lines of code means more to read, more to understand, more to troubleshoot, more to debug.

The code was extremely difficult to decipher. Not only was so much functionality packed into one page, each method was crazy long and overly complex. At the end of this post is one such method along with how I refactored it after moving it into the new page I created for that piece of the puzzle. I hope you'll agree that the refactored version is much easier to read and understand quickly. And that's the key - to add to a code base without breaking it, you have to be able to understand it and I want to do that quickly. When an analyst or a user says "it's only one new button" or some such nonsense, I want them to be right! Nothing frustrates me more than when a task that seems like it should only take a day or two actually takes a week or two and it's because of the unnecessary complexity left by the yahoo coder before you.

A couple of things about the code below. First, don't bother trying to grok the pre-refactored code - it's just there to note. I assure you I tested and the shorter version captures exactly the same thing as the the longer. Second, one might be tempted to say the shorter version is only better if you understand lambdas and LINQ. To that I say, true. But consider this. A new developer coming on who doesn't know them will only have to learn them once and will then be capably of understanding all the rest at first glance. Doing it the old-school looping, if-else, and temp variables way means the poor developer will have to decipher the craziness over and over again. And each and every time is a lot of unnecessary time added to that simple button request.

So, to summarize: Less Is More!! Make every attempt to accomplish your programming tasks in as few lines of code as possible (short of obfuscation of course!). Always keep it in your mind how difficult it would be for the programmer coming along behind you (that coder might be you months or years later).

Happy coding!

 

The code:

Before I refactored:

private int GetDfuCount()
{
   int maxCount = 0;
   ArrayList systemDFUList = new ArrayList();

   foreach (DFULibrary.DFU userItem in GetUserDfuList())
   {
      if (userItem.Exists == false && userItem.Staged == false)
      {
         if (!userItem.BusinessUnit.Equals("CASE READY BEEF-PORK"))
         {
            maxCount++;
         }
         else
         {
            if (userItem.Reservation == true)
            {
               maxCount++;
            }
         }
      }

      foreach (DFULibrary.DFU systemItem in userItem.SystemDfuList)
      {
         if (systemItem.Exists == false && systemItem.Staged == false)
         {
            bool match = false;
            if (!systemItem.BusinessUnit.Equals("CASE READY BEEF-PORK"))
            {
               foreach (DFULibrary.DFU listItem in systemDFUList)
               {

                  if (systemItem.Product == listItem.Product &&
                     systemItem.Location == listItem.Location &&
                     systemItem.DfuLevel == listItem.DfuLevel &&
                     systemItem.SellingGroup == listItem.SellingGroup &&
                     systemItem.DriverLevelLoc == listItem.DriverLevelLoc)
                  {
                     match = true;
                     break;
                  }
               }
               if (!match)
                  systemDFUList.Add(systemItem);
            }
            else
            {
               if (userItem.Reservation == true)
               {
                  foreach (DFULibrary.DFU listItem in systemDFUList)
                  {
                     if (systemItem.Product == listItem.Product &&
                        systemItem.Location == listItem.Location &&
                        systemItem.DfuLevel == listItem.DfuLevel &&
                        systemItem.SellingGroup == listItem.SellingGroup &&
                        systemItem.DriverLevelLoc == listItem.DriverLevelLoc)
                     {
                        match = true;
                        break;
                     }
                  }
                  if (!match)
                     systemDFUList.Add(systemItem);
               }
            }
         }
      }
   }
   return maxCount + systemDFUList.Count;
}

After I refactored:

private int GetDfuCount()
{
   Func<DFULibrary.DFU, bool> userDfusToCount =
      x =>
         !x.Exists && !x.Staged &&
         (x.BusinessUnit != "CASE READY BEEF-PORK" || x.Reservation);

   Func<DFULibrary.DFU, bool> sysDfusToCount =
      x =>
         !x.Exists && !x.Staged &&
         (x.BusinessUnit != "CASE READY BEEF-PORK" || x.UserDfu.Reservation);

   int userDfuCount = UserDfuList.Count(userDfusToCount);
   int sysDfuCount = SystemDfuList.Count(sysDfusToCount);

   return userDfuCount + sysDfuCount;
}

Tags: ,

Project Architecture

by October 4, 2008 06:13 AM

I read and loved the series of posts called The Onion Architecture by Jeffrey Palermo. As he put it:

The main premise is that it controls coupling. The fundamental rule is that all code can depend on layers more central, but code cannot depend on layers further out from the core. In other words, all coupling is toward the center. This architecture is unashamedly biased toward object-oriented programming, and it puts objects before all others.

Sweet! Everything a good architecture astronaut such as myself would love. Well, I thought I’d see how well it performed down here where the oxygen is breathable. I’m currently working on a side project for a buddy at work and thought that it would be the perfect test bed as there isn’t a hard deadline for it.

So, here is our “Agenda Management System”. Basically it’s going to expose city council voting records in new and interesting ways.

VS Solution

This is the actual solution. Two of the projects you see exist in every project I create no matter what the chosen architecture is: _build and Tests. _build contains deploy scripts and all of my project dependencies that aren’t “out of the box”. The Tests project contains all of my unit tests (there aren’t many – yes, I suck at this still). The other three projects make up the “onion”.

image

The most interesting parts of the solution to me are the dependencies. You can see here that Core doesn’t reference anything. That means everything my domain objects and services could possibly need have to live within the Core project. Sounds reasonable until you realize they will at some point need data from the database and to log information. The details of those concerns aren’t “domain” type things of course so they live in the infrastructure project. But Core doesn’t reference Infrastructure! Oh my.

This is where the Dependency Inversion principle comes into play. Basically, anything the domain layer could need that isn’t a “domain” type of concern is split into a contract and implementation. The contract (interface) lives within Core and the implementation lives within Infrastructure. The two are tied together only once at application startup.

Full Solution

Hopefully the image to the left isn’t too tiny to see, but basically there are three players required to make this work smoothly. The class IoC in Core is a static container for the interface of an inversion of control container that will resolve interfaces for me. The InversionOfControlContainer in Infrastructure is the implementation of the container defined in Core. At application startup in the Global.asax, I tie the two together.

So in Global.asax.cs you’ll see this:

protected void Application_Start()
{
IoC.Initialize(new InversionOfControlContainer());

You can probably imagine that all I'm doing in "Initialize" is setting a private static variable that will be referenced throughout the rest of the application’s lifetime. For example, a method on IoC is:

public static T Resolve<T>() { return iocContainer.Resolve<T>(); }

And I use IoC like this:

IoC.Resolve<IUnitOfWork>().Begin();

Of course, note that in this situation, both the interface IUnitOfWork and it's implementation are in Infrastructure, but the use of IoC is the point :)

Well, hopefully that conveys my implementation of The Onion Architecture. I must say, this is something that I’m finding extremely pleasant to work with. Due to the fact that Core doesn’t reference Infrastructure, I’m forced to find solutions strictly in terms of my domain and the repositories. Very cool stuff indeed!

Happy coding!

Tags: , ,