reflection

Mapping DataTables and DataRows to Objects in C# and .NET

My group regularly uses DataSet, DataTable, and DataRow objects in many of our apps.

(What? Don't look at me like that. These apps are old.)

Anyway, we're simultaneously trying to implement good C# and object-oriented programming principles while maintaining these old apps, so we often end up having to map data from a data set to a C# object. We did this enough times that I and a coworker (we'll call her Marlena) decided to sit down and just make up a new mapping system for use with these DataTable and DataRow objects.

As always with my code-based posts, there's a GitHub project with a full working example app, so check that out too!

One Jump Ahead

So here's a basic problem with mapping from DataSet, DataTable, and DataRow objects: we don't know at compile time what columns and tables exist in the set, so mapping solutions like AutoMapper won't work for this scenario. Our mapping system will have to assume what columns exist. But, in order to make it more reusable, we will make the mapping system return default values for any values which it does not locate.

There's also another, more complex problem: the databases we are acquiring our data from use many different column names to represent the same data. 20 years of different maintainers and little in the way of cohesive naming standards will do that do a database. So, if we needed a person's first name, the different databases might use:

  • first_name
  • firstName
  • fname
  • name_first

This, as might be imagined, makes mapping anything rather difficult. So our system will also need to be able to map from many different column names.

Finally, this system wouldn't be worth much if it couldn't handle collections of objects as well as single objects, so we'll need to allow for that as well.

So, in short, our system needs to:

  1. Map from DataTable and DataRow to objects.
  2. Map from multiple different column names.
  3. Handle mapping to a collection of objects as well as a single object.

We'll need several pieces to accomplish this. But before we can even start building the mapping system, we must first acquire some sample data.

Mine, Mine, Mine

We're going to create some DataSet objects that we can test our system against. In the real world, you would use an actual database, but here (for simplicity's sake) we're just going to manually create some DataSet objects. Here's a sample class which will create two DataSet objects, Priests and Ranchers, each of which use different column names for the same data:

public static class DataSetGenerator  
{
    public static DataSet Priests()
    {
        DataTable priestsDataTable = new DataTable();
        priestsDataTable.Columns.Add(new DataColumn()
        {
            ColumnName = "first_name",
            DataType = typeof(string)
        });
        priestsDataTable.Columns.Add(new DataColumn()
        {
            ColumnName = "last_name",
            DataType = typeof(string)
        });
        priestsDataTable.Columns.Add(new DataColumn()
        {
            ColumnName = "dob",
            DataType = typeof(DateTime)
        });
        priestsDataTable.Columns.Add(new DataColumn()
        {
            ColumnName = "job_title",
            DataType = typeof(string)
        });
        priestsDataTable.Columns.Add(new DataColumn()
        {
            ColumnName = "taken_name",
            DataType = typeof(string)
        });
        priestsDataTable.Columns.Add(new DataColumn()
        {
            ColumnName = "is_american",
            DataType = typeof(string)
        });

        priestsDataTable.Rows.Add(new object[] { "Lenny", "Belardo", new DateTime(1971, 3, 24), "Pontiff", "Pius XIII", "yes" });
        priestsDataTable.Rows.Add(new object[] { "Angelo", "Voiello", new DateTime(1952, 11, 18), "Cardinal Secretary of State", "", "no" });
        priestsDataTable.Rows.Add(new object[] { "Michael", "Spencer", new DateTime(1942, 5, 12), "Archbishop of New York", "", "yes" });
        priestsDataTable.Rows.Add(new object[] { "Sofia", "(Unknown)", new DateTime(1974, 7, 2), "Director of Marketing", "", "no" });
        priestsDataTable.Rows.Add(new object[] { "Bernardo", "Gutierrez", new DateTime(1966, 9, 16), "Master of Ceremonies", "", "no" });

        DataSet priestsDataSet = new DataSet();
        priestsDataSet.Tables.Add(priestsDataTable);

        return priestsDataSet;
    }

    public static DataSet Ranchers()
    {
        DataTable ranchersTable = new DataTable();
        ranchersTable.Columns.Add(new DataColumn()
        {
            ColumnName = "firstName",
            DataType = typeof(string)
        });
        ranchersTable.Columns.Add(new DataColumn()
        {
            ColumnName = "lastName",
            DataType = typeof(string)
        });
        ranchersTable.Columns.Add(new DataColumn()
        {
            ColumnName = "dateOfBirth",
            DataType = typeof(DateTime)
        });
        ranchersTable.Columns.Add(new DataColumn()
        {
            ColumnName = "jobTitle",
            DataType = typeof(string)
        });
        ranchersTable.Columns.Add(new DataColumn()
        {
            ColumnName = "nickName",
            DataType = typeof(string)
        });
        ranchersTable.Columns.Add(new DataColumn()
        {
            ColumnName = "isAmerican",
            DataType = typeof(string)
        });

        ranchersTable.Rows.Add(new object[] { "Colt", "Bennett", new DateTime(1987, 1, 15), "Ranchhand", "", "y" });
        ranchersTable.Rows.Add(new object[] { "Jameson", "Bennett", new DateTime(1984, 10, 10), "Ranchhand", "Rooster", "y" });
        ranchersTable.Rows.Add(new object[] { "Beau", "Bennett", new DateTime(1944, 8, 9), "Rancher", "", "y" });
        ranchersTable.Rows.Add(new object[] { "Margaret", "Bennett", new DateTime(1974, 7, 2), "Bar Owner", "Maggie", "y" });
        ranchersTable.Rows.Add(new object[] { "Abigail", "Phillips", new DateTime(1987, 4, 24), "Teacher", "Abby", "y" });

        DataSet ranchersDataSet = new DataSet();
        ranchersDataSet.Tables.Add(ranchersTable);

        return ranchersDataSet;
    }
}

We'll test our system against this sample data.

Something There

Now we can build our actual mapping solution. First off, we need a way to decide what column names map to object properties. It was Marlena's idea to keep those things together, and so we came up with a class called DataNamesAttribute that looks like this:

[AttributeUsage(AttributeTargets.Property)]
public class DataNamesAttribute : Attribute  
{
    protected List<string> _valueNames { get; set; }

    public List<string> ValueNames
    {
        get
        {
            return _valueNames;
        }
        set
        {
            _valueNames = value;
        }
    }

    public DataNamesAttribute()
    {
        _valueNames = new List<string>();
    }

    public DataNamesAttribute(params string[] valueNames)
    {
        _valueNames = valueNames.ToList();
    }
}

This attribute can then be used (in fact, can only be used, due to the AttributeUsage(AttributeTargets.Property) declaration) on properties of other classes. Let's say we're going to map to a Person class. We would use DataNamesAttribute like so:

public class Person  
{
    [DataNames("first_name", "firstName")]
    public string FirstName { get; set; }

    [DataNames("last_name", "lastName")]
    public string LastName { get; set; }

    [DataNames("dob", "dateOfBirth")]
    public DateTime DateOfBirth { get; set; }

    [DataNames("job_title", "jobTitle")]
    public string JobTitle { get; set; }

    [DataNames("taken_name", "nickName")]
    public string TakenName { get; set; }

    [DataNames("is_american", "isAmerican")]
    public bool IsAmerican { get; set; }
}

Now that we know where the data needs to end up, let's start mapping out the mapper (heh).

Reflection

Our mapper class will be a generic class so that we can map from DataTable or DataRow objects to any kind of object. We'll need two methods to get different kinds of data:

public class DataNamesMapper<TEntity> where TEntity : class, new()  
{
    public TEntity Map(DataRow row) { ... }
    public IEnumerable<TEntity> Map(DataTable table) { ... }
}

Let's start with the Map(DataRow row) method. We need to do three things:

  1. Figure out what columns exist in this row.
  2. Determine if the TEntity we are mapping to has any properties with the same name as any of the columns (aka the Data Names) AND
  3. Map the value from the DataRow to the TEntity.

Here's how we do this, using just a bit of reflection:

public TEntity Map(DataRow row)  
{
    //Step 1 - Get the Column Names
    var columnNames = row.Table.Columns
                               .Cast<DataColumn>()
                               .Select(x => x.ColumnName)
                               .ToList();

    //Step 2 - Get the Property Data Names
    var properties = (typeof(TEntity)).GetProperties()
                                      .Where(x => x.GetCustomAttributes(typeof(DataNamesAttribute), true).Any())
                                      .ToList();

    //Step 3 - Map the data
    TEntity entity = new TEntity();
    foreach (var prop in properties)
    {
        PropertyMapHelper.Map(typeof(TEntity), row, prop, entity);
    }

    return entity;
}

Of course, we also need to handle the other method, the one where we can get a collection of TEntity:

public IEnumerable<TEntity> Map(DataTable table)  
{
    //Step 1 - Get the Column Names
    var columnNames = table.Columns.Cast<DataColumn>().Select(x => x.ColumnName).ToList();

    //Step 2 - Get the Property Data Names
    var properties = (typeof(TEntity)).GetProperties()
                                        .Where(x => x.GetCustomAttributes(typeof(DataNamesAttribute), true).Any())
                                        .ToList();

    //Step 3 - Map the data
    List<TEntity> entities = new List<TEntity>();
    foreach (DataRow row in table.Rows)
    {
        TEntity entity = new TEntity();
        foreach (var prop in properties)
        {
            PropertyMapHelper.Map(typeof(TEntity), row, prop, entity);
        }
        entities.Add(entity);
    }

    return entities;
}

You might be wondering just what the heck the PropertyMapHelper class is. If you are, you might also be about to regret it.

Dig a Little Deeper

The PropertyMapHelper, as suggested by the name, maps values to different primitive types (int, string, DateTime, etc.). Here's that Map() method we saw earlier:

public static void Map(Type type, DataRow row, PropertyInfo prop, object entity)  
{
    List<string> columnNames = AttributeHelper.GetDataNames(type, prop.Name);

    foreach (var columnName in columnNames)
    {
        if (!String.IsNullOrWhiteSpace(columnName) && row.Table.Columns.Contains(columnName))
        {
            var propertyValue = row[columnName];
            if (propertyValue != DBNull.Value)
            {
                ParsePrimitive(prop, entity, row[columnName]);
                break;
            }
        }
    }
}

There are two pieces in this method that we haven't defined yet: the AttributeHelper class and the ParsePrimitive() method. AttributeHelper is a rather simple class that merely gets the list of column names from the DataNamesAttribute:

public static List<string> GetDataNames(Type type, string propertyName)  
{
    var property = type.GetProperty(propertyName).GetCustomAttributes(false).Where(x => x.GetType().Name == "DataNamesAttribute").FirstOrDefault();
    if (property != null)
    {
        return ((DataNamesAttribute)property).ValueNames;
    }
    return new List<string>();
}

The other we need to define in ParsePrimitive(), which as its name suggests will parse the values into primitive types. Essentially what this class does is assign a value to a passed-in property reference (represented by the PropertyInfo class). I'm not going to post the full code on this post (you can see it over on GitHub), so here's a snippet of what this method does:

private static void ParsePrimitive(PropertyInfo prop, object entity, object value)  
{
    if (prop.PropertyType == typeof(string))
    {
        prop.SetValue(entity, value.ToString().Trim(), null);
    }
    else if (prop.PropertyType == typeof(int) || prop.PropertyType == typeof(int?))
    {
        if (value == null)
        {
            prop.SetValue(entity, null, null);
        }
        else
        {
            prop.SetValue(entity, int.Parse(value.ToString()), null);
        }
    }
    ...
}

That's the bottom of the rabbit hole, as it were. Now, we can use the DataSet objects we created earlier and our mapping system to see if we can map this data correctly.

Two Worlds

Here's a quick program that can test our new mapping system:

class Program  
{
    static void Main(string[] args)
    {
        var priestsDataSet = DataSetGenerator.Priests();
        DataNamesMapper<Person> mapper = new DataNamesMapper<Person>();
        List<Person> persons = mapper.Map(priestsDataSet.Tables[0]).ToList();

        var ranchersDataSet = DataSetGenerator.Ranchers();
        persons.AddRange(mapper.Map(ranchersDataSet.Tables[0]));

        foreach (var person in persons)
        {
            Console.WriteLine("First Name: " + person.FirstName + ", Last Name: " + person.LastName
                                + ", Date of Birth: " + person.DateOfBirth.ToShortDateString()
                                + ", Job Title: " + person.JobTitle + ", Nickname: " + person.TakenName
                                + ", Is American: " + person.IsAmerican);
        }

        Console.ReadLine();
    }
}

When we run this app (which you can do too), we will get the following output:

Which is exactly what we want!

(I mean, really, did you expect me to blog about something that didn't work?)

Go the Distance

It concerns me that this system is overly complicated, and I'd happily take suggestions on how to make it more straightforward. While I do like how all we need to do is place the DataNamesAttribute on the correct properties and then call an instance of DataNamesMapper<T>, I feel like the whole thing could be easier somehow. Believe it or not, this version is actually simpler than the one we're actually using in our internal apps.

Also, check out the sample project over on GitHub, fork it, test it, whatever. If it helped you out, or if you can improve it, let me know in the comments!

Finally, extra special bonus points will go to anyone who can figure out a) what the hell those odd section titles are about and b) where I got the sample data from.

Happy Coding!

Object Change Tracking via Reflection in .NET

We have this big project we're working on (which I have written about before) and one of the things we need to do on this project is automatic logging of changes made to model objects. I've worked out a way to do this generically, for any object, and I think others might find it just as useful as we have.

Requirements

We needed to be able to compare two objects of the same type, examine their properties, and log which properties have changed value. Further, we needed to be able to mark which property represents the primary key for the instance of the object (which is not expected to change value), and we needed to be able to specify fields whose changes won't be logged. Finally, we wanted to do this generically; for any two objects of the same type, we should be able to compare them and log changes.

In short, we needed this log generator to be:

  • Able to compare two objects of any type, provided they are the same type as each other
  • Able to exclude certain properties
  • Able to determine the primary key field and record that value

Let's set up an environment where we can achieve these goals.

Program Setup

Let's say we have a model object that looks like this:

public class RetailLocation  
{
    public int Id { get; set; }
    public DateTime DateOpened { get; set; }
    public string ManagerFirstName { get; set; }
    public string ManagerLastName { get; set; }
    public bool HasLimitedMenu { get; set; }
}

Every time a change is made to this object, we want to automatically record what changes occurred.

Further, we also have the log object itself:

public class ChangeLog  
{
    public string ClassName { get; set; }
    public string PropertyName { get; set; }
    public int PrimaryKey { get; set; }
    public string OldValue { get; set; }
    public string NewValue { get; set; }
    public DateTime DateChanged { get; set; }
}

What this means is that if multiple properties change (between two objects) then we'll get multiple records of ChangeLog.

Let's set up a command-line program that will test this.

class Program  
{
    static void Main(string[] args)
    {
        //Create the old entry
        RetailLocation oldLocation = new RetailLocation()
        {
            Id = 1,
            DateOpened = new DateTime(2009, 12, 3),
            ManagerFirstName = "Steve",
            ManagerLastName = "Harkonnen",
            HasLimitedMenu = true
        };

        //Create the new entry
        RetailLocation newLocation = new RetailLocation()
        {
            Id = 1,
            DateOpened = new DateTime(2009, 12, 3),
            ManagerFirstName = "Kelly",
            ManagerLastName = "Nishimura",
            HasLimitedMenu = false
        };

        ChangeLogService service = new ChangeLogService();
        List<ChangeLog> logs = service.GetChanges(oldLocation, newLocation); //What does this do?

        foreach(var log in logs)
        {
            Console.WriteLine("Primary Key: " + log.PrimaryKey.ToString() + ", Class Name:" + log.ClassName + ", Property Name: " + log.PropertyName + ", Old Value = " + log.OldValue + ", New Value = " + log.NewValue);
        }
    }
}

Whenever we instantiate ChangeLogService and call GetChanges, we should get back a List of ChangeLogs that will contain all the changes found between the two objects. But exactly how can we get these changes?

Making the Comparison

Let's start with our GetChanges method declaration:

public List<ChangeLog> GetChanges(object oldEntry, object newEntry)  
{
    List<ChangeLog> logs = new List<ChangeLog>();
}

We'll need several steps to get all of the changes.

Step 1: Do the Types Match?

First off, we won't compare objects if they are not of the same type. So let's check the type of the two objects and return a blank list of changes if the types are different.

var oldType = oldEntry.GetType();  
var newType = newEntry.GetType();  
if(oldType != newType)  
{
    return logs; //Types don't match, cannot log changes
}

Step 2: Find the Primary Key Property

Now we come to the first real issue we have to solve: we need to record the primary key value for the record that changed.

We can make the assumption that the object will have at least one of those fields represent a primary key value. (In the real world, of course, you could have multiple fields represent the key, but in this tutorial we're assuming that the database has a single-column primary key for all tables). But, even with this assumption, how can we identify which property in the object (i.e. which column in the database) is that primary key?

We will use an attribute:

public class LoggingPrimaryKeyAttribute : Attribute  
{
}

We can then use this to decorate the RetailLocation object from earlier:

public class RetailLocation  
{
    [LoggingPrimaryKey]
    public int Id { get; set; }
    public DateTime DateOpened { get; set; }
    public string ManagerFirstName { get; set; }
    public string ManagerLastName { get; set; }
    public bool HasLimitedMenu { get; set; }
}

Step 3: Initialize Change Log Data and Get All Properties

We now need some data about each change log that we're going to create. Specifically, we need the primary key value, the date changed, and the type name. We're using the LoggingPrimaryKeyAttribute to get the first item, and the other two can be gathered from other data, like so:

var oldProperties = oldType.GetProperties();  
var newProperties = newType.GetProperties();

var dateChanged = DateTime.Now;  
var primaryKey = (int)oldProperties.Where(x => Attribute.IsDefined(x, typeof(LoggingPrimaryKeyAttribute))).First().GetValue(oldEntry);  
var className = oldEntry.GetType().Name;  

Notice the call to Attribute.IsDefined. This returns true if the given property (represented by x in this call) has an attribute of the given type defined upon it. This is how we can check if a particular attribute is defined on a particular property (and we'll be using this again in just a bit).

Better still, we now have lists of properties from the two changed objects. We can use these lists to determine which of their properties have changed.

Step 4: Ignore Specified Properties

We have now reached the step where we can compare the properties of the two objects. However, we first need a way to exclude the properties that we don't want to log changes for. To do this, we create another attribute:

public class IgnoreLoggingAttribute : Attribute  
{
}

In our sample, we'll pretend that we don't want to record changes made to the Manager's First Name, so we'll decorate RetailLocation like so:

public class RetailLocation  
{
    [LoggingPrimaryKey]
    public int Id { get; set; }
    public DateTime DateOpened { get; set; }
    [IgnoreLogging]
    public string ManagerFirstName { get; set; }
    public string ManagerLastName { get; set; }
    public bool HasLimitedMenu { get; set; }

}

Step 5: Compare the Properties

We want to walk through each of the properties on one of the objects and to see if the other object's value for that property has changed.

foreach(var oldProperty in oldProperties)  
{

}

Inside this foreach loop, we check to see if the new instance has the property:

foreach(var oldProperty in oldProperties)  
{
    var matchingProperty = newProperties.Where(x => !Attribute.IsDefined(x, typeof(IgnoreLoggingAttribute)) 
                                                    && x.Name == oldProperty.Name 
                                                    && x.PropertyType == oldProperty.PropertyType)
                                        .FirstOrDefault();
    if(matchingProperty == null)
    {
        continue; //If we don't find a matching property, move on to the next property.
    }
}

Once again we have a call to Attribute.IsDefined, only this time we want the properties that don't have the attribute [IgnoreLogging] defined upon them.

Finally, we need to compare the values of the properties and, if they are different, create a new entry of ChangeLog to record the differences. You might think about doing this:

var oldValue = oldProperty.GetValue(oldEntry);  
var newValue = matchingProperty.GetValue(newEntry);  
if(matchingProperty != null && oldValue != newValue) { //Create ChangeLog }  

However, if you do this, you'll find that the IF clause is always true. In other words, oldValue will always not be equal to newValue. This is because, in C# the == and != operators perform a reference value check on the objects being compared, which will never be equal (see Jon Skeet's answer on StackOverflow). Instead, what we need to do is change the values to string to directly compare them:

var oldValue = oldProperty.GetValue(oldEntry).ToString();  
var newValue = matchingProperty.GetValue(newEntry).ToString();  
if(matchingProperty != null && oldValue != newValue) { //Create ChangeLog }  

The Complete Method

Here is the complete code for the GetChanges method:

public List<ChangeLog> GetChanges(object oldEntry, object newEntry)  
{
    List<ChangeLog> logs = new List<ChangeLog>();

    var oldType = oldEntry.GetType();
    var newType = newEntry.GetType();
    if(oldType != newType)
    {
        return logs; //Types don't match, cannot log changes
    }

    var oldProperties = oldType.GetProperties();
    var newProperties = newType.GetProperties();

    var dateChanged = DateTime.Now;
    var primaryKey = (int)oldProperties.Where(x => Attribute.IsDefined(x, typeof(LoggingPrimaryKeyAttribute))).First().GetValue(oldEntry);
    var className = oldEntry.GetType().Name;

    foreach(var oldProperty in oldProperties)
    {
        var matchingProperty = newProperties.Where(x => !Attribute.IsDefined(x, typeof(IgnoreLoggingAttribute)) 
                                                        && x.Name == oldProperty.Name 
                                                        && x.PropertyType == oldProperty.PropertyType)
                                            .FirstOrDefault();
        if(matchingProperty == null)
        {
            continue;
        }
        var oldValue = oldProperty.GetValue(oldEntry).ToString();
        var newValue = matchingProperty.GetValue(newEntry).ToString();
        if(matchingProperty != null && oldValue != newValue)
        {
            logs.Add(new ChangeLog()
            {
                PrimaryKey = primaryKey,
                DateChanged = dateChanged,
                ClassName = className,
                PropertyName = matchingProperty.Name,
                OldValue = oldProperty.GetValue(oldEntry).ToString(),
                NewValue = matchingProperty.GetValue(newEntry).ToString()
            });
        }
    }

    return logs;
}

The completed method fulfills all of our requirements:

  • It will get and record the primary key value.
  • It will ignore specified columns.
  • It will compare the properties of two objects of the same type and return the properties that have changed values.

Check out my sample project on Github, which has all of the code we've written in this post. As always, if you notice something I could've done better with this code, let me know in the comments!

Happy Coding!

Loading Entity Framework Related Entities with AutoMapper and Reflection

In a previous post I mentioned how we would be using AutoMapper to create Data-Transfer Objects out of Entity Framework entities; the DTOs would then be transmitted over a WCF service. Well, as often happens, a new requirement surfaced recently, and the solution we came up with turned out to be pretty useful, so now I'm sharing it with you all.

The Problem

The issue we had was this: because this app needs to keep performance in mind (and because I really, really hate clutter), we wanted to include a way for the system calling our service to specify if they wanted any additional entities related to the entities they asked for initially.

Let's set this up with a diagram:

A diagram of the Leagues database, showing that a League has many Teams, and a Team has many Players

So, a League has many Teams, and a Team has many players.

If the service requested to get a Team for a specific ID, we would return a Data-Transfer Object for the Team class that looks something like this:

[DataContract]
public class TeamDTO  
{
    [DataMember]
    public int Id { get; set; }

    [DataMember]
    public int LeagueId { get; set; }

    [DataMember]
    public string Name { get; set; }

    [DataMember]
    public DateTime FoundingDate { get; set; }

}

But what if the user also wanted the players on that team? We could make them do another call to the service, but then they are making two calls when they could just be making one. Plus, the calling user won't always want the players for the specific team, so we needed a way for them to "opt-in" to selecting the players in addition to the team object.

So, we had two competing problems:

  1. The user will want the related entities sometimes, but not always.
  2. The user does not want to have to make multiple calls to the service to retrieve related data.

That's the issue we tackled. How did we do it? Read on to find out!

Getting Active Navigation Properties

Before we started writing any code at all, we decided that for any query against an entity, we would only load "immediately-related" entities. That means that if the user queried for one or more Leagues, s/he could get the related Teams, but not those Teams' players (at least, not during the same call). We figured that if they really want that data, they'll have to make another call.

For reference, we were using Entity Framework Code First for our data layer, and our Team model class looked like this:

[Table("Team")]
public partial class Team  
{
    public Team()
    {
        Players = new HashSet<Player>();
    }

    public int Id { get; set; }

    public int LeagueID { get; set; }

    [Required]
    [StringLength(200)]
    public string Name { get; set; }

    public DateTime FoundingDate { get; set; }

    public virtual League League { get; set; }

    public virtual ICollection<Player> Players { get; set; }
}

We also modified our TeamDTO data-transfer object to include the optional Players and League, as well as a TeamOptions contract that would let the user choose if they wanted the related entities:

[DataContract]
public class TeamDTO  
{
    [DataMember]
    public int Id { get; set; }

    [DataMember]
    public string Name { get; set; }

    [DataMember]
    public DateTime FoundingDate { get; set; }

    [DataMember]
    public List<PlayerDTO> Players { get; set; }

    [DataMember]
    public LeagueDTO League { get; set; }
}

[DataContract]
public class TeamOptions  
{
    [DataMember]
    public bool IncludePlayers { get; set; }

    [DataMember]
    public bool IncludeLeague { get; set; }
}

The problem was, how would we translate the boolean values of the properties of TeamOptions into statements we could actually use in LINQ-to-Entities? Reflection turned out to be the answer.

First, we created an attribute called NavigationPropertyAttribute, which allowed us to decorate the properties in TeamOptions with the name of the navigation property they corresponded to:

[AttributeUsage(AttributeTargets.Property)]
public class NavigationPropertyAttribute : Attribute  
{
    protected string _navigationPropertyName { get; set; }

    public string NavigationPropertyName
    {
        get
        {
            return _navigationPropertyName;
        }
        set
        {
            _navigationPropertyName = value;
        }
    }

    public NavigationPropertyAttribute()
    {
        _navigationPropertyName = string.Empty;
    }

    public NavigationPropertyAttribute(string name)
    {
        _navigationPropertyName = name;
    }
}

We also needed an OptionsBase class that uses Reflection to return a list of the Navigation Property names attached to option properties that are active:

public abstract class OptionsBase  
{
    public List<string> GetSelectedOptions()
    {
        List<string> names = new List<string>();
        var type = GetType();
        var properties = type.GetProperties(BindingFlags.Public | BindingFlags.Instance);
        foreach(PropertyInfo info in properties)
        {
            bool value = (bool)info.GetValue(this);
            if (info.PropertyType == typeof(bool) && value)
            {
                NavigationPropertyAttribute attribute = (NavigationPropertyAttribute)info.GetCustomAttribute(typeof(NavigationPropertyAttribute));
                names.Add(attribute.NavigationPropertyName);
            }
        }
        return names;
    }
}

That enabled us to modify TeamOptions like so:

[DataContract]
public class TeamOptions : OptionsBase  
{
    [DataMember]
    [NavigationProperty("Players")]
    public bool IncludePlayers { get; set; }

    [DataMember]
    [NavigationProperty("League")]
    public bool IncludeLeague { get; set; }
}

If we had an instance of TeamOptions and wanted to get a list of the property names, we could do this:

var options = new TeamOptions()  
{
    IncludePlayers = true,
    IncludeLeague = true
}

var navProperties = options.GetSelectedOptions(); //returns new List<string>(){"Players", "League"}  

Now that we knew which navigation properties we needed to include, we wanted a way to do this simply, preferably in one line of code. That's where this extension method comes in:

public static IQueryable<T> AddIncludeStatements<T>(this IQueryable<T> query, OptionsBase options)  
{
    var names = options.GetSelectedOptions();
    var dbQuery = (DbQuery<T>)query;
    foreach (var name in names)
    {
        dbQuery = dbQuery.Include(name);
    }
    return dbQuery;
}

Let's break this method down:

  • An IQueryable<> is a representation of a query. In this case, the extension method is taking a query that it is going to modify as a this parameter.
  • A DbQuery<> is a non-generic instance of a query. Because we cannot use the Include() method on an IQueryable<>, we needed to cast to DbQuery<>

What this method does is take a LINQ-to-Entities query, insert the needed include statements into said query, then return the modified query that can be executed elsewhere.

We use it like this (this is our TeamService class, which provides data access for the Team objects):

public class TeamService  
{
    private LeaguesDataContext _context;

    public TeamService()
    {
        _context = new LeaguesDataContext();
    }

    public TeamDTO GetByID(int id, TeamOptions options)
    {
        return Mapper.Map<TeamDTO>(_context.Teams.Where(x => x.Id == id).AddIncludeStatements(options).First());
    }
}

Woohoo! We're done, now we just pull up WCFTestClient to test this and we'll be ready to rock.....

An error message from WCFTestClient, saying a generic error occurred.

Crap. That's not good. What went wrong?

Tracking Down the Bug

I may be an experienced bug hunter, but this thing stymied me. The error message didn't give any useful details, it just said that a "connection was forcibly closed by the remote host." That's as helpful as a slap to the face.

Eventually, though, I discovered the problem. I'm gonna lay out most of the code (call to create AutoMapper maps excluded); see if you see the issue.

[DataContract]
public class TeamDTO  
{
    [DataMember]
    public int Id { get; set; }

    [DataMember]
    public string Name { get; set; }

    [DataMember]
    public DateTime FoundingDate { get; set; }

    [DataMember]
    public List<PlayerDTO> Players { get; set; }

    [DataMember]
    public LeagueDTO League { get; set; }
}

[DataContract]
public class PlayerDTO : Player  
{
    [DataMember]
    public int Id { get; set; }

    [DataMember]
    public int TeamId { get; set; }

    [DataMember]
    public string FirstName { get; set; }

    [DataMember]
    public string LastName { get; set; }

    [DataMember]
    public DateTime DateOfBirth { get; set; }

    [DataMember]
    public TeamDTO Team { get; set; }
}

public static class AutoMapperConfiguration  
{
    public static void Configure()
    {
        Mapper.CreateMap<EntityFrameworkModel.League, Contracts.DataContracts.LeagueDTO>();
        Mapper.CreateMap<EntityFrameworkModel.Team, Contracts.DataContracts.TeamDTO>();
        Mapper.CreateMap<EntityFrameworkModel.Player, Contracts.DataContracts.PlayerDTO>();
    }
}

public class TeamService  
{
    private LeaguesDataContext _context;

    public TeamService()
    {
        _context = new LeaguesDataContext();
    }

    public TeamDTO GetByID(int id, TeamOptions options)
    {
        return Mapper.Map<TeamDTO>(_context.Teams.Where(x => x.Id == id).AddIncludeStatements(options).First());
    }
}

See the issue? I sure didn't; not the first ten times, anyway.

The problem was that we defined a circular reference. See how PlayerDTO and TeamDTO each have properties of the other's type? When AutoMapper encountered those properties, it attempted to map them by reading the corresponding EF navigation properties, and EF promptly loaded that data from the database. This caused an infinite loop of loading (load them Team, get the related Players, get their related Teams, get those Teams' related Players, etc.) and eventually the system crashed.

Well, that sucks. I had been hoping to keep the structure nice and clean, but this infinite loop wasn't going to let me. I had to find some way to eliminate the circular references.

We did this by creating two levels of DTOs. Here's the two classes for Team:

namespace WCFNavigationPropertyMapping.Contracts.DataContracts  
{
    [DataContract]
    public class Team
    {
        [DataMember]
        public int Id { get; set; }

        [DataMember]
        public string Name { get; set; }

        [DataMember]
        public DateTime FoundingDate { get; set; }
    }

    [DataContract]
    public class TeamDTO : Team
    {
        [DataMember]
        public League League { get; set; }

        [DataMember]
        public List<Player> Players { get; set; }
    }
}

Note that in TeamDTO, the reference for the two NavigationProperties are for the basic contract classes, not the complete DTO objects. This design will prevent the circular-reference infinite-loop errors. Now if we run this in WCF Test Client, it behaves exactly like we want it to.

Summary

This design allows us to:

  1. Let the user decide which related entities s/he wants on a given call.
  2. Let the developers use concise, clear code to describe what the LINQ-to-Entities queries should be doing.

I'm pretty satisfied with this solution; we'll see how it holds up under load. I have, however, noticed a couple of drawbacks:

  • I don't like needing two data contracts classes. I think there ought to be a cleaner way of designing this, but it's escaping me at the moment.
  • Reflection and Entity Framework are both reputed (even by me) to be slow compared to other equivalent solutions, so I'm really interested to see if my developer-friendly solution can pull its weight in a production environment.

There's a sample project that demos this solution over on GitHub. I'd be happy to hear any suggestions you dear readers have about how to improve this design.

Happy Coding!

Using .NET Reflection to Map DataTables to Strongly-Typed Models

We're in the process of converting an old ASP.NET WebForms application to MVC. For this particular project, we are accessing an old database, one that doesn't really play well with Entity Framework or other ORMs, so we're using ADO.NET statements to get the data back in the form of DataTables.

DataTables, however, are really difficult to work with in MVC, because:

  • They're not strongly typed; the type of any given object in a DataRow's ItemArray is object.
  • We can't do validation on individual data pieces because of the no-type issue.
  • The tables can contain any number of rows.

This bothered me for quite a while, as I really couldn't mesh the philosophies of MVC and DataTables together in any meaningful way. The lack of strong-typing was the most egregious issue; I was reasonably sure that the value of a given column in a given row was of a particular type, but I couldn't use the benefits of that type.

Taking some inspiration from AutoMapper, we decided to create our own mapping engine that would take those DataTables and map them into collections of strongly-typed objects. I've taken those ideas and distilled them down into a sample project. This post details how we ended up creating this mapper class, and how well it worked (spoiler alert: pretty damn well).

The Problem

We had old procedures in our database that we still had to use for this rewrite. Those procedures were written in our organization's cowboy coding days, and as such had no real consistency in their naming and structure. Plus, there were a lot of them (100+). So, if we were going to build this mapping engine, we needed a generic solution that we could use for a lot of different types and source data.

To further complicate matters, the procedures' structure and naming weren't predictable at all. For example, we had two different procedures, each of which returned data representing a collection of users for this application, but each of which was named entirely differently. So we had to account for the possibility that two columns in different procedure calls would have different names but represent the same data.

Finally, we wanted the ability to not have to explicitly map each property to a column. Implicit in this requirement is the ability to determine the type and parse the source data accordingly.

In short, we needed a mapping engine that:

  • Was generic and extensible
  • Could handle multiple column names
  • Could discover the type of the destination property and parse the source data

The Model Class

Here's the model class we will be using:

public class User  
{
    public int ID { get; set; }
    public string FirstName { get; set; }
    public string LastName { get; set; }
    public DateTime? DateOfBirth { get; set; }
    public decimal CashOnHand { get; set; }
}

SourceNamesAttribute

The first part of the problem we had to solve was this: How does the code determine which columns in a DataTable map to which properties? We decided that an attribute was the best solution for this, and created the SourceNamesAttribute:

[AttributeUsage(AttributeTargets.Property)]
public class SourceNamesAttribute : Attribute  
{
    protected List<string> _columnNames { get; set; }
    public List<string> ColumnNames
    {
        get
        {
            return _columnNames;
        }
        set
        {
            _columnNames = value;
        }
    }

    public SourceNamesAttribute()
    {
        _columnNames = new List<string>();
    }

    public SourceNamesAttribute(params string[] columnNames)
    {
        _columnNames = columnNames.ToList();
    }
}

The AttributeUsage attribute allows us to restrict our SourceNamesAttribute to only be used on properties of a class, not classes themselves.

SourceNamesAttribute stores a list of column names for each property, so it supports our requirement that the mapping be tolerant of multiple column names. Now we can decorate the properties of User with this attribute:

public class User  
{
    [SourceNames("user_id", "userId")]
    public int ID { get; set; }

    [SourceNames("first_name", "first")]
    public string FirstName { get; set; }

    [SourceNames("last_name", "last")]
    public string LastName { get; set; }

    [SourceNames("date_of_birth", "birthDate")]
    public DateTime? DateOfBirth { get; set; }

    [SourceNames("cash_on_hand", "cash", "cashOnHand")]
    public decimal CashOnHand { get; set; }
}

A Generic Mapper Interface

For simplicity, we'll assume we only want to work with DataTables for now. The interface for our Mapper class looks like this:

public interface IMapper<T>  
{
    List<T> Map(DataTable table);
}

And our implementation looks like this:

public class Mapper<TEntity> where TEntity : class, new()  
{
    public List<TEntity> Map(DataTable table)
    {
    }
}

We needed to specify that TEntity must be a class, and will be newly created by this Mapper<>.

We still need to fill in that Map() function, but to do that we need some parts.

Getting the Source Names

We need a static method that we can use to get the source names for a given property:

public static class MappingHelper  
{
    private static List<string> GetSourceNames(Type type, string propertyName)
    {
        var property = type.GetProperty(propertyName).GetCustomAttributes(false).Where(x => x.GetType() == typeof(SourceNamesAttribute)).FirstOrDefault();
        if (property != null)
        {
            return ((SourceNamesAttribute)property).ColumnNames;
        }
        return new List<string>();
    }
}

The first parameter is the type of the destination model. We use propertyName to get the property of the object via Reflection, and then look for a SourceNamesAttribute on that property; if such an attribute exists, we return the source names.

Since we have proven that we can get the PropertyInfo for a given property, we can now tackle the other end of this problem: how to determine the property's type and parse accordingly.

Parsing a Given Property's Value

Here's the rest of Mapper and MappingHelper:

public class Mapper<TEntity> where TEntity : class, new()  
{
...
    public void Map(Type type, DataRow row, PropertyInfo prop, object entity)
    {
        List<string> columnNames = MappingHelper.GetSourceNames(type, prop.Name);
        //Handle .NET Primitives and Structs (e.g. DateTime) here.
        foreach (var columnName in columnNames)
        {
            if (!String.IsNullOrWhiteSpace(columnName) && row.Table.Columns.Contains(columnName))
            {
                var propertyValue = row[columnName];
                if (propertyValue != DBNull.Value)
                {
                    MappingHelper.ParsePrimitive(prop, entity, row[columnName]);
                    break; //Assumes that the first matching column contains the source data
                }
            }
        }
    }
}

public static class MappingHelper  
{
    ...
    public static void Map(Type type, DataRow row, PropertyInfo prop, object entity)
    {
        List<string> columnNames = GetSourceNames(type, prop.Name);
        foreach (var columnName in columnNames)
        {
            if (!String.IsNullOrWhiteSpace(columnName) && row.Table.Columns.Contains(columnName))
            {
                var propertyValue = row[columnName];
                if (propertyValue != DBNull.Value)
                {
                    ParsePrimitive(prop, entity, row[columnName]);
                    break; //Assumes that the first matching column contains the source data
                }
            }
        }
    }

    private static void ParsePrimitive(PropertyInfo prop, object entity, object value)
    {
        if (prop.PropertyType == typeof(string))
        {
            prop.SetValue(entity, value.ToString().Trim(), null);
        }
        else if (prop.PropertyType == typeof(int) || prop.PropertyType == typeof(int?))
        {
            if (value == null)
            {
                prop.SetValue(entity, null, null);
            }
            else
            {
                prop.SetValue(entity, int.Parse(value.ToString()), null);
            }
        }
        else if (prop.PropertyType == typeof(DateTime) || prop.PropertyType == typeof(Nullable<DateTime>))
        {
            DateTime date;
            bool isValid = DateTime.TryParse(value.ToString(), out date);
            if (isValid)
            {
                prop.SetValue(entity, date, null);
            }
            else
            {
                //Making an assumption here about the format of dates in the source data.
                isValid = DateTime.TryParseExact(value.ToString(), "yyyy-MM-dd", new CultureInfo("en-US"), DateTimeStyles.AssumeLocal, out date);
                if (isValid)
                {
                    prop.SetValue(entity, date, null);
                }
            }
        }
        else if (prop.PropertyType == typeof(decimal))
        {
            prop.SetValue(entity, decimal.Parse(value.ToString()), null);
        }
        else if (prop.PropertyType == typeof(double) || prop.PropertyType == typeof(double?))
        {
            double number;
            bool isValid = double.TryParse(value.ToString(), out number);
            if (isValid)
            {
                prop.SetValue(entity, double.Parse(value.ToString()), null);
            }
        }
    }
}

The Mapper.Map() function takes four inputs: the Type of the destination property, a PropertyInfo instance, the source DataRow, and destination object ("entity"). It grabs the source column names and cycles through them, assuming that the first one it finds in the DataRow is the one we want, and then takes that value out of the row and attempts to parse it using ParsePrimitive().

ParsePrimitive() is only set up at the moment to parse the types we need for this example, so feel free to add you own parsing scenarios.

Completing the Mapper

With all of this in place, we can finally complete Mapper<T>.Map():

public List<TEntity> Map(DataTable table)  
{
    List<TEntity> entities = new List<TEntity>();
    var columnNames = table.Columns.Cast<DataColumn>().Select(x => x.ColumnName).ToList();
    var properties = (typeof(TEntity)).GetProperties()
                                      .Where(x => x.GetCustomAttributes(typeof(SourceNamesAttribute), true).Any())
                                      .ToList(); //Only get properties that have the SourceNamesAttribute
    foreach (DataRow row in table.Rows) //Every DataRow represents a new T object
    {
        TEntity entity = new TEntity();
        foreach (var prop in properties)
        {
            MappingHelper.Map(typeof(TEntity), row, prop, entity);
        }
        entities.Add(entity);
    }
    return entities;
}

Now, our controller actions in the MVC project look like this:

[HttpGet]
public ActionResult AllUsers()  
{
    DataTable firstTable = DataTableGenerator.GetFirstTable();
    DataTable secondTable = DataTableGenerator.GetSecondTable();
    Mapper<User> mapper = new Mapper<User>();
    List<User> users = mapper.Map(firstTable);
    users.AddRange(mapper.Map(secondTable));
    return View(users);
}

Whew! That was a lot of steps. But now we have a generic, custom mapper that takes DataTables and maps them into strongly-typed objects, which are much easier to work with in MVC.

You can grab a sample project on GitHub that has all of this code and some simple usage scenarios.

Happy Coding!