Sunday, June 13, 2010

Practical MongoDB Part 3: Fine Tuning

In Part 1 of this series I briefly discussed setting up MongoDB to run as a service. In Part 2 I covered data access objects. In this installment I'd like to touch on embedded documents before reviewing a few configuration changes you should use to improve performance.


Embedded Documents

Obviously few real world domain objects consist entirely of primitive members. Most objects contain reference types, or collections of primitives or reference types. Mongo provides support for these too.

Let's modify our User class from Part 2. In addition to email and password fields, We'd like to store a collection of UserStat objects:
public class User : ICollectable
{
public User()
{
Stats = new List<UserStat>();
}

...

public List<UserStat> Stats { get; private set; }
}

public class UserStat
{
public DateTime Posted { get; internal set; }
...
}
You may notice that UserStat here does not implement ICollectable. This is because ICollectable is intended to be implemented only on Collection classes. When we save this new User object to Mongo, there will still only be one Collection: User. UserStat only exists inside the context of a User and thus does not require a Collection of its own.
Let's go ahead and write a test for saving a UserStat.
public void Can_Save_User_With_Stats()
{
UserStat stat1 = new UserStat();
stat1.Posted = DateTime.UtcNow.AddMinutes(5);
UserStat stat2 = new UserStat();
stat2.Posted = DateTime.UtcNow.AddMinutes(10);

User test = new User();
test.Email = "email";
test.Stats.Add(stat1);
test.Stats.Add(stat2);

using (var dao = new UserDao())
{
dao.Insert(test);

User found = dao.Get(test.Email);

Assert.NotNull(test.Id); // insert succeeded
Assert.Equal(2, found.Stats.Count);
Assert.True(found.Stats.Any(s => s.Posted.Minute == stat1.Posted.Minute));
Assert.True(found.Stats.Any(s => s.Posted.Minute == stat2.Posted.Minute));
}
}
After successfully passing this test, we can look at the contents of the collection to see how it was stored.


Indexing

Indexing in Mongo, just as in an RDBMS, is important for improving query performance. Indexes can be defined on a single field, a combination of fields, and even on fields in embedded documents.

I'm going to modify my DaoBase class with an EnsureIndex method:
public abstract class DaoBase<T> 
where T : ICollectable
{
public DaoBase()
{
...
EnsureIndex(Mongo.GetCollection<T>());
}

...
protected abstract void EnsureIndex(MongoCollection<T> collection);
}
And since I'll be querying my users frequently by email, I'll add a unique index in UserDao.
public class UserDao : DaoBase<User>
{
...
protected override void EnsureIndex(MongoCollection<User> collection)
{
collection.CreateIndex(u => u.Email, // the field to index
"email", // index name
true, // unique
IndexOption.Ascending // this doesn't matter on unique indexes
);
}
}
NoRM also provides support for indexing multiple fields using anonymous types, but I won't go into that here.

We can run our tests again and confirm that the index exist using the command line.

Notice that there already exists an index on the Id column. Mongo creates this index on all collections by default and it cannot be removed.

At this point I'm entirely sure I want the EnsureIndex method to be part of my DAO, but I'll leave it for now.

Configuring Property Aliases

NoRM provides a useful feature called configuration maps. This allows you more fine grained control of how data is stored in Mongo. As we've seen, we can setup and start using NoRM and Mongo together without even touching this feature. However, I'll be using aliasing in order to reduce my database size.

To begin I'll need to create a new class which extends MongoConfigurationMap.
public class UserMap : MongoConfigurationMap
{
public UserMap()
{
For<User>(config =>
{
config.ForProperty(u => u.Email).UseAlias("un");
config.ForProperty(u => u.Password).UseAlias("pwd");
config.ForProperty(u => u.Stats).UseAlias("st");
}
);
}
}
Than we need to execute this map before calling mongo. For this I created another class to centralize mapping and unmapping.
internal class Configurations
{
internal static void Map()
{
MongoConfiguration.Initialize(config => config.AddMap<UserMap>());
}
internal static void UnMap()
{
MongoConfiguration.RemoveMapFor<User>();
}
}
And modified DaoBase once more.
public abstract class DaoBase<T> 
where T : ICollectable
{
public DaoBase()
{
Configurations.Map();
...
}
}
If we run our test again and check the console we can see that the field values are now stored with shortened property names.

I leave it to the reader to alias the fields in UserStat.

1 comment:

Mau Sanchez said...

Hi Joel, thanks a lot for your examples. I do have a problem though that really looks like your UserStat class. Basically I have a class called Event which has a list of embedded Criterias, the class Criterias is an abstract class which may contain N types of criterias, so:
public abstract class Criteria
{
public string Type {get;set;}
}
public class CriteriaA : Criteria
{
public string Day {get;set;}
}
public class CriteriaB : Criteria
{
public string Night{get;set;}
}

The problem is that when querying Events (with a repository) I can't access the child properties that I need, say:
_repostory.Find(p=>p.Criterias.Any(x=>x.Type).FirstOrDefault()

I can access the property "Type" bevause it belongs to the main abstract class, but how can I access the other child properties when needed?

Thanks.