Thursday, May 17, 2012

Developing for, but remaining decoupled from Azure

(This started out as a Yammer discussion post, but got obviously way too long for that, so I decided to blog it instead.)

A coworker had mentioned that there could be difficulty in developing for Azure while maintaining independence from it by decoupling from direct dependencies.  Having done this from the beginning in my almost 2-year old Azure project, I decided to tell how I did this.

Why decouple?


First of all, why do we need to do this?

Well, I have developed for a very wide range of platforms in my past, including dozens of flavors of *nix, tons of versions of Windows (3.0, 3.1, 3.11, 95, 98, 98 SE, NT 3.X, 2000, ME, XP, Vista, Win 7), QNX, PSOS, OS/2, Mac OS, etc.

Given that, I've always had issues with tying my code too tightly to a platform.  It always seems to come back and bit you. Hard.

Developing/Debugging


If you have ever tried to develop and debug with the Azure Dev Fabric, you probably didn't even ask this question in the first place.

Working with the Dev Fabric is painful, especially in the Web development side.  Instead of being able to edit the HTML and JavaScript files while you debug, each edit requires a rebuild and redeploy.

If you're not familiar with this process, what is actually happening when you debug a "Cloud" app is a full deployment.  Visual Studio is compiling the code, then, upon success, it packages a mini-azure deployment and deploys it to the Dev Fabric.  This involves spinning up temporary Web Sites on your IIS installation and spinning up what are effectively mini Virtual Machines.  So, in order to get any code changes over there, you must redeploy.

Bottom Line: no easy tweaks to your HTML, CSS or JavaScript while debugging.

Platform Independence


Sometimes, you may not have 100% buy-in from a client or a manager for running in Azure, so, in order to hedge your bets, you need to have the ability to run on Azure without depending on it, in case someone makes a decision to go the other way.  Maybe the client hires a new CIO and he wants to run on Amazon because Jeff Bezos is an old college buddy.  Crazier things have happened.

In my case, my project was, I believe, the first Azure project our company deployed into Production.   Thus, we were hedging our bets a bit by allowing it to run on a regular on-premise deployment.  This had the added benefit of allowing us to do pilot/demo deployments on-site for our client so they could view the site during early development.

What to decouple?


First off, there are three main areas that need to be decoupled in order to maintain your freedom, Configuration Settings, Images/Static Files and Data (SQL, etc)

CONFIGURATION SETTINGS


Regular .NET apps use AppSettings from either web.config or app.config for simple key/value settings.  For this scenario, a simple interface will do:

public interface IConfigurationSettingsProvider
{
    string GetSettingValue(string key);
}
Non-Azure Implementation
using System.Configuration;

public class ConfigurationManagerSettingsProvider
    : IConfigurationSettingsProvider
{
    public string GetSettingValue(string key)
    {
        return ConfigurationManager.AppSettings.Get(key);
    }
}
Azure Implementation
public class AzureRoleConfigurationSettingsProvider
    : IConfigurationSettingsProvider
{
    public string GetSettingValue(string key)
    {
        return RoleEnvironment.GetConfigurationSettingValue(key);
    }
}
As you see, this isn't really all that complicated

IMAGES


In my case, the images we're serving up generally are based on an "Item Key" and can either be referring to a "normal" image size or a "thumbnail" image.  We also needed the ability to store the images and retrieve them.

The implementation of my IImageStorage interface also actually takes care of scaling the images to the appropriate sizes during upload.

The particulars of your situation may vary, but the pattern could be similar.
NOTE: I removed a few overloads of methods here (for instance, each "Save" method has an implementation that takes a "Stream" and proxies it to the byte[] after reading the Stream in)

Interface
public enum ImageSize
{
    Thumbnail,
    Normal,
    Original
}
 
public interface IImageStorage 
{ 
    void SaveImage(int imageId, string pathInfo, byte[] imageBytes); 
 
    // removed overloads 
    // ... 
 
    Uri GetImageUri(int imageId, ImageSize size); 
 
    IEnumerable<int> GetAllImageIds();
 
    void DeleteImage(int imageId); 
} 


Non-Azure Implementation

In this case, I chose to make a "FileSystemImageStorage" class that uses a directory structure of

Root->
      Thumbs
      Normal
      Original

Each file name is the "ID" with ".jpg" in the end.

The urls returned are either "file:///" or "data://" (base64-encoded) urls based on the size.


Azure Implementation

For Azure, the pattern is very similar.  I created a "BlobStorageImageStorage" class.  I use 3 Blob Storage Containers called "normal", "thumbs" and "original" which each contain a blob with the image inside it.

In this case, I chose to make the name the ID zero-padded to 10 spaces in order to allow for long-term expansion.  So, ID 1234 would be "0000001234".  So, my url for the thumbnail for 1234 is:

http:// [accountname].blob.storage.net/thumbs/0000001234

DATA


For a "typical" application, data usually means SQL.  In the case of SQL, the only difference as far as your application is concerned is a connection string, which is easy enough to abstract for yourself.

If, on the other hand, you are using Azure Table Storage, you will need to have some sort of Repository interface that will persist TableStorageEntitites for you.

In our application, we decided to allow the Entities to subclass TableStorageEntity, regardless of whether or not we're storing them to Azure.

Given that, we have this class (keep in mind that I've removed error checking, comments and a few other minor details - and the code patterning is kinda strange too in this format):

public class TableServiceEntityRepository<TEntity>
    where TEntity : TableServiceEntity
{
    readonly string _tableName;
    readonly TableServiceContext _context; 
    public TableServiceEntityRepository(
        ITableServiceContextFactory ctxFactory)
    {
        _tableName = typeof(TEntity).Name;
        _context = ctxFactory.Create(_tableName);
    }
    public IQueryable<TEntity> Get(
        Specification<TEntity> whereClause = null,
        Func<
        <TEntity>, IOrderedQueryable<TEntity>
            > orderBy = null)
    {
        IQueryable<TEntity> query =
            _context.CreateQuery<TEntity>(_tableName);
        if (whereClause != null)
        {
            query = query.Where(whereClause.IsSatisfiedBy());
        }
        if (orderBy != null)
        {
            return orderBy(query);
        }
        return query;
    }
    public virtual void Insert(TEntity entity)
    {
        _context.AddObject(_tableName, entity);
    }
    public virtual void Update(TEntity entity)
    {
        var desc = _context.GetEntityDescriptor(entity);
        if (desc == null || desc.State == EntityStates.Detached)
        {
            _context.AttachTo(_tableName, entity, "*");
        }
        _context.UpdateObject(entity);
    }
    public virtual void Delete(TEntity entity)
    {
        var desc = _context.GetEntityDescriptor(entity);
        if (desc == null || desc.State == EntityStates.Detached)
        {
            _context.AttachTo(_tableName, entity, "*");
        }
        _context.DeleteObject(entity);
    }
    public virtual void SaveChanges()
    {
        _context.SaveChanges();
    }
}

If you notice, this code uses a Specification for a "whereClause".  That is defined this way: 

public abstract class Specification<TEntity>
{ 
    private Func<TEntity, bool> _compiledFunc = null; 
    private Func<TEntity, bool> CompiledFunc 
    { 
        get 
        { 
            if (_compiledFunc == null) 
                _compiledFunc = this.IsSatisfiedBy().Compile(); 
            return _compiledFunc; 
        } 
    } 
    public abstract Expression<Func<TEntity, bool>> IsSatisfiedBy();
 
    public bool IsSatisfiedBy(TEntity entity) 
    { 
        return this.CompiledFunc(entity); 
    } 
} 

The non-Azure version of this just uses the TableServiceEntity's PartitionKey and RowKey fields for folder name and file name and uses Binary Serialization to serialize the files.  It's pretty simple, but wouldn't work for very large numbers of entities, of course.  You could, however choose to do the same thing with SQL Server using a SQL Blob for the serialized data for a quick speed improvement.

Though we have other implementation abstractions in our software for things like Queues, this should give you a good kick start on how things are done.

Summary


Abstracting things from specific hardware/platform dependencies is usually a Good Thing in my opinion.  Stuff Happens.  Platforms break/die/become uncool, etc.  We shouldn't have to refactor significant portions of applications in order to be more Platform agnostic.

Feel free to comment or contact me for more details on how things like this can be done, or any questions on my implementation of the Specification pattern or anything at all :)