Single or multitenant for your SaaS?

Often we need to make a call between architecting for single or multitenant instances when designing Software as a Service (SaaS) products.  Under a single tenant solution you deploy one instance per customer and under a multitenant solution many customers share a single deployed instance (where ‘an instance’ is all of the server resources you need to make the product available, e.g. web server, database, email server). It’s costly to change architecture once an application has been developed, deployed into production and has live users – so picking the most appropriate architectural approach from the outset is important to ensure longevity and maintainable costs. The following offers a number of considerations in helping decide which approach is appropriate for you.

 Single Instance considerations:

  • Physically ‘more’ secure – as each customer has a separate instance (likely including database, web servers, and storage) the risk of leaking data between instances is significantly reduced
  • Performance can be tuned on a per customer basis. Have a client with heavy loads? Scale up, or out. A load intensive customer won’t affect the performance of other clients as the resources are not shared. Got a light work load? Scale down, reducing cost.
  • Deploying to an on-premise, or hybrid cloud location is an easier story if you’re already set up to deploy as  single instance model. The nature of single instances means that an instance can be created and deployed as a single package – assuming the target deployment environment supports the required services and any other dependencies, e.g. web server, database server, virtualisation and / or network services (SMTP, DNS routing), runtimes (.NET, PHP)
  • Instances can be deployed in different regions to reduce latency (increasing performance by moving the instance closer to users), or to meet local compliance laws, e.g. data sovereignty
  • Calculating per customer hosting costs is simple as it’s based on exact, actual resource usage
  • Per customer [instance] customisation can be achieved if required. For that customer who just has to have that specific feature. Note: this will complicate your source control and deployment models, and can lead to a fragmented nightmare if not managed effectively, or avoided all together. Just because you *could* customise a solution per customer doesn’t necessarily mean you should!
  • Monitoring, maintenance, backups and new version and/or patch deployments and/or data migrations need to be managed on a per instance level, increasing costs proportional to the number of instances.  Need to back up instance databases, or roll out an update? Consider how will this be managed, and the cost for 1, 10, 100, 1000, 10,000 instances.  This is a significant cost, and should be evaluated against the necessity and likely usage of the other benefits of this model
  • Have a clear picture of how new instance provisioning will be managed, and how long it will take. Most cloud platforms allow for automated provisioning via API’s, and you may need additional steps like registering or configuring external services like DNS and authentication mechanisms (e.g. AD accounts)
  • Some resources can be shared, e.g. worker queues for background processes and SMTP servers. Plan which resources need to be ‘per instance’, and which make sense to share across instances

Single instance is typically suitable for:

  • Applications that store highly commercially sensitive, or personally identifying information of a private nature
  • Customers with specific, and varying workloads
  • Customers who need to adhere to local compliance laws
  • Customers willing to pay a premium for these features as they incur a higher cost to provide and manage

 

Multitenant considerations:

  • Careful consideration needs to be made to ensure data is not leaked across clients. This isn’t difficult, and should be managed with suitable design and sufficient testing
  • Deployments are much simpler, everyone can always be on the latest version. This approach lends itself to providing an ‘evergreen’ product
  • Reporting across all your instances is much simpler when they are in the same place (or all the data is in the same database)
  • On-boarding new customer is typically simple as it doesn’t require the provisioning of underlying resources (servers, database, storage)
  • An outage will effect *all* of your customers, design for redundancy
  • Requires a single monitoring, maintenance, backup and deployment approach. Costs will increase with the number of tenants, though much lower that a per instance approach. It’s probably the biggest benefit to this approach
  • There could be a point where you need to scale up, or out – or have *very* large data and need to consider how to partition. This could be one multitenancy per region.

 

Multitenancy is typically suitable for:

  • ‘Self Service’ products scaling up to and greater than 1000’s of customers
  • Reducing costs of deploying, hosting and maintaining your service

Shared tenancy with individual database and/or storage

An alternative hybrid approach is to provide shared tenancy of the application, with separate storage per customer. For example your app could be load balanced across a number of web servers which would service all of your customers. Each customer however would have their own separate physical database for data storage. This approach gives you the ability to provide the physical separation of data while having a shared application tier. This approach takes benefits from both single and multitenancy architectures:

  • Physically ‘more’ secure
  • Storage can be located in different geographic regions (though be aware any latency this might introduce in your application)
  • Retains ‘evergreen’ application versioning
  • Overall costs are lower than single instance approach

Calculated Columns using IF and IN operators in Zoho Reports

Filed under #SoIDontForgetHowIDidThis, #Zoho

In Zoho Reports you can’t add lookup or standard columns to data sourced from Zoho CRM. You can however add calculated columns, which is great if you want to transform a list of usernames into a department for future filtering or grouping.

The syntax for this isn’t well documented, and it took a wee bit of fiddling to figure it out. We combine the IF function with the IN operator where:

IF is a function defined as:

if(expr1, expr2, expr3)

Returns expr2 if expr1 is true else it returns expr3

Example: if(5 > 10,100,50) = 50

IN is a list operator defined as: 

<expression1> in <expression2>

It’s also useful to know that lists are enclosed in round brackets (), string literals are enclosed in single quotes ‘’ and column references are  enclosed in double quotes “”.

E.g, to return the department name of a user based on the potential owner, the formula would be:

IF("Potential Owner" IN ('Anthony','Seth', 'Chris'), ‘Technology Infrastructure' , ‘Software Development’)

Which will result in a list similar too:

  • Seth – Technology Infrastructure
  • Bob – Software Development
  • Carol – Software Development
  • Anthony – Technology Infrastructure

For a more explicit result we can nest the IF function, which will be familiar to many Excel users:

IF("Potential Owner" IN ('Anthony','Seth', 'Chris'), 'Technology Infrastructure' , IF("Potential Owner" IN ('Phillip’,'Bob'), 'Software Development' , 'Other’))

Will result in:

  • Seth – Technology Infrastructure
  • Bob – Software Development
  • Carol – Other
  • Anthony – Technology Infrastructure

As Carol isn’t explicitly defined as being in the TI or SD departments, this is more correct.

Scoping for readability and automatic clean-up with IDisposable()

This is a simple trick that I find makes my code more readable, and reduces the amount of code I need to write to explicitly scope *something*. It avoids the need to have to manually clean up, or reset some state. In the sample below I’ve used this to highlight the output on a console app to a user defined font colour. It’s really simple, DRY, and a great example:

Rather than writing this:

Console.WriteLine("Starting...");
           
Console.ForegroundColor = ConsoleColor.Yellow;

while(!_server.IsComplete)
{
     Console.WriteLine(_server.Progress);
     System.Threading.Thread.Sleep(1000);
}

Console.ResetColor();

I can now do this:

Console.WriteLine("Starting...");
using (new ScopedConsoleColour(ConsoleColor.Yellow))
{
      while (!_server.IsComplete)
      {
          Console.WriteLine(_server.Progress);
          System.Threading.Thread.Sleep(1000);
       }
}

Based on a defined class of:

class ScopedConsoleColour : IDisposable
{
    public ScopedConsoleColour(System.ConsoleColor colour)
    {
        Console.ForegroundColor = colour;
    }

    public void Dispose()
    {
        Console.ResetColor();
    }
}

mplementing IDisposable ensures your state gets reset based on your scope. Another common use for this approach is for setting a wait cursor in a Windows form application. I’m sure there are many others great uses!

While this example is trivial, and only really saves 2 lines of code – I find it helps maintain readability as my project size increases, and reduced bugs if I forget to reset state because of an unexpected execution path – like when an exception gets thrown.

Creating object hierarchies with LINQ

I’m constantly amazed by the power and simplicity of LINQ.  I love that I can turn a flat table into a  object graph with a couple of view models and a single LINQ query.

Given a repository method that returns a list of Tracks based on the following SQL:

Select Id, Title, Artist, Album, Year, TrackNumber, FilePathName From Track Order By Artist, Album, TrackNumber, Title

We can create a nice hierarchy of  Artists, each with 0 or more albums, of which each may have a number of tracks – with a nice, neat LINQ query:

var artists = _libraryRepository.FetchLibrary()
    .AsParallel()
    .GroupBy(track => track.Artist)
    .Select(group => new Artist(group.Key,
        group.GroupBy(track2 => track2.Album)
            .Select(group2 => new Album(group2.Key,group2)))
   );

And the models:

public class Artist
{
    public string Name { get; set; }
    public IEnumerable<Album> Albums { get; private set; }

    public Artist() { }

    public Artist(string name, IEnumerable<Album> albums)
    {
        Name = name;
        Albums = albums;
    }
}

public class Album
{
    public string Name { get; set; }
    public IEnumerable<Track> Tracks { get; private set; }

    public Album() { }

    public Album(string name, IEnumerabl<Track> tracks)
    {
        Name = name;
        Tracks = tracks;
    }
}

public class Track
{
    public Guid Id { get; private set;}
    public string Title { get; set; }
    public string Artist { get; set; }
    public string Album { get; set; }
    public string FilePathName { get; set; }
    public int? Year { get; set; }
    public int? TrackNumber { get; set; }

    public Track() {}

    public Track(Guid id, string filePathName)
    {
        Id = id;
        FilePathName = filePathName;
    }
}

Got comments?  Tweet me @padgettrowell

Many to One relationships with PetaPoco

One small ‘gotcha’ to be aware of with the excellent lightweight ORM PetaPoco is when mapping many to one object references to entity properties.
The following repository call returns a list of Schedule entities. A schedule enitity has two object properties: Category and Resource. So the call to db.Fetch accepts an SQL statement that will return all the entities we’re interested in while correctly mapping the Resource and Category objects.

		       /// Fetches all schedules with a start date greater then or equal to the supplied date for all active resources
       public List&lt;Schedule&gt; FetchAllFutureSchedules(DateTime from)
        {
            // Ensure columns are grouped by table!!!!!
            // the poco parser expects all poco fields to be sequential, after the poco id

            var db = new PetaPoco.Database(&quot;DatabaseConnection&quot;);

            var relator = new ScheduleRelator();

            return db.Fetch&lt;Schedule, Resource, Category, Schedule&gt;(
            relator.MapResources,
            @&quot;SELECT [ScheduleId]
              ,[Title]
              ,[StartDate]
              ,[EndDate]
              ,[Utilisation]
              ,[Resource].[Name]
              ,[Resource].[ResourceId]
              ,[Resource].[IsActive]
              ,[Category].[Name]
              ,[Category].[CategoryId]
              ,[Category].[IsActive]
              ,[Category].[Colour]
	        FROM [Schedule]
	        LEFT JOIN [Resource] ON Schedule.ResourceId = [Resource].ResourceId
	        LEFT JOIN [Category] ON Schedule.CategoryId = [Category].CategoryId
            WHERE
                    [Resource].[IsActive] = 1
                AND (StartDate &gt;= @0 OR EndDate &gt;= @0)
	        ORDER BY [Resource].[Name]&quot;

            , from
            );
        }

I ran into an issue where my entities where not being correctly loaded – properties where either null, or being assigned the wrong value from the database. The issue is with the way PetaPoco delineates the returned data reader field to construct your entity.
To help it out:
– Construct your SQL to keep all the properties for each entity grouped together.
– When specifying the fields for a new entity, don’t specify the primary key first, PetaPoco can get confused as to whether it’s the foreign or primary key.
As a little extra, I’ve included the code to resolve and map two many to one entities as the PetaPoco example only show’s how to achieve this with one .
internal class ScheduleRelator

 {
        // A dictionary of known resources
        Dictionary&lt;int, Resource&gt; _resources = new Dictionary&lt;int, Resource&gt;();
        Dictionary&lt;int, Category&gt; _categories = new Dictionary&lt;int, Category&gt;();

        public Schedule MapResources(Schedule p, Resource r, Category c)
        {
            // Match resource
            Resource aResource;
            if (_resources.TryGetValue(r.ResourceId, out aResource))
                r = aResource;
            else
                _resources.Add(r.ResourceId, r);

            // Wire up objects
            p.Resource = r;

            // Match categories
            Category aCategory;
            if (_categories.TryGetValue(c.CategoryId, out aCategory))
                c = aCategory;
            else
                _categories.Add(c.CategoryId, c);

            // Wire up objects
            p.Category = c;

            return p;
        }
    }

Returning a DTO from a one to many relationship

Wrote this neat query the other day.  It queries a 1:M relationship (a make can have zero or more models) and returns a list of tuples, with the third property being a dictionary of makes.  This neatly preserves the make/model relationship and returns the entire hierarchy in one database call.  Thanks to Blair Davidson for the help on this!

		  public IEnumerable<System.Tuple<string,int,Dictionary<string, int>>> SelectAllMakeModelsAsReverseDictionary()
        {
            var session = GetCurrentSession();

            return session.Query<Model>()
                .Where(x => !x.IsDeleted)
                .Select(x => new
                                 {
                                     MakeId = x.Make.Id,
                                     MakeName = x.Make.Name,
                                     ModelId = x.Id,
                                     ModelName = x.Name
                                 })
                .AsEnumerable()
                .GroupBy(x => new {x.MakeId, x.MakeName})
                .Select(x => Tuple.Create(x.Key.MakeName,x.Key.MakeId,x.Select(y => y).ToDictionary(z => z.ModelName, z => z.ModelId, StringComparer.OrdinalIgnoreCase)));
        }