High thread count in Azure Functions

09 October 2018

Recently Azure Functions runtime V2 was released and one of my team’s functions started to consume an abnormally high number of threads. When we dug into the details of the problem, the root cause was performing static initialization of logging on each function invocation. Specifically, we were using Serilog with the Application Insights sink and were recreating the sink each time the function ran. The sink was then assigned to a static variable (Log.Logger).

The code used to look something like this:

using System;
using Microsoft.ApplicationInsights.Extensibility;
using Microsoft.Azure.WebJobs;
using Serilog;

namespace ServiceBusThreadingProblem
{
   public static class HighThreadCountFunction
   {
      [FunctionName( "HighThreadCountFunction" )]
      public static void Run( [ServiceBusTrigger( "TestTopicName", "TestSubscriptionName", Connection = "TestSubscriptionConnectionString" )] string messageContents, ExecutionContext context, int deliveryCount, DateTime enqueuedTimeUtc, string messageId )
      {
         var logConfiguration = new LoggerConfiguration()
            .Enrich.WithProperty( "FunctionInvocationId", context.InvocationId )
            .Enrich.WithProperty( "QueueMessageMessageId", messageId )
            .Enrich.WithProperty( "MachineName", Environment.MachineName );

         logConfiguration.WriteTo.ApplicationInsightsTraces( new TelemetryConfiguration( Environment.GetEnvironmentVariable( "APPINSIGHTS_INSTRUMENTATIONKEY" ) ) );

         Log.Logger = logConfiguration.CreateLogger();
         Log.Information( $"C# ServiceBus topic trigger function processed message: {messageContents}" );
      }
   }
}

The issue is with the re-assignment of the Log.Logger static variable each time the function runs. Looking at a memory dump it was apparent that each thread (~400 of them) was waiting somewhere in Application Insights code. That led us to examine the application insights traces line, which is part of the Serilog initialization. A quick test confirmed that without that line, the threads stayed at normal levels.

As a general principle in Azure Functions, if a class can manage external connections and is threadsafe then it should be reused as a static variable. See the Azure Function documentation on static client reuse. Since Serilog’s Log.Logger is a static variable and is threadsafe, the code above should not have been recreating the logger on each invocation.

Instead, the code above should be refactored to reuse the Log.Logger variable. One pattern that makes this easy is using the Lazy class to ensure only one instance of the logger is ever created as shown below.

using System;
using Microsoft.ApplicationInsights.Extensibility;
using Microsoft.Azure.WebJobs;
using Serilog;
using Serilog.Context;
 
namespace ServiceBusThreadingProblem
{
   public static class NormalThreadCountFunction
   {
      private static Lazy<ILogger> LoggingInitializer = new Lazy<ILogger>( () =>
      {
         var logConfiguration = new LoggerConfiguration()
                 .Enrich.FromLogContext()
                 .Enrich.WithProperty( "MachineName", Environment.MachineName );

         logConfiguration.WriteTo.ApplicationInsightsTraces( new TelemetryConfiguration( Environment.GetEnvironmentVariable( "APPINSIGHTS_INSTRUMENTATIONKEY" ) ) );
         Log.Logger = logConfiguration.CreateLogger();
         return Log.Logger;
      } );

      [FunctionName( "NormalThreadCountFunction" )]
      public static void Run( [ServiceBusTrigger( "TestTopicName", "TestSubscriptionName", Connection = "TestSubscriptionConnectionString" )] string messageContents, ExecutionContext context, int deliveryCount, DateTime enqueuedTimeUtc, string messageId )
      {
         ILogger logger = LoggingInitializer.Value;

         using ( LogContext.PushProperty( "FunctionInvocationId", context.InvocationId ) )
         using ( LogContext.PushProperty( "QueueMessageMessageId", messageId ) )
         {
            logger.Information( $"C# ServiceBus topic trigger function processed message: {messageContents}" );
         }
      }
   }
}

In the code above, the Application Insights Serilog sink is no longer reinitialized on each function invocation. The Lazy field handles initializing the logger if one isn’t already available. With this refactoring, the function consistently uses only ~40 threads. Additionally, the thread count is not impacted by additional load.

Takeaways

To help your azure function scale and to avoid consuming extra resources, reuse as much as possible between invocations.

  • Use static clients to optimize connections. Examples of clients that should be reused are Http Client, Azure Storage clients, and logging clients.
  • Avoid reinitializing classes that can be reused. In my case, it was logging with Serilog but configuration is another candidate to be initialized once for all invocations of a function.

Debugging Azure Functions with Windbg

05 October 2018

When all else fails troubleshooting a production issue I often reach for Windbg to dig into a problem. For troubleshooting issues like high CPU or memory usage, Windbg is a great tool to pinpoint the problem. Windbg allows you to open a memory dump of a running application and inspect it. Anywhere you can run .NET code should support taking a memory dump, which you can then inspect using Windbg. I recently had to troubleshoot an issue with a .net core Azure Function application and while the steps were a little different than a web application, it is possible. Note - this post assumes you have Windbg installed. If not follow the link above and install it. I recommend the version you can get through the Windows store.

Take memory dump of running function

First, open the KUDU site for your problematic azure function that is running in Azure. The URL will look like https://<your-function-name-here>.scm.azurewebsites.net. Navigate to the Process Explorer tab and copy the PID for the process running your function. This might look weird because the process is running W3WP.exe but that is expected. There should only be two options and one will be tagged with the scm tag, meaning that is the process running the scm site. Choose the W3WP.exe process not tagged scm and copy the PID.

Next, open the Debug Console tab. I usually choose the CMD option but either should work. Navigate to the logfiles directory and make a new directory to hold the dump files. I typically use dumpfiles as the name. Navigate into the new directory and run the following command to generate the dump file - D:\devtools\sysinternals\procdump -accepteula -ma <your-PID-here>. That will generate the dump file and place it in the current directory. Finally, download the file with the .dmp extension that was generated to your local machine.

Determine platform details and load into debugging session

Open Windbg and load the .dmp into your debugging session. First, we have to determine if the application was running as x86 or x64 (this will be important later on). You should be able to find it in the output when the file was loaded. See the image below for an example (the example was running as x86).

FindBitnessInWindbg

Now we are ready to inspect the memory dump. The next step is to load the appropriate SOS.dll (debug extensions for .net) for the .net runtime your application is running under. The easiest way to ensure you have the right version is to download it from the KUDU site. To find where it’s located run the following command in Windbg - .loadby sos clr for full framework applications or .loadby sos coreclr for .net core applications. That command attempts to load the sos.dll from the same location that clr.dll/coreclr.dll was located on the machine running the application. Since we aren’t on the KUDU site the command fails but it will output the path to the dll in the error as shown below.

SOSLoadError

Take the path from the error message and navigate to that location in the KUDU site for your application. Note that you have to press the System Drive button in the Debug Console to navigate to the root of the D:\ drive. Then you can move to the path specified in the error message. In my case it was D:\Program Files (x86)\dotnet\shared\Microsoft.NETCore.App\2.1.4. Once you are there, find the SOS.dll in that directory and download it. Finally, load that dll via the .load command - .load C:\the\downloaded\file\path\SOS.dll.

Inspect the memory dump

At this point we have a memory dump of the application and the debug extension for .net is loaded. Providing concrete next steps is tough because it depends on your problem. A good place to start is by running !sos.threadpool to output statistics about the threads in the application and the CPU percentage. !sos.threads will show the details of each thread in the application.

Another useful debug extension to consider using is MEX. Go to the MEX page and download the appropriate dll for your application. This is where the x86/x64 inspection we did earlier is useful. Once you have the right version downloaded, load it via .load C:\path\to\MEX.dll. You can see all the .net related commands by running !mex.help -cat 'DotNet' but I often find !mex.sqlcn useful for looking at open sql connections and !mex.us is helpful to automatically look at all stack traces of running threads and group similar stacks together.

Other resources

Windbg commands are cryptic and difficult to remember so I have the best luck googling for my problem with “windbg” in the query to find blog posts that talk about how to debug my problem. Some especially useful articles I find myself going back to again and again are below. Good luck!

Other useful Windbg posts

Full Framework WSFederation to OWIN Conversion

14 May 2018

If you have been using WSFederation in a .net web application for more than a year or two, chances are that it is configured using the Microsoft.IdentityModel.Web or System.IdentityModel.Services libraries. Two HTTP modules are added to the application, WSFederationAuthenticationModule and SessionAuthenticationModule, to handle the WSFederation protocol and configuration was done by inheriting those classes or configuring on application start via the web.config. However, in newer versions of asp.net using “middleware” is preferred by using OWIN in both full framework applications and .net core. The purpose of OWIN is to abstract the underlying web server from the web application. HttpModules are tightly coupled to System.Web and therefore the IIS webserver. Using OWIN does require some configuration and setup changes which I will detail in this post.

Basic OWIN setup

First, if you don’t already have OWIN configured for your application install the Microsoft.Owin and Microsoft.Owin.Host.SystemWeb nuget packages. Then add a startup class like the one below to your application:

using System;
using System.Threading.Tasks;
using Microsoft.Owin;
using Owin;

[assembly: OwinStartup(typeof(OwinApp.Startup))]
namespace OwinApp
{
    public class Startup
    {
        public void Configuration(IAppBuilder app)
        {
        }
    }
}

Convert SessionAuthenticationModule into OWIN configuration

Once OWIN is installed, we can begin configuring WSFederation. Previously a SessionAuthenticationModule would have been customized to set up properties for the cookies that will store session information:

public class CustomSessionAuthenticationModule : SessionAuthenticationModule
{
  protected override void InitializePropertiesFromConfiguration()
  {
      CookieHandler.RequireSsl = true;
      CookieHandler.Name = "FederatedAuthCookie";
  }
}

and configured as a HTTP module in the web.config:

<modules>
  <add name="SessionAuthenticationModule" type="MyApp.CustomSessionAuthenticationModule, MyApp" preCondition="managedHandler" />
</modules>

In the OWIN pipeline, we’ll configure the cookie using CookieAuthentication classes and helper methods.

public class Startup
{
    public void Configuration(IAppBuilder app)
    {
        app.UseCookieAuthentication( new CookieAuthenticationOptions
        {
          // converted from the CookieHandler.Name = "FederatedAuthCookie"; line in SessionAuthenticationModule
          CookieName = "FederatedAuthCookie",
          // converted from the CookieHandler.RequireSsl = true; line in SessionAuthenticationModule
          CookieSecure = CookieSecureOption.Always
        } );
    }
}

Convert WSFederationAuthenticationModule into OWIN configuration

Next, we’ll convert our custom WSFederationAuthenticationModule to use the WsFederationAuthenticationMiddleware from the OWIN pipeline.

public class CustomWsFederationAuthenticationModule : WSFederationAuthenticationModule
{
  protected override void InitializeModule( HttpApplication context )
  {
      base.InitializeModule( context );

      RedirectingToIdentityProvider += OnRedirectingToIdentityProvider;
  }

  protected override void InitializePropertiesFromConfiguration()
  {
      Issuer = InstanceWideSettings.BaseStsUrl;
  }

  private void OnRedirectingToIdentityProvider( object sender, RedirectingToIdentityProviderEventArgs args )
  {
      // setting the realm in the OnRedirecting event allows it to be dynamic for multi-tenant applications
      args.SignInRequestMessage.Realm = Settings.BaseUrl;
  }
}

The code above will be removed and replaced with the UseWSFederationAuthentication helper below

public void Configuration(IAppBuilder app)
{
   ...
   app.UseWsFederationAuthentication( new WsFederationAuthenticationOptions
   {
     // Pulls in STS Url and other metadata (like signing certificates)
     MetadataAddress = Settings.StsMetadataUrl,
     Notifications = new WsFederationAuthenticationNotifications
     {
         // replaces the OnRedirectingToIdentityProvider event
         RedirectToIdentityProvider = notification =>
         {
           notification.ProtocolMessage.Wtrealm = Settings.PresentationUrlRoot;
         }
     };
     // Name this authentication type (for WIF)
     AuthenticationType = WsFederationAuthenticationDefaults.AuthenticationType,
     // Tells the pipeline to use a cookie authenication we configured above to store the WIF session
     SignInAsAuthenticationType = CookieAuthenticationDefaults.AuthenticationType
   } );
}

Move Global.asax.cs WSFederation configuration into OWIN configuration

Now that we’ve converted the two WSFederation HttpModules we can finish configuring the OWIN pipeline by converting either the WSFederation configuration in the web.config or that was configured on application start. In my case, I preferred to set up WSFederation in code using the FederationConfigurationCreated event like the code below:

FederatedAuthentication.FederationConfigurationCreated += ( sender, args ) =>
{
  args.FederationConfiguration.IdentityConfiguration.AudienceRestriction.AudienceMode = ystem.IdentityModel.Selectors.AudienceUriMode.Always;

  // this method loads the list of relying parties for a multi-tenant application.
  List<string> relyingParties = GetRelyingParties();
  relyingParties.ForEach( rp => args.FederationConfiguration.IdentityConfiguration.AudienceRestriction.AllowedAudienceUris.Add( new Uri( rp  ) );

  // This code loads the metadata url, parses it and and updates the configuration with details from it like the signing certificates
  args.FederationConfiguration.IdentityConfiguration.IssuerNameRegistry = new CustomMetadataParser( Settings.StsMetadataUrl );
};

The items configured above can be added to the UseWsFederationAuthentication configuration:

public void Configuration(IAppBuilder app)
{
   ...
   app.UseWsFederationAuthentication( new WsFederationAuthenticationOptions
   {
     ...
     TokenValidationParameters = new TokenValidationParameters()
     {
         // this replaces the IdentityConfiguration.AudienceRestriction setup
         ValidAudiences = GetRelyingParties(),
         ValidateAudience = true
     },
     // Pulls in STS Url and other metadata (like signing certificates) so we don't have to do custom metadata parsing
     MetadataAddress = Settings.StsMetadataUrl,
     ...
   } );
}

Additionally, in the Global.asax.cs file if you wanted to have access to WSFederation events you could declare special methods on your HttpApplication class and those would be invoked while the WSFederation protocol was executing. Two examples that I’ve used are shown below:

void WSFederationAuthenticationModule_SessionSecurityTokenCreated( object sender, SessionSecurityTokenCreatedEventArgs e )
{
   // extend the expiration of the session cookie to make it last 1 year
   TimeSpan expiration = TimeSpan.FromYears( 1 );
   e.SessionToken = new SessionSecurityToken( e.SessionToken.ClaimsPrincipal, e.SessionToken.Context, now, now.Add( expiration ) ) { IsPersistent = true };

   e.WriteSessionCookie = true;
}

void WSFederationAuthenticationModule_RedirectingToIdentityProvider( object sender, RedirectingToIdentityProviderEventArgs e )
{
   // add client id parameter to outgoing wsfederation request
   e.SignInRequestMessage.Parameters.Add( "client_id", Settings.ClientId );
}

Again, these items can be replicated in the UseWsFederationAuthentication configuration:

public void Configuration(IAppBuilder app)
{
   ...
   app.UseWsFederationAuthentication( new WsFederationAuthenticationOptions
   {
     ...
     Notifications = new WsFederationAuthenticationNotifications
     {
         // replaces the WSFederationAuthenticationModule_RedirectingToIdentityProvider method
         RedirectToIdentityProvider = notification =>
         {
           notification.ProtocolMessage.Parameters.Add( "client_id", Settings.ClientId );
         },
         // replaces the WSFederationAuthenticationModule_SessionSecurityTokenCreated method
         SecurityTokenValidated = notification =>
         {
           var newAuthenticationProperties = new AuthenticationProperties( authenticationTicket.Properties.Dictionary );

           DateTime now = DateTime.UtcNow;
           TimeSpan expiration = TimeSpan.FromYears( 1 );

           newAuthenticationProperties.IssuedUtc = now;
           newAuthenticationProperties.ExpiresUtc = now.Add( expiration );
           authenticationProperties.IsPersistent = true;

           return new AuthenticationTicket( claimsIdentity, authenticationProperties );
         }
     };
     ...
   } );
}

Wrap up

At this point, all old WSFederation code is replaced and WSFederation actions are handled using the OWIN pipeline. One thing to note - we are not able to re-use existing sessions so existing user sessions will be invalidated by this code change. Once the user logs in again at the STS they’ll be issued a new cookie that will work with the OWIN pipeline cookie authentication code.

Setting up Code Analysis in Visual Studio 2017 projects

04 May 2018

In older versions of Visual Studio, FXCop was the standard for static analysis in .NET. In Visual Studio 2017, along with the release of the roslyn compiler, the landscape is different. Static analysis is no longer something you install on the machine and configure in the project, it’s delivered via nuget packages. In this post, I’ll lay out how I suggest setting up static analysis for C# projects. The setup should work equally well for Full Framework and .NET Core.

Which analyzers to use

The first question to answer is which code analyzers to use for your project. Before we just had FXCop but now you can google for “roslyn analyzers” and find a plethora of options. I typically stick to the stock Microsoft options but there are plenty of third-party options too like Style Cop.

In this example, I’ll use the Microsoft.CodeAnalysis.FxCopAnalyzers, which is a meta package of four other analyzer packages:

  • Microsoft.CodeQuality.Analyzers
    • The bulk of the “classic” FXCop errors are here. For instance, implementing IDisposable properly and passing URIs instead of strings are both checked in this package.
  • Microsoft.NetCore.Analyzers
    • .NET core specific warnings/errors appear here, but many are more generic like requiring ICultureInfo to be passed to methods that can accept it.
  • Microsoft.NetFramework.Analyzers
    • Full framework .net warnings are checked here like handling ISerializable correctly.
  • Text.Analyzers
    • This package provides some basic spell checking (disabled by default).

Install and configure analyzers

To get started, install the Microsoft.CodeAnalysis.FxCopAnalyzers nuget package into all projects in your solution. This by itself will generate warnings and do the code analysis using the default ruleset. I want to take it one step further to use shared rulesets for all projects in the solution.

Add shared rulesets

To utilize shared rulesets, edit the csproj files to reference a shared analyzer ruleset. This file will be used to configure which rules are enabled/disabled in your solution. I prefer to have two ruleset files, one for production source code and one for tests so that I can be flexible on the rules I use to analyze test code.

  1. Edit the projects by adding the following lines to the csproj. They can go anywhere but I typically add them underneath the TargetFramework/RootNamespace property group. I use a tools folder at the root of my git repository but you can put the ruleset file anywhere.

     <PropertyGroup>
       <CodeAnalysisRuleSet>..\..\tools\Source.ruleset</CodeAnalysisRuleSet>
     </PropertyGroup>
    
  2. Add a ruleset file in the location specified above. The exact contents of the ruleset file will vary but if you use FxCop analyzers a good place to start is with this default file - https://gist.github.com/dontjee/4a151dea7bc1169f9dd051da70bec35e. It enables some of the most important rules as warnings.

  3. Repeat the process above for test projects, or any other projects you want to use different rulesets, using a different ruleset file.

     <PropertyGroup>
       <CodeAnalysisRuleSet>..\..\tools\Tests.ruleset</CodeAnalysisRuleSet>
     </PropertyGroup>
    

Build and fix/ignore warnings

Now that we’ve configured the ruleset files, the next step is to do a rebuild of the solution and fix or ignore any warnings that pop up. One rule that I often disable is CA2007 Do not directly await a Task without calling ConfigureAwait. This makes sense when building libraries to be consumed in other projects but when building applications, this rule isn’t necessary. To disable the rule follow the steps below

  1. Find the rule in the analyzer list under Dependencies->Analyzers->[The-Analyzer-Name]. In the case of CA2007, the analyzer name is Microsoft.CodeQuality.Analyzers. Dual_write-example
  2. Under the code analyzer, find the rule you want to disable and right-click on it and set the Rule Set Severity to None. Code-Analyzer-Rule-Right-Click

    This adds the following block to the corresponding ruleset file to disable the rule:

     <Rules AnalyzerId="Microsoft.CodeQuality.Analyzers" RuleNamespace="Microsoft.CodeQuality.Analyzers">
       <Rule Id="CA2007" Action="None" />
     </Rules>
    

Repeat the process above for all rules you wish to disable or fix the warnings that show up. Once that’s done, the code analysis setup is complete for your solution.

Extra Credit - set up builds to fail on analyzer warnings

Now that you have a clean build with no warnings I suggest configuring the continuous build (I hope you have one!) to report warnings as errors so that the build will fail if any new code analysis violations show up. To do this, add the following MSBuild property to the compile step of your build - /p:TreatWarningsAsErrors="true".

Avoid dual writes with sql server and change tracking - Part 2

20 December 2017

In my last post I suggested using the database stream writer pattern to avoid writing to multiple data stores outside of a transaction from your application. This post will detail the implementation. For my example the application is a c# application writing to SQL Server and tracking changes using the Change Tracking feature. The data model for this example is similar to Youtube. It contains users, channels and media. A user has one or more channels which in turn have one or more pieces of media associated with them. The full schema is contained in this gist.

At a high level, there are 3 pieces of the architecture to consider.

  1. The primary application which will write to the SQL database. It will not deal with writes to the downstream data stores and will largely be ignored by this post.
  2. The SQL Server database which will be configured to track changes to all necessary tables.
  3. The application to monitor the change tracking stream from the database and push updates to their datastore (cache, search, etc).

Database Implementation

First change tracking must be enabled on the SQL Server database and tables. The SQL below enables change tracking on all three tables configured to retain changes for seven days and enables automatic cleanup of expired changes.

ALTER DATABASE BlogPostExample
SET CHANGE_TRACKING = ON  
  (CHANGE_RETENTION = 7 DAYS, AUTO_CLEANUP = ON)  

ALTER TABLE dbo.UserAccount
ENABLE CHANGE_TRACKING  
WITH (TRACK_COLUMNS_UPDATED = ON)

ALTER TABLE dbo.Channel
ENABLE CHANGE_TRACKING  
WITH (TRACK_COLUMNS_UPDATED = ON)

ALTER TABLE dbo.Media
ENABLE CHANGE_TRACKING  
WITH (TRACK_COLUMNS_UPDATED = ON)

One final table is required to track the current position in the change tracking stream our monitor has consumed. The following SQL will create the table and initialize it with the minimum change tracking version currently in the database:

CREATE TABLE [dbo].[CacheChangeTrackingHistory] (
   [CacheChangeTrackingHistoryId]               INT   IDENTITY (1, 1) NOT NULL,
   [TableName]                             NVARCHAR (512)   NOT NULL,
   [LastSynchronizationVersion]            BIGINT   NOT NULL,
);
ALTER TABLE [dbo].[CacheChangeTrackingHistory]
   ADD CONSTRAINT [PK_CacheChangeTrackingHistory] PRIMARY KEY CLUSTERED ([CacheChangeTrackingHistoryId] ASC);

-- Add default values for last sync version
INSERT INTO dbo.CacheChangeTrackingHistory( TableName, LastSynchronizationVersion )
VALUES ('dbo.UserAccount', CHANGE_TRACKING_MIN_VALID_VERSION(Object_ID('dbo.UserAccount')))
INSERT INTO dbo.CacheChangeTrackingHistory( TableName, LastSynchronizationVersion )
VALUES ('dbo.Channel', CHANGE_TRACKING_MIN_VALID_VERSION(Object_ID('dbo.Channel')))
INSERT INTO dbo.CacheChangeTrackingHistory( TableName, LastSynchronizationVersion )
VALUES ('dbo.Media', CHANGE_TRACKING_MIN_VALID_VERSION(Object_ID('dbo.Media')))

Change Monitor Implementation

With the database properly configured we can start on the application that will consume the change tracking stream from SQL Server. The changes can be accessed by the CHANGETABLE( CHANGES <TABLE_NAME> ) function. I will focus on UserAccount changes but the code will apply equally to the Channel and Media tables. When our monitor application starts, a loop is started to process change tracking updates and push them to downstream data stores. In this case the only downstream data store is the Cache represented by the ICache interface. If we had multiple downstream systems to update, the application would start one monitoring loop with a distinct change tracking history table for each system.

public static class ChangeTracker
{
    internal static async Task StartChangeTrackingMonitorLoopAsync( CancellationToken token, ICache userAccountCache )
    {
       while ( true )
       {
          if ( token.IsCancellationRequested )
          {
             break;
          }

          using ( ChangeTrackingBatch<UserAccountChangeModel> userAccountChangesBatch = GetLatestUserChanges() )
          {
             UserAccountChangeModel[] userAccountChanges = ( await userAccountChangesBatch.GetItemsAsync() ).ToArray();
              foreach( var userAccount in userAccountChanges )
              {
                userAccountCache.UpdateObject( "user_account", userAccount.OperationType, userAccount.UserAccountId, userAccount );
              }
              userAccountChangesBatch.Commit();
          }

          await Task.Delay( 1000 );
       }
    }

    private Task<IEnumerable<UserAccountChangeModel>> GetLatestUserChangesAsync()
    {
         string cmd = "
DECLARE @last_synchronization_version BIGINT = (SELECT LastSynchronizationVersion FROM dbo.CacheChangeTrackingHistory WHERE TableName = 'dbo.UserAccount')

DECLARE @current_synchronization_version BIGINT = CHANGE_TRACKING_CURRENT_VERSION();
SELECT ct.UserAccountId, ua.Email, ua.DisplayName, ua.CreateDate
		, CASE WHEN ct.SYS_CHANGE_OPERATION = 'I' THEN 'Insert' WHEN ct.SYS_CHANGE_OPERATION = 'U' THEN 'Update' ELSE 'Delete' END AS OperationType
FROM dbo.UserAccount AS ua
	RIGHT OUTER JOIN CHANGETABLE(CHANGES dbo.UserAccount, @last_synchronization_version) AS ct ON ua.UserAccountId = ct.UserAccountId

UPDATE dbo.CacheChangeTrackingHistory
SET LastSynchronizationVersion = @current_synchronization_version
WHERE TableName = 'dbo.UserAccount'
";
         return new ChangeTrackingBatch<UserAccountChangeModel>( _connectionString, cmd );
    }
}
internal class ChangeTrackingBatch<T> : IDisposable
{
  private readonly string _command;
  private SqlTransaction _transaction;
  private IEnumerable<T> _items;
  private SqlConnection _connection;
  private readonly object _param;

  public ChangeTrackingBatch( string connectionString, string command, object param = null )
  {
     _connection = new SqlConnection( connectionString );
     _command = command;
     _param = param;
  }

  public async Task<IEnumerable<T>> GetItemsAsync( )
  {
     if ( _items != null )
     {
        return _items;
     }

     _connection.Open();
     _transaction = _connection.BeginTransaction( System.Data.IsolationLevel.Snapshot );
     _items = await _connection.QueryAsync<T>( _command, _param, _transaction );
     return _items;
  }

  public void Commit()
  {
     _transaction?.Commit();
     _connection?.Close();
  }

  public void Dispose()
  {
     Dispose( true );
     GC.SuppressFinalize( this );
  }

  protected virtual void Dispose( bool disposing )
  {
     if ( disposing )
     {
        _transaction?.Dispose();

        _connection?.Dispose();
     }
  }
}

One additional thing to note is the use of SnapshotIsolationMode for the database transaction. This makes sure we’re working with a consistent view of the database to prevent any collisions with the change tracking cleanup process.

Wrap Up

At this point the solution is complete. Any updates to the UserAccount table will be tracked and pushed into the cache by the change tracker class. If the update to the cache fails, the transaction will be rolled back. The monitor application will then retry applying changes in order until the change is pushed into the cache successfully or the change is cleaned up by change tracking retention settings.

This solution is tied to the scalability of SQL Server, so for a write heavy application a different architecture may be necessary. For example, SQL Server could be replaced by a log stream like Kafka. However, for moderate scale applications this architecture will be more than adequate to handle load. We’ve solved the resiliency and race condition issues from the dual write scenario by ensuring that any successful database write will be pushed to downstream systems, in order. We’ve also improved the overall architecture of the primary application by removing the writes to secondary data stores. Plus we haven’t introduced any new data stores to learn and manage. For reference, the full application source code is available on GitHub.


See All Posts