Showing posts with label algorithms. Show all posts
Showing posts with label algorithms. Show all posts

Wednesday, 29 June 2016

Calling C# Methods from SQL Server

I was recently able to take advantage of a useful Microsoft SQL Server feature that enabled me to call a .NET library class method from a stored procedure.

Brief Context


I was working on a client project in which our team were required to write an ETL process. The ETL process was mostly written using SQL Server Integration Services (SSIS). In one part of the ETL process, we needed to do some fuzzy string similarity matching. We found that the Levenshtein string distance algorithm suited our requirements.

Aside: The Levenshtein algorithm takes two strings and computes the "edit distance" which indicates how many transformations are necessary to reach a given target string from a given source string. For example, the function Levenshtein("test", "tear") would return the value two.

In our case, we wanted to abstract the Levenshtein output into a percentage, so for example, the function GetTextSimilarity("test", "tear") would return 50% - indicating that the inputs are half similar.

Creating the Class Library


The full code for the class library is available on GitHub.

The first step is to create a .NET class library which targets a .NET framework version that the flavour of SQL Server you are working with supports. In my case, we created a C# class library targeting .NET framework 4.5 as we were going to use the class library in Microsoft SQL Server 2014.

Ensure that the method you want to call from SQL Server is marked static. You will need to decorate the method with the Microsoft.SqlServer.Server.SqlProcedure attribute. If you look at my class library code, you'll find two classes, namely, LevenshteinDistance and StringCompare. The LevenshteinDistance class contains the algorithm implementation in a static method marked with the SqlProcedure attribute. Similarly, the StringCompare method (shown below) contains one method, GetTextSimilarity, which is the method we want to call as a function from SQL server.

The GetTextSimilarity method calls the LevenshteinDistance.GetLevenshteinDistance method and returns the percentage similarity value that we required. Note that only the method you want to call as a function requires the SqlProcedure attribute. In this case, we could have only marked the GetTextSimilarity method with this attribute as this is the only method we want to call from SQL Server.

Registering the Class Library in SQL Server


Once you've finished writing the class library and you are ready to call your method(s) from SQL Server, you'll need to execute the following SQL script (obviously, you'll need to replace the database name, assembly file path, function name and method namespace).


Once the script has finished executing, you will find the registered assembly under the Database\Programmability\Assemblies node in SQL Server Management Studio.
Image

If you look under Database\Programmability\Functions, you'll find the method you registered earlier (see SQL script).

Image

You can now call your method as a standard T-SQL function call. Quite useful!

Image

Tuesday, 25 August 2015

Conway's Game of Life: An Exercise in WPF, MVVM and C#

This blog post was written for the Lockheed Martin Insight blog, sharing here for the external audience.

Last month I started the Pluralsight summer camp by watching an interesting video in which the presenter implements Conway's Game of Life using HTML, CSS and JavaScript. The video inspired me to write an MVVM based implementation of the game using Windows Presentation Foundation (WPF) and C#. There were two aims to this exercise. The primary was for me to reacquaint myself with WPF, which is a framework I've been wanting to spend some time with for a while. And the secondary was to implement something fun.

The purpose of this blog post is to document the exercise. We'll cover what Conway's Game of Life is, a bit about the design and then see how some WPF and .NET framework features help in implementing an MVVM solution. The source code for the implementation is available on GitHub.

So, what is Conway’s Game of Life?

Image

The Game of Life is a popular cell automaton algorithm that simulates the evolution of life. The algorithm was designed by British mathematician John Conway and has gained popularity for its simplicity. In its purest form, the universe for the Game of Life is represented by a grid which has an infinite number of cells. Each grid cell can contain a dead or living cell. The game starts with the user providing a seed generation of living cells. Based on the initial generation of cells, the universe evolves through generations to simulate evolution. For a generation to evolve, the following four rules are applied to the current generation:

  1. Any live cell with fewer than two live neighbours dies (due to under-population).
  2. Any live cell with two or three live neighbours lives on to the next generation.
  3. Any live cell with more than three live neighbours dies, (due to overcrowding).
  4. Any dead cell with exactly three live neighbours becomes a live cell (by reproduction).

And that's it. By applying these four rules, it is surprising to see how a universe can evolve to create complex symmetrical patterns, even when there is no symmetry or pattern in the seed generation.

Arriving at an Object-Oriented Design

The MVVM architectural design pattern was adopted to achieve a clear separation of concerns. I started by identifying the models:

  • Cell Model: Represents a single cell in the universe. Encapsulates data for the cell's position and whether it is dead or alive.
  • Generation Model: Represents a generation of life. Encapsulates data on the size of the universe, the cells within the universe and provides functionality to alter the living state of cells.

The EvolutionEngine class was identified to hold the current generation and take on the responsibilities of:

  • Evolving the current generation to the next generation, as per the rules of evolution.
  • Detecting if the current generation can no longer evolve.
  • Keeping track of the number of generations that the current generation has evolved through.
  • Resetting the generation back to an empty state.

A single view-model was identified, namely, GenerationViewModel. The view-model uses the EvolutionEngine class to:

  • React to actions that originate from the view.
  • Provide information for the view to display.
  • Provide notifications to the view if the observed information changes.

In summary, we have a couple of models, an engine class to work with the models and a view-model to keep the view in sync with the models.

Writing the MVVM WPF Implementation

The following diagram illustrates how the components of an MVVM solution are related in WPF (image source).

Image

A view is built using the eXtensible Application Markup Language (XAML). The association between a view and its view-model is accomplished through the DataContext property of the view. Controls in a view are bound to properties in the view-model using the data binding features of WPF. In a scenario where the view-model is only serving as a proxy between the view and model, you may consider the use of a view-model superfluous and let the view-model expose the underlying model (see the "Why Use a ViewModel?" section in an MSDN article). Actions performed on the user interface are encapsulated in commands that are exposed by the view-model (using the Command design pattern). We will see how a view-model exposes its supported commands using the ICommand interface. Finally, notifications are raised as C# events from the view-model. The INotifyPropertyChanged interface is used primarily for this purpose.

As I’ve alluded to, there are a number of useful features in WPF and .NET that assist in the implementation of an MVVM solution. The particular features that I’m referring to are:

  • The FrameworkElement.DataContext property.
  • Using the Binding class to associate parts of the view to the view-model.
  • Raising notifications through the INotifyPropertyChanged interface.
  • Converting incompatible data between bound properties using the IValueConverter interface.
  • Encapsulating and executing actions using the ICommand interface.

Each of these features were used in the implementation for the game. In the sections below, we will discuss what each of these features are and see examples of how they were adopted in the solution.

Setting the Data Context and specifying a Binding

The Grid control was a natural choice to visually represent the universe. Each cell of life within the universe is represented by a TextBlock control. The code snippet below shows how the TextBlock is created:

In WPF, user interface controls derive from the FrameworkElement class which contains an important property named DataContext. The DataContext property of a control is set to an object that the control uses to pull its data from and/or push its data to. This object is typically a view-model or model instance. By setting the TextBlock.DataContext property to an instance of the Cell model we can bind properties of the TextBlock control to properties of the Cell model.

Aside: As a WPF view is hierarchical, a nice feature of using the DataContext property is that it can be inherited from a parent user interface control.

In our Game of Life implementation, we use the Background (colour) property of the TextBlock control to indicate whether the cell is dead or alive. On lines 7-10 of the CreateCellTextBlock method, you’ll see that we call the SetBinding method. The first parameter to this method specifies which property we want to set the binding for (made possible with Dependency Properties) and the second parameter accepts a Binding instance.

The Binding instance is created by the following method:

There are a few things to note here. Firstly, the Binding object supports a Source property which we do not explicitly set. The reason for this is that if no Source property is set on a binding, the binding defaults to using the DataContext property on the control that the binding is set on. In this case, the TextBlock control’s DataContext property is already set to a Cell model instance. We can therefore set the Path property to point to a particular property within the model, in this case the boolean Cell.Alive property.

The Mode property is set to the enumeration value of BindingMode.TwoWay, meaning that any change to the TextBlock control through the user interface updates the model and conversely any change to the model updates the user interface. To support two-way binding, the Cell model will need to implement the INotifyPropertyChanged interface so that any change to the Alive property is notified to the view.

Implementing Notifications with INotifyPropertyChanged

The INotifyPropertyChanged interface is implemented by classes that need to notify clients (in our case, the view) that an observed property value has changed. Typically, this interface is implemented by view-models or models that want to broadcast changes to certain property values.

INotifyPropertyChanged supports one member, the PropertyChanged event. If you have a two-way binding setup between a user interface control property and a view-model property, programmatic changes to the view-model property will not be reflected in the user interface unless the view-model implements INotifyPropertyChanged and the observed property’s setter raises the PropertyChanged event.

In our solution, we wrote a reusable ObservableBase class that implements INotifyPropertyChanged. The Cell and GenerationViewModel classes both inherit from ObservableBase.

The PropertyChanged event is raised from the OnPropertyChanged method which accepts a single parameter – the name of the property that has changed. The CallerMemberName attribute is used so that we can call this method without having to specify a hardcoded string with the property name. The code below shows the implementation for the Cell model, which inherits from ObservableBase.

Notice how the Alive property setter calls the OnPropertyChanged event. An unfortunate side effect of implementing INotifyPropertyChanged is that you’re unable to use automatic properties for the properties that need to raise the notification event.

Converting Between Bound Properties Using IValueConverter

In our earlier code snippet for the CreateCellAliveBinding method, you may have noticed that we also set the Binding.Converter property. The Converter property is used when we bind two properties that have a type mismatch or expect data in different formats. To recap, we are using the Background (colour) property of the TextBlock to indicate if the cell represented by the TextBlock is dead or alive. The Alive property on our Cell model is of boolean type, therefore, we need our binding to convert the boolean value to a colour. This is where the IValueConverter interface is used to take on the role of an adapter.

The implementation of the IValueConverter is in the class LifeToColourConverter, shown below:

The IValueConverter interface supports two methods, namely, Convert and ConvertBack. The Convert method is used when binding the value from the source property (Cell.Alive) to the target property (TextBlock.Background). Conversely, the ConvertBack method is used when updating the source from the target. If you are only using one-way binding, then you are only required to provide an implementation for the Convert method (the ConvertBack method can safely throw the NotImplementedException).

Handling User Interface Actions Using the Command Design Pattern

A view-model exposes a set of actions that are encapsulated within commands. Commands are exposed as properties on the view-model so that user interface controls can bind to them. The logic for a command is encapsulated within a class that implements ICommand. The ICommand interface supports two methods; CanExecute and Execute. The CanExecute method returns a boolean value which determines whether the command can execute. The Execute method contains the command logic to run.

In our solution, the view's DataContext is set to an instance of the GenerationViewModel class. In this view-model, we expose the following three commands:

RelayCommand is a popular implementation of the ICommand interface which has been available as open source for some while. We are using a particular variant of RelayCommand which enables a single parameter to be sent to the method that ultimately executes the command. You can view how the three command properties are initialised and the implementations for the CanExecute/Execute methods in the GenerationViewModel class.

An advantage of the RelayCommand implementation for ICommand is that it achieves reusability by using the Predicate and Action delegates to reference the CanExecute and Execute methods respectively. This avoids the need for writing a separate ICommand implementation for every command that the game supports.

In the user interface for the game, we have two buttons. One button binds to the view-model EvolveCommand property and the other binds to the ResetCommand property. The XAML snippet below shows how the command bindings are created:

If the ICommand.CanExecute method returns false for a command that is bound to a button, the button is automatically disabled.

The ToggleCellLifeCommand is executed when the user is setting up the initial generation of living cells, the binding for this command is setup programmatically by the following method (called by the CreateCellTextBlock method shown earlier):

We are using the InputBinding class to specify that a left mouse click on the TextBlock should execute the ToggleCellLifeCommand and pass it a formatted string that contains the cell position. The ToggleCellLifeCommand method implementation in the GenerationViewModel class looks up the cell and toggles the Alive property boolean value. As we have an active binding between each TextBlock and its Cell.Alive property, the TextBlock colour will automatically change to a colour that represents the current value for the Alive property.

Summary

In this post, we’ve introduced and described some of the main features that WPF and the .NET framework provide for implementing MVVM based solutions. Conway’s Game of Life served as a fun exercise to use these features with.

We saw how the FrameworkElement.DataContext property enables seamless binding between user interface controls and properties on view-models. Two-way binding was shown to be possible using the INotifyPropertyChanged interface as a mechanism to notify the view of changes in the view-model. We introduced a useful ObservableBase class which implements INotifyPropertyChanged and can be reused in future WPF projects that need to support two-way binding. To bind two properties that work with different data types, we saw how an implementation of the IValueConverter interface can be used as an adapter. Finally, we saw how the command pattern is used to encapsulate logic in an ICommand implementation and how commands are exposed as properties on a view-model that can be bound-to in the view.

The animated graphic below shows the finished game. In the animation, we see the evolution for a simple L-shaped seed pattern in a universe of 25x25 cells. Surprisingly, this basic seed pattern lives through eighteen generations of life with some fascinating symmetrical patterns before it stops evolving.

Image

Full source code is available in a GitHub repository.

Sunday, 7 July 2013

C# How To: Implement the Soundex Algorithm

I caught the end of a conversation about the Soundex algorithm at work the other day which inspired me to write an implementation of it in C#. If you are not familiar with what Soundex is then the Wikipedia article on Soundex is a good place to start. I first came across this algorithm in a Natural Language Processing module during my university education. In a nutshell, when the Soundex algorithm is applied to a word, a Soundex Code is produced as output. Words that differ in spelling but sound the same (homophones) should produce the same Soundex Codes. For instance, "to" and "two" are spelt differently, but sound the same and therefore produce the same Soundex Code of "T000".

This is a useful little algorithm and I particularly like it for its simplicity and the fact that the heuristics used in the algorithm work well in most cases (one limitation of Soundex is that it falls short of covering words that sound the same but have a different first letter, e.g. "site" and "cite" produce different Soundex codes). Soundex is useful when writing search functionality where you want to account for misspellings in the users query. It's worth pointing out that SQL Server natively supports Soundex (see the Soundex function in T-SQL, for example).

My C# implementation is below - I opted to implement it in a static class that exposes one public method "For". The example source code is available on GitHub - https://github.com/rsingh85/SoundexExample

public static class Soundex
{
    public static string For(string word)
    {
        const int MaxSoundexCodeLength = 4;

        var soundexCode = new StringBuilder();
        var previousWasHOrW = false;

        word = Regex.Replace(
            word == null ? string.Empty : word.ToUpper(),
                @"[^\w\s]",
                    string.Empty);

        if (string.IsNullOrEmpty(word))
            return string.Empty.PadRight(MaxSoundexCodeLength, '0');

        soundexCode.Append(word.First());

        for (var i = 1; i < word.Length; i++)
        {
            var numberCharForCurrentLetter =
                GetCharNumberForLetter(word[i]);

            if (i == 1 &&
                    numberCharForCurrentLetter ==
                        GetCharNumberForLetter(soundexCode[0]))
                continue;

            if (soundexCode.Length > 2 && previousWasHOrW &&
                    numberCharForCurrentLetter ==
                        soundexCode[soundexCode.Length - 2])
                continue;

            if (soundexCode.Length > 0 &&
                    numberCharForCurrentLetter ==
                        soundexCode[soundexCode.Length - 1])
                continue;

            soundexCode.Append(numberCharForCurrentLetter);

            previousWasHOrW = "HW".Contains(word[i]);
        }

        return soundexCode
                .Replace("0"string.Empty)
                    .ToString()
                        .PadRight(MaxSoundexCodeLength, '0')
                            .Substring(0, MaxSoundexCodeLength);
    }

    private static char GetCharNumberForLetter(char letter)
    {
        if ("BFPV".Contains(letter)) return '1';
        if ("CGJKQSXZ".Contains(letter)) return '2';
        if ("DT".Contains(letter)) return '3';
        if ('L' == letter) return '4';
        if ("MN".Contains(letter)) return '5';
        if ('R' == letter) return '6';

        return '0';
    }
}
Example:

// Both lines below output R100
Console.WriteLine(Soundex.For("Ravi"));
Console.WriteLine(Soundex.For("Ravee"));

Friday, 5 October 2012

C# How To: Convert Integer to Base Binary, Octal, Denary or Hexadecimal

A few weeks ago I had a requirement to convert an Int32 value (base 10 or decimal) to its binary representation as a string. I encapsulated the logic to convert the decimal number to a binary string in a method. I used the base 10 to base 2 algorithm where the decimal number is continiously divided by two until it becomes zero. Within the iteration, the modulo two of the number is then appended to the output binary string. At the time of writing that code, it seemed like the best way to go about it, and because it was neatly encapsulated in a method, I knew I could swap out the implementation at a later date without much hassle.

Today, I came across an overload of the Convert.ToString method which surprisingly accepted a parameter called toBase. I wish I had come across this earlier... The overload accepts the bases 2, 8, 10 or 16 and throws an ArgumentException if one of these are not passed in. Examples are below:

Console.WriteLine("{0}", Convert.ToString(10,2));  // Outputs "1010
Console.WriteLine("{0}", Convert.ToString(10,8));  // Outputs "12"
Console.WriteLine("{0}", Convert.ToString(10,10)); // Outputs "10"
Console.WriteLine("{0}", Convert.ToString(10,16)); // Outputs "a"
Console.WriteLine("{0}", Convert.ToString(10,3));  // Throws ArgumentException

Sunday, 15 July 2012

Implementing Convert.ToInt32(string)

Earlier today I used the static helper method System.Convert.ToInt32(string). I thought it would be interesting to have a go at writing my own version of this method.
public static int ConvertToInt32(string integerAsString)
{
    var result = 0;
    var positiveSigned = integerAsString[0] == '+';
    var negativeSigned = integerAsString[0] == '-';

    for (var i = (positiveSigned || negativeSigned) ? 1 : 0; 
        i < integerAsString.Length; 
        i++)
    {
        var currentCharAsInt = (int) integerAsString[i] - (int) '0';

        if (currentCharAsInt < 0 || currentCharAsInt > 9)
            throw new ArgumentException(
                string.Format("The String value '{0}' " + 
                    "is not in a recognisable format.", 
                        integerAsString));

        result *= 10;
        result += currentCharAsInt;
    }

    return (negativeSigned) ? -result : result;
}