Facets of Immutability

Immutability, the cornerstone of functional programming, has many facets.

Not every (mainstream) language supports all the facets; at least not per what each facet stands for. That’s what I will talk about today. The various facets of immutability from a theoretical perspective, and briefly show how some of the mainstream languages have adopted and support these facets in their own way.

Continue reading Facets of Immutability

Writing sonnets in C++

Recently, I came across this post – Write a URL in a C++ program, one of those C’s tricks. If you have not already read the post, I will wait until you read and return …

waitingWaiting …

The crux of the trick is the protocol part of the URL – http:, becomes a goto label and the rest of the URL starting with // becomes a comment. Sweet 🤗

My turn My turn …

Continue reading Writing sonnets in C++

Publishing C++/CLI on LeanPub

I came across LeanPub a few months back. I believe it was through hanselman – blog, video or something. I liked LeanPub instantly because of a couple of reasons.

  • It supports writing in markdown and I love markdown.
  • There is a large collection and variety of books including technical books and material, some of which are free too.
  • In case you are not a professional writer, the publication process encourages you with the feeling as if you are one.

I hadn’t written/published any lengthy material in a long time except the C+/CLI Primer on CodeProject. Why not publish same, I thought, and actually published. I wasn’t even expecting any response from anyone since the material was on C++/CLI, a language that gave me the impression that I was the only one using it at the time I published on CodeProject. 😀 I am really impressed that the material topped more than 50 downloads in about three months since it was published. Heck, a couple of them even paid despite the fact that the material is free. Not only am I humbled by this encouraging gesture but I am also convinced that C++/CLI is still being pursued and will continue to live – production, academic or as a pet language. Go grab your copy of the booklet – C++/CLI Primer. It’s free!

final, const and beyond

What are your thoughts on the following piece of code?

public String someGibberishMethod() {
    int length = someMethodReturningLength();
    int sum = 0;

    for (int index = 0; index < length; ++index) {
       // some code that updates the sum variable
    }

    SomeClass someClass = new SomeClass(sum);
    int sumValueInsideSomeClass = someClass.getSumValue();
    // use someText, maybe log or something

    String someText = someClass.doSomeOperation(/*some parameters*/);
    // use someText, maybe log or something
    return someText;
}

Continue reading final, const and beyond

An Unfair World of Tuples, Anons., var and auto

It all began when I wanted to return more than one value from one of the methods. Although my attempts ended futile, it was fun exploring and musing how things could have been.

There are at least a couple of options to return multiple values from a method:-

  1. return an instance of a class that holds the values
  2. return a tuple

Continue reading An Unfair World of Tuples, Anons., var and auto

Crazy Braces – [](){}();

What does this cryptic sequence of braces mean? What programming language is it? Is it valid syntax? If there is even a weak chance of this syntax being valid, then what does it mean?

Alright, alright, alright…..It is C++. That would calm most people; with all their love (pun) for C++. Specifically, it is C++0x. Amongst many other features that we have been waiting for, C++0x gives us the power of lambdas.

The formal definition of a lambda in C++0x is as follows:-

[capture_mode] (parameters) mutable throw() -> return_type {
   body
}

So a lambda may capture one or more variables in scope by value or by reference, or it may capture none. Specifying return_type is not necessary if the type can be inferred or is void.

For instance, a std::for_each‘s functor based code could be inlined with a lambda as follows:-

std::for_each(v.begin(), v.end(), [](int x) {
   cout << x << std::endl;
});

A lambda definition could be assigned to a variable and then used or invoked later.

auto lessthan = [](int left, int right) {
    return left < right;
};

In the above code, lessthan represents a function that takes two int parameters, and returns a bool. And it can be invoked as lessthan(2, 3), which returns true. The cute thing about a lambda is that it can invoked directly right after its definition. The following code defines a lambda (which takes two ints and returns a bool) and invokes it right away.

[](int left, int right) {
   return left < right;
} (2, 3);

Coming back to our initial question. You should have guessed it by now. The sequence of braces – [](){}();, is nothing but a definition followed by a call (right away) to a lambda taking no arguments and returning nothing.

To end with a quote, C++ code is like calligraphy. In other words, it is beautiful to those who understand it, while cryptic to others.

Quiz – Where am I ?

The question is, in C++, how do detect if an object is allocated on the stack or heap.

On Windows, the stack address is in the range of 0x80000000. If the address of the variable is in this range, then you could say that the object is allocated on the stack; else it is allocated on the heap. This technique of detecting is not preferrable since it may not work on other operating systems (such as linux), and deals with the platform specific information making it a non-portable solution.

Let us try to use standard C++ stuff to solve the problem. Ok, we want to know where an object is allocated. In C++, new operator is responsible for allocating an object.

It was very thoughtful of Stroustoup to keep the object allocation and initialization\construction separate. new operator is responsible only for allocation. The compiler is responsible for woving up the code for allocation and calling the constructor. It was also very thoughtful of being able to specify the location where the object needs to be allocated, which of course does not necessarily require overloading the new operator.

C++ allows taking control of object allocation by overloading the new operator. By taking control, you would be able to detect where an object is allocated, and also keep a count of the objects allocated on the stack\heap. The following code snippet illustrates the same:-

//
// SomeClass.h
//

#include <iostream>
#include <stdlib.h>
#include <deque>
#include <algorithm>

using namespace std;

class SomeClass;
typedef std::deque<SomeClass*>      SomeClassDB;
typedef SomeClassDB::iterator    SomeClassDBIter;
typedef SomeClassDB::const_iterator SomeClassDBConstIter;

class SomeClass
{
private: static SomeClassDB heapObjects;
private: static SomeClassDB stackObjects;

private: double value;
private: bool isOnHeap;

public: SomeClass(double d) : value(d), isOnHeap(SomeClass::IsOnHeap(this))
        {
           if (!IsOnHeap())
           {
              SomeClass::stackObjects.push_back(this);
           }

           PrintLocationInfo();
        }

public: ~SomeClass()
        {
           SomeClassDBIter iter = std::find(SomeClass::heapObjects.begin(),
                    SomeClass::heapObjects.end(),
                    this);

           if (iter != SomeClass::heapObjects.end())
           {
              SomeClass::heapObjects.erase(iter);
           }
        }

public: double Value() const
        {
           return this->value;
        }

public: bool IsOnHeap() const
        {
           return this->isOnHeap;
        }

public: bool IsOnStack() const
        {
           return !IsOnHeap();
        }

public: std::string Location() const
        {
           return IsOnHeap() ? "Heap" : "Stack";
        }

public: void PrintLocationInfo() const
        {
           std::cout << Value() << " is on " << Location().c_str() << std::endl;
        }

private: static bool IsOnHeap(SomeClass* scPtr)
         {
            SomeClassDBConstIter iter = std::find(SomeClass::heapObjects.begin(), SomeClass::heapObjects.end(), scPtr);
            return iter != SomeClass::heapObjects.end();
         }

private: static bool IsOnStack(SomeClass* scPtr)
         {
            return !IsOnHeap(scPtr);
         }

public: static void* operator new(size_t size)
        {
           SomeClass* scPtr = (SomeClass*)malloc(size);
           SomeClass::heapObjects.push_back(scPtr);
           return scPtr;
        }

public: static void operator delete(void* ptr)
        {
           free(ptr);
        }

public: static size_t HeapCount()
        {
           return SomeClass::heapObjects.size();
        }

public: static size_t StackCount()
        {
           return SomeClass::stackObjects.size();
        }

public: static void PrintObjectCount()
        {
           std::cout << "OnHeap: " << HeapCount() << " .... OnStack: " << StackCount() << std::endl;
        }
};

//
// SomeClass.cpp
//

#include "SomeClass.h"

SomeClassDB SomeClass::heapObjects;
SomeClassDB SomeClass::stackObjects;

int main (int argc, char *argv[])
{
   SomeClass sc(0.123);
   SomeClass* scPtr = new SomeClass(1.234);

   SomeClass::PrintObjectCount();

   {
      SomeClass sc1(2.345);
      SomeClass::PrintObjectCount();
   }

   delete scPtr;

   SomeClass::PrintObjectCount();

   return 0;
}

You should be aware of implementing the custom delete if you have provided a custom new operator. It is logical because only a custom implementation that allocated memory for the object could possibly know how to deallocate. The above technique of overloading new\delete is portable and safe since it is standard C++. As always, writing standard C++ is fun.

But why would one care to know where an object is allocated or keep a count of objects. I don’t think it would something you would require for production purposes. It could be for development\debugging purposes; maybe you want to detect memory leaks or a general distribution of objects. You could take control of the allocation by overloading new for a particular class or for all classes by declaring a global new operator.

Thinking Currying !!!

Currying is a mathematical concept based on lambda calculus. It is a technique of operating on a function (taking multiple arguments) by splitting and capable of chaining into a series of single argument functions. It is very similar to what a human would attempt to do on paper. For example, if you have to add numbers 1 through 10, what would you do? Class II mathematics… 0 in hand, 1 in the mind, add 0 and 1, so 1 in the mind, then 2 in the hand, … up to 10. So we compute the addition with one argument at a time.

In the programming world, it is realized by transforming a n-arguments function into a (n-1) arguments function, which takes the remaining one argument. This transformation when applied recursively on each of the single argument functions is the chaining of single argument functions that I discussed earlier. Needless to say, currying is a gift of the functional programming world. In simple words, functional programming is about building functions from other functions, and so functions are treated as first class citizens (like data).

If currying is just transforming a n arguments function into a single argument function, then extension method too is an example of currying.

public static bool Replace(this IList srcList, int position, T item)
{
    // Imagine an implementation here.... 
}

So you would be using the Replace above without explicitly passing the source list to the method; one argument less.

list.Replace(2, newItem);
// instead of Replace(ilistObj, 2, newItem) if extension method was not invented. 

Isn’t that right? Yes, but that is not the currying from the conventional standpoint of functional programming.

For samples, we will not use the devil (any functional programming language) itself; instead we will use C#, which provides functional programming facilities. Alright, let us curry.

int Add(int x, int y)
{
    return x + y;
} 

Or to be functional, we may (re)write:

Func adderFunc = (x, y) => x + y;

To curry out an increment function, we would write as follows:-

Func<int, Func> Incrementer()
{
    return num => Add(num, 1);
} 

and we use it is as follows:-

var inc = Incrementer();
Console.WriteLine(inc(5));
Console.WriteLine(inc(12)); 

Is that not better and wise than writing Sum(5, 1)?

If currying is just transforming a n-arguments function into a single argument function, then we should be able to write a generic currying function.

Func<T1, Func> Curry(Func fn)
{
    return x => y => fn(x, y);
}

Func<int, Func> Adder = Curry(Add); 
var Incrementor = Adder(1);
var i = Incrementor(5); 

when we may not have actually encountered a compelling situations to use this in the past
But isn’t this all cryptic? So why would we want to do all such cryptic things when we have not encountered any such situation….in the past? Actually we have.

When simple principles are tough for us to understand, it is our grandma who helps us. Our grandma here is C++; although grandma called it binding.

In C++, many STL algorithms require a functor (roughly equivalent to a function definition) with one argument, where as the desired function was a two or more arguments function. Then we use std::bind1st or std::bind2nd to curry it into unary function.

For instance, the count\_if algorithm calculates the number of elements in a sequence that satisfy the predicate, which happens to be a unary function. Suppose we want to count the even numbers in a collection of whole numbers (and imagine this to be a tough calculation). If there was a function that could take a number and return true if it was an even number, it could be fed to count_if. But what if we had a function like the one below – a two argument function.

bool IsDivisible(int number, int divisor)
{
    return number % divisor == 0;
}

The functor we need is nothing but an IsDivisible function with the second argument as 2. We could write an IsEven function which in turn calls IsDivisible. But that could be tedious one we were write such wrappers for a function with 10 arguments. In situations like this, we curry. In C++, we (use) bind.

std::bind1st – A helper template function that creates an adaptor to convert a binary function object into a unary function object by binding the first argument of the binary function to a specified value.

std::bind2nd – A helper template function that creates an adaptor to convert a binary function object into a unary function object by binding the second argument of the binary function to a specified value.

So in our case, we will be usingbind2nd, as follows:-

std::vector wholeNumbers = { 1, 2, 4, 5, 6, 7, 9, 10, 12, 15, 17, 20 };

std::count_if(wholeNumbers.begin(),
    wholeNumbers.end(),
    std::bind2nd(IsDivisible, 2)); 

Unfortunately, C++ stayed with bind1st and bind2nd, and currying was not that evident or securely possible since C++ did not have required facilities, say something like the C# delegate. So the concept has been in vogue from long time ago. Like every paradigm in programming, functional programming requires (re)aligning our thought (process). Instead of treating functions as just reusable pieces of code (as considered in procedural programming), they have to be conceived as the input processors, which may return either data or whole new function.

So that is a quick thought on Currying. I will try to share a few other thoughts related to currying, slightly expanding to functional programming but later.

Quiz – (Journey through templates, SFINAE and specialization) !!!

template<typename A, typename B>
class TClass
{
public: TClass()
        {
        }

        // Overload #1
public: std::string SomeMethod(A a, B b)
        {
           std::ostringstream ostr;
           ostr << a << "-" << b;
           return ostr.str();
        }

        // Overload #2
public: std::string SomeMethod(B b, A a)
        {
           std::ostringstream ostr;
           ostr << b << "-" << a;
           return ostr.str();
        }
};

So that is a template class with SomeMethod overloads. Why would somebody write such a class? Imagine it is an adder class, and the method overloads could used to add with parameters specified in either order. Following is the way one could use the above (based on the adder example):-

int i = 45;
double d = 12.3f;

TClass<int, float> t1;
const std::string idText = t1.SomeMethod(i, d); // This calls Overload #1
const std::string diText = t2.SomeMethod(d, i); // This calls Overload #2

Seems cool until you try this simple thing

TClass<int, int> t2;
const std::string idText = t1.SomeMethod(i, i);

The mighty C++ compiler will complain

Error 1 error C2535:
‘std::string TClass<A,B>::SomeMethod(A,B)’ : member function already defined or
declared <YourFileName.cpp>

A well-tuned C++ programmer would have guessed already how to resolve the situation. But we are are going to discuss a few other things related to the problem, besides the solution.

Part 1: Non-member Functions

Let us say TClass was written for hosting SomeMethod overloads. By that I mean to say SomeMethod overloads are self-sufficient and could be made non-member functions (as global methods or methods inside a namespace).

namespace SF
{
   // Overload #1
   template<typename A, typename B> std::string SomeMethod(A a, B b)
   {
      std::ostringstream ostr;
      ostr << a << "-->" << b;
      return ostr.str();
   }

   // Overload #2
   template<typename A, typename B> std::string SomeMethod(B b, A a)
   {
      std::ostringstream ostr;
      ostr << b << "-->" << a;
      return ostr.str();
   }
}

Hang on. The above code will still result in a compilation error for SomeMethod<T, T>; same type. We are going to now resolve this situation using one of the tough and less-used powers of C++. We are going have the two overloads, make some changes to one overload using SFINAE principle, and allow instantiation\use of SomeMethod using like or different types. That is not a typo; I spelt it right – SFINAE, a.k.a Substitution Failure Is Not An Error.

SFINAE is a principle adopted by the C++ compiler to prevent resulting in a compilation error when preparing the set of candidate overloads for a particular template instantiation. It is widely used for deliberately excluding one or more template overloads based on some condition. The principle is applied only in the case of templates.

In our case, when SF::SomeMethod<AnyT, AnyT> is called, right now both the overloads are successful candidates. That is why the compiler complains about method redefinition. Suppose we wrote the code in a way so as to detect that the two template types passed are same (“the condition“), and we exclude one of the overloads (consider overload #2). This would help us bind to the one method (overload #1). Well, SFINAE is exactly for such purposes. Alright, let us (re)write code…

// Alternatively, you can use EnableIf facility from Boost library
template<bool Condition, typename T = void>
struct EnableIf
{
   typename T Type;
};

template<typename T>
struct EnableIf<false, T>
{
};

// Alternatively, you can use IsSameType facility from Boost or Loki library
template<typename A, typename B>
struct IsSameType
{
   enum { Result = false };
};

template<typename A>
struct IsSameType<A, A>
{
   enum { Result = true };
};

template<typename A, typename B>
struct AreDifferentTypes
{
   enum { Result = !(typename IsSameType<A, B>::Result) };
};

namespace SF
{
   // Overload #1
   template<typename A, typename B> std::string SomeMethod(A a, B b)
   {
      std::ostringstream ostr;
      ostr << a << "-->" << b;
      return ostr.str();
   }

   // Overload #2
   template<typename A, typename B>
   typename EnableIf<AreDifferentTypes<A, B>::Result, std::string>::Type // Evaluates to the return type
      SomeMethod(B b, A a)
   {
      std::ostringstream ostr;
      ostr << b << "-->" << a;
      return ostr.str();
   }
}

We have tweaked overload #2 such that while creating the candidate set for overload resolution, it produces an error when the template parameter types substituted are same. AreDifferentTypes<A, B>::Result will evaluate to false if the template parameters are of the same type, as a result of which the specialized version of EnableIf (EnableIf<false, T>) is chosen which does not define the Type member typedef. This is spotted as an error when selecting candidates for overload resolution but ignored (for a good cause). Now the compiler will be able to bind to the only candidate found (Overload #1). So now you can enjoy using SomeMethod:-

void main()
{
   int i = 12;
   float f = 34.56;

   const std::string dtText = SF::SomeMethod<int, float>(i, f);
   const std::string stText = SF::SomeMethod<int, int>(i, i);
}

Part 2 – Member Functions

I could feel the SFINAE hangover. Unfortunately, the SFINAE will not work with template member functions. So before we see how to solve our problem in discussion for member functions, let us see why SFINAE will not come for rescue.

For a template method instantiation\call, the C++ compiler first searches for methods (matching the name), which could be a potential candidate for binding to the method call. During this step, the methods are inspected for the signature (and not code). They are rejected if they are detected to produce an error. In the earlier case, AreDifferentTypes returning false thereby making EnableIf devoid of the typedef
Type was the error. So this principle is applied to methods (matching the name) with valid signature. In other words, there cannot be an ill-formed
method signature from a template instantiation.

In the case of non-member functions (above), the method decorated with EnableIf will always be considered to have a valid method signature. In other words, depending on the template parameters, the method will get an appropriate return type; although it may fail to be a successful candidate for overload resolution (using SFINAE) when the template parameters are of the same type. Irrespective of whether a template method or method inside a template class is used, the compiler will stop with errors if it has an invalid signature.

Aimed with the above information, let us rewrite the TClass as follows and understand why the principle will not work for template member functions.

template<typename A, typename B>
class TClass
{
public: TClass()
        {
        }

        // Overload #1
public: std::string SomeMethod(A a, B b)
        {
           std::ostringstream ostr;
           ostr << a << "-->" << b;
           return ostr.str();
        }

        // Overload #2
public: typename EnableIf<AreDifferentTypes<A, B>::Result, std::string>::Type
           SomeMethod(B b, A a)
        {
           std::ostringstream ostr;
           ostr << b << "-->" << a;
           return ostr.str();
        }
};

You will see a whole of bunch of errors, of which we should be interested in the following:-

Error 1 error C2039: 'Type' : is not a member of 'EnableIf<Condition,T>' <FileName> <Line>

Error 5 error C2556: 'int TClass<A,B>::SomeMethod(B,A)' : overloaded function differs
only by return type from 'std::string TClass<A,B>::SomeMethod(A,B)' <FileName> <Line>

A template is almost a dead piece of code, until instantiated. Try writing a template method (with valid signature) that has few favorite quotes from Romeo Juliet (instead of C++ code) as its body. Unless you don’t instantiate\call that method, the compiler will be tolerant. When a template is instantiated, that’s when the real code to bind to is cooked specifically for the parameter types. And when that is done, any invalid code will be flagged as an error. So when the SomeMethod (overload #2) is cooked for TClass<AnyT, AnyT> instantiation, it results in an invalid code. That is our Error 1 above – Trying to access a member that does not exist. This means that the SomeMethod is being cooked without a return type being specified. In C++ (like in C) if you don’t specify a return type, it is assumed to be int. Since there already is an other SomeMethod with the same method signature but just differing in the return type, Error 5 above is raised.

Another category of member functions are static template methods inside a non-template class, which are equivalent to non-members. Let us leave applying the above explanation to this category as an exercise.

Does that mean we don’t have a solution? No, not using SFINAE. But by another friend – Template Specialization.

template<typename A, typename B>
class TClass
{
public: TClass()
        {
        }

        // Overload #1
public: std::string SomeMethod(A a, B b)
        {
           std::ostringstream ostr;
           ostr << a << "-" << b;
           return ostr.str();
        }

        // Overload #2
public: std::string SomeMethod(B b, A a)
        {
           std::ostringstream ostr;
           ostr << b << "-" << a;
           return ostr.str();
        }
};

template<typename A>
class TClass<A, A>
{
public: TClass()
        {
        }

public: std::string SomeMethod(A a, A b)
        {
           std::ostringstream ostr;
           ostr << a << " - " << b;
           return ostr.str();
        }
};

So, we have solved the problem!

Type Safe Logger

Sanjeev and I have published an article – Type Safe Logger For C++ – at CodeProject. Every bit of work is tiresome or little ugly in C++. So is logging – writing application diagnostics to console, file etc. The printf style of outputting diagnostics is primitive and not type safe. The std::cout is type safe but does not have a format specification. Besides that, printf and std::cout know to write only to the console. So we need a logging mechanism that provides a format specification, is type safe and log destination transparent. So we came up with this new Logger to make C++ programmers happy.

Following is a short introduction excerpt of the article:-

Every application logs a whole bunch of diagnostic messages, primarily for (production) debugging, to the console or the standard error device or to files. There are so many other destinations where the logs can be written to. Irrespective of the destination that each application must be able to configure, the diagnostic log message and the way to generate the message is of our interest now. So we are in need of a Logger class that can behave transparent to the logging destination. That should not be a problem, it would be fun to design that…….Read more.

As always your comments are most valuable.

And oh, Happy Logging!