Functional Adam: February 2016

Tuesday, February 23, 2016

How to Format Code in a Blog Post?

There are quite a few ways to easily format code in HTML, I'll walk you through the steps I took.

Step 1 - Find a good CSS library for code formatting

The best HTML syntax highlighter library I found was PrismJs. It's free, incredibly thorough the languages it supports, and it provides a professional look. You simply choose the language support you like and download the CSS and Javascript files.

Here is a before look:

let items = [1; 3; 4]
items |> List.map(fun x -> x + 1)

Here is the code after formatting with PrismJs:

let items = [1; 3; 4]
items |> List.map(fun x -> x + 1)

Step 2 - Move from Wordpress to Google Blogger

I originally had the blog on Wordpress using a free template but I decided to move it to Google Blogger. The reason for moving was Google Blogger provided free template editing. Wordpress has other advantages but for my purpose I simply needed to tweak the template to include the Prism CSS and Javascript files. Since I only had a few posts this was not a difficult move.

Step 3 - Host PrismJs CSS and Javascript files in DropBox
I used the instructions in this article to host the files in DropBox so they can be referenced publicly.

Step 4 - Update template HTML to include CSS and Javascript files

In Google Blogger I went to Template -> Edit HTML and added this piece in the header.

<link href='https://dl.dropboxusercontent.com/s/wvc7y6gckrru3ex/prism.css' rel='stylesheet'/>
<script src='https://dl.dropboxusercontent.com/s/sy0k7bc7ixl6xan/prism.js' type='text/javascript'/>

It was that easy! Now my blogging code looks much nicer.

Saturday, February 20, 2016

CORS for security

What is it?

Cross-Origin Resource Sharing is a W3C spec that allows resource sharing across domains. Some resources are allowed to come from any domain, but web fonts and AJAX requests are limited to accessing the same domain as the parent web page. This presents a problem if an AJAX request from www.example1.com wants to request a resource from www.example2.com. In order to fulfill that request a CORS security header will have to be added to the response.

Why?

All modern browsers implement the same-origin policy as a preventative measure to keep attackers from redirecting users to malicious websites. This is a bit unfortunate for development teams implementing APIs to be used across distributed environments. It can be perfectly valid for an API to serve AJAX requests to multiple domains. This is why the CORS spec was introduced and is necessary for teams with environments spanning multiple domains.

How to use it?

To successfully serve a cross-origin request the server side must specify the Access-Control-Allow-Origin and Access-Control-Allow-Methods headers. The Allow-Origin header specifics the domains allowed to access the resource and the Allow-Methods specifies the HTTP methods allowed (GET, PUT, etc). To specify all origins allowed the Allow-Origin header could just specify '*'.

To Vim or Not to Vim?

I try to learn something new about one of my tools every day. Slow, methodical, constant improvement is important to staying effective as a developer. We interact with numerous tools and we should constantly be re-evaluating them to try and become more efficient at our job.

While I have been keeping with my commitment to learning about my tools, one of the tasks I've been putting off is going back to basics and learning Vim. You may be asking with all the advances throughout the years in editors why would I want to go back and learn Vim?

Well, the fact this editor is still widely used in the community after 25 years is reason enough. Besides, even though one can be extremely productive using a modern IDE I feel they can make you complacent. Too much magic behind the scenes. So, that's why I decided to start the Vim journey.

Now what?

I knew it would be frustrating at first, but not this frustrating! In the beginning I felt completely worthless in this crazy Vim land. I try to type a word and the next thing I know I've deleted half the text on the screen and my cursor is skipping around everywhere.

A co-worker introduced me to vimtutor and sanity was restored. Over time I became familiar with the commands and I'm starting to understand the logic behind its power. Small commands that can be combined to do really complicated things. I'm a huge fan of not using the mouse and Vim takes this to a whole new level. The only problem is now I want it everywhere! At least thanks to vimium I can happily vim in the browser :)

Why Would I Use a Monad?

In this post I am going to describe a simple case for when to use a monad. A monad is nothing to be frightened of, it's simply a structure that describes a chain of operations. They should be embraced because if used correctly they can lead to more readable code. And, monads are easy to understand! So, let's walk through a common scenario.

We'll start out with a standard pattern I've observed in a normal C# application. Let's say there is a function that should be called, but only if a number of other validation functions first succeed. The C# code could look as follows (this code could be written better but it is just an example):

public static bool SaveRecord(int input) {
    var valid = IsGreaterThanZero(input);
    if (valid) {
        valid = IsLessThanTen(input);
    }
    if (valid) {
        valid = EqualsFour(input);
    }
    if (valid) {
        SaveRecord(input);
    }
    return valid
}

In this example we want to call SaveRecord, but only depending on the results of some other functions. Instead of branching on the valid boolean we could use exceptions, or we could put those validation calls in one line, but I'm going to argue for a better way. First let's write some code in F#. I'm a fan of Scott Wlaschin's railway programming, where program execution shortcuts if there is an error instead of using exceptions. To do this we need a type like so:

type Response =
 | Success of int
 | Failure of string

Instead of returning simple types our function inputs and outputs can be wrapped with this Response type. By using this response type our functions will have more well defined inputs and outputs which can lead to more readable programs as we will see below. So, let's re-implement the above function in F# using this response type (warning: ugly code below!):

let yuck input =
       match isGreaterThanZero input with
       | Failure f -> Failure f
       | Success a -> match isLessThanTen a with
                         | Failure f -> Failure f
                         | Success b -> match equalsFour b with
                                           | Failure f -> Failure f
                                           | Success c -> saveRecord c

As you can see that looks terrible. After every response we have to check whether not the response is valid before continuing, but luckily the language provides a way to clean it up. I'll show you step by step. First, let's create a function called bind that will help remove some duplicate logic in this function.

let bind m f =
  match m with
  | Failure f -> Failure f
  | Success a -> f a

This bind function takes a response type and a function, then simply propagates the failure if it's a failure or if it's a success it calls the function with the unwrapped value as its parameter. This serves the purpose of removing the nested match statements above and helps readability.

let withBind input =
  let result1 = bind (Success input) isGreaterThanZero
  let result2 = bind result1 isLessThanTen
  let result3 = bind result2 equalsFour
  bind result3 saveRecord

This is much more readable than the first function but we can take it a step further. We can use computation expressions in F# by building a type with the bind function we created that looks as follows:

type ValidationBuilder() =
  member this.Bind(m, f) = bind m f
  member this.Return(x) = Success x
  member this.ReturnFrom(x) = x
  member this.Zero(x) = Success x

let validation = new ValidationBuilder()

With this type we can write a function that is cleaner than before.

let withContinuation input =
  validation {
    let! result1 = isGreaterThanZero input
    let! result2 = isLessThanTen result1
    let! result3 = equalsFour result2
    return! (saveRecord result3)
  }

The computation expression allows us to build in the continuation logic so it doesn't have to be repeated everywhere in our application. The ! operator automatically calls the bind function, unwraps the value and continues on unless there is a failure in which case none of the other lines are executed. One other option we could use here is an infix operator instead of a computation expression. These can be confusing if used too often but here is how we could change the function using an infix operator.

let (>>=) m f =
  match m with
  | Failure f -> Failure f
  | Success a -> f a

The infix operator works like the bind function and when we use it in our original function this is the result.

let withInfixOperator input =
  input
  |> isGreaterThanZero
  >>= isLessThanTen
  >>= equalsFour
  >>= saveRecord

This is the most readable of all the options and shows the power of F#. See, monads aren't scary! If used correctly your programs can contain mostly program logic and are very easy to read as long as the basic concepts are well understood.

What I'm Currently Reading - Release It!

It's an oldie but a goodie. If you look on any top 10 list of software engineering Release It! will be somewhere near the top. Release It! focuses on designing robust software applications and some of the pitfalls that come with distributed complex systems.

I like this book because it takes a deep dive on those nasty, crippling production bugs that can bring down even the most well thought out system. I've seen my fair share of these types of bugs, where the actual problem is an innocuous line of code that was successfully code reviewed, unit-tested, and automated, yet it wasn't enough to prevent the debilitating application crash. This book gives good advice on how to deal with this reality and design systems that can withstand the rigors of production.
Here are my main takeaways so far.

Be cynical!: After reading this book we are taught to be cynical, pessimistic, and distrustful of our code. Danger lurks at every integration point and we must write our code expecting it to fail. Think deeply about the consequences of our failing code and design your architecture with that in mind.

Be anti-fragile: Unit tests and automation are wonderful, but we need our software to be anti-fragile. Beat up your software with in testing. Stress your system with load tests and keep your application running in test for long periods of time to try and uncover those bugs usually only found in production.

Know your design requirements: Should the checkout process be frozen because we can't talk to a third party affiliate? Probably not. Know the requirements of your system and design your failure scenarios accordingly.

Deploy early and often: The deployment process is one of the first things you should iron out. Deployments should be simple and low risk, if it is too hard then you need to rethink your design. Having an easy deployment process will make the team more likely to refactor code and make changes, which is a good thing. I have seen codebases where the team is afraid to make changes because of the difficulty and risk of deploying. This is demoralizing and a lesson that pushing our code needs to be easy.

I would recommend this book to a software engineer with any level of experience. It is a good lesson for all.

My Journey to Functional Programming

Over the last year I have transitioned from an imperative and object oriented style to a declarative and functional style of programming. Even though there was much gnashing of teeth along the way it was a worthy pursuit. I believe a declarative functional style is beneficial in a few ways I will describe below, but first I'd like to describe the journey in case you find yourself making the same transition. Last year a functional programming advocate on our team convinced us to use F# for our team's next project. What followed was a path to the ~~dark~~ functional side.

Dismissal: I was initially skeptical about this decision as functional programming reading materials were being passed around. They stated the style was easier to read and reason about, but this did not seem so at first. The code looked mathematical and cryptic. During these initial stages it was difficult to see how these differences would make life easier as a programmer.
Helplessness: Using a functional programming language for the first time felt like trying to program blindfolded without hands alone in a dark room. How do you loop? Why can't I change this record? Why is this not building!? What's a continuation? Programs that resulted from those initial iterations was a hybrid functional/object oriented style that took longer to create and didn't have a consistent feel. It wasn't until after building multiple real-world applications did it begin to sink in. Then after many tears...
Bliss: It's all about the functions stupid! For my entire software engineering career building software consisted of imperative object mutation while using OO patterns as the main way of sharing code. Turning the thinking around to writing programs using mostly pure composable functional expressions was difficult, but the benefits are worth the initial struggle.

Why declarative and functional?

Expressiveness: Writing code in a mostly pure, declarative way using a functional language leads to easy to change, easy to understand, and testable code. If these practices are followed then it is rather easy to understand the flow of the program because most of the logic is contained in small blocks unaffected by state. Declarative code is telling you its intent (side benefit is fewer code comments) whereas imperative code tells you how.
Functional patterns: Many books have been written on OO patterns and a lot of those patterns are useful and elegant ways of solving specific problems. In functional programming the answer to most of these problems are to simply create more functions. With continuations and currying it's trivial to break apart functions and share code in elegant ways.
Interfaces: Even when using an OO style best practice is to program to the interface instead of the implementation. An interface is simply a function definition, so functional programming takes the adage to heart.

I never thought I would say it, but I'm firmly in the functional camp now and happy to be there! In later posts I will go over specific examples of why I enjoy this style.

"Timeout expired. The timeout period elapsed prior to obtaining a connection from the pool. This may have occurred because all pooled connections were in use and max pool size was reached"

An email inbox full of this error message is not the best way to start a morning. Our service had been running in production smoothly for months, but software engineering is a cruel profession and bad things are bound to happen.

Luckily only one of our three production servers was down so we quickly pulled the problem child out of the load balancer rotation and began to investigate. We used perfmon to monitor the open connections on the server and were easily able to fill up the connection pool with only a few hundred concurrent requests from our tests scripts. After waiting a few minutes the connections were reclaimed and the service would become responsive again.

At this point it was rather obvious we were leaking connections, but how? This error message is an indication that connections were being taken from the pool but not released again. So the first obvious thing to do was check to make sure all our SQL clients were being properly disposed. This would be an easy fix if somehow we missed putting our SQL clients in using blocks, but after looking through the code all of our clients were in using blocks and should be disposed properly.

Connection pooling

SQL connections are expensive to create. It consists of creating a socket, doing an initial handshake, authenticating the credentials, doing transaction checks, etc. Connection pooling is used as an optimization technique used to minimize the number of times a connection must be established. So, when a SQL client wants to connect to the database a connection will be created or one taken from the existing pool of connections. Once the SQL client is finished with the connection it will close the connection when Dispose() is called and the connection will go back to the connection pool. Each connection pool is associated with a distinct connection string. If the maximum number of connections in the pool has been reached and there are no more available connections in the pool then a timeout will occur when a connection is requested.

Duplicating a production problem locally is the holy grail of fixing production bugs. Once local duplication has been reached fixing the bug is usually straight-forward. In this case I was not able to reproduce this problem locally even when sending 10,000 concurrent requests at my local service. Am I hitting the same endpoint? Yes. Similar CPU/memory specs as production? Yes. After running through many of these questions we tried setting <gcServer enabled="true"/> on my local dev machine and YES!! Duplication. This server setting is used in production and was required to duplicate this problem.

Garbage collection

The gcServer element specifies whether or not to use server or workstation garbage collection. Workstation garbage collection collects on the same user thread that triggered the garbage collection, so it must compete with other threads when garbage collecting. Server garbage collection dedicates threads specifically to garbage collection at high priority, so this setting will perform better on multi-core CPU machines.

After being able to reproduce the problem locally it was easy to narrow down the offending code (F#).

use cmd = new GetRecords(connectionString)
let! dbResult = cmd.AsyncExecute(contactID)
let records =
    dbResult
    |> Seq.map Convert.convertRecords

That doesn't look so bad right? The SQL command is in a using and Dispose() should be called on it when it goes out of scope. Well, we had our suspicions about the fact that the database call was retuning a sequence and we were curious how this code would perform:

use cmd = new GetRecords(connectionString)
let! dbResult = cmd.AsyncExecute(contactID)
let records =
    dbResult
    |> Seq.toArray
    |> Array.toSeq
    |> Seq.map Convert.convertRecords

Voila! Materializing the sequence to an array before returning actually fixed the problem. But why? Sequences are not computed until it is traversed, so returning a sequence of objects retrieved from a database will keep open the database connection until the sequence is fully traversed. This is bad because we want that database connection returned back tot he pool as quick as possible so it can be used by other threads.

Summary
Be weary of sequences of items returned from a database that are lazily evaluated. It is best practice to materialize the sequence to an array before returning it out of the function. Also, make sure your load tests are run before EVERY release in a QA environment that exactly mirrors production. Hopefully that will save you headaches in the future.