sexta-feira, 31 de março de 2017

Running tests with stubs containerized

We've been speaking about tests on other posts and I mentioned about using stubs to run tests with external dependencies.

A good approach is to use docker, I just did some examples of stub services and pushed the images to docker hub. To run my tests, I added a dependency on my docker-compose file for those images and use them to run my tests with docker compose.

It lead me to write this post and share this knowledge with you.

I've created a stub service with NodeJS (using hapi to startup a service and expose my routes), with this code in hands, we can start playing with docker, we'll do some simple steps to publish our image to docker hub and use it on our tests.

Register an account on docker hub

  • Visit Docker Hub and fill the form with dockerhubid, email and password
  • A confirmation email will be sent, and you just have to click a link to activate your account
  • On command line you can run docker login, it'll ask your credentials

Create your image

First of all, with our stub code, we need to create a local docker image, to be able to publish it later.

The first thing to do is to create a DockerFile, which looks like:

FROM node:6.0

RUN mkdir /usr/src/app
WORKDIR /usr/src/app

COPY package.json /usr/src/app
RUN npm install --dev

COPY . /usr/src/app


CMD npm start

Basically I'm using a node:6.0 image, copying my project and running some commands to install and expose my stub.

To create the image now, we need to execute this DockerFile, and it can be done easily with one command.

On the command line, do the steps below:
  • Navigate to the folder where your DockerFile is
  • Run docker build -t name .

It'll find a DockerFile on the current folder and will build the image with the name informed.

The command "docker build" has other options, in this case I just wanted to keep it simple.

Pushing my image to docker hub

Once I built my DockerFile, I need to push my image to docker hub, and for that we need two steps:

Tag the image

It's a very simple step, we need to tag our image id. This tagging process is to associate the image id you just generated with a name that will be pushed to docker hub.

docker tag imageId dockerHubId/imageName:version

Remember that dockerHubId must be the user logged in (your account).

Push the image

Once we tagged our image, we can now push it to docker hub.
The command to push is very simple.

docker push dockerHubId/imageName:version

Running tests with docker compose

Once we have our image published on docker hub, we can create the docker-compose.yml to run our tests using the stubs from the image just published.

In the example I'll show, it's a NodeJS project running tests with the command "npm test".

Here is the docker-compose.yml file with the dependency of a image with our stub.

  image: mcure/mystub:latest
    - "3002"

  build: .
  command: npm test
    EXTERNAL_URL: "http://mystub:3002"
    - mystub
My code has a dependency on a environment variable for referencing an external API, which I'm calling EXTERNAL_URL.

My project's config file uses this environment variable to set the URL of the external API.

When executing this docker-compose.yml, it'll build the container from the DockerFile of my project, and it will run the "command" specified replacing the environment variable set on the configuration.

For each test execution, it'll create a new container from that image, which means that my test can be ran without any external dependency, and I have now a test more isolated not depending on any real service.

With this post, I hope I helped you to understand a bit more about docker and the strategy of running tests with stubs.

Tweet me if you wanna discuss more about it.

segunda-feira, 16 de janeiro de 2017

Contract tests

Today's post is to talk about contract tests, the value of having them and strategies to have trustable test suites.

Everyone know the value of tests, they will ensure that the system behaves as expected, be using BDD or TDD or whatever technique you prefer. Of course any kind of system/problem may have a more appropriated kind of test.

In the last few years, I've been working with APIs. Each day is a new challenge, sometimes we have the need to version it, change contracts, make new integrations, include new rules, etc.

I've been developing an API with NodeJS, and it was chosen CucumberJS to write tests, so then the team was be able to do BDD. Of course there are other libraries that would allow us to do the same, anyway...

When writing APIs, one important thing to think about is to have contract tests, so then I believe BDD was a good take.

Testing a contract is not something straight forward, you need to think about many things and create strategies.

Treating external dependencies with contract tests

When you write an API, there is a big chance that you have external dependencies, be it a database, other APIs, messaging mechanisms, cache, etc. You don't want to trust real dependencies when running contract tests, tests must be 100% independent and should not trust a real external dependency.

What's the chance of an external dependency be unavailable at the time you want to test your contract? Do you want your contract tests to fail if a external dependency is not available? No, right?!

There are some good tips to avoid this kind of issue to happen, depending on the kind of dependency we can have different strategies.

External APIs

Having an external API as dependency may be a problem as mentioned before, the API may be down, or with some delay, it can make your tests to fail or take a long time. A solution for this problem is to create stubs (Martin Fowler post about mocks and stubs).

Stub is a simple interface which will give you standard responses of a giving API. Let's say you have a dependency on a Products's API and your API needs to retrieve a list of products. This stub would always return a list of products, this stub could be started up with your suite of contract tests, and be stopped at the end of its execution.


Many times we are dependent on data, so a database dependency may be a big problem.

You cannot just trust your QA or DEV environment database, not just because of the fact that the database may be down, under stress, a dataload is happening, it can even be empty, or it can have any kind of unavailability. But also because you may need some data on it, and you also want to make sure it will always, no matter when, return the data you expect.

You can easily start up a database with docker and include some fake data on it before your tests execution. Remember that your tests must be 100% independent, which means that a database change that is done on the test 1 will not impact the test 2, and the test 2 is not dependent on a state change made on test 1.

These are just 2 good tips to make your contract test faster and trustable.

Containerize your tests

As mentioned before, containers are a good solution for some external dependencies, they're immutable and can be executed on any environment. Why not run your tests with docker-compose? You can start containers from the scratch, seed some data, run the tests, and kill it when the execution is finished.

If you need to run the tests again, you will make sure the initial state is always the same, this way you have a good and trustable test suite. It even facilitates your life when you want to include the tests on a CI server, you just need docker installed and that's it.

Contract assertions

One very common mistake I see on some contract tests suites is that people only check if the request worked properly.

The contract is composed by many other things, here are some things you can check when doing contract tests:

HTTP Status code

The HTTP status code must be checked, different requests should return different status codes. If you do a GET on a specific endpoint, it should return 200 (ok). Or 404 (not found) if the given resource was not found. if you are creating a record (doing a POST), the status code should be 201 (created). These are just some examples.

Response body

The response body is also part of the contract, when you send a request, you expect a response body always with the same shape. Let's say you have a products's endpoint which returns the fields: id, description, unitPrice, available. Everytime you GET a product, you may expect to receive back these fields.

It just remembered me of Consumer Driven Contracts, you should definitely consider it when you are consuming APIs. (here is a great post from Martin Fowler about it)

I think that's all, I hope I helped you understanding how a contract work and how it can be tested, please feel free to tweet me and discuss about this subject. here is a great post about contract breaks, enjoy it.

segunda-feira, 24 de outubro de 2016

Benchmark npm vs yarn

Few weeks ago Yarn was released to node community, it is an alternative solution to npm and others.

I heard a co-worked saying it was faster than npm, I found it interesting and I decided to do a quick benchmark using both npm and yarn with the installation of the project I work on.

I've done three tests:
  • Installation with yarn
  • Installation with npm
  • Installation with npm with cache

Yarn gets data from npm repository. When running yarn install, it uses cache by default, differently from npm, which you have to include an option to the install command, example: npm --cache-min 9999999 install

I've used the lib rmdir to clean up the directory node_modules before each execution. And I've also used child_process.exec to run the install command.
Note, rmdir uses child_process.exec behind the scenes. And rmdir could return a promise instead of requiring a callback :P.

Here is the code I ran to do the benchmark:

var rmdir = require('rmdir'),
    exec = require('child_process').exec;

rmAndInstall('npm', '--cache-min 9999999');

function rmAndInstall(libName, options) {
    rmdir("node_modules", () => {
        const init = new Date();
        const opts = options === undefined ? '' : options
        console.log('Executing '+libName+' ' +opts+' install');
        exec(libName+' ' +opts+' install', () => {
            var result = (new Date() - init).toString();
            console.log(libName+' '+opts+' took: '+ result);

And here are the results in milliseconds, interesting humm?!

mcure@cure:~/github/project$ node app/benchmark_npm_yarn.js 
installing stuff with yarn
installing stuff with npm
installing stuff with npm --cache-min 9999999
yarn took: 8930
npm took: 35383
npm --cache-min 9999999 took: 28226

And that's really impressive because yarn execution time was almost 4x faster than npm using cache.

To be honest, I don't know exactly how mature yarn is and neither about its bugs, but I'll definitely push my team to use yarn, at least for evaluating it. Also I want to try all features and learn more about the tool.

It was a very simple post, but I hope I helped you to know a bit more about yarn. And I hope you get encouraged to try it as well.

The benchmark code is also on my github.

Tweet me if you wanna discuss more about it.

segunda-feira, 12 de setembro de 2016

REST anti-patterns

In the past few years I've been working and studying RESTful APIs, and I have seen some common mistakes in different projects and on online forums, then I decided to write this post based on some experiences and on stuff I've read on the internet.
Here are some anti-patterns, their explanation and examples.

URI not very RESTful

Your URI does not reflect the action that's happening under an existing resource.

RESTful APIs are about resources, when we're building our URIs, we need to tell a story about that resource, looking at the URI the consumer must understand all about the given resource, where it came from, which is its identifier, which options it has.

Let's say we have a resource called account and we need to close this account, what is the best way to represent this action?

Below I've divided some examples in wrong and correct. Which one makes more sense to you?

  • POST /accounts/close
  • POST /closeAccount
  • POST /accounts/4402278/close
The wrong options doesn't give this visibility when looking at the URI. Probably we need to send some query parameters or a body, but it's not clear which is the name of the query parameter, neither the format of the body if needed

The correct option shows that the "account" 4402278 can be closed, we can deduce it just looking at the URI.

Using wrong HTTP methods

The HTTP methods must be used to give the intent of the action that is happening. If you are returning information, you must use GETS for example.

Below is a list of actions and its respective methods for the most common HTTP methods.

GET - Retrieve records
POST - Create records
PUT - Update whole records
PATCH - Updates pieces of records
DELETE - Delete records

Having said that, mistakes such as below are often seen:
  • POST /accounts/4402278/delete
  • DELETE /accounts/4402278

The wrong example is doing a POST on accounts for a given id asking a delete option explicitly on the URI.

It's somehow clear, however you are doing it wrong because HTTP has a explicit method for deleting resources, which is DELETE.
There are many other cases which could make this post even bigger, but here is just the challenging.

Hurting Idempotency

No matter how many times you call GET on the same resource, the response should always be the same and no change in application state should occur.
  • Idempotent methods: GET, PUT, OPTIONS
  • Non Idempotent methods: POST

What about the method DELETE? If you DELETE /accounts/4402278 twice
  • The accounts will not be deleted man times ... thinking this way it is idempotent
  • The second time the resource will not be found and should return a 404 Not found, this way it's not idempotent anymore

Ignoring status codes

If your API only returns 200 (OK) or 500 (Internal Server Error), you are hurting the response codes.

The status codes were created to give the consumer an overall status of the final state of the request.

It means we need to be carefull when choosing status codes to represend this state of the requisitions, it need to reflect exaclty what happened and the end result.
  • GET /accounts/123456 (and there is no matching record) response: HTTP status 200 (ok) with a body saying it's not found
  • GET /accounts/123456 (and there is no matching record) response: HTTP status 404 (not found)
If you are getting a resource as the example below, and there is no matching record, it should return 404, because it was not found. If you are returning a 200(ok) with a message in the body saying "Not found", you are doing it wrong, the status code says "ok" but the message says "not found", it's completely redundant.

Status codes will also help to give more clarity on the responses of your API. In many cases consumers only have to parse the status code to know exactly what happened, much simpler than parse responses with big strings.

HTTP status codes

Here are some of the most used, in my humble opinion :)

2xx - Success4xx / 5xx - Error3xx - Redirection
200 OK400 Bad Request301 Moved
201 Created401 Unauthorized302 Found
203 Partial Information402 Payment Required304 Not Modified
204 No response403 Forbidden
404 Not Found
500 Internal Server Error
503 Service Unavailable

Ignoring caching

It is easy to ignore the caching by including a header "Cache-control: no-cache" in responses of your API calls.

HTTP defines a powerful caching mechanism that include ETag, If-Modified-Since header, and 304 Not Modified response code.

They allow your clients and servers to negotiate always a fresh copy of the resource and through caching or proxy servers increase you application's scalability and performance.

Ignoring hypermedia

If your API calls send representations that do not contain any links, you are most likely breaking the REST principle called HATEOAS.

Hypermedia is the concept of linking resources together allowing applications to move from one state to another by following links.

If you ignore hypermedia, it is likely that URIs must be created at the client-side by using some hard-coded knowledge.


Client interacts with a application through hypermedia provided dynamically the API

Current state of the application is defined by your data and the links on your payloads.

Client must have a generic understanding of hypermedia.

Allows the server functionality to evolve independently.

Interaction is driven by hypermedia, rather than out-of-band information.


In the example below we have a resource called "account" which has 100.00 on it.

    "accounts": [
            "accountNumber": "4502278",
            "balance": 100.00,
            "links": [
              {"rel": "deposit", href: "/account/4502278/deposit"},
              {"rel": "withdraw", href: "/account/4502278/withdraw"},
              {"rel": "transfer", href: "/account/4502278/transfer"},
              {"rel": "close", href: "/account/4502278/close"}

Let's say now the owner of the account is on "the red" at this moment. the API should block some actions and show the payload such as:

    "accounts": [
            "accountNumber": "4502278",
            "balance": -60.55,
            "links": [
                {"rel": "deposit", href: "/account/4502278/deposit"}

Ignoring MIME types

If resources returned by API calls only have a single representation, you are probably only able to serve a limited number of clients that can understand the representation.

If you want to increase a number of clients that can potentially use your API, you should use HTTP's content negotiation.

It allows you to specify standard media types for representations of your resource such as XML, JSON or YAML


When building your APIs ...
  • Be coherent
  • Require headers
  • Use Standards (JSON-API)
  • Build well designed URIs
  • Return coherent status codes
  • Care about idempotency
  • Use correct HTTP methods

I really hope I helped you identifying some common mistakes and I also hope the tips given here will help you when designing your APIs.

Tweet me if you wanna discuss more about it.

sábado, 28 de maio de 2016

Versioning APIs

I'm currently working in a project where my team is developing an API, and one subject that came to the table is versioning. This is a very confusing subject and generates a lot of discussion, that's why I had the idea to write a post about it.

There are a few strategies for versioning a API, but let us get a step before, what happens in the project that requires a new version of the API?

Contract break

Let's say we have a resource called Person, its contract includes id, name, birthDate, address, zipCode and city. At some point a decision is made to change it and separate the address information from the Person. It means that the contract will be changed because the address information will be moved to a separated resource.

All consumers will be broken, because they look at the address information inside the Person. This is a contract break. There are other cases where new information is added to the contract, which means it will not break any consumers, so we cannot consider this case as a contract break.

Version it!

How to avoid breaking the consumers? Versioning it. At this point, API team will create a version 2 of the contract. Consumers will still use the old contract, however a version 2 of the contract will be published. Consumers and API team will now have an agreement on when the API will stop supporting the old version and consumers will have to start using the new version.

Be cautious

Versioning contracts looks like a good solution, but it can get dangerous if the API starts supporting a lot of versions. It will make your code look like a mess, hard to understand, too many branches on the code. I won't even mention that it can cause bugs (just did it :P). I'd say a good practice would be accumulating your contract breaks and release a new API version once you evaluate it is worth to. Also, I wouldn't have more than two vesions in parallel to avoid the issues mentioned above.

Versioning strategies

I've been researching a couple of solutions for API versioning, I will present and comment two strategies that most called my attention.

Version as path/query parameter

This is the strategy I've most seen on projects I worked on and on my researches. It consists in adding the version in the path like the example below:


Who is using this approach?
  • Twitter
  • Atlassian
  • Google Search

Version as a header

This is probably the less intrusive strategy, where the version is informed in the header Accept, leaving the URL clear. See the example below:

Accept: application/json; version=1.0

There are other ways to inform the version in the Accept header, but I thought this one is the clearer way. I also saw some example where people use a custom header like X-Version: 1.0

Who is using this approach?
  • Azure
  • Github API
  • Google Data API


A good contract design many times avoids contract breaks, which avoids versioning. Always be careful about the contracts, these set how the external world talk to your API. Contracts break, it's natural, however, always evaluate each change, think about your design and ask yourself if each change is really the right thing to do, collect the pros and cons and do smart decisions.

quinta-feira, 26 de maio de 2016

Integrating node.js and Apache Kafka

In this post I will demonstrate how we can integrate node.js and Apache Kafka, producing and consuming messages in a very simple example.

First of all, let us get Apache Kafka up and running, you can see how to do on the official kafka's site tutorial.

Once it's up and running, we can set up the project and start playing with the lib no-kafka

  • npm init
  • npm install no-kafka --save

I have used the version 2.4.2 of no-kafka. So, if you want to inform the version when installing, just run it as "npm install no-kafka@2.4.2 --save".

Here is a producer example, which will connect to kafka and produce messages in a topic.

var Kafka = require('no-kafka');
var producer = new Kafka.Producer();
return producer.init()
  return producer.send({
      topic: 'kafka-test-topic',
      partition: 0,
      message: {
          value: 'Hello!'
.then(function (result) {
  console.log('topic sent');
If you are running a local instance, it connects automatically to the local host. To connect to a external instance, you can replace "localhost" by the external host following the example below:

var Kafka = require('no-kafka');
var connString = ' kafka://localhost:9092, localhost:9092 '
var producer = new Kafka.Producer({ connectionString: connString });

Here is a consumer example, which will connect to kafka and subscribe to a topic, receiving messages and printing them to the console.

var Kafka = require('no-kafka');
var consumer = new Kafka.SimpleConsumer();
// data handler function can return a Promise 
var dataHandler = function (messageSet, topic, partition) {
    messageSet.forEach(function (m) {
        console.log('topic received: ');
            'partition': partition,
            'offset': m.offset,
            'message': m.message.value.toString('utf8')
return consumer.init()
.then(function () {
    return consumer.subscribe('kafka-test-topic', 0, dataHandler);

As you can see in the pieces of code above, all requests return promises. This is an example of the very basic features of the lib interacting with Kafka.
I put this project on my github, so then you can play with the code and evolve as needed.

Building APIs with HarvesterJS

HarvesterJS helps creating robust APIs on the top of mongoDB and node.js. It is a fork of fortuneJS and is JSONAPI compliant, and runs under Express. It gives the developer the ability to create contracts and validations with Joi.

In this post you'll see how to setup a very basic API with schema validations and some features of HarvesterJS.

Once the resources are properly set up, HarvesterJS provides the GET/POST/PUT/DELETE operations persisting the data on MongoDB.

Initial project setup

  • npm init
  • npm install harvesterjs --save
  • npm install joi --save

Seeting up the API with configs


var harvester = require('harvesterjs'),
    options = {
        adapter: 'mongodb',
        connectionString: 'mongodb://',
        inflect: true
var harvesterApp = harvester(options);


function onListen() {
    console.log('listening on port 4567');

harvesterApp.listen(4567, onListen);

Setting up a resource

You can setup the resource fields and use JOI to describe and include validations on the field. In the examples below, we have a resource called customer with two fields: status and name

  • Status is a string which only accepts two values: Active or Inactive
  • Name is a string which is required.

Basic resource customer.js

var Types = require('joi');

harvesterApp.resource('customer', {
    status: Types.string().valid('Active', 'Inactive'),
    name: Types.string().required()

Linking resources

Regular link

In the example below, we have a resource called customer which has a link to a resource called contact. This link is an array of contacts, but you can have a single resource link.

var Types = require('joi');

harvesterApp.resource('customer', {
    status: Types.string().valid('Active', 'Inactive'),
    name: Types.string().required(),
    links: {
       contacts: ['contact']

External link

In the example below, we have a resource called customer which has an external link to a resource called contact.

var Types = require('joi'),
   contactURI = 'http://localhost:2426/contacts';

harvesterApp.resource('customer', {
   status: Types.string().valid('Active', 'Inactive'),
   name: Types.string().required(),
   links: {
      contact: { ref: 'contact', baseUri: contactURI }

Manipulating resources manually

HarvesterJS gives you the ability to manipulate documents manually. Once you have the harvesterApp object in place, you can use harvesterApp.adapter's methods to interact with mongoDB: find, findMany, create, update, delete.

These are the very basic features of HarvesterJS, for more information check its github.