Decoupling

Published at Apr 14, 2023

#backend
#api
#software architecture

Software should be grown, not built.

Brian Foote

Resources:

  • Fundamentals of software architecture : an engineering approach by Marc Richards and Neal Ford ( O’Reilly )

Big ball of mud?

I made the first deploy, Uau! it is a very agile project. BUT the code deployed is a Big Ball of Mud, term defined in a paper released in 1997 by Brian Foote and Joseph Yoder:

“A Big Ball of Mud is a haphazardly structured, sprawling, sloppy, duct-tape-and-baling- wire, spaghetti-code jungle… The overall structure of the system may never have been well defined.If it was, it may have eroded beyond recognition…”

In other words, my code is a simple scripting code inside a handler that calls directly to database. Many trivial applications start like this then become unwieldy as they continue to grow. Here you can see one handler:

// Don't structure like this at home !
e.GET("/publicholidays", func(c echo.Context) error {
    fromDate := c.QueryParam("from")
    toDate := c.QueryParam("to")
    if fromDate == "" {
        return c.JSON(http.StatusBadRequest, "error param from")
    }
    if toDate == "" {
        return c.JSON(http.StatusBadRequest, "error param to")
    }

    sqlStmt := `
SELECT ph.id, ph.name, ph.description, ph.day, ph.country_code,
ph.language, ph.state_name, ph.state_code FROM public_holidays ph
WHERE ph.day BETWEEN $1 AND $2;
`

    rows, err := dbpool.Query(context.TODO(), sqlStmt, fromDate, toDate)
    if err != nil {
        return c.JSON(http.StatusInternalServerError, "internal server error")
    }
    defer rows.Close()

    phs := []models.PublicHoliday{}
    for rows.Next() {
        var ph models.PublicHoliday
        err = rows.Scan(&ph.ID, &ph.Name, &ph.Description, &ph.Day,
            &ph.CountryCode, &ph.Language, &ph.StateName, &ph.StateCode)
        if err != nil {
            return c.JSON(http.StatusInternalServerError, "internal server error")
        }
        phs = append(phs, ph)
    }
    if rows.Err() != nil {
        return c.JSON(http.StatusInternalServerError, "internal server error")
    }
    return c.JSON(http.StatusOK, phs)
})

Architecture style: Layered Architecture

We need to set a solution for the big ball of mud, my future me will be happy that I did something to structure this when he wants to maintain/improve the code. As seen above, we have logic for network request, input validation, database retrieval and network response. We can split at least in 2 big chunks, one for network stuff and other for database stuff. What we are doing here is a technically partitioned architecture named as layered architecture.

Layered architecture’s primary goal is to promote separation of concerns so changes in one layer don’t affect (negatively) others. The scope of each layer should account for:

  • Encapsulation: each layer having a distinct responsability.
  • Abstraction: each layer provides a service to the layer above it and depends on services below it.

Although there are no specific restrictions in the number and types of layers, most layered architectures consist of four standard layers:

  • Presentation: this layer will be responsible for rendering the UI in the browser, handling user interactions and presenting data to user. This is being developed with SvelteKit as an SPA.
  • Business: it handles user requests, do business logic and coordinates interactions between layer above and below. The backend framework used is Echo
  • Persistence: will communicate the database layer to store, retrieve or manage data. I will use pgx as it offers me flexibility, wide range of types and opportunity to learn raw sql queries. As drawback, it can be cumbersome to manage raw sql statements or scan queries into struct.
  • Datastore:it can be any type of storage system like file based, in-memory, etc. We will use a Postgres DB to store our data.
layered arch

Where do I set input validation? as seen in code above, there is a check of request input that cannot be an empty string. Input validation should be implemented for each layer because each layer may have specific requirements for the input and cannot depend of validation done by layer above. You can think it this way, you don’t need credentials when you enter a bank office but if you work in the front desk then you have a set of credentials, going deeper to access the vault (if they still exist) then you need another set of credentials. Think of input validation as the bank credentials that allows you to cross successfully each layer.

Be aware: Sinkhole anti-pattern

The snippet code showed above is a very simple request from presentation to persistence layer, there is no “real” business logic. We are defining 4 layers (for now) and this request is passing through the business layer, this results in unnecessary object instantiation and processing, apart from maintaining a layer that basically does not do anything. This is an anti-pattern to be watched out in a layered architecture.

Every layered architecture will have at least some scenarios that fall into this anti-pattern. The 80-20 rule is usually a good practice to follow (source) , it is acceptable if only 20% of requests are sinkholes. However, if 80% of requests are sinkholes then the layered architecture is not the correct architecture style. Possible idea is to split in domains and applying layered architecture to each domain that may end in one domain having less layers than others.

Key concept: Closed and open layer

Each layer in this architecture is marked as being closed. A closed layer means that the layer can not be bypassed, the request has to go through it, in a layered architecture it is essential that all layers are closed to comply with its primary goal of promoting separation of concerns, if changes are made to one layer it generally does not affect other layers, but it could be that you modify something in the contract between layers, then it possibly affect the layer above. For the code above, you require start and end date as filter in the query (that should be placed in the persistence layer), what if I want to filter by country code as well, then the contract between business and persistence layer changes so I have to change business layer as well, but if business layer was open meaning that presentation layer can bypass it and ask directly to persistence, then I have to change also in presentation, then you are directly affecting 2 layers instead of only 1 , hence the importance of setting all layers closed.

I asked myself if most of requests are for simple DB retrievals, why presentation layer does not ask directly datastore layer? we can have a faster request with only 2 layers. In practice, you will be creating a big ball of mud in your presentation layer, for example, when clicking get holidays, it will trigger a logic for start-end date and preparation of query to datastore. Will you do it right there on your on:Click function ? most probable you would wrap it in a function so you can reuse it after and inevitable you will end up creating business and persistence layer together with your presentation enviroment (SvelteKit in this case). For example, this could be possible by using Firestore, a database service directly accesible from client but it is restricted to NoSQL database.

A request going through the layers

Let’s see a request flowing down to the datastore and going back to be displayed to user. The black arrows show the request going towards datastore , and red arrows show the response flowing back to screen. This example is based on our klokken app and the diagram was inspired by a similar illustration.

The holiday screen is responsible for accepting user inputs and displaying the customer information, when user visits page it automatically makes a request to backend so user does not need to click on anything. To initiate the request, the holiday screen delegates that job to a holiday delegate module. This module knows the contract (where and how to ask) between presentation and business layer, in essence it knows the url and inputs needed to make the request. The holiday object in the business layer receives the request and aggregate or prepare it if needed and send it to the holiday dao(data access object) in the persistence layer. The dao provides an abstract class for the datastore details and each method translates the input into a query sent to the database. The database executes query and return it to dao , then dao returns it to business layer which may again do some agreggation and then the information will go back to holiday delegate and finally to be displayed in customer screen.

request travel through layers

There is decoupling between two layers that I want to explain further:

  • REST: a server will respond with the representation of a resource (in json format). You ask I respond, this is the basic form of decoupling exchange of information. Client can communicate with server through a network request, be aware that client must know how to call correctly (name of parameters, correct methods, required params, etc). If server changes the contract then client requests will fail.

  • Interface: also called abstraction class, it is defined in the persistence layer and used in the business layer. It is important to note that both interface and its concrete implementation are defined in persistence layer together, why? separation of concerns, one layer defines its disinct responsability and layer above will use it.

  • DB driver: database call is another network request. The database driver is the client where we delegate the network request.

decoupling being used

TODO: add a link to delegation pattern notes with info from this https://en.wikipedia.org/wiki/Delegation_pattern

TODO: data access object to pattern notes (https://en.wikipedia.org/wiki/Data_access_object)

Implementing layered architecture

Well, time to refactor!

Persistence layer

Let’s start with the persistence layer by implementing a Data Access Object (DAO) , first I define the abstract class or interface in Golang. Instead of using a string, it makes sense to use time type for from-to params.

type Persistence interface {
	GetPublicHolidays(from, to time.Time) ([]models.PublicHoliday, error)
	SavePublicHolidays(phs *[]models.PublicHoliday) error
}

Then, implement a concrete class or struct in Golang. Note the MustGetNewPostgresPersistence function, a helper to init the struct and a cleanup function to close database connection.

// Note: Only shown here the implementation of GetPublicHolidays method
type PostgresPersistence struct {
	DB *pgxpool.Pool
}

// Initializer
func MustGetNewPostgresPersistence(ctx context.Context, dburl string) (Persistence, func()) {
	dbpool, err := pgxpool.New(ctx, dburl)
	if err != nil {
		log.Fatalf("Unable to create connection pool: %v\n", err)
	}
	err = dbpool.Ping(context.TODO())
	if err != nil {
		log.Fatalf("Error testing db conn\n")
	}
	//Return persistence and cleanup function
	return &PostgresPersistence{DB: dbpool}, func() { dbpool.Close() }
}

func (p *PostgresPersistence) GetPublicHolidays(fromDate, toDate time.Time) ([]models.PublicHoliday, error) {
	// Input validation to persistence layer
	if toDate.Before(fromDate) {
		return []models.PublicHoliday{}, fmt.Errorf("To date should be after from date")
	}
	sqlStmt := `
    SELECT ph.id, ph.name, ph.description, ph.day, ph.country_code,
    ph.language, ph.state_name, ph.state_code FROM public_holidays ph
    WHERE ph.day BETWEEN $1 AND $2;
    `
	rows, err := p.DB.Query(context.TODO(), sqlStmt, fromDate, toDate)
	if err != nil {
		return []models.PublicHoliday{}, err
	}
	defer rows.Close()

	phs := []models.PublicHoliday{}
	for rows.Next() {
		var ph models.PublicHoliday
		err = rows.Scan(&ph.ID, &ph.Name, &ph.Description, &ph.Day,
			&ph.CountryCode, &ph.Language, &ph.StateName, &ph.StateCode)
		if err != nil {
			return []models.PublicHoliday{}, err
		}
		phs = append(phs, ph)
	}
	if rows.Err() != nil {
		return []models.PublicHoliday{}, err
	}
	return phs, nil
}

Business layer

Now, there is an input transformation required, the input from-to must be converted from string to time.Time and then we can call the persistence layer from our business layer.

Get public publicholidays endpoint must validate/transform input and call persistence layer, then return result to presentation layer.

e.GET("/publicholidays", func(c echo.Context) error {
    // Input validation to business layer
    fromDate, err := time.Parse(time.RFC3339, c.QueryParam("from"))
    if err != nil {
        return c.JSON(http.StatusBadRequest, "error param from not parsable")
    }
    toDate, err := time.Parse(time.RFC3339, c.QueryParam("to"))
    if err != nil {
        return c.JSON(http.StatusBadRequest, "error param to not parsable")
    }

    // Send to persistence layer
    phs, err := pL.GetPublicHolidays(fromDate, toDate)
    if err != nil {
        fmt.Printf("Error persistence layer call: %#v \n", err.Error())
        return c.JSON(http.StatusInternalServerError, "internal server error")
    }

    // Return to presentation layer
    return c.JSON(http.StatusOK, phs)
})

Now, the call to persistence layer is done from business layer. And we can call this done and go to presentation layer and I see many examples over the internet that they stop here as well. But it would be good to consider the decoupling of holiday object and holiday server, why?

  • Software should be grown, not built : beautiful phrase appliable to this moment, it is no secret that this architecture is a basic start with its good and bad, an attempt to make this architecture last forever in a growing startup can be hard to achieve and even if achieavable then at what cost. There are other architectures with their good and bad that will probably find its way into our solution, I am talking about microservices. To be able to use it, REST communication must be swapped by GRPC, this have a huge impact because REST server Echo is coupled with the business logic in the code just above.

  • Bubbling errors: holiday object is enforced to organize its errors to communicate to client through HTTP status codes. At this point, you can easily add a metric logger for errors when you return to client ( you can do it here or with a middleware anyways)

  • Testing?

Holiday server:

e.GET("/publicholidays", func(c echo.Context) error {
    phs, err := h.GetPublicHolidays(c.QueryParam("from"), c.QueryParam("to"))
    if err != nil {
        return c.JSON(getStatusCode(err), err.Error())
    }
    return c.JSON(http.StatusOK, phs)
})

...
...
...

func getStatusCode(err error) int {
	if errors.Is(err, handler.ErrBadInput) {
		return http.StatusBadRequest
	} else if errors.Is(err, handler.ErrInternal) || err != nil {
		return http.StatusInternalServerError
	}
	return http.StatusOK
}

Holiday object:

var (
	ErrBadInput = errors.New("bad input")
	ErrInternal = errors.New("error process")
)

type Business struct {
	PL persistence.Persistence
}

func (b *Business) GetPublicHolidays(from, to string) ([]models.PublicHoliday, error) {
	// Input validation to business layer
	fromDate, err := time.Parse(time.RFC3339, from)
	if err != nil {
		return []models.PublicHoliday{}, fmt.Errorf("param from not parsable: %w", ErrBadInput)
	}
	toDate, err := time.Parse(time.RFC3339, to)
	if err != nil {
		return []models.PublicHoliday{}, fmt.Errorf("param to not parsable: %w", ErrBadInput)
	}

	// Send to persistence layer
	phs, err := b.PL.GetPublicHolidays(fromDate, toDate)
	if err != nil {
		fmt.Printf("Error persistance layer call: %#v \n", err.Error())
		return []models.PublicHoliday{}, fmt.Errorf("error persistance layer call: %w", ErrInternal)
	}

	// Return to presentation layer
	return phs, nil
}