Showing posts with label spring boot. Show all posts

From Microservices to Service Blocks using Spring Cloud Function and AWS Lambda

Thursday, July 6, 2017

This blog post will introduce you to building service block architectures using Spring Cloud Function and AWS Lambda.

What is Spring Cloud Function?

Spring Cloud Function is a project from Pivotal that brings the same popular fundamentals behind Spring Boot to serverless functions.

Service Block Architecture

One of the most important considerations in software design is modularity. If we think about modularity in the mechanical sense, components of a system are designed as modules that can be replaced in the event of a mechanical failure. In the engine of a car, for example, you do not need to replace the entire engine if a single spark plug fails.

In software, modularity allows you to design for change.

Modularity also gives developers a shared map that can be used to reason about the functionality of an application. By being able to visualize and map out the complex processes that are orchestrated by an application’s source code, developers and architects alike can more easily visualize where to make a change with surgical precision.

Changing software

In many ways, we should consider ourselves lucky to be building software instead of cars. Some of today’s most valuable companies are created using bits and bytes instead of plastic and metal. But despite these advances, the very best car company releases less often than the world’s very worst software company.

An application’s source code is a system of connected bits and bytes that is always evolving—one change after another. But, as the source code of a system expands or contracts, small changes require us to build and deploy entire applications.

To make one small code change to a production environment, we are required to deploy everything else we didn’t change.

When teams share a deployment pipeline for an application, teams become forced to plan around a schedule they have little or no control over. For this reason, innovation is stifled—as developers must wait for the next bus before they can get any feedback about their changes.

The result of building microservices is an ever increasing number of pathways to production. With more and more microservices, the amount of unchanged code per deployment decreases when measured across all applications. It’s the decomposition in microservices that ends up breeding lower unchanged code deployed over time—an important metric. Serverless functions can help to get this number even lower—as the unit of change becomes the function. But, how do microservices and serverless functions fit together?

Service Blocks

Service blocks are cloud-native applications that share many characteristics with microservices. The key difference with microservices is that a service block is a self-contained system that has multiple independently deployable units—mixing together serverless functions with containers.

While microservices can be created entirely as serverless functions, a service block focuses on a contextual model that combines together traditional "always-on" applications with portable on-demand functions.

The Patterns

The basic pattern of a service block combines a core application running in a container with a collection of serverless functions.

Service Block Patterns Spring Cloud Function

Tuesday, January 31, 2017

This blog series will introduce you to building event-driven microservices as cloud-native applications.

In this first post, we’ll explore how to implement the CQRS pattern in microservices. We’ll also dive into why serverless is a natural fit for these kinds of systems. Later in the series we’ll explore a reference application that uses Spring Cloud Stream to implement CQRS.

What is an event-driven architecture?

Event-driven architectures treat domain events as first-class citizens. This approach is as old as software itself.

One example we use every day is in front-end applications. In every web browser in use today, events are handled as a way to capture inputs of a user form. Events connected to page elements are handled by an explicitly mapped function, sometimes referred to as an action or command, which will apply state changes to a user interface when triggered.

Now, more recently, with the widespread adoption of microservices, there is renewed interest in how to take advantage of event-driven techniques in distributed back-end systems.

CQRS

One of the most popular practices in event-driven architectures today is called CQRS, which is short for Command Query Responsibility Segregation. CQRS is a style of architecture that allows you to use different models to update and read domain data.

The basic idea of CQRS is that it’s perfectly natural to need to separate the models you’re using to update and read data. The diagram above shows this basic idea.

Tuesday, August 30, 2016

It’s safe to say that any company who was writing software ten years ago—and is building microservices today—will need to integrate with legacy systems. In this article, we will explore building cloud-native microservices that still need to integrate with legacy systems. We’ll use practices from Martin Fowler’s Strangler Application to slowly strangle domain data away from a legacy system using microservices.

When building microservices, the general approach is to take existing monoliths and to decompose their components into new microservices. The most critical concerns in this method have much less to do with the application code and more to do with handling data. This article will focus on various methods of strangling a monolith’s ownership of domain data by transitioning the system of record over time.

Throughout this article, we’ll use a reference application built with Spring Boot and Spring Cloud. The example demonstrates techniques for integrating a cloud-native microservice architecture with legacy applications in an existing SOA.

Going Cloud Native

Many companies want to start taking advantage of the public cloud without having to migrate every line of business application at the same time. The reasons for this are numerous. The existing line of business applications can be thought of as the vital organs of a living organism. During the migration, think about the complex relationships between the existing components deployed to your infrastructure. Think about the dependencies of your applications and the connections between them. Think about every application that relies on a database or network file system. These are among the many considerations that will cause a migration to become an expensive and time-consuming project.

The unfathomable complexity of migrating applications to the cloud can delay a decision from being made until the business deems necessary, which is usually triggered by a major event that results in a loss of revenue. This ticking time bomb will result in a lift-and-shift migration of applications. The problem with the lift and shift approach is that any technical debt that you had on-premises finds new life in the cloud environment. The problem being, the architectural and infrastructure issues that triggered the cloud migration still may not be fixed.

The approach I discuss in this article focuses on addressing the underlying symptoms of legacy systems that decrease system resiliency and lead to costly failures.

Tuesday, April 19, 2016

When building applications in a microservice architecture, managing state becomes a distributed systems problem. Instead of being able to manage state as transactions inside the boundaries of a single monolithic application, a microservice must be able to manage consistency using transactions that are distributed across a network of many different applications and databases.

In this article we will explore the problems of data consistency and high availability in microservices. We will start by taking a look at some of the important concepts and themes behind handling data consistency in distributed systems.

Throughout this article we will use a reference application of an online store that is built with microservices using Spring Boot and Spring Cloud. We’ll then look at how to use reactive streams with Project Reactor to implement event sourcing in a microservice architecture. Finally, we’ll use Docker and Maven to build, run, and orchestrate the multi-container reference application.

Eventual Consistency

When building microservices, we are forced to start reasoning about state in an architecture where data is eventually consistent. This is because each microservice exclusively exposes resources from a database that it owns. Further, each of these databases would be configured for high availability, with different consistency guarantees for each type of database.

Eventual consistency is a model that is used to describe some operations on data in a distributed system—where state is replicated and stored across multiple nodes of a network. Typically, eventual consistency is talked about when running a database in high availability mode, where replicas are maintained by coordinating writes between multiple nodes of a database cluster. The challenge of the database cluster is that writes must be coordinated to all replicas in the exact order that they were received. When this happens, each replica is considered to be eventually consistent—that the state of all replicas are guaranteed to converge towards a consistent state at some point in the future.

When first building microservices, eventual consistency is a frequent point of contention between developers, DBAs, and architects. The head scratching starts to occur more frequently when the architecture design discussions begin to turn to the topic of data and handling state in a distributed system. The head scratching usually boils down to one question.

How can we guarantee high availability while also guaranteeing data consistency?

To answer this question we need to understand how to best handle transactions in a distributed system. It just so happens that most distributed databases have this problem nailed down with a healthy helping of science.

Transaction Logs

Mostly all databases today support some form of high availability clustering. Most database products will provide a list of easy to understand guarantees about a system’s consistency model. A first step to achieving safety guarantees for stronger consistency models is to maintain an ordered log of database transactions. This approach is pretty simple in theory. A transaction log is an ordered record of all updates that were transacted by the database. When transactions are replayed in the exact order they were recorded, an exact replica of a database can be generated.

The diagram above represents three databases in a cluster that are replicating data using a shared transaction log. The zipper labeled Primary is the authority in this case and has the most current view of the database. The difference between the zippers represent the consistency of each replica, and as the transactions are replayed, each replica converges to a consistent state with the Primary. The basic idea here is that with eventual consistency, all zippers will eventually be zipped all the way up.

Sunday, January 3, 2016

This article introduces you to a sample application that combines multiple microservices with a graph processing platform to rank communities of users on Twitter. We’re going to use a collection of popular tools as a part of this article’s sample application. The tools we’ll use, in the order of importance, will be:

Ranking Twitter Profiles

Let’s do an overview of the problem we will solve as a part of our sample application. The problem we’re going to solve is how to discover communities of influencers on Twitter using a set of seed profiles as inputs. To solve this problem without a background in machine learning or social network analytics might be a bit of a stretch, but we’re going to take a stab at it using a little bit of computer science history.

The PageRank algorithm, created by Google co-founder Larry Page, was first used by Google to rank website documents from analyzing the graph of backlinks between sites.

I dug up the original research paper on PageRank from Stanford for some inspiration. In the paper, the authors talk about the notion of approximating the "importance" of an academic publication by weighting the value of its citations.

The reason that PageRank is interesting is that there are many cases where simple citation counting does not correspond to our common sense notion of importance. For example, if a webpage has a link to the Yahoo home page, it may be just one link but it is a very important one. This page should be ranked higher than many pages with more links but from obscure places. PageRank is an attempt to see how good an approximation to "importance" can be obtained just from the link structure.

— Page, Lawrence and Brin, Sergey and Motwani, Rajeev and Winograd, Terry (1999)
The PageRank Citation Ranking: Bringing Order to the Web

Now let’s take the same definition that is described in the paper and apply it to our problem of discovering important profiles on Twitter. Twitter users typically follow other users to track their updates as a part of their stream. We can use the same reasoning behind using PageRank on citations to approximate the "importance" of profiles on Twitter. This reasoning would tell us that it’s not the number of followers that make a profile important, it is measured by how important those followers are.

That’s exactly what we’re going to build in this article, and we’ll end up with something that looks like the following table.

Rank	Name	Profile	Following	Followers	PageRank
1.	Paul Ford	@ftrain	175	31948	7368.2417
2.	harper	@harper	1024	32452	6754.455
3.	Bret Victor	@worrydream	35	37658	6747.585
4.	Lincoln Stoll	@lstoll	872	41067	5976.3555
5.	kate matsudaira	@katemats	20465	25799	5916.3843
6.	rands	@rands	741	35079	5888.145
7.	Alex Payne	@al3x	405	41099	5547.4307
8.	Chris Wanstrath	@defunkt	5645	45310	4787.9644
9.	SaraReyChipps	@SaraJChipps	1007	29617	4271.676
10.	Leah Culver	@leahculver	491	30723	3852.3728

The first thing we’re going to need to worry about when building this solution is how we’re going to calculate PageRank on potentially millions of users and links. To do this, we’re going to use something called a graph processing platform.

What is a graph processing platform?

A graph processing platform is an application architecture that provides a general-purpose job scheduling interface for analyzing graphs. The application we’ll build will make use of a graph processing platform to analyze and rank communities of users on Twitter. For this we’ll use Neo4j Mazerunner, an open source project that I started that connects Neo4j’s database server to Apache Spark.

The diagram below illustrates a graph processing platform similar to Neo4j Mazerunner.

Submitting PageRank Jobs to GraphX

The graph processing platform I’ve described will provide us with a general purpose API for submitting PageRank jobs to Apache Spark’s GraphX module from Neo4j. The PageRank results from GraphX will be automatically applied back to Neo4j without any additional work to manually handle data loading. The workflow for this is extremely simple for our purposes. From a backend service we will only need to make a simple HTTP request to Neo4j to begin a PageRank job.

I’ve also taken care of making sure that the graph processing platform is easily deployable to a cloud provider using Docker containers. In a previous article, I describe how to use Docker Compose to run Mazerunner as a multi-container application. We’ll do the same for this sample application but extend the Docker Compose file to include additional Spring Boot applications that will become our backend microservices.

By default, Docker Compose will orchestrate containers on a single virtual machine. If we were to build a truly fault tolerant and resilient cloud-based application, we’d need to be sure to scale our system to multiple virtual machines using a cloud platform. This is the subject of a later article.

Now that we understand how we will use a graph processing platform, let’s talk about how to build a microservice architecture using Spring Boot and Spring Cloud to rank profiles on Twitter.

Building Microservices

I’ve talked a lot about microservices in past articles. When we talk about microservices we are talking about developing software in the context of continuous delivery. Microservices are not just smaller services that scale horizontally. When we talk about microservices, we are talking about being able to create applications that are the product of many teams delivering continuously in independent release cycles. Josh Long and I describe at length how to untangle the patterns of building and operating JVM-based microservices in O’Reilly’s Cloud Native Java.

In this sample, we’ll build 4 microservices, each as a Spring Boot application. If we were to build this architecture as microservices in an authentic scenario, each microservice would be owned and managed by a different team. This is an important differentiation in this new practice, as there is much confusion around what a microservice is and what it is not. A microservice is not just a distributed system of small services. The practice of building microservices should never be without the discipline of continuous delivery.

For the purposes of this article, we’ll focus on scenarios that help us gain experience and familiarity with building distributed systems that resemble a microservice architecture.

Overview

Now let’s do a quick overview of the concepts we’re going to cover as a part of this sample application. We will apply the same recipe from previous articles on similar topics for building microservices with Spring Boot and Spring Cloud. The key difference from my previous articles is that we are going to create a data service that does both batch processing tasks as well as exposing data as HTTP resources to API consumers.

System Architecture Diagram

The diagram below shows each component and microservice that we will create as a part of this sample application. Notice how we’re connecting the Spring Boot applications to the graph processing platform we looked at earlier. Also, notice the connections between the services, these connections define communication points between each service and what protocol is used.

Microservice architecture with Spring Boot

The three applications that are colored in blue are stateless services. Stateless services will not attach a persistent backing service or need to worry about managing state locally. The application that is colored in green is the Twitter Crawler service. Components that are colored in green will typically have an attached backing service. These backing services are responsible for managing state locally, and will either persist state to disk or in-memory.

Tuesday, August 25, 2015

This series continues from the last blog post about building microservices using Spring Cloud. This post has two parts. The first part describes how to create cloud-native data services using Spring Boot. The second part is a companion example project that uses Docker Compose to run multiple microservices locally to simulate a polyglot persistence setup.

What is polyglot persistence?

Polyglot persistence is a term that describes an architecture that uses a collection of different database solutions as a part of a platform’s core design. More plainly, each backing service is managed from an exclusive connection to a Spring Boot service that exposes domain data as HTTP resources.

The central idea behind polyglot persistence is that service architectures should be able to utilize the best languages for the job at hand. There is no clear definition of how to do this well, and it tends to evolve organically as central databases become cumbersome when required to add new features.

Spring Boot Roles

When designing microservices that manage exclusive access to multiple data providers, it can be useful to think about the roles in which your microservices will play.

We can think of a Spring Boot application as the basic building block for our microservice architecture.

Figure 1. Each Spring Boot application plays a role when integrating with other services

The diagram above describes six Spring Boot applications that are color coded to describe the role they play when integrated using Spring Cloud.

Data Services

Each Spring Boot application in a microservices architecture will play a role to varying degrees of importance. The data service role is one of the most important roles in any setup. This role handles exposing the application’s domain data to other microservices in the platform.

Polyglot Data Services

The diagram below describes an example microservice architecture with multiple Spring Boot applications that expose data from multiple database providers.

Figure 2. Example Polyglot Persistence Architecture

Sunday, July 12, 2015

This blog series will introduce you to some of the foundational concepts of building a microservice-based platform using Spring Cloud and Docker.

What is Spring Cloud?

Spring Cloud is a collection of tools from Pivotal that provides solutions to some of the commonly encountered patterns when building distributed systems. If you’re familiar with building applications with Spring Framework, Spring Cloud builds upon some of its common building blocks.

Among the solutions provided by Spring Cloud, you will find tools for the following problems:

Spring Boot

The great part about working with Spring Cloud is that it builds on the concepts of Spring Boot.

For those of you who are new to Spring Boot, the name of the project means exactly what it says. You get all of the best things of the Spring Framework and ecosystem of projects, tuned to perfection, with minimal configuration, all ready for production.

Service Discovery and Intelligent Routing

Each service has a dedicated purpose in a microservices architecture. When building a microservices architecture on Spring Cloud, there are a few primary concerns to deal with first. The first two microservices you will want to create are the Configuration Service, and the Discovery Service.

Microservice Configuration

The graphic above illustrates a 4-microservice setup, with the connections between them indicating a dependency.

The configuration service sits at the top, in yellow, and is depended on by the other microservices. The discovery service sits at the bottom, in blue, and also is depended upon by the other microservices.

In green, we have two microservices that deal with a part of the domain of the example application I will use throughout this blog series: movies and recommendations.

Configuration Service

The configuration service is a vital component of any microservices architecture. Based on the twelve-factor app methodology, configurations for your microservice applications should be stored in the environment and not in the project.

The configuration service is essential because it handles the configurations for all of the services through a simple point-to-point service call to retrieve those configurations. The advantages of this are multi-purpose.

Let's assume that we have multiple deployment environments. If we have a staging environment and a production environment, configurations for those environments will be different. A configuration service might have a dedicated Git repository for the configurations of that environment. None of the other environments will be able to access this configuration, it is available only to the configuration service running in that environment.

Microservice Configuration 2

Thursday, May 14, 2015

This blog post will take some of my learnings in developing microservices and apply a graph processing technique to simulate the decomposition of service architectures into microservices.

What is a microservice?

Microservices are an extension of SOA principles that are better suited for agile software development. A microservice architecture usually starts from decomposing monolithic applications into services that are cheaper to evolve and easier to throw away. The guiding theme behind this movement is to decentralize change management and reduce conflicts that tend to cause roadblocks in an SOA-based platform.

Using Data to Design Better Technology Platforms

Microservices aren't new. The pattern has been adopted at many software companies.

When companies on an SOA add new features to their platform, there tends to be a fair amount of conflicts between service teams. Certain services in the SOA become more relied upon by other services or applications in the platform.

What I've seen is that services tend towards growth rather than decomposing into smaller units. It's far easier to add features to existing services than to create new services that require operational support. Every new service requires a focus on deployment and configurations. The complexity can be tough to support with rigid processes and a lack of focus on automation.

Jumping head first into microservices is a major commitment. A monolith will have highly centralized components that will gain more mass as new microservices are born, adding additional complexity with each service call to replace modules or add functionality. It's important to analyze these connections to understand which services in an SOA are becoming more depended on.

Measuring Service Centrality

My time spent using graphs to analyze data has given me a great tool for understanding how to use data to drive decisions on decomposing an SOA. The first metric that I will use is network centrality. This metric measures how centralized a service is within a network of dependencies.

The whole idea here is to determine what components in a service are good candidates for a microservice. This can be determined by finding a component that will be the highest contributor to decreasing the overall centrality of a service, once removed.

The graph metric for centrality is a great starting point to analyze how services are gaining mass and how best to decompose services.

Decomposition Strategy

The decomposition strategy that I would like to demonstrate is based on RESTful web services that manage a set of resources.

Each service will expose a set of REST API methods to interact with the resources of the domain. The graph data model that will be used to calculate centrality will be represented by relationships of service to service interactions.

Graphs are a great way to model the resources of a domain and their interactions. Below I've sketched out a domain model for an eCommerce website based on an example by Chris Richardson.

Store front domain resources

This domain model has a set of resources which are represented by their label. Those resources are:

Customer
Order
Account
Address
Product
Warehouse
Credit Card

In a monolithic architecture all of our services will be contained in a single project, for example a WAR, with modules representing each service.

From Chris's example, we have the following deployment model:

Deployment model

From this example deployment model I've mapped the calls from each module to resources in the domain. That ends up looking like this:

Service to resource mappings

As systems scale and dependencies grow, they become harder for us to understand. However, these mappings can be tremendously valuable to understand which service is best suited to first become a microservice.

Mapping Stories to Release Artifacts

Conway's law states that organizations are constrained to produce systems that mirror their communication structures. In order to make the jump to microservices we need to scale teams horizontally and not vertically. To do this well, we need to figure out how to split applications into independently releasable containers. One principle metric to be aware of is the number of business stories that are affected per release. Each of these stories have a certain level of functionality that drives revenue for the business. This can help determine which features are more valuable in terms of revenue than others.

Let's take for example the following story.

As a user, I want to be able to browse the product catalog so that I can find products I want to buy.

If the product catalog becomes unavailable to users of the website, there will be an impact to revenue. This shows that not all user stories have the same business value.

Ideally we want to find ways to empower single teams to be accountable for single stories. This way, if there is an outage that affects a story, teams will have more autonomy to bring that functionality back online.

Dependency Graph

Below you will find an example graph data model of the service dependencies shared between containers, services, resources, and user stories that describe product features.

Service Dependency Model

In order to generate a rich dataset to analyze, I chose to use the concept of a user story as an added dimension to the dependency graph. User stories do well to group together a set of features. These features act as a good boundary criteria for determining how to make components more modular from a business value perspective.

The relationships between concepts in this dependency graph are driven by the following rules:

User stories depend on domain resources
Domain resources are owned by services
Services are managed by teams
A service belongs to a deployment container

Interactive Neo4j GraphGist Example

I've put together a step by step walkthrough of how you can use Neo4j to do graph analysis to functionally decompose a monolithic application into microservices.

In the coming months I will be focusing a lot on this topic with demos that revolve around how to build great microservice architectures using Spring boot.

Subscribe to: Posts ( Atom )

Pages