Building Microservices with Polyglot Persistence Using Spring Cloud and Docker

Tuesday, August 25, 2015

This series continues from the last blog post about building microservices using Spring Cloud. This post has two parts. The first part describes how to create cloud-native data services using Spring Boot. The second part is a companion example project that uses Docker Compose to run multiple microservices locally to simulate a polyglot persistence setup.

What is polyglot persistence?

Polyglot persistence is a term that describes an architecture that uses a collection of different database solutions as a part of a platform’s core design. More plainly, each backing service is managed from an exclusive connection to a Spring Boot service that exposes domain data as HTTP resources.

The central idea behind polyglot persistence is that service architectures should be able to utilize the best languages for the job at hand. There is no clear definition of how to do this well, and it tends to evolve organically as central databases become cumbersome when required to add new features.

Spring Boot Roles

When designing microservices that manage exclusive access to multiple data providers, it can be useful to think about the roles in which your microservices will play.

We can think of a Spring Boot application as the basic building block for our microservice architecture.

Microservice Roles
Figure 1. Each Spring Boot application plays a role when integrating with other services

The diagram above describes six Spring Boot applications that are color coded to describe the role they play when integrated using Spring Cloud.

Data Services

Each Spring Boot application in a microservices architecture will play a role to varying degrees of importance. The data service role is one of the most important roles in any setup. This role handles exposing the application’s domain data to other microservices in the platform.

Polyglot Data Services

The diagram below describes an example microservice architecture with multiple Spring Boot applications that expose data from multiple database providers.

Polyglot Persistence Microservices
Figure 2. Example Polyglot Persistence Architecture

We can see that our User Service connects to two databases: MySQL and Couchbase. We can also see that our Rating Service swaps out MySQL (RDBMS) for a Neo4j graph database.

One of the reasons why you might decide to use a polyglot persistence setup for a microservice architecture is that it gives you the benefit of using the best database model for the use case. For instance, I decided to use Neo4j for the Rating Service because the shape of the data for ratings can be used to generate recommendations using Apache Spark.

Configuring a Data Service

Let’s take a look at what some of the common characteristics of a data service are in Spring Boot when using Spring Cloud.

Spring Data

Each Spring Boot application that we can consider to be a data service is one that has the responsibility for managing data access for other applications in the architecture. To do this, we can use another project of the Spring Framework, Spring Data.

What is Spring Data?

Spring Data is a project in the Spring Framework ecosystem of tools that provides a familiar abstraction for interacting with a data store while preserving the special traits of its database model.

Anyone who has worked with the Spring Framework over the years has a good idea how to use Spring Data. If you’re not familiar, please take a look at the Spring Data guides to get working examples.

Creating a Data Service

When deciding to create a new data service for a cloud-native application, it is helpful to first examine the domain model of the application.

Movie Domain
Figure 3. Graph model of the movie domain of our example application

In the graph data model above we can see the common entities that we need to expose from our services. The nodes represent the domain entities within our movie application.

  • User

  • Movie

  • Genre

The connections between these entities give us a good idea of our boundaries that we need to consider when designing our microservices. For instance, we may have a requirement to analyze the ratings data between movies and users to generate movie recommendations.

For this example project we will use three data services:

  • Rating Service (Neo4j)

  • Movie Service (MySQL)

  • User Service (Neo4j)

Before we can get started creating our data services, let’s talk about what the anatomy of a Spring Boot data service looks like in a cloud-native application with Spring Cloud.

Anatomy of a Spring Boot Data Service

This section will do a deep dive on how Spring Boot application are automatically configured to use data sources using Spring Data. Since each Spring Boot application is integrated using Spring Cloud, it is helpful to understand how these applications bootstrap their dependencies.

The dependencies each of our data services will have in common are:

These dependencies are declared in a Spring Boot application’s pom.xml. The common dependencies we will need are listed below.

<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter</artifactId>
</dependency>
<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-data-rest</artifactId>
</dependency>
<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-data-jpa</artifactId>
</dependency>
<dependency>
    <groupId>org.springframework.cloud</groupId>
    <artifactId>spring-cloud-config-server</artifactId>
</dependency>
<dependency>
    <groupId>org.springframework.cloud</groupId>
    <artifactId>spring-cloud-starter-eureka</artifactId>
</dependency>
<dependency>
    <groupId>org.springframework.cloud</groupId>
    <artifactId>spring-cloud-starter-zuul</artifactId>
</dependency>
<dependency>
    <groupId>mysql</groupId>
    <artifactId>mysql-connector-java</artifactId>
</dependency>

Bootstrapping Datasource Dependencies

One of the core principles of Spring Boot is minimal configuration. Spring Boot will automatically scan the classpath of an application at startup and bootstrap dependencies. For example, if I added spring-boot-starter-jpa as one of my dependencies, the data source is automatically configured by looking for a compatible database driver elsewhere in the dependencies.

In the example code snippet from the pom.xml, I’ve specified mysql-connector-java dependency. Now when I start the Spring Boot application, this MySQL driver will automatically be configured as our default data source for JPA.

The data source connection details are retrieved from the configuration service for a specific environment. Those configurations are contained in application.yml. Below is an example application properties for a Spring Data JPA application that has a connection to a MySQL database. This is similar to how the Movie Service in the example project is configured.

spring:
  profiles:
    active: development
---
spring:
  profiles: development
  jpa:
    show_sql: false
    database: MYSQL
    hibernate:
      ddl-auto: none
  datasource: (1)
    url: jdbc:mysql://localhost/test
    username: dbuser
    password: dbpass
1 The spring.datasource property block is where you configure connection details.

Spring Cloud Dependencies

The Spring Cloud dependencies that I’ve specified in the pom.xml will be common and standard throughout our connected data services. The dependency we can expect to change per the requirements of the attached data store will be the Spring Data project we choose for that data service.

Config Server

The spring-cloud-starter-config-server dependency is used to tell our Spring Boot application to use a configuration server to retrieve environment configurations.

Spring Configurations
Figure 4. Config server enables automatic configuration per environment in a Git repository

By adding this dependency to our classpath, we can configure the service to retrieve a set of configurations for a specific Spring profile. A Spring profile could define configurations for an environment, for instance, staging and production profiles. Retrieving configurations for a profile is an important feature of a data service since we will connect to different databases in different environments.

Eureka Discovery

The spring-cloud-starter-eureka dependency is used to tell our Spring Boot application that it should register itself with the Eureka discovery service on startup.

Eureka is a service registry that provides us with a way to automatically discover and connect to other data services using the ID of a Spring Boot application. Further, as we scale up the number of instances of a data service, a client-side load balancer will automatically route requests to registered instances of the same service ID.

Discovery
Figure 5. Eureka client-side load balancing with ribbon

Zuul Gateway

The spring-cloud-starter-zuul dependency is used to tell our Spring Boot application that it should advertise its HTTP routes to other services using a reverse proxy lookup. This technique is called a sidecar proxy, which is used to expose domain resources to applications that do not register with Eureka. The use of a sidecar on an API Gateway is helpful if you have applications using a language other than the JVM.

By adding the annotation @EnableZuulProxy on the Spring Boot application class, your service will automatically add HTTP routes advertised by other services through Eureka.

curl -X GET 'http://service.cfapps.io/routes'

By making a request to the /routes endpoint of a Zuul enabled service, you will get back a manifest of services who have registered with Eureka and are exposing a REST API or HTTP route.

{
  "_links": {
    "self": {
      "href": "http://service.cfapps.io/routes",
      "templated": false
    }
  },
  "/rating/**": "rating",
  "/user/**": "user",
  "/movie/**": "movie",
  "/gateway/**": "gateway",
  "/moviesui/**": "moviesui"
}

The result shows that we have multiple services registered with their service ID as routes we can make requests to. Let’s see the result of calling the movie service’s route at /movie/**.

curl -X GET 'http://service.cfapps.io/movie'
{
  "_links": {
    "movies": {
      "href": "http://service.cfapps.io/movie/movies{?page,size,sort}",
      "templated": true
    },
    "profile": {
      "href": "http://service.cfapps.io/movie/alps",
      "templated": false
    }
  }
}

We can now see the list of links that are advertised by the movie service’s root. We can see that this service has a single repository exposed as a REST resource at /movie/movies and that it is a paging and sorting repository.

The JSON format we are looking at is HAL, which is the JSON and XML specification based on the principles of HATEOAS (Hypermedia as the Engine of Application State). This JSON format provides a way for clients to traverse a REST API using embedded links.

We can now traverse into the movie service’s movies repository and take a look at the results.

curl -X GET 'http://service.cfapps.io/movie/movies'

The results of this request show a traversable page of items that are returned by the paging and sorting repository for this domain entity.

{
  "_links": {
    "first": {
      "href": "http://service.cfapps.io/movie/movies?page=0&size=20",
      "templated": false
    },
    "self": {
      "href": "http://service.cfapps.io/movie/movies",
      "templated": false
    },
    "next": {
      "href": "http://service.cfapps.io/movie/movies?page=1&size=20",
      "templated": false
    },
    "last": {
      "href": "http://service.cfapps.io/movie/movies?page=83&size=20",
      "templated": false
    },
    "search": {
      "href": "http://service.cfapps.io/movie/movies/search",
      "templated": false
    }
  },
  "_embedded": {
    "movies": [
      {
        "id": 1,
        "title": "Toy Story (1995)",
        "released": 788918400000,
        "url": "http://us.imdb.com/M/title-exact?Toy%20Story%20(1995)",
        "genres": [
          {
            "name": "Animation"
          },
          {
            "name": "Children's"
          },
          {
            "name": "Comedy"
          }
        ],
        "_links": {
          "self": {
            "href": "http://service.cfapps.io/movie/movies/1",
            "templated": false
          },
          "movie": {
            "href": "http://service.cfapps.io/movie/movies/1",
            "templated": false
          }
        }
      }
      ...

Adding a Neo4j Data Service

Now let’s take a look at what a Spring Boot data service would look like if it exposed data from a Neo4j database. Since Spring Data provides a project for Neo4j, we can use a set of features that take advantage of the specialized traits of a graph database.

Instead of needing to specify a database driver in my classpath like we did with MySQL, I can provide dependencies in my pom.xml for the Spring Data Neo4j project.

<dependency>
   <groupId>org.springframework.data</groupId>
   <artifactId>spring-data-neo4j</artifactId>
   <version>3.4.0.RC1</version>
</dependency>
<dependency>
   <groupId>org.springframework.data</groupId>
   <artifactId>spring-data-neo4j-rest</artifactId>
   <version>3.3.0.M1</version>
</dependency>

Now I can use the specific features of the Spring Data Neo4j project, which gives native graph features like routing and graph traversals.

Rating Service

Going back to our domain model from earlier, I can see that users in my application can rate movies. We can use our Neo4j graph store as a way to index the connections between users and movies and later use that data to generate recommendations, a leading use case for graph databases.

Movie Domain
Figure 6. The domain model shows how we can use a graph database to analyze the connections between movies and users

Using the Zuul enabled reverse proxy, let’s take a look at what the Rating Service exposes from Neo4j.

curl -X GET 'http://service.cfapps.io/rating'
{
  "_links": {
    "products": {
      "href": "http://service.cfapps.io/rating/products{?page,size,sort}",
      "templated": true
    },
    "ratings": {
      "href": "http://service.cfapps.io/rating/ratings{?page,size,sort}",
      "templated": true
    },
    "users": {
      "href": "http://service.cfapps.io/rating/users{?page,size,sort}",
      "templated": true
    },
    "profile": {
      "href": "http://service.cfapps.io/rating/alps",
      "templated": false
    }
  }
}

We see from the results that we have 3 repositories that are exposed through REST and HATEOAS. The /rating/products endpoint is a generic form of the movie domain entity from our other service. Later we may want to offer things other than movies, this generic term saves us from having to change the semantics later if we enter a new line of business and still need recommendations.

One of the key differences between our Spring Data JPA MySQL repository for movies is that a graph model has different underlying entity types that describe our data: nodes and relationships.

Let’s take a look at the /rating/ratings endpoint, which exposes the ratings of a user and movie.

...
"ratings": [
{
  "id": 87863,
  "timestamp": 881252305,
  "rating": 1,
  "_links": {
    "self": {
      "href": "http://service.cfapps.io/ratings/87863",
      "templated": false
    },
    "rating": {
      "href": "http://service.cfapps.io/ratings/87863",
      "templated": false
    },
    "user": {
      "href": "http://service.cfapps.io/ratings/87863/user",
      "templated": false
    },
    "product": {
      "href": "http://service.cfapps.io/ratings/87863/product",
      "templated": false
    }
  }
},
...

The rating repository shows each relationship that connects a user to a product, and what they rated the product. The ID used for the user and the product relates back to the unique ID used by our other services that manage parts of our domain data, such as Movie Service and User Service.

Custom Graph Queries

Depending on how our connected data is used, we can create repository endpoints that allow us to bind certain REST API endpoints to tailored queries that use Neo4j’s Cypher query language.

One such example is the requirement to find all ratings for a user. The Cypher query to do this could be:

MATCH (n:User)-[r:Rating]->() WHERE n.knownId = {id} RETURN r

Here we are matching the pattern where a user has rated something, starting at the user’s ID and returning a list of the relationship entities containing the attributes of the rating entity.

To bind this query to our Spring Data REST repository we would describe it as follows:

@RepositoryRestResource
public interface RatingRepository extends PagingAndSortingRepository<Rating, Long> {
    @Query(value = "MATCH (n:User)-[r:Rating]->() WHERE n.knownId = {id} RETURN r")
    Iterable<Rating> findByUserId(@Param(value = "id") String id);
}

By registering this custom repository method, Spring Data REST will automatically register it as an embedded link in the rating’s REST repository. Let’s take a look.

curl -X GET "http://service.cfapps.io/rating/ratings/search/findByUserId?id=1"

The custom repository method will be added to the rating service’s search links. We can now call this new method by its name, as shown above.

{
  "_links": {
    "self": {
      "href": "http://service.cfapps.io/rating/ratings/search/findByUserId?id=1",
      "templated": false
    }
  },
  "_embedded": {
    "ratings": [
      {
        "id": 87863,
        "timestamp": 881252305,
        "rating": 1
      },
      ...

Binding REST Clients

The next thing we will want to do is to consume the data from our different polyglot persistence data services. To do this using Java is entirely too simple using Netflix Feign client, as described in my last blog post. Let’s take a look at what these client contracts might look like in a UI application.

Movie Client

@FeignClient("movie")
public interface MovieClient {

  @RequestMapping(method = RequestMethod.GET, value = "/movies")
  PagedResources<Movie> findAll();

  @RequestMapping(method = RequestMethod.GET,
      value = "/movies/search/findByTitleContainingIgnoreCase?title={title}")
  PagedResources<Movie> findByTitleContainingIgnoreCase(@PathVariable("title") String title);

  @RequestMapping(method = RequestMethod.GET, value = "/movies/{id}")
  List<Movie> findById(@PathVariable("id") String id);

  @RequestMapping(method = RequestMethod.GET,
      value = "/movies/search/findByIdIn?ids={ids}")
  PagedResources<Movie> findByIds(@PathVariable("ids") String ids);
}

The above interface declares that I would like to bind a method signature to the REST API route of the movie service, as configured by the @FeignClient("movie") annotation. This interface will be registered as a bean when the application starts up and can be autowired in other beans in the application.

If you’re like me, when you think about how powerful this can be in a large operation with many microservices, it gets you excited about the future of developing cloud-native Java applications using Spring.

Autowire a Feign Client

The snippet below shows how we would auto wire our Feign Client interface for our movie service.

@SpringUI(path = "/movies")
@Title("Movies")
@Theme("valo")
public class MovieUI extends UI {

    private static final long serialVersionUID = -3540851800967573466L;

    TextField filter = new TextField();
    Grid movieList = new Grid();

    @Autowired
    MovieClient movieClient;

    ...

    private void refreshMovies(String stringFilter) {
        if(!Objects.equals(stringFilter.trim(), "")) {
            movieList.setContainerDataSource(new BeanItemContainer<>(
                    Movie.class, movieClient
                        .findByTitleContainingIgnoreCase(stringFilter) (1)
                        .getContent()));
        }
    }
1 We can call client APIs just as if they were Autowired repositories hosted within our application.

Docker Demo

The example project uses Docker to build a container image of each of our microservices as a part of the Maven build process. We can easily orchestrate the full microservice cluster on our machine using Docker compose.

Getting Started

To get started, visit the GitHub repository for this example project.

Clone or fork the project and download the repository to your machine. After downloading, you will need to use both Maven and Docker to compile and build the images locally.

Download Docker

First, download Docker if you haven’t already. Follow the instructions found here, to get Docker up and running on your development machine.

You will also need to install Docker Compose, the installation guide can be found here.

Requirements

The requirements for running this demo on your machine are found below.

  • Maven 3

  • Java 8

  • Docker

  • Docker Compose

Building the project

To build the project, from the terminal, run the following command at the root of the project.

$ mvn clean install

The project will then download all of the needed dependencies and compile each of the project artifacts. Each service will be built, and then a Maven Docker plugin will automatically build each of the images into your local Docker registry. Docker must be running and available from the command line where you run the mvn clean install command for the build to succeed.

After the project successfully builds, you’ll see the following output:

[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO]
[INFO] spring-cloud-microservice-example-parent ........... SUCCESS [  0.478 s]
[INFO] user-microservice .................................. SUCCESS [ 36.055 s]
[INFO] discovery-microservice ............................. SUCCESS [ 15.911 s]
[INFO] api-gateway-microservice ........................... SUCCESS [ 17.904 s]
[INFO] config-microservice ................................ SUCCESS [ 11.513 s]
[INFO] movie-microservice ................................. SUCCESS [ 13.818 s]
[INFO] ui-search .......................................... SUCCESS [ 31.328 s]
[INFO] rating-microservice ................................ SUCCESS [ 22.910 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------

Start the Cluster with Docker Compose

Now that each of the images has been built successfully, we can using Docker Compose to spin up our cluster. I’ve included a pre-configured Docker Compose yaml file with the project.

From the project root, navigate to the spring-cloud-polyglot-persistence-example/docker directory.

Now, to startup the microservice cluster, run the following command:

$ docker-compose up

If everything is configured correctly, each of the container images we built earlier will be launched within their VM container on Docker and networked for automatic service discovery. You will see a flurry of log output from each of the services as they begin their startup sequence. This might take a few minutes to complete, depending on the performance of the machine you’re running this demo on.

Once the startup sequence is completed, you can navigate to the Eureka host and see which services have registered with the discovery service.

Copy and paste the following command into the terminal where Docker can be accessed using the $DOCKER_HOST environment variable.

$ open $(echo \"$(echo $DOCKER_HOST)\"|
            \sed 's/tcp:\/\//http:\/\//g'|
            \sed 's/[0-9]\{4,\}/8761/g'|
            \sed 's/\"//g')

If Eureka correctly started up, a browser window will open to the location of the Eureka service’s dashboard.

You’ll need to wait for Eureka to start and see that all the other services are registered before proceeding. If Eureka is not yet available, give Docker Compose a few more minutes to get all the services fully started.

Sidecar Routes

After all the services have started up and registered with Eureka, we can see each of the service instances that are running and their status. We can then access one of the data-driven services, for example the movie service. The following command will open a browser window and display the routes that have been bootstrapped on the API Gateway using the @EnableZuulProxy ane @EnableSidecar annotations.

$ open $(echo \"$(echo $DOCKER_HOST)/routes\"|
            \sed 's/tcp:\/\//http:\/\//g'|
            \sed 's/[0-9]\{4,\}/10000/g'|
            \sed 's/\"//g')

This command will navigate to the API gateway’s endpoint display each route that has been discovered through the Zuul Sidecar.

{
  "_links" : {
    "self" : {
      "href" : "http://192.168.59.103:10000/routes",
      "templated" : false
    }
  },
  "/gateway/**" : "gateway",
  "/movie/**" : "movie",
  "/rating/**" : "rating",
  "/moviesui/**" : "moviesui",
  "/user/**" : "user",
  "/discovery/**" : "discovery"
}

Movies UI

I’ve created a simple Spring Boot Vaadin application to allow us to consume our data services and perform simple searches.

$ open $(echo \"$(echo $DOCKER_HOST)/movies\"|
            \sed 's/tcp:\/\//http:\/\//g'|
            \sed 's/[0-9]\{4,\}/1111/g'|
            \sed 's/\"//g')

This command will open up the search-ui application and allow us to search for movies. Try typing in one of your favorite movies from the early 90s like I’ve done in the screen shot below.

Vaadin Movies
Figure 7. Search Movies UI

Users UI

Paste the following command into your terminal to open the user application.

$ open $(echo \"$(echo $DOCKER_HOST)/users\"|
            \sed 's/tcp:\/\//http:\/\//g'|
            \sed 's/[0-9]\{4,\}/1111/g'|
            \sed 's/\"//g')

I created this view to display the results from querying the User Service, the Rating Service, and the Movie Service. This example demonstrates how fast Spring Boot can handle queries that span multiple data services. There are two view in this page. The table view to the left displays users that are returned from our User Service. The table to the right will display the movies that a user has previously rated. To activate the view on the right, click on one of the table rows containing a user record, as shown below.

Vaadin Users
Figure 8. Search Users UI