Relational DB in microservices

我的未来我决定 提交于 2021-02-10 15:07:03

问题


I have a monolithic application that currently uses a PostgreSQL DB and the schemas are set up as you would expect for most relational databases with various table data being linked back to the user via FKs on the user_id.

I'm trying to learn more about microservices am trying to migrate my python API to a microservice architecture. I have a reasonable understanding of how I'm going to break up the larger app into smaller parts, however, I'm not entirely clear on how I'm supposed to deal with the data side of things.

I understand that one single large DB is against general design principles of microservices but I'm not clear on what the alternative would be.

My biggest concern is cascading across individual databases that would hold microservice data. In a simple rdb, I can just cascade on delete and the DB will handle the work across the various tables. In the case of microservices, how would that work? Would I need to have a separate service that handles deleting user data across the other service DBs?

I don't really understand how I would migrate a traditional application with a relational DB to a microservice architecture?

EDIT:

To clarify - a specific architectural/design problem I'm facing is as follows:

I have split up my application into a few microservices. The ones that are in my mind still relational are:

Geolocation - A service that checks geometry data, records in PostGIS, and returns certain information. A primary purpose is to record the location of a particular user for referencing later

Image - A simple upload service to upload images and store meta data in the db.

Load-Image - A simple service that returns a random set of images based on parameters such as location, and user profile data such as Age, Gender, etc

Profile - A service that simply manages user data such as Age, Gender, etc

Normally, these three items would have a table each in a larger db rather than their own individual dbs. Filtering images by say location and age is a very simple JOIN and filter.

How would something like that work in a microservice architecture? If the data is held in different dbs entirely how would I setup the logic to filter the data? I could duplicate data that doesn't change often like profile info and add it to a MongoDB document that would contain image data including user_id and profile data - however, location data can change regularly and constant updates doesn't sound practical.

What would be the best approach? Or should I stick with a shared RDBMS for just those few services?


回答1:


It comes down to the duplication of data, why we want it, and how we manage it.

Early in our careers we were taught about the the duplicating data in the context of duplication for the purpose of redundancy, for example, in database replication, or backups. We were also taught that data can be modelled in a relational manner, with constraints enforcing the integrity of the model. In fact, the integrity of the model is sacrosanct. Without integrity, how can you have consistency? The answer is that you can't. Kinda.

When you work with distributed systems and service orientation, you do so because you want to minimise interactions thereby reducing coupling between components. However, there is a cost to this. The more distributed your architecture, the less coupling it has, and the more duplication of data will be necessary. This is taken to an extreme with microservices, where effectively the same data may be present in many different places, in varying degrees of consistency.

Instead of being bad, however, in this context data duplication is an essential feature of your system. It is an enabler of an architectural style with many great benefits. Put another way, without duplication of data, you get less distribution, you get more coupling, which makes your system more expensive to build, own, and change.

So, now we understand duplication of data and why we want it, let's move onto how we manage having lots of duplication. Let's try an example:

In a relational database, let's say we have a table called Customers, which contains a customer ID, and customer details, and another table called Orders which contains the order ID, customer ID, and the order details. Let's say we also have an ordering application, which needs to delete all the customer's orders if the customer is deleted for GDPR.

Because we are migrating our system to microservices, we decide to create a service called Customers.

So we create a service with the following operation:

  • DELETE /customers/{customerId} - deletes a customer

We create another service called Orders with the following operations:

  • GET /orders/customers/{customerId} - gets all the orders for a customer
  • DELETE /orders/{orderId} - deletes an order

We build a UX screen for deleting a customer. The UX first calls the orders service to get all the orders for the customer. Then it iterates over the list of orders, calling the orders service to delete the order. Then it calls the customers service to delete the user.

This example is very simplistic, but as you can see, there is no option but to orchestrate the "Delete Customer" operation from the caller, which in this case is the user interface. Of course, what would be a single atomic transaction in a database does not translate to multiple HTTP/s calls, so it is possible that some of the calls may not succeed, leaving the system as a whole in an inconsistent state. In this instance the inconsistency would need to be resolved via some recovery mechanism.




回答2:


In a microservice architecture, we have both the option, either use database per service or a shared database. There are advantages and disadvantages to both the pattern. Database per service architecture is the best practice but when the monolithic application has lots of function, procedure or database-specific feature on database level then we can use the Shared database approach, I know this is not the best practice if you have time and bandwidth then you should go for database per service. As your concern is cascading over individual databases, you need to remove cascading from the database and implement global transaction handling in your application and execute all cascading related queries from that transaction.



来源:https://stackoverflow.com/questions/61175534/relational-db-in-microservices

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!