Distributed Transaction Performance for Scalability & Response
In order to make Distributed Database, especially Transactional RDBMS (Relational Database) supports remote transaction, fast, there are some points to solve.
- Why Distributed Database is High Speed
- Weak Points of Distributed Database
- What makes Remote SQL Slow
- How to Avoid Slow Architecture
Generally, distributed database is good at high scalability & high throughput. That is because it has some network nodes, and the CPUs in them works simultaneously.
The query benchmark is very fast, if it is independent task. As you add nodes, the number of transactions executed is linearly increases.
But in actual use, it does not work like benchmark. It has some weak points.
The distributed database is good at throughput, but week at response speed. That is because there are some slow points, because it uses network.
Following stuff makes transaction slow.
Network access is very slow. CPU works in nano seconds order, and most of functions are executed in micro seconds order.
But network access costs about 100 to 300 micro seconds. Comparing with this speed, most of local program's cost is very small.
Therefore how to reduce network access is most important thing to the distributed database architecture.
Locking makes it very slow. It works together with network, it becomes very very slow.
Especially, row lock is implemented as distribute lock using network, scanning big table never ends.
Partition Table is very powerful method to handle big data. It makes possible to query records palallelly.
But the location of records are biased, meaning there are a lot of records in some remote partition nodes, it becomes very slow.
Distributed Database Management System is complex system using network. In order to make it fast, we have to avoid slow architecture.
Row Level Rocks are the most frequently executed one. They are used on scanning table. If these locks are distributed, that makes scanning very slow.
Table locks are done frequently, but row locks are executed much more frequently.
Either to use distributed table locks is tuning point of trade of between scalability and response speed.
If the database system checks update of schema and network structure every time, it needs network access every time.
Alinous Elastic DB has version of schema and cluster network, so that it accesses network only when they are changed.
Remote Table Partitioning is effective way to deal big data. But the partition key's value is not set properly, it causes too many network access, and table scan.
To manage data in the database continuously is necessary to keep the health of database.