posted time Created time: 2017-02-07 Last updated time:

Hot Replication among Remote Table Partition Nodes

When we update key's value of table partition on Remote Storage Node, it starts data transfer to the next remote node. Then the database does not stop, and it can process SQL statements.

In this article, I mention how to replicate and delete old data without stop.

Architecture of Storage Engine

This database is designed for dynamic table partitioning.

It has Storage Records Data Format and Distributed Storage Engine Architecture suitable for online data management.

Replication & Synchronizing Process on Table Partitioning

When rebuilding table partitions, it transfer data for next partitioning structure. Then it does not have to stop or lock the database.

Actual data replication and synchronizing process runs by following procedure.

Copy Data to the Next Partition Node

After calculating the range of data to copy, and which remote nodes are source or destination, starts to copy the data.

Then it does following operation.

  1. Lock the table in Update Mode
  2. Get current commit id as data version
  3. Start logging update logs, and release the lock
  4. Read certain quantity of data with using shared lock
  5. Repeat until all data to copy does not remain

Start Logging Update & Delete Operation

When it starts copying data, it also starts logging updated log. It is continuously executed while the data copy.

Apply Log to the Storage

When data copy is done, it transfers log data from source node to destination nodes. Then start to apply the logged update to the destination remote node.

Lock Table by Update Mode

When update process with the log ends, once lock the table in Update Mode, and check no additional log does not remain. If it remained, release lock and apply log update again.

Update Partition Keys & Release Lock

If the additional update log is not there, then update the partition key's value on the nodes. At this time, replication process finished.

Vacuum Source Records

After copying the data, source nodes still has records before data transfer. The vacuum operation that will be executed later erases them.

SELECT Queries On Copying Data

Non Stop Table Partition Copy

When it copying the data, the table storage has a lot of duplicated records. Even after that, source storage has them.

But SELECT queries and scanning table by UPDATE and DELETE, works correctly.

That is because table storages of each remote nodes has range of partition keys. If the record's partition key's value is out of the range, it ignores.

Go to Top