posted time Created time: 2017-02-07 Last updated time:

How to Reduce Network Transfer between Storage & SQL Engine

Distributed Database is to handle big data with some remote networks nodes, so how to reduce network data is necessary issue.

Especially using database for OLTP use, it is essential to reduce it.

Distributed Database Engine & Storage Engine

The distributed database has storage engine architecture. Alinous Elastic DB has following one.

Reduce network transfer of Bigdata

The storage engine is accessed by SQL table access coordinator, which is called Region Manager in this database. It request scanning table to the storage engine, then it returns the result.

Then network transaction occurs. SELECT, UPDATE, and DELETE statements cause this issue.

If the result data is very big, it is tough burden to the network. Therefore it has to make the result small as possible as it can.

How to Reduce Network Transaction

Alinous Elastic DB has distributed algorithm to reduce the quantity of result data. That is executed in following way.

Filtering Result in Storage Engine

Before executing SQL, the Transaction Engine calculates execution plan. In SQL Optimization Phase, it makes plan to scan each table.

Then it calculate following stuffs.

  • Which index key to use ( or full scan)
  • Essential additional conditions to the scanned result.

The additional condition is sent to the storage engine on requesting scan and it is used by the storage engine then, therefore, the result to return is reduced.

This method is effective for SELECT stattement, which has conditions in JOIN and WHERE clause.

Use Table Partition Key

If a table partition key is included in the condition to filter scanned result, Region Manager does not send scan request to the remote storage node which never has result.

That makes network transfer a little bit smaller. To filter result affect much better.

But the reason why I want to introduce that, is it reduce CPU cost of storage engine very much, instead of networking.

And if the storage node is replicated cluster, packets transferred in network decrease more, but it is little bit.

Go to Top