Traversal and Batch Operation Related Questions

1. Package Splitting Problems of Response Package

1.1 What kind of response will involve package splitting?

When the response package returned by TcaplusDB returns multiple records, the specific requests are:

  • BatchGet: batch get
  • BatchUpdate or BatchDelete: batch modify or delete. If the flag for returning old records is set, the batch old records will be returned
  • GetByPartKey (corresponding to IndexGetRequest in PB API) returns all records under the same index key through all local queries
  • UpdateByPartKey or DeleteByPartKey: batch modify or delete all records under the same index key. If the flag for returning old records is set, the batch old records will be returned
  • ListGetAll: get all records from the List Key in the List table
  • Traverse: traverse full table
  • Global index query: When the table is configured with a global index, batch records can be returned through SQL query

1.2 What are the conditions for triggering package splitting?

Package splitting will be triggered when the response package exceeds 256 KB. However, it should be ensured that package splitting will not split a single record (no matter how large the record is). For example, the three records of the response package are 10 KB, 251 KB, and 1 MB respectively, and will be returned in two packages, namely (10 KB, 251 KB) and (1 MB).

1.3 How to determine the last package when it is returned by package splitting?

  • C++ TDR API: can call TcaplusServiceResponse::HaveMoreResPkgs() to determine
  • C++ PB API: return error codes API_ERR_NO_MORE_RECORD in callback

1.4 How to set package splitting?

For the PB API, the default is package splitting.

For the TDR API, it can set package splitting through the interface TcaplusServiceRequest::SetMultiResponseFlag(). When flag = 0 (the default value is also 0), it means that one request packet is not allowed to automatically respond to multiple response packets. In this case, if a response involves multiple records (more than 256 KB), only the first packet (part records) will be returned. If package splitting is not set, the returned response packet record may be empty or less than the actual value.

Note: TcaplusServiceRequest::SetMultiResponseFlag() is only effective for part request word, specifically:

  • Versions before 3.46.0 (including 3.46.0):
  • TCAPLUS_API_LIST_GETALL_REQ
  • TCAPLUS_API_LIST_DELETE_BATCH_REQ
  • TCAPLUS_API_LIST_GET_BATCH_REQ
  • Versions after 3.55.0 (including 3.55.0):
  • TCAPLUS_API_LIST_GETALL_REQ
  • TCAPLUS_API_LIST_DELETE_BATCH_REQ
  • TCAPLUS_API_LIST_GET_BATCH_REQ
  • TCAPLUS_API_LIST_ADDAFTER_BATCH_REQ
  • TCAPLUS_API_LIST_REPLACE_BATCH_REQ
  • TCAPLUS_API_BATCH_GET_REQ
  • TCAPLUS_API_BATCH_DELETE_REQ
  • TCAPLUS_API_BATCH_REPLACE_REQ
  • TCAPLUS_API_BATCH_UPDATE_REQ
  • TCAPLUS_API_BATCH_INSERT_REQ

1.5 How to customize package splitting?

For the request XxxByPartKey and ListGetAll, they can set the Limit and Offset attributes in the request to control the returned data, and obtain subsequent records by increasing the Offset. Related interfaces:

  • C++TDR API: TcaplusServiceRequest::SetResultLimit() sets the batch, TcaplusServiceResponse::GetRecordNextOffset() gets the next batch of offset
  • C++PB API: IndexGetRequest sets the batch, IndexGetResponse.m_nRemainNum gets the remaining records

For the full table traversal, similarly, set the Limit and Range attributes. Related interfaces:

  • C++ TDR API: TcaplusServiceTraverser::SetResNumPerReq() and TcaplusServiceTraverser::SetRange()
  • C++PB API: no interface currently

However, the rule that more than 256 KB will trigger package splitting remains unchanged. For example, if a request is set with Limit = 3 and Offset = 0, the three records of the response packet are 10 KB, 251 KB and 1 MB respectively, or will they be returned in two packets, namely (10 KB, 251 KB) and (1 MB).

1.6 Whether the returned batch records are guaranteed in order

None.

1.7. Traversal Principle and FAQ

Tcaplus supports full table traversal, including generic table and list table. It will initiate traversal task on each tcapsvr distributed in the table according to sharding logic. Therefore, the performance of the entire traversal request is relatively low. It is recommended to set data traversal from tcapsvr slave (traversing data from tcapsvr slave will not affect the external services provided by tcapsvr master), that is, interface: SetOnlyReadFromSlave (bool flag).

If there is not much data to be fetched each time, the local index can be prioritized to meet the requirements.

Once starting the traverser, Tcapsvr of Tcaplus will start the internal iterator to traverse and read the node data on the tree in time slices. First find the key, then find the value, and then package the traversed data into the buffer and return it to the service api. The traversal logic inside the service api will judge whether the expected seq is consistent with the received seq to determine whether it is losing packets or receiving duplicate packets. Once it is confirmed that the received package meets the expectation, the main thread can get the complete package when calling RecvResponse, and then call FetchRecord to parse a business record. Traversal can be interrupted and terminated at any time, or it can be resumed conditionally. On the whole, traversal uses ping-pong automatic trigger mechanism inside the service api. After receiving the packets that meet the expectation, it will automatically decide whether to continue to initiate the next traversal request (for example, if the backend storage svr explicitly returns a BUSY error, the traversal will be interrupted) until the explicit traversal is completed or the app stops traversing.

Practical Tutorial

Each table can only have one traverser running at the same time. A single gamesvr supports up to 8 traversers at the same time. It is recommended to reduce the number of traversers, otherwise the normal read/write access delay will be affected.

The traversal of Tcaplus_ client is serial and can only support small amounts of data. The analytical text export function shall be used for parallel processing of cold backup files for large amounts of data.

FAQ

  • Q: Can Tcaplus traversal and other read/write operations be performed at the same time?

  • A: They can be performed at the same time. When the traversal and update operations are performed together, the following conditions exist: If the update is performed before the traversal, the traversed data is not the data at the beginning of the traversal (data snapshot); if the traversal is performed before the traversal and the update is performed after the traversal, the newly modified data cannot be traversed.

  • Q: How to determine whether the traversal has ended?

  • A: Check whether the traversal is complete according to the state, that is, the GetState interface is in the ST_IDLE state.

  • Q: Under what circumstances is a traversed cursor invalid, causing the traversal interface to return an error?

  • Answer: Host/backup switch and relocation scenarios will cause traversal failure. Normal data reading and writing will not cause errors.

  • Q: Can the traversal stop? Interruptible or not

  • A: The traversal can be stopped at any state by calling the traverser's Stop interface. Execute the Resume interface when the traverser is in the NORMAL or RECOVERABLE state to interrupt a traversal. After calling Resume successfully, call ContinueTraverse to continue traversing the interrupted traversal and resume execution.

  • Q: How to save traffic during traversal?

  • Answer: Call the SetFieldNames interface to set the value field expected to be returned during traversal.

  • Q: Is conditional filtering traversal supported?

  • A: Version 3.55.0 and above supports conditional filtering traversal.

  • Q: Is it possible to split packages in SetResNumPerReq (1)?

  • A: The backend of Tcaplus is packaged according to 256 KB. Once the packets exceed the limit, they will be splitted. Calling SetResNumPerReq (1) only means that each request packet corresponds to one response packet. In fact, there may be multiple records in one response packet. The project team only needs to consider the continuous packet receiving and judge whether the traversal is over according to GetStat, without caring about the package splitting details. ServiceApi delivers a complete package to the app.

  • Q: Does Tcaplus support full key traversal?

  • A: There is no support yet. There are plans to support it.

  • Q: How to organize functions to complete traversal?

  • Example of traversal

    generic table: C++_tdr1.0_asyncmode_generic_simpletable/SingleOperation/travers/main.cpp

    list table: C++_tdr1.0_asyncmode_list_simpletable/SingleOperation/listtravers/main.cpp

    Conditional filter: C++_tdr1.0_asyncmode_generic_simpletable/SingleOperation/condition_operation/condition_traverse.cpp

  • Q: During traversal, can the newly inserted data be traversed?

  • A: The newly inserted data can be traversed before traversing. Traversal does not generate snapshots, but traverses by file offset. If the updated or deleted or inserted location is the location that has been traversed, it will not be read.

results matching ""

    No results matching ""