HBase源码解析

2017-08-09 · 725 words · 2 minute read

2017年8月

HBase

服务器端读

Get org.apache.hadoop.hbase.protobuf.ProtobufUtil.toGet(Get get) throws IOException

Create a protocol buffer Get based on a client Get.

Result r = null;
if (action.hasGet()) {
  Get get = ProtobufUtil.toGet(action.getGet());
  r = region.get(get);
}
ClientProtos.Result pbResult = null;
pbResult = ProtobufUtil.toResult(r);

服务器端写

有疑问？？？

STEP 1. Try to acquire as many locks as we can, and ensure we acquire at least one.we should record the timestamp only after we have acquired the rowLock,otherwise, newer puts/deletes are not guaranteed to have a newer timestamp
STEP 2. Update any LATEST_TIMESTAMP timestamps.Acquire the latest mvcc number
STEP 3. Write back to memstore. Write to memstore. It is ok to write to memstore first without updating the HLog because we do not roll forward the memstore MVCC. The MVCC will be moved up when the complete operation is done. These changes are not yet visible to scanners till we update the MVCC. The MVCC is moved only when the sync is complete.
STEP 4. Build WAL edit
STEP 5. Append the final edit to WAL. Do not sync wal.
STEP 6. Release row locks, etc.
STEP 7. Sync wal.
STEP 8. Advance mvcc. This will make this put visible to scanners and getters.
STEP 9. Run coprocessor post hooks. This should be done after the wal is synced so that the coprocessor contract is adhered to. if the wal sync was unsuccessful, remove keys from memstore

1、做准备工作，实例化变量

2、检查Put和Delete里面的列族是否和Region持有的列族的定义相同。

3、给Row加锁，先计算hash值做key，如果该key没上过锁，就上一把锁，然后计算出来要写的action有多少个，记录到numReadyToWrite。

4、更新时间戳，把该action里面的所有的kv的时间戳更新为最新的时间戳，它这里也会把之前的没运行的也一起更新。

5、给该region加锁，这个时间点之后，就不允许读了，等待时间需要根据numReadyToWrite的数量来计算。

6、上锁之后，下面就是重头戏了，也就是Put、Delete等的重点。给这些写入memstore的数据创建一个批次号。

7、把kv们写入到memstore当中，然后计算出来一个添加数据之后的新的MemStore的大小addedSize。

8、把kv添加到日志当中，标志状态为成功，如果是用户设置了不写入日志的，它就不写入日志了。

9、先异步添加日志。

10、释放之前创建的锁。

11、同步日志。

12、结束该批次的操作。

Final、同步日志没成功的，最后根据批次回滚MemStore中的操作。