Add rowset id generator to FE and BE#1678
Conversation
| } | ||
|
|
||
| OLAPStatus FeBasedRowsetIdGenerator::get_next_id(RowsetId* gen_rowset_id) { | ||
| std::lock_guard<std::mutex> l(_lock); |
There was a problem hiding this comment.
std::mutex is too heavy for get next id. In most cases, it is a only add operation
|
|
||
| import com.google.common.collect.Maps; | ||
|
|
||
| public class RowsetIdGenerator { |
There was a problem hiding this comment.
Can we name it a PersistedIdGernator? I think it can not only be used for RowsetId, but also be used for other id
There was a problem hiding this comment.
Maybe it could not. because rowsetid generator could have many implementations, for example, we could scan all rowset id in storage engine at start time and then generate rowset it one by one. In this case it is not persistent.
There was a problem hiding this comment.
For Id generator logic, we can use a class.
Then you can let RowsetIdGenerator contain it.
gensrc/thrift/FrontendService.thrift
Outdated
| // (start_rowset_id, end_rowset_id) not include start and end | ||
| struct TGetRowsetIdResponse { | ||
| 1: optional i64 start_rowset_id = -1 | ||
| 2: optional i64 end_rowset_id = -1 |
There was a problem hiding this comment.
I prefer [start, end) with start_id and id_count
f94cb37 to
7f14f10
Compare
|
Can anyone explain what's the purpose of this PR and give a brief description of what's changed? |
efbc29d to
0d3988a
Compare
| vector<string> error_msgs; | ||
| TStatusCode::type status_code = TStatusCode::OK; | ||
| return_value.__set_snapshot_version(PREFERRED_SNAPSHOT_VERSION); | ||
| int32_t return_snapshot_version = PREFERRED_SNAPSHOT_VERSION; |
There was a problem hiding this comment.
add some comment here to describe the version meaning.
be/src/olap/storage_engine.h
Outdated
|
|
||
| bool rowset_id_in_use(RowsetId& rowset_id) { return _rowset_id_generator->id_in_use(rowset_id); }; | ||
|
|
||
| void release_rowset_id(RowsetId& rowset_id) { return _rowset_id_generator->release_id(rowset_id); }; |
There was a problem hiding this comment.
add some comments here.
| std::mutex _lock; | ||
| RowsetId _next_id = -1; | ||
| RowsetId _id_batch_end = -1; | ||
| virtual void release_id(RowsetId& rowset_id) = 0; |
There was a problem hiding this comment.
add comment to describe the api
| // generator a id according to data dir | ||
| // rowsetid is not globally unique, it is dir level | ||
| // it saves the batch end id into meta env | ||
| OLAPStatus next_id(RowsetId* rowset_id); |
There was a problem hiding this comment.
| OLAPStatus next_id(RowsetId* rowset_id); | |
| OLAPStatus next_id(RowsetId* rowset_id) override; |
| // it saves the batch end id into meta env | ||
| OLAPStatus next_id(RowsetId* rowset_id); | ||
|
|
||
| bool id_in_use(RowsetId& rowset_id); |
There was a problem hiding this comment.
| bool id_in_use(RowsetId& rowset_id); | |
| bool id_in_use(const RowsetId& rowset_id) override; |
| lo = (id_version << 56) + (low & LOW_56_BITS); | ||
| } | ||
|
|
||
| std::string to_string() const { |
There was a problem hiding this comment.
I think you need to_string and serialize and deserialize api
There was a problem hiding this comment.
to_string is ok, because it is a string value in pb.
be/src/olap/rowset/beta_rowset.cpp
Outdated
| @@ -26,7 +26,7 @@ | |||
| namespace doris { | |||
|
|
|||
| std::string BetaRowset::segment_file_path(const std::string& dir, RowsetId rowset_id, int segment_id) { | |||
There was a problem hiding this comment.
| std::string BetaRowset::segment_file_path(const std::string& dir, RowsetId rowset_id, int segment_id) { | |
| std::string BetaRowset::segment_file_path(const std::string& dir, const RowsetId& rowset_id, const int segment_id) { |
| return _valid_rowset_ids.find(rowset_id) != _valid_rowset_ids.end(); | ||
| } | ||
|
|
||
| void UniqueRowsetIdGenerator::release_id(RowsetId& rowset_id) { |
There was a problem hiding this comment.
| void UniqueRowsetIdGenerator::release_id(RowsetId& rowset_id) { | |
| void UniqueRowsetIdGenerator::release_id(const RowsetId& rowset_id) { |
| } | ||
|
|
||
|
|
||
| TEST_F(UniqueRowsetIdGeneratorTest, GenerateIdTest) { |
There was a problem hiding this comment.
add a case to test BE restart situation.
| // options | ||
| doris::EngineOptions options; | ||
| options.store_paths = paths; | ||
| options.backend_uid = doris::UniqueId(); |
There was a problem hiding this comment.
Is backend_uid act like a random seed that will be generated every time BE started?
If so, I think the name is confused that make people think of it as the ID of a BE?
Maybe random_seed is better?
| static const uint16_t OLAP_STRING_MAX_LENGTH = 65535; | ||
|
|
||
| static const int32_t PREFERRED_SNAPSHOT_VERSION = 2; | ||
| static const int32_t PREFERRED_SNAPSHOT_VERSION = 3; |
There was a problem hiding this comment.
add some comment to explain the version 1 , 2, and 3?
| request.setVersion_hash(versionHash); | ||
| request.setList_files(true); | ||
| request.setPreferred_snapshot_version(2); | ||
| request.setPreferred_snapshot_version(3); |
There was a problem hiding this comment.
Can make this a global variable? Its better not to write magic number
| @@ -0,0 +1,111 @@ | |||
| // Licensed to the Apache Software Foundation (ASF) under one | |||
There was a problem hiding this comment.
you should add this file to run-ut.sh
be/src/olap/storage_engine.h
Outdated
| std::unique_ptr<TabletManager> _tablet_manager; | ||
| std::unique_ptr<TxnManager> _txn_manager; | ||
|
|
||
| RowsetIdGenerator* _rowset_id_generator; |
There was a problem hiding this comment.
should delete it in destructor or use std::unique_ptr
|
|
||
| namespace doris { | ||
|
|
||
| UniqueRowsetIdGenerator::UniqueRowsetIdGenerator(UniqueId backend_uid) : |
There was a problem hiding this comment.
| UniqueRowsetIdGenerator::UniqueRowsetIdGenerator(UniqueId backend_uid) : | |
| UniqueRowsetIdGenerator::UniqueRowsetIdGenerator(const UniqueId& backend_uid) : |
| _backend_uid(backend_uid), _inc_id(1) { | ||
| } | ||
| // generator a id according to data dir | ||
| // rowsetid is not globally unique, it is dir level |
There was a problem hiding this comment.
I think rowset id is global unique
594ee87 to
218e07e
Compare
Fix bug Compile success Build unit test successfully Rowsetid version error Add rowsetid gc logic Fix bug Add unique rowset id unit test
* Fix BE UT failure introduced by apache#1526 in TabletSchemaCache (apache#1661) * Fix be ut * Revert "Fix BE UT failure introduced by apache#1526 in TabletSchemaCache (apache#1647)" This reverts commit bca5d32. * (selectdb-cloud) Modify debug checks for detached schema kv (apache#1678)
No description provided.