[feat](storage)Support Azure Blob Storage#56861
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
run buildall |
1 similar comment
|
run buildall |
FE UT Coverage ReportIncrement line coverage |
TPC-DS: Total hot run time: 190544 ms |
ClickBench: Total hot run time: 30.67 s |
|
run buildall |
[feat](storage)Support Azure Blob Storage
7157ce0 to
f5dd462
Compare
|
run buildall |
|
run buildall |
ClickBench: Total hot run time: 31.19 s |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
FE Regression Coverage ReportIncrement line coverage |
|
run buildall |
ClickBench: Total hot run time: 30.06 s |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
FE Regression Coverage ReportIncrement line coverage |
fe/fe-core/src/main/java/org/apache/doris/common/util/LocationPath.java
Outdated
Show resolved
Hide resolved
| s3Props.put("AWS_SECRET_KEY", secretKey); | ||
| s3Props.put("AWS_NEED_OVERRIDE_ENDPOINT", "true"); | ||
| s3Props.put("provider", "azure"); | ||
| s3Props.put("PROVIDER", "AZURE"); |
| @Override | ||
| public String getStorageName() { | ||
| return "Azure"; | ||
| return "AZURE"; |
There was a problem hiding this comment.
we’ve updated the logic to keep Storage names fully uppercase for consistency, since both HDFS and S3 follow that convention. All callers already perform case-insensitive matching, so this change ensures uniform style without affecting compatibility.
| @ConnectorProperty(names = {"azure.account_name", "azure.access_key", "s3.access_key", | ||
| "AWS_ACCESS_KEY", "ACCESS_KEY", "access_key"}, | ||
| description = "The access key of S3.") | ||
| protected String accessKey = ""; |
There was a problem hiding this comment.
how about change this to "accountName"?
| "AWS_SECRET_KEY", "secret_key"}, | ||
| sensitive = true, | ||
| description = "The secret key of S3.") | ||
| protected String secretKey = ""; |
|
|
||
| boolean isPrefix = false; | ||
| while (blobPath.normalize().toString().startsWith(listPrefix)) { | ||
| while (null != blobPath && blobPath.normalize().toString().startsWith(listPrefix)) { |
There was a problem hiding this comment.
Why adding null != blobPath, is this a bug?
fe/fe-core/src/main/java/org/apache/doris/fs/obj/AzureObjStorage.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Change the name to objCommit?
|
run buildall |
| */ | ||
| public static String encodeToBase64(int id) { | ||
| ByteBuffer buf = ByteBuffer.allocate(4) | ||
| .order(ByteOrder.BIG_ENDIAN); |
There was a problem hiding this comment.
use LE align as the BE side
|
PR approved by at least one committer and no changes requested. |
|
run buildall |
|
run performance |
ClickBench: Total hot run time: 29.09 s |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
|
run check_coverage |
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
FE Regression Coverage ReportIncrement line coverage |
## What's Changed
1. **Refined Azure Blob Configuration Naming**
- Adopted Azure-native property names for better consistency with Azure
SDK conventions:
- `account_name` → Azure Storage Account Name
- `account_key` → Azure Storage Account Key
- Ensures compatibility, clarity, and alignment with Azure Blob
attribute definitions.
2. **Full Feature Support for Azure Blob Storage**
- Added comprehensive integration for the following modules:
- **TVF (Table-Valued Function)**
- **LOAD (Data Loading)**
- **CATALOG (Metadata Querying)**
- Azure Blob can now be used as both a data source and destination
across all modules.
3. **Protocol Compatibility**
- Added full support for multiple Azure storage access protocols:
- `abfs://`
- `abfss://`
- `wasb://`
- `wasbs://`
- Automatically recognizes protocol prefixes and maps them to the
correct Azure storage client implementation.
## todo
**Unified Connectivity Testing Framework**
- Refactored the connectivity test logic into a unified implementation
shared across all object storage backends (S3, OSS, COS, OBS, BOS, and
Azure).
- Improves code reusability and simplifies the process of adding new
storage providers.
FE Regression Coverage ReportIncrement line coverage |
cherry pick apache#56861 (cherry picked from commit 9177047)
## What's Changed
1. **Refined Azure Blob Configuration Naming**
- Adopted Azure-native property names for better consistency with Azure
SDK conventions:
- `account_name` → Azure Storage Account Name
- `account_key` → Azure Storage Account Key
- Ensures compatibility, clarity, and alignment with Azure Blob
attribute definitions.
2. **Full Feature Support for Azure Blob Storage**
- Added comprehensive integration for the following modules:
- **TVF (Table-Valued Function)**
- **LOAD (Data Loading)**
- **CATALOG (Metadata Querying)**
- Azure Blob can now be used as both a data source and destination
across all modules.
3. **Protocol Compatibility**
- Added full support for multiple Azure storage access protocols:
- `abfs://`
- `abfss://`
- `wasb://`
- `wasbs://`
- Automatically recognizes protocol prefixes and maps them to the
correct Azure storage client implementation.
## todo
**Unified Connectivity Testing Framework**
- Refactored the connectivity test logic into a unified implementation
shared across all object storage backends (S3, OSS, COS, OBS, BOS, and
Azure).
- Improves code reusability and simplifies the process of adding new
storage providers.
What's Changed
Refined Azure Blob Configuration Naming
account_name→ Azure Storage Account Nameaccount_key→ Azure Storage Account KeyFull Feature Support for Azure Blob Storage
Protocol Compatibility
abfs://abfss://wasb://wasbs://todo
Unified Connectivity Testing Framework