-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Open
Labels
Description
New Feature
AI & Search
- Inverted index supports custom analyzers, including Pinyin tokenizer and Pinyin filter ([feature](inverted index) add custom analyzer support with pinyin tokenzer and pinyin filter #57097)
- Added support for multi-position PhraseQuery in inverted index search functions ([feature](inverted index) add multi position PhraseQuery support to search function #57588)
- Added Ann index only-scan capability ([ann] Ann index only scan #57243)
Function
- Added the sem aggregate function ([Feature](agg) add sem agg function #57545)
- Supported the factorial simple SQL function derived from Hive ([Enhancement] support simple sql function for factorial(from Hive) #57144)
- Added support for zero-width assertions in some regular expression functions ([Enhancement](regexp) Support zero-width assertions in some regexp functions #57643)
- Enabled GROUP BY and DISTINCT operations for JSON type ([feature](jsonb) json type support group by and distinct #57679)
- Added the add_time/sub_time time functions ([feature](function) support add/sub_time functions #56200)
- Added the deduplicate_map function ([feat](function) Add function of deduplicate_map #58403)
Materialized View (MTMV)
- Materialized views can still participate in transparent query rewrite when data changes occur in their non-partitioned base tables ([feature](mtmv) Materialized view can participate in transparent query rewrite even when data changes occur in its non-partitioned base tables #56745)
- Supported creating MTMV based on views ([feature](mtmv)create mtmv can use view #56423)
- MTMV refresh supports multiple PCT tables (branch-4.0:[feature](mtmv)MTMV refresh support multi pct tables #58140)
- Supported window function rewrite when materialized views contain window functions ([opt](mtmv) Support window function rewrite when materialized view contains window function #55066)
Data Lake
- Implemented the Iceberg rewrite_data_files action to support table optimization and compaction ([feat](iceberg) Implement Iceberg
rewrite_data_filesaction for table optimization and compaction #56413, [feat](iceberg) change OPTIMIZE TABLE to ALTER TABLE EXECUTE syntax #56638) - Supported VARBINARY type mapping for Hive/Iceberg/Paimon/JDBC tables ([feature](catalog) support varbinary type mapping in hive/iceberg/paimon table #57821, [feature](jdbc) support mapping varbinary type in JBDC catalog #58215)
- Supported Partition Evolution DDL for Iceberg tables ([feature](iceberg) Support Partition Evolution DDL for Iceberg Tables #57972)
Optimizations
- Optimized the performance of the FROM_UNIXTIME function ([Opt](function) Optimize the performance of FROM_UNIXTIME #57423)
- Removed the castTo conversion in PartitionKey comparison to improve partition processing efficiency ([opt](partition) remove castTo in PartitionKey comparison #57518)
- Optimized the performance of Parquet reader when decoding RLE_DICTIONARY encoding ([enhancement](parquet)Optimize the performance of parquet reader when decode RLE_DICTIONARY encoding #57208)
- Reduced the memory footprint of the Column class in Catalog ([opt](catalog) Reduce the memory footprint of Column #57401)
- Accumulated multiple small batches before Ann index training to improve training efficiency ([improve](ann index)Accumulate multiple small batches before training #57623)
Bug Fixes
Query
- Fixed the issue where the utc_time function returned incorrect results when the input was null ([Fix](funtion) Fix utc_time result when input null #57716)
- Fixed the exception thrown when UNION ALL is combined with TVF ((fix)(nerieds) fix union all with tvf throw exception #57889)
- Fixed the problem that the WHERE clause contained non-key columns when creating a materialized view on a unique key table ([fix](nereids)where clause should only contains key column when create mv on unique table #57915)
- Fixed window functions: enabled constant expression evaluation for the offset parameter of LAG/LEAD ([fix](window) allow constant expression evaluation in LAG/LEAD offset parameter #58200)
- Fixed aggregate functions: abnormal push-down of aggregate operations before projection on nullable columns; count push-down aggregation issue on non-null columns ([Fix](agg) fix push agg op in nullable column before projection #58234)
- Fixed time functions: the second/microsecond functions did not handle time literals; time_to_sec reported errors due to garbage values when processing null values ([fix](function)Let second and microsecond functions deal time literal #56659, [fix](time) fix time_to_sec error caused by gabage value of null value #58410)
- Fixed AI functions: unknown error occurred when _exec_plan_fragment_impl called AI functions (branch-4.0: [Fix](ai) Fix _exec_plan_fragment_impl meet unknown error when call AI_Functions #58521)
- Fixed geo module: memory leak in the geo module ([Fix](geo) fix memory leak in geo #58004)
- Fixed information_schema: timezone format incompatibility when using offset timezone ([fix](information_schema) Fix timezone format incompatibility when using offset timezone #58412)
Materialized View and Schema Change
- Fixed the failure of rewrite when materialized views contain group sets and filters above scan ([fix](mtmv) Fix mv rewrite fail when mv contains group sets and filter above scan #57343)
- Fixed the coredump issue caused by reading non-overlapping segments from a single rowset during heavy schema change ([fix](schema-change) Prevent coredump when reading non-overlapping segments from a single rowset during heavy schema change #57191)
Storage-Compute Separation
- Fixed the issue of broadcast remote read in TopN queries ([fix](cloud) avoid broadcast remote read in topn query #58044)
- Fixed the accumulation of tablet deletion tasks in the cloud environment ([fix](cloud) Fix cloud drop tablet tasks pile up #58131)
- Fixed the problem of long service startup time during the first boot in the cloud environment ([fix](cloud) Fix the issue where it takes a long time to come alive on first boot #58152)
Data Lake
- Fixed Iceberg: enabled dynamic partition pruning only for identity partitions ([fix](iceberg) only enable dynamic partition pruning for identity partitions in Iceberg #58033)
- Fixed the permission authentication issue when loading Iceberg partitions ([fix]handle loading iceberg partitions by auth #57988)
- Fixed the partition path scheme mismatch when inserting into Hive partitioned tables on object storage ([fix](hive) Fix partition path scheme mismatch when inserting into Hive partitioned tables on object storage #57973)
- Fixed the issue where Hive cache was not refreshed when inconsistent ([fix](hive) refresh hive cache when inconsistency #58074)
- Fixed Paimon Catalog: OSS access failure when using DLS endpoint ([fix](paimon-catalog)Fix OSS access when using DLS endpoint #58099)
- Fixed Iceberg: FE did not refresh logs after ALTER TABLE ... EXECUTE; enabled dynamic partition pruning only for identity partitions; added auth regression tests for Iceberg system tables ([fix](iceberg) Add FE refresh logging after ALTER TABLE … EXECUTE #58355, [fix](iceberg) only enable dynamic partition pruning for identity partitions in Iceberg #58033, [fix](regression) Add auth regression tests for Iceberg/Paimon system tables #58298)
- Fixed Hive: StackOverflowError caused by insert overwrite on S3-compatible storage; Hive cache not refreshed when inconsistent; partition path scheme mismatch ([fix](hive) Fix StackOverflowError in insert overwrite on S3-compatible storage #58504, [fix](hive) refresh hive cache when inconsistency #58074, [fix](hive) Fix partition path scheme mismatch when inserting into Hive partitioned tables on object storage #57973)
- Fixed Paimon: OSS access failure under DLS endpoint; supported user-defined S3 config prefixes and unified to HDFS S3A protocol ([fix](paimon-catalog)Fix OSS access when using DLS endpoint #58099, [fix](paimon)Support user-defined S3 config prefixes and unify to HDFS S3A protocol #57116)
Reactions are currently unavailable