OpenMathLib
OpenBLAS
Blog
Docs
Changelog
Blog
Docs
Changelog
Overview
Branches
Benchmarks
Runs
Performance History
Latest Results
Enable RVV-optimized TRSM kernels for RISCV64_ZVL128B Use existing RVV-optimized TRSM kernel implementations instead of the generic C versions for the RISCV64_ZVL128B target. The RVV kernels (trsm_kernel_{LN,LT,RN,RT}_rvv_v1.c) are already present in the repository and used by the x280 target, but were not enabled for ZVL128B. Note: ZVL256B is not included because the RVV TRSM kernel has correctness issues with VLEN=256. Signed-off-by: Felix-Gong <gongxiaofei24@iscas.ac.cn>
Felix-Gong:feature/rvv-trsm-zvl128b
17 hours ago
Enable RVV-optimized TRSM kernels for RISCV64_ZVL128B Use existing RVV-optimized TRSM kernel implementations instead of the generic C versions for the RISCV64_ZVL128B target. The RVV kernels (trsm_kernel_{LN,LT,RN,RT}_rvv_v1.c) are already present in the repository and used by the x280 target, but were not enabled for ZVL128B. Note: ZVL256B is not included because the RVV TRSM kernel has correctness issues with VLEN=256. Signed-off-by: Felix-Gong <gongxiaofei24@iscas.ac.cn>
Felix-Gong:feature/rvv-trsm-zvl128b
22 hours ago
Enable RVV-optimized TRSM kernels for RISCV64_ZVL128B and ZVL256B Use existing RVV-optimized TRSM kernel implementations instead of the generic C versions for the RISCV64_ZVL128B and RISCV64_ZVL256B targets. The RVV kernels (trsm_kernel_{LN,LT,RN,RT}_rvv_v1.c) are already present in the repository and used by the x280 target, but were not enabled for these two configurations. Signed-off-by: Felix-Gong <gongxiaofei24@iscas.ac.cn>
Felix-Gong:feature/rvv-trsm-zvl128b
23 hours ago
Merge pull request #5829 from martin-frbg/issue5825 Fix OpenMP reentrancy issues in LLVM compilations with gmake on ARM64
develop
2 days ago
Merge pull request #5826 from ChipKerchner/fasterRVVGEMV Faster GEMV for RVV
develop
2 days ago
Comment out the libclang_rt.builtins kludge in preparation for removal
martin-frbg:issue5825
2 days ago
Fix incorrect inline assembly constraints in dcbt prefetch instructions Corrected the register constraints for the PowerPC dcbt (Data Cache Block Touch) instruction in Power10 kernel implementations. The dcbt instruction has special behavior where if the first operand (RA) is r0, it uses the value 0 instead of the register contents. Therefore, RA must use the "b" constraint (any GPR except r0), while RB can use "r" (any GPR including r0). Changes: - Changed first operand constraint from "r" to "b" to exclude r0 - Changed second operand constraint from "b" to "r" for flexibility This ensures correct prefetch behavior and compliance with PowerPC ISA specifications, preventing potential issues where r0 might be incorrectly used as the base address register. Signed-off-by: Amrita H S <amritahs@linux.vnet.ibm.com>
amritahs-ibm:fix_dcbt_constraints
4 days ago
Power10: Replace vector pair loads with __builtin_vsx_lxvp Replace normal vector pair pointer dereferences with the optimized __builtin_vsx_lxvp builtin across DGEMM, ZGEMM, and DGEMV kernels. Also done some identation corrections in dgemm_kernel_power10.c. This is done as part of POWER code cleanup and may not have any performance impact. Signed-off-by: Amrita H S <amritahs@linux.vnet.ibm.com>
amritahs-ibm:use_lxvp_builtins
4 days ago
Latest Branches
CodSpeed Performance Gauge
0%
Enable RVV-optimized TRSM kernels for RISCV64_ZVL128B
#5830
22 hours ago
4fb51e3
Felix-Gong:feature/rvv-trsm-zvl128b
CodSpeed Performance Gauge
0%
Fix OpenMP reentrancy issues in LLVM compilations with gmake on ARM64
#5829
2 days ago
1145c75
martin-frbg:issue5825
CodSpeed Performance Gauge
0%
Draft PR: Fix incorrect inline assembly constraints in dcbt prefetch instructions
#5828
4 days ago
831b822
amritahs-ibm:fix_dcbt_constraints
© 2026 CodSpeed Technology
Home
Terms
Privacy
Docs
×
Advertisement