3,576 questions
-3
votes
0
answers
35
views
Dell laptop with Intel Iris Xe cannot set custom resolution 2560x1080 on external ultrawide monitor via HDMI (xrandr Configure crtc failed) [closed]
I have a Dell laptop with Intel Iris Xe graphics (Raptor Lake / RPL-U) running **Ubuntu 24.04** (kernel 6.8.0-100-generic, Mesa 25.2.8).
My external monitor is a **2560×1080** (21:9 ultrawide).
The ...
2
votes
0
answers
67
views
Difference between db string and other data sizes in assembly for strings [duplicate]
Assume this code in x86_64 assembly:
section .data
msg db "Hello, world!"
section .text
global _start
_start:
;; system call 1 is sys_write
mov rax, 1
...
0
votes
0
answers
65
views
Bootloader stopped working after I changed the syntax from gas to nasm
I have this bootloader I made a while ago and I would like it to be in nasm:
.intel_syntax noprefix
.code16
.equ STACK_TOP, 0x7C00
.equ SELF_LOAD, 0x7C00
.equ ELF_HDR_LOAD, 0x7E00
.equ SECT_SIZE, ...
6
votes
1
answer
182
views
What is the performance effect (on x64) of __atomic_fetch_add that ignores its result?
My code is
...
fragment1 // compares several regions in D1$ to D1$/D3$
__atomic_fetch_add(&lock,-1,__ATOMIC_ACQ_REL); // stmt A
fragment2 // moves several regions from D1$/D3$ to D1$
...
1
vote
1
answer
126
views
Does Intel CPU have instruction for paging translation result
I wonder if Intel (and Intel compatible) CPUs have an instruction (for diagnostic/debugging purposes) which, for a given linear address, returns the result of paging translation (i.e. the ...
1
vote
0
answers
86
views
Intuition over TBB parallel scan/parallel prefix requirements
I am reading a paragraph about the tbb::parallel_scan algorithm from the book Intel Threading Building Blocks, and I understood what the operation does serially, but I am not understanding what are ...
Best practices
1
vote
2
replies
123
views
Loading a byte: Partial register stall for intel cpus (r8 vs r64)
My assembly program reads characters in a text file by loading them one by one in register 'al'. However I sometime need to use rax fully, and I think this causes a partial register stall. Now I think ...
0
votes
1
answer
92
views
Cache Allocation Technology in 13th Generation Core i9 13900E Intel CPU [closed]
I am trying to implement Cache allocation Technology`s impact with my CPU. However, when I use either lscpu to see whether my CPU supports, or cpuid -l 0x10, output is false.
How is this possible?
How ...
7
votes
1
answer
235
views
Why are all IMUL µOPs dispatched to Port 1 only (on Haswell), even when multiple IMULs are executed in parallel?
I'm experimenting with the IMUL r64, r64 instruction on an Intel Xeon E5-1620 v3 (Haswell architecture, base clock 3.5 GHz, turbo boost up to 3.6 GHz, Hyper Threading is enabled).
My test loop is ...
3
votes
1
answer
123
views
JavaFX app freezes or flickers after Intel Iris Xe driver update [closed]
I have a JavaFX desktop application that started having rendering issues after updating the Intel Iris Xe graphics driver.
On Java 11 + JavaFX (Zulu distribution):
openjdk version "11.0.25" ...
2
votes
0
answers
78
views
What is the relationship between Intel Extension for PyTorch and PyTorch XPU versions?
A while ago, I was training a deep learning model on a computer without an NVIDIA GPU but with an Intel GPU. I only used the CPU for training, which was painfully slow. It suddenly occurred to me: can ...
3
votes
1
answer
161
views
L1-dcache-stores, LLC-stores, cache-references and uncore memory counter don't add up in Linux perf?
I am trying to measure memory bus related performance of a simple test program on an Intel N150 (Twin Lake, which has four Gracemont cores, like Alder Lake E-cores).
PMU counters from perf stat don't ...
0
votes
1
answer
140
views
Why does PERF_COUNT_HW_REF_CPU_CYCLES have much higher variance on Zen5 cpus than PERF_COUNT_HW_CPU_CYCLES?
My understanding is that PERF_COUNT_HW_REF_CPU_CYCLES should map to some counter that counts at a constant rate, as opposed to PERF_COUNT_HW_CPU_CYCLES which is affected by frequency scaling. I'd ...
4
votes
0
answers
72
views
Why are Elevated Permissions Needed for Efficient CPU Utilization in a Multithreaded Application
On an Intel i7-13620H based computer with Windows 11, a 10-thread, compute-intensive application written in C# application uses only 4 of 10 CPU cores, seems to use only the “efficiency” (and not “...
-1
votes
1
answer
192
views
Memory violation at execution of multithreaded code compiled with Intel C++ 2025 compiler
I am using the basic threadpool found at
https://codereview.stackexchange.com/questions/288042/c-implementation-of-a-concurrent-queue-and-of-a-thread-pool/288044?noredirect=1#comment593398_288044
to ...