Block Devices In User Space

January 20, 2026 by Al Williams 15 Comments

Your new project really could use a block device for Linux. File systems are easy to do with FUSE, but that’s sometimes too high-level. But a block driver can be tough to write and debug, especially since bugs in the kernel’s space can be catastrophic. [Jiri Pospisil] suggests Ublk, a framework for writing block devices in user space. This works using the io_uring facility in recent kernels.

This opens the block device field up. You can use any language you want (we’ve seen FUSE used with some very strange languages). You can use libraries that would not work in the kernel. Debugging is simple, and crashing is a minor inconvenience.

Another advantage? Your driver won’t depend on the kernel code. There is a kernel driver, of course, named ublk_drv, but that’s not your code. That’s what your code talks to.

Continue reading “Block Devices In User Space” →

BASIC On A Calculator Again

January 19, 2026 by Al Williams 7 Comments

We are always amused that we can run emulations or virtual copies of yesterday’s computers on our modern computers. In fact, there is so much power at your command now that you can run, say, a DOS emulator on a Windows virtual machine under Linux, even though the resulting DOS prompt would probably still perform better than an old 4.77 MHz PC. Remember when you could get calculators that ran BASIC? Well, [Calculator Clique] shows off BASIC running on a decidedly modern HP Prime calculator. The trick? It’s running under Python. Check it out in the video below.

Think about it. The HP Prime has an ARM processor inside. In addition to its normal programming system, it has Micropython as an option. So that’s one interpreter. Then PyBasic has a nice classic Basic interpreter that runs on Python. We’ve even ported it to one or two of the Hackaday Superconference badges.

Continue reading “BASIC On A Calculator Again” →

Optimizing Software With Zero-Copy And Other Techniques

January 16, 2026 by Maya Posch 21 Comments

An important aspect in software engineering is the ability to distinguish between premature, unnecessary, and necessary optimizations. A strong case can be made that the initial design benefits massively from optimizations that prevent well-known issues later on, while unnecessary optimizations are those simply do not make any significant difference either way. Meanwhile ‘premature’ optimizations are harder to define, with Knuth’s often quoted-out-of-context statement about these being ‘the root of all evil’ causing significant confusion.

We can find Donald Knuth’s full quote deep in the 1974 article Structured Programming with go to Statements, which at the time was a contentious optimization topic. On page 268, along with the cited quote, we see that it’s a reference to making presumed optimizations without understanding their effect, and without a clear picture of which parts of the program really take up most processing time. Definitely sound advice.

And unlike back in the 1970s we have today many easy ways to analyze application performance and to quantize bottlenecks. This makes it rather inexcusable to spend more time today vilifying the goto statement than to optimize one’s code with simple techniques like zero-copy and binary message formats.

Continue reading “Optimizing Software With Zero-Copy And Other Techniques” →

Making Code A Hundred Times Slower With False Sharing

January 14, 2026 by Maya Posch 6 Comments

The cache hierarchy of the 2008 Intel Nehalem x86 architecture. (Source: Intel)

Writing good, performant code depends strongly on an understanding of the underlying hardware. This is especially the case in scenarios like those involving embarrassingly parallel processing, which at first glance ought to be a cakewalk. With multiple threads doing their own thing without having to nag the other threads about anything it seems highly doubtful that even a novice could screw this up. Yet as [Keifer] details in a recent video on so-called false sharing, this is actually very easy, for a variety of reasons.

With a multi-core and/or multi-processor system each core has its own local cache that contains a reflection of the current values in system RAM. If any core modifies its cached data, this automatically invalidates the other cache lines, resulting in a cache miss for those cores and forcing a refresh from system RAM. This is the case even if the accessed data isn’t one that another core was going to use, with an obvious impact on performance. As cache lines are a contiguous block of data with a size and source alignment of 64 bytes on x86, it’s easy enough to get some kind of overlap here.

The worst case scenario as detailed and demonstrated using the Google Benchmark sample projects, involves a shared global data structure, with a recorded hundred times reduction in performance. Also noticeable is the impact on scaling performance, with the cache misses becoming more severe with more threads running.

A less obvious cause of performance loss here is due to memory alignment and how data fits in the cache lines. Making sure that your data is aligned in e.g. data structures can prevent more unwanted cache invalidation events. With most applications being multi-threaded these days, it’s a good thing to not only know how to diagnose false sharing issues, but also how to prevent them.

Continue reading “Making Code A Hundred Times Slower With False Sharing” →

A UI-Focused Display Library For The ESP32

January 9, 2026 by Lewin Day 13 Comments

If you’re building a project on your ESP32, you might want to give it a fancy graphical interface. If so, you might find a display library from [dejwk] to be particularly useful.

Named roo_display for unclear reasons, the library is Arduino-compatible, and suits a wide range of ESP32 boards out in the wild. It’s intended for use with common SPI-attached display controllers, like the ILI9341, SSD1327, ST7789, and more. It’s performance-oriented, without skimping on feature set. It’s got all kinds of fonts in different weights and sizes, and a tool for importing more. It can do all kinds of shapes if you want to manually draw your UI elements, or you can simply have it display JPEGs, PNGs, or raw image data from PROGMEM if you so desire. If you’re hoping to create a touch interface, it can handle that too. There’s even a companion library for doing more complex work under the name roo_windows.

If you’re looking to create a simple and responsive interface, this might be the library for you. Of course, there are others out there too, like the Adafruit GFX library which we’ve featured before. You could even go full VGA if you wanted, and end up with something that looks straight out of Windows 3.1. Meanwhile, if you’re cooking up your own graphics code for the popular microcontroller platform, you should probably let us know on the tipsline!

Thanks to [Daniel] for the tip!

NPAPI And The Hot-Pluggable World Wide Web

January 9, 2026 by Maya Posch 16 Comments

In today’s Chromed-up world it can be hard to remember an era where browsers could be extended with not just extensions, but also with plugins. Although for those of us who use traditional Netscape-based browsers like Pale Moon the use of plugins has never gone away, for the rest of the WWW’s users their choice has been limited to increasingly more restrictive browser extensions, with Google’s Manifest V3 taking the cake.

Although most browsers stopped supporting plugins due to “security concerns”, this did nothing to address the need for executing code in the browser faster than the sedate snail’s pace possible with JavaScript, or the convenience of not having to port native code to JavaScript in the first place. This led to various approaches that ultimately have culminated in the WebAssembly (WASM) standard, which comes with its own set of issues and security criticisms.

Other than Netscape’s Plugin API (NPAPI) being great for making even 1990s browsers ready for 2026, there are also very practical reasons why WASM and JavaScript-based approaches simply cannot do certain basic things.

Continue reading “NPAPI And The Hot-Pluggable World Wide Web” →

Detail of Horus's face, from a statue of Horus and Set placing the crown of Upper Egypt on the head of Ramesses III. Twentieth Dynasty, early 12th century BC.

HORUS Framework: A Rust Robotics Library

January 9, 2026 by John Elliot V 7 Comments

[neos-builder] wrote in to let us know about their innovation: the HORUS Framework — Hybrid Optimized Robotics Unified System — a production-grade robotics framework built in Rust for real-time performance and memory safety.

This is a batteries included system which aims to have everything you might need available out of the box. [neos-builder] said their vision is to create a robotics framework that is “thick” as a whole (we can’t avoid this as the tools, drivers, etc. make it impossible to be slim and fit everyone’s needs), but modular by choice.

[neos-builder] goes on to say that HORUS aims to provide developers an interface where they can focus on writing algorithms and logic, not on setting up their environments and solving configuration issues and resolving DLL hell. With HORUS instead of writing one monolithic program, you build independent nodes, connected by topics, which are run by a scheduler. If you’d like to know more the documentation is extensive.

The list of features is far too long for us to repeat here, but one cool feature in addition to the real-time performance and modular design that jumped out at us was this system’s ability to process six million messages per second, sustained. That’s a lot of messages! Another neat feature is the system’s ability to “freeze” the environment, thereby assuring everyone on the team is using the same version of included components, no more “but it works on my machine!” And we should probably let you know that Python integration is a feature, connected by shared-memory inter-process communication (IPC).

If you’re interested in robotics and/or real-time systems you should definitely be aware of HORUS. Thanks to [neos-builder] for writing in about it. If you’re interested in real-time systems you might like to read Real-Time BART In A Box Smaller Than Your Coffee Mug and Real-Time Beamforming With Software-Defined Radio.