1. 71
    I Hate Programming Wayland Applications graphics rant p4m.dev
  1.  

    1. 57

      Don't get me wrong: I'm not expecting that e.g. DPI aware mult-monitor applications using several input devices, mixed refreshrates and hot-plugging of devices "just works"

      In my opinion this is exactly what you should expect. What’s the point of an abstraction if it cannot handle complex cases like that?

      1. 7

        I agree. I wonder if the design of these abstractions works well for creating the Wayland backends for Qt and GTK or SDL (since GUI toolkits and libraries like SDL need to handle those complex cases), at the expense of making simple applications harder to write without a toolkit.

      2. 26

        I share the author's frustration.

        It seems like Wayland was never designed to build a desktop experience but as the most modular composition system and then the desktop part was added as an afterthought and on a ad-hoc basis. Thus you end-up with all this idiosyncrasy. I can think of at least three ways to handle scaling and you'll probably need to implement them all as a fallback. Let's not even get started on fact that different compositors implement a different set of protocols and they sometimes behave differently. At this point, I'm not even sure there's a good way to fix this mess.

        To be fair, if you write your application on top of a toolkit (e.g. GTK, Qt) or a library (e.g. SDL, winit), it hides all of that. The downside is that then you're then reliant on them to support all the Wayland features you need.

        1. 6

          It indeed wasnt. Wayland started as thing for car dashboards. weston and xdg-shell was more of a demo. Gnome did all desktop stuff on top of DBUS. Desktop was always a afterthought.

          1. 4

            Are there any sources for that claim? I'm quite curious but wasn't able to find anything meaningful.

            1. 2

              I'm curious as well since all the earliest articles and talks made by the developers I've seen over the years were always focused on replacing X11 and shitting on it. I haven't found any evidence to show that their objective was to develop it to make car dashboards (or something to that scale) and then later extend it to desktops. Though, admittedly, it was one of the earliest use.

        2. 26

          So, the author uses a super-high-level, super-primitive SDL1-esque immediate mode drawing library under X11, and then complains that he cannot replicate the simplicity when using the lowest-level API possible under Wayland (roughly equivalent to raw xcb)?

          Color me unimpressed.

          1. 6

            It makes sense though in comparison of novelty and modern requirements; one could compare raw X11 and Raylib and the latter is obviously much more easy to use, but that's why the article states that X11 is 40 years old at this point and it was created for completely different desktop requirements.

            So it actually makes sense to compare modern framework for displaying graphics on your screen (raylib) and modern approach to display graphics on your screen (Wayland). You cannot write off the sheer unneeded complexity of Wayland just because "it is old" or anything, it doesn't have excuses. It is just made badly.

            1. 12

              So it actually makes sense to compare

              No, it doesn't, and I'm afraid you have missed the point. It makes no sense to compare programming models on two different extremes of the abstraction scale, and make conclusions about the complexity based on that comparison.

              One is an ultra-high-level library for immediate mode rendering (which is a model that makes every possible tradeoff towards interface simplicity and optimizes for a very narrow set of applications, such as videogames and other programs that need to continuously render changing frames on a window-as-canvas), and the other is an extremely low-level event-driven protocol for window management, designed to be useful in the widest possible variety of scenarios. It makes no sense to compare the two.

              [...] sheer unneeded complexity of Wayland [...] It is just made badly

              I'm very sorry, but none of this was demonstrated.

              1. 1

                Okay, touche.

            2. 3

              To be fair, Wayland doesn't provide any higher level protocols unlike X11 (for better or worse) and push a lot of the complexity onto the clients.

              The X server can handle decorations, and understands fonts and basic drawing functions and these features are always available. On the other hand, the Wayland's ecosystem is pretty fragmented with compositors implementing different sets of features (e.g. even basic things such as decorations) or outright using different means altogether (e.g. DBus). Thus, you can't expect a set of features to work and require the implementation of fallbacks.

              But then you can argue that you can use higher level library that can hide all this but then it also limits what you can do. This is why some popular Wayland programs just deal with it.

              1. 2

                To be fair, Wayland doesn't provide any higher level protocols unlike X11 (for better or worse) and push a lot of the complexity onto the clients.

                That would have been a fair argument. But it isn't the one that was made.

                The X server [...] understands fonts and basic drawing functions and these features are always available

                Yes, but nobody uses them outside of 90's software and personal toy projects. That may sound harsh, but that's how it is.

                Wayland's major point — like you said, for better or worse — is that all of this is pointless cruft, hopelessly outdated with respect to both modern hardware, modern UI expectations, and consequently, the way modern software is operating de facto.

                More crucially, yes, it could have been possible to modernize that cruft and add it back, but 1) no one was going to go over all the toolkits and all the applications and rewrite and rearchitect(!) them back to use the "new" server-side drawing (especially given that it is essentially a dead paradigm and has been for ≈20 years), and 2) it would have become outdated in a few more years anyway.

                Sure, it is a "neat" solution to a niche problem, but as they say, every problem has a solution that's simple, elegant, and dead wrong.


                But then you can argue that you can use higher level library that can hide all this but then it also limits what you can do.

                I'm pretty sure it is possible to write a library implementing a subset of Xlib atop Wayland. Of course, it would be less flexible and give you less raw power than working in terms of Wayland, surfaces, buffers, and presentations — but that's how it is. Classic Xlib is objectively more rigid than Wayland.

                1. 3

                  That would have been a fair argument. But it isn't the one that was made.

                  Their thesis is that Wayland development is not great however I will agree with you that they didn't make the most compelling case with their example with X11.

                  They still did make some good arguments sprinkled around, e.g. with the input handling (which causes apps to behave differently depending on the toolkit or the developer's sauce), the lack of standardisation between compositors, that god awful XDG portal (though they missed the point of having multiple implementations of the same interface).

                  That's why I'm not being being dismissive the way you are.

                  On the Xlib part, more precisely the server side rendering, I'm not attached to it. Having said that lack of proper window manager support (i.e. not having it seperate from the compositor) and just basic stuff like not having to draw your decorations is quite sad.

                  1. 3

                    Their thesis is that Wayland development is not great

                    This is where we start to disagree. Wayland development is fine, for one simple reason: you are not expected to use raw libwayland API any more than you are expected to target raw xcb.

                    Unless you are writing a toolkit, in which case all that flexibility would come in handy, because you wouldn't want to use "high-level" pixel-banging APIs anyway. Even under X11, you'd be using SHM, DRI, DMA-BUFs and Present to have any reasonable efficiency — and now X11 programming looks exactly like Wayland programming would have looked like, just with more idiosyncrasies.

                    They still did make some good arguments sprinkled around, e.g. with the input handling (which causes apps to behave differently depending on the toolkit or the developer's sauce), [...] that god awful XDG portal

                    What exactly is wrong with input handling and why the XDG portal is god-awful? The blog post does not provide any details on either claim, except for general ranting to the effect of "I don't like that life is hard and hardware is complicated".

                    (For one, the part of the author's rant on refresh rates and wl_outputs is literally a case of "if you do not understand why this is the way it is, you are part of the problem".)

                    The lack of "standardization" (more correct term would be "uniformness") between compositors is the only thing that I'm willing to concede. But even then, all the "standardization" in X11 was of the "lowest common ground" kind. X11 appears standardized simply because we are in an extremely late phase in the lifecycle of that technology.

                    That's why I'm not being being dismissive the way you are.

                    Let's just say I'm allergic to logically incoherent arguments and low-quality rants.

            3. 22

              I share the author's frustration, because I've been at this exact place.

              I have this tiny C/X11 program, xvisualbell, that simply blinks your screen (all displays) to get your attention. Notice how the whole program fits on a single page, or under 50 lines of code including blanks and comments.

              I attempted to port this to libwayland twice (using hello-wayland as a starting point, and - embarrassingly - even an LLM). Not only did I not get it to work, the code to draw a simple window easily blows up to 300-400 lines (while missing any of the xvisbell "features").

              Since it doesn't hurt to ask: If someone could port this thing for me, I'd love to pay you 100€, or a charity of your choice double that.

              1. 11

                I was just wanting something like this a few months ago. You inspired me to throw Claude and the problem and we got something that seems to work fine. I am no C programmer, but I checked it for all the footguns I know and linted it. YMMV though, don't blame me if your monitors catch on fire or something :).

                It really proves the article true though, even the simplest implementation is way way way more complicated. Not to mention all the generation stuff that would have taken me forever to figure out myself.

                https://github.com/giodamelio/wvisbell

                1. 3

                  wow - thank you! i have just two½ requests:

                  • 1: please add a license, so i'm actually allowed to use (and share) it. i personally prefer GPLv3, but any free software license is fine.
                  • 2: you should probably remove the xvisbell.c file, since it's not needed (and right now, missing its license file)
                  • ½: i don't see the point in the second argument (n times) - just start the program twice

                  then, please get in touch to claim your reward. my email is in the last sentence on https://gir.st, or a PM here is fine too.

                  1. 3

                    Totally fair on all points, just fixed 1 and 2, and probably won't bother to remove ½ unless you really want, though it's a good reminder to think though other options before adding features.

                    1. 7

                      Just want to confirm that @gir has sent payment successfully.

                2. 8

                  the one I used that got me somewhere is : https://gaultier.github.io/blog/wayland_from_scratch.html#

                  it basically just draws text in a way that does not depend on any toolkit: https://github.com/ossia/score/blob/master/src/linuxcheck/wayland.cpp (to warn users of missing dependencies that their toolkits may have before my app runs). Almost 800 lines (although not all are necessary for this use case).

                  X11 version: https://github.com/ossia/score/blob/master/src/linuxcheck/x11.cpp

                  1. 5

                    For this you'd be looking at using the wlr-layer-shell protocol to have a surface ovelayed over the whole screen. That's if you're using a wlroot compositor or KWin (KDE). Mutter (GNOME) doesn't support it so you'd have to go for an extension (I've never tried though).

                    For examples you can base this on, the simplest project I can think of would be slurp. You can also look at a project of mine that also makes use of it, wl-kbptr, but it's more complex — though most of the Wayland stuff you'd need are in the main file. I'd also advise you to look at the Wayland Book which explains the different concepts.

                  2. 17

                    The internal programming model is not a big problem as long as it has no issues preventing better higher level primitives. Lots of people complain about Vulkan in a similar fashion: "so much work to do things that were simpler in OpenGL".

                    The thing is - you typically don't need to use lowest layer available. If it's possible to build nice higher level libraries, use these and be happy. Just like we all don't need to write machine code by hand.

                    Lots of tech is best when it has inconvenient but precise and efficient lowest layer plumbing and then many choices of higher level layers with good UX.

                    1. 8

                      The thing is, using higher level abstractions has a cost.

                      I've made some software which uses OpenGL. I won't pretend OpenGL is a great API, but it's fairly simple to work with, and historically, it has given me access to all features of all graphics hardware; and if some driver or hardware has some quirks, it's easyish to work around those quirks because it's all my code.

                      I could never do the same with Vulkan. So, I'd have to pick a higher level abstraction. Tell me: which higher level abstraction over Vulkan should I use, which provides access to all available features including vendor-specific extensions, and gives me enough freedom to work around hardware/driver quirks without patching the abstraction layer? I'm not sure such a thing exists. The closest thing is probably the Rust WebGPU stuff but that's a gigantic abstraction layer (just cargo add wgpu adds 124 dependencies!) which hides all the details from you (it necessarily has to, since it supports Vulkan, Metal, DX12 and OpenGL back-ends). And it's naturally Rust-only, so it's not very applicable when I'm writing C++.

                      1. 5

                        Oftentimes higher level abstractions don't have a cost, e.g. because compiler can compile them out. Or the cost is negligible.

                        https://www.youtube.com/watch?v=7bSzp-QildA is a good explanation why Vulcan is a more powerful and clean API.

                        The biggest problem with non-OpenGL is the fragmentation, as all platforms (kind of) could support OpenGL back in a day, but now there's: still Vulcan, OpenGL, Metal, webGL, DX. And most people want to write code once and run everywhere, so you need not just a wrapper, but an abstraction layer over all of the backends. So no wonder wgpu brings in bunch of dependencies. Also - counting dependencies is a bad way to judge how "heavy" an abstraction is. Most of these dependencies will not be compiled and/or executed. The actual runtime overhead might be near-zero.

                        Anyway in principle it is possible to write a simple wrapper that simplify the API. E.g. https://github.com/charles-lunarg/vk-bootstrap . But I have never used it - I'm not even C++ dev (for a long while).

                        1. 1

                          Oftentimes higher level abstractions don't have a cost, e.g. because compiler can compile them out. Or the cost is negligible.

                          That's why wgpu adds several seconds of compilation time to your project. Arguably a worse outcome than slightly slower runtime performance.

                          1. 6

                            Strong disagree. Apps are used orders of magnitude more than they are compiled, runtime performance should always be the priority if you care about your users

                            1. 1

                              That doesn't mean I should make my own life miserable by having to wait several seconds between iterations. I'm pretty sure that will lead to me building a worse, less performant product overall.

                              1. 4

                                I mean; yes it does. User's experience is more important than developer's. Developers thinking their experience matters more (not saying it does not matter at all) is why we have so many absolutely terrible products despite dev ux constantly improving.

                            2. 1

                              As I said, lots of the extra compilation is due to multiple backends.

                            3. 1

                              Oftentimes higher level abstractions don't have a cost, e.g. because compiler can compile them out. Or the cost is negligible.

                              I wasn't talking about a runtime performance cost. I was talking about the things I laid out.

                              Anyway in principle it is possible to write a simple wrapper that simplify the API.

                              Of course it's possible in principle.

                        2. 12

                          Entertaining and depressing in equal measure

                          1. 7

                            This is your regularly scheduled reminder: one thing being bad doesn't imply something else is good.

                            It's entirely possible that both Wayland and X11 are awful.

                            1. 2

                              indeed, and in my case I am considering "just writing something myself" - this solution will also be awful, but it is a road I am willing to explore. Can "just" replace Qt's mechanism from eglfs-kms to only create the dmabuf (and not do the modeset), then pass the descriptor for this buf to my yet-to-be-made compositor which will render this in a texture. Only difficulty I see (famous last words) is handling geometry/window resizes, and the fact that only Qt programs will work with this system, but that is fine.

                              1. 4

                                indeed, and in my case I am considering "just writing something myself" - this solution will also be awful, but it is a road I am willing to explore.

                                I'm not convinced it will be awful.

                                If you start from "Vulkan abstractions exist" (Mesa implements a pure software Vulkan rendering via llvmpipe!), you can elide a LOT of complexity. Vulkan handles computations on bitmaps really well via compute shaders. The "existence proof" would be how much smaller the X11 codebase would be if you do this.

                                However, you need to think very seriously about how to handle input up front. You can't divorce input handling from window handling. Wayland devs understood how icky input handling is and tried very hard to punt it from their codebase, but that decision has caused them no end of grief. Better to architect it in at the start.

                                Thomas Leonard did a good delve in trying to get down to the actual system calls for this kind of thing via OCaml: https://roscidus.com/blog/

                                1. 3

                                  You might take a look at Arcan desktop. Here is an article about the low level API's available: https://arcan-fe.com/2019/03/03/writing-a-low-level-arcan-client/

                                  At the very worst, it might give you ideas for your own!

                                  1. 2

                                    cool, it's why I wrote the comment - for other people to either say I'm silly or perhaps get linked to some nice blogposts like these, thanks!

                              2. 3

                                Should we classify Wayland as an example of the second-system effect in a different form?

                                The creators actually threw away a lot of the idea from X as not being in scope, made something simple, but largely inadequate. Hence the slow adoption. Then in having to add extensions to the protocol, got the second-system effect in variant form?

                                Possibly Wayland should have been the 'throw one away' instance, and the learning from it used to produce a v2 with the missing pieces, rather than chuck everything in the extensions?

                                1. 2

                                  This seems like a failure of the type system or binding structure. Why are all these illegal states representable?