Offline to Realtime: Gesture Graphics

In a previous post I introduced my reasoning about user input. The key takeaway was that it’s not a good idea to deal with raw pointer input, and providing infrastructure to decode common user action is simple and worthwhile. Today I am writing about the second part of Gesture, the graphics component that allows me to draw on screen. Gesture::Graphics is not meant to draw a 3d scene, it is rather some infrastructure for immediate mode GUI.

Image
3d manipulator and viewport toolbars created using Gesture::Graphics

I am not an expert about about graphical user interfaces. In the past I made use of popular traditional GUI frameworks, like most people do. But if there is one thing they all have in common is how short they fall when doing something just above basic, or how poorly the GUI refreshes. Traditional GUI systems are designed around a concept of minimal screen redraw: a button is some object-oriented window widget to position on some parent window context and draw. The GUI is a deep graph that process recursively. The redraw is CPU intensive and sometimes it involves sending messages through the OS to different sub windows… That is why when I resize the window of a graphical application, everything strobes. I am old enough to remember when the year 2000 was the year of the future!! Well, it’s 2021 as I am writing this… GPUs can draw billions of polygons at 60 Hz. No GUI should strobe, ever!

Why should I care? Well I spent many years trying to make offline renderers deliver near real-time rendering performance. That is really hard work. If I can get a renderer to path-trace at 60 Hz, I am not going to settle for a GUI that strobes when I resize a window or drag a divider bar.

My first goal is not to design a fully featured general purpose GUI system though. If I get to it, I will, otherwise I may adopt dear imgui instead. For now I aim at some minimalist system where I can draw dynamic 2d and 3d elements I can interact with. This is the initial goal is illustrated here:

Gesture Graphics

The structure of Gesture::Graphics is outlined in the next listing. It represents a minimalist container for vertices in space, and list of command to draw them. If you are not familiar with the concept of immediate mode GUI, it is not as immediate as one may think. Immediate mode means that the GUI is not persistent, it is fully regenerated and redrawn at each update. The GUI generation can be embedded within other portions of the code that are not meant to represent GUI. Updates could be continuous, like in a videogame, or they can triggered by specific events when something actually changes, and this is reflected in how the app message loop polls. However the immediate GUI is not as immediate as to draw right in the spot where its code executes, instead it buffers draw commands that will draw windows and widgets later on when ready. In this sense “immediate” is “deferred”. What is “immediate” about it is that often the system does not need to store specific state for the UI, while state and data model can be derived from the application data itself. In practice some data has to be stored somewhere, some information need to be retained… I may have activated a manipulator by clicking on it… Some refers to this as “partially retained immediate mode GUI”. Although this may not become apparent until the next blog post, but this is the design I am exploring here.

struct Gesture
{
    struct Input
    {
        [...]
    };
    Input input;

    // A minimalist graphics system to draw interactive GUI elements on screen
    struct Graphics
    {
        static constexpr int kInvalidVertexIndex = -1;

        // We support multiple command lists. Each list will draw with different
        // settings. The number of independent list is currently hardcoded but
        // it could be made configurable if we would store an array of function
        // pointers to setup each draw pass.
        enum class CommandSequence : int
        {
            k3dDepthTested = 0, //< for viewport depth compositing 3d GUI elements
            k3dStacked = 1,     //< for 3d GUI elements overlay
            k2dScreen = 2       //< for topmost 2d elements (widgets and buttons)
        };
        static constexpr int kNumCommandsLists = 3;

        struct Command
        {
            Command() = default;
            Command(GLenum command, float thickness = 1)
                :command(command), thickness(thickness)
            {}

            GLenum command;  //< Any of GL_POINTS, GL_LINES, GL_TRIANGLES, etc...
            float thickness; //< Line thickness or point radius.
        };
        struct CommandRange
        {
            Command cmd;
            int begin, end;
        };

        // Gesture draw
        double timeDelta;     //< A frame time for GUI interaction animation
        const CommandRange* currentCommand = nullptr;
        int lineLoopBegin = kInvalidVertexIndex;
        std::vector<VertsCode> verts;
        std::vector<CommandRange> commands[kNumCommandsLists];

        // A texture atlas for GUI elements
        // Todo: switch to bindless textures
        uint32_t glTextureId = 0;

        // Empty the commands/verts buffers, typically done after drawing the GUI.
        void clearCommands()
        {
            lineLoopBegin = kInvalidVertexIndex;
            currentCommand = nullptr;

            verts.clear();
            for (int i = 0; i < kNumCommandsLists; ++i)
                commands[i].clear();
        }

        // Add a draw command. There are multiple command sequences you can add to so
        // to handle 3d objects, 2d overlay, etc...
        // Commands are added *before* defining their geometry
        inline void addCommand(Command c, CommandSequence index = CommandSequence::k3dStacked)
        {
            // Terminate any previous command by recording the index of the last vertex
            // before starting a new command
            for (int i = 0; i < kNumCommandsLists; ++i)
            {
                if (!commands[i].empty())
                {
                    if (commands[i].back().end == kInvalidVertexIndex)
                        commands[i].back().end = verts.size();
                }
            }

            int begin = verts.size();
            int end = kInvalidVertexIndex; //< leave the command buffer open to any new vertex

            commands[static_cast<int>(index)].push_back({ c, begin, end });
            currentCommand = &commands[static_cast<int>(index)].back();
        }

        // Any of GL_POINTS, GL_LINES, GL_TRIANGLES, etc...
        inline void addCommand(GLenum command, CommandSequence index = CommandSequence::k3dStacked)
        {
            addCommand(Command(command), index);
        }

        // Add one vertex to the current command geometry
        inline void addVert(const VertsCode& v)
        {
            assert(currentCommand != nullptr);
            verts.push_back(v);
        }

        // Add two points to form a line, or the first line of a line loop.
        // To extend the line to form a polyline (open or close) use extLine.
        // It must be used with a GL_LINES command only.
        inline void addLine(const VertsCode& v0, const VertsCode& v1)
        {
            assert(currentCommand != nullptr);
            assert(currentCommand->command == GL_LINES);

            // Mark the spot for the first line vertex so that we can close a line
            // loop if we want to.
            lineLoopBegin = verts.size();
            verts.push_back(v0);
            verts.push_back(v1);
        }
 
        enum class LoopEntry : int
        {
            kContinue = 0,
            kClose = 1
        };
        // Create a polyline (open or close) by extending a line created with
        // addLine.
        inline void extLine(const VertsCode& v, LoopEntry loop = LoopEntry::kContinue)
        {
            assert(currentCommand != nullptr);
            assert(currentCommand->command == GL_LINES);

            // Since we draw lines, each line has two points. Repeat last vertex as the
            // first of this new line. We could make this more efficient by not repeating
            // points and draw with indices. But for the tests I have done so far this
            // seems to be plenty fast, so I'll keep this simple.
            verts.push_back(verts.back());
            verts.push_back(v);

            // Check if we need and can close a line loop.
            if (loop == LoopEntry::kClose && lineLoopBegin != kInvalidVertexIndex)
            {
                verts.push_back(v);
                verts.push_back(verts[lineLoopBegin]);
                lineLoopBegin = kInvalidVertexIndex;
            }
        }

        // Gesture draw, called once per window update (frame) when the GUI draw commands
        // had been described in full.
        void draw(struct SceneView& sceneView, const struct SelectionBuffer& selection);

        // Pick a GUI element using the cursor position in Input.
        // Return a valid GUI selection code, SelectionBuffer::k_noSelectionCode
        // otherwise. 
        uint32_t pick(struct SelectionBuffer& selection, const Input& input,
                      const Viewport& viewport);
    };
    Graphics graphics;

Elaborating further, what Gesture::Graphics draws is really basic. It’s lines points and triangles. That’s it! Anything richer than that can be built on top with helper functions. Well, I should really add the ability to draw text as a base primitive, but not for this first round. Each bit of geometry that is drawn can be optionally associated with a selection code. Some of the UI is passive, such as graphics support elements, while other bits are active and can be interacted with. Gesture::Graphics doesn’t know what type of interaction it is. It only provides infrastructure to handle the proximity of the cursor to the active elements. It is then up to some other implementation to harness that.

Commands

In the spirit of minimalism, a Command is – in the most traditional sense – something the graphics API will draw on screen, nothing more. As it stands, a command is one of the GL draw primitives, such as GL_POINTS, GL_LINES, GL_TRIANGLES. I didn’t spend time abstracting from the OpenGL enumerator, but I should.

        struct Command
        {
            Command() = default;
            Command(GLenum command, float thickness = 1)
                :command(command), thickness(thickness)
            {}

            GLenum command;  //< Any of GL_POINTS, GL_LINES, GL_TRIANGLES, etc...
            float thickness; //< Line thickness or point radius.
        };

A command defines what we are about to draw, and it is followed by a number of statements to add vertices to the command. struct CommandRange is used internally to keep track of how many consecutive vertices belong to that command.

Here is a listing with some ways to add commands and draw something.

// Begin a new command, to draw a bunch of lines in 2d
// There is no need to terminate a command, adding a new
// command or calling Gesture::Graphics::draw() will do
// that for us.
gesture.graphics.addCommand(GL_LINES, 
                                    Gesture::Graphics::CommandSequence::k2dScreen);
gesture.graphics.addLine(...);
gesture.graphics.addLine(...);
gesture.graphics.addLine(...);
[...]


// For convenience I make possible to add a polyline by
// extending le first line:
gesture.graphics.addLine(...); //< draw the fist line
gesture.graphics.extLine(...);
gesture.graphics.extLine(...);
gesture.graphics.extLine(...);

// And optionally closing a loop to form a close shape
gesture.graphics.extLine(..., Gesture::Graphics::LoopEntry::kClose);


// Polygons are drawn as individual triangles, here an example to draw a rectangle
static void drawRect(Gesture& gesture, const Vec3f& p, const Vec3f& vW, const Vec3f& vH,
                     const Vec3f& color, float opacity, const uint32_t code = 0)
{
    gesture.graphics.addVert(VertsCode(p          , color, opacity, code));
    gesture.graphics.addVert(VertsCode(p + vW     , color, opacity, code));
    gesture.graphics.addVert(VertsCode(p + vW + vH, color, opacity, code));
    gesture.graphics.addVert(VertsCode(p          , color, opacity, code));
    gesture.graphics.addVert(VertsCode(p + vW + vH, color, opacity, code));
    gesture.graphics.addVert(VertsCode(p +      vH, color, opacity, code));
}

gesture.graphics.addCommand(GL_TRIANGLES);
drawRect(gesture, ...); 


Since I don’t have any prior experience designing a system like this, I keep it basic and flexible. I may decide to tighten it up in the future if I see good or bad patterns emerge. For example I may decide to remove (or convert) support for lines and points, so that everything is a list of triangles and the entire GUI geometry could be drawn with a single draw call. For now I only have some really basic methods to add geometry. Most of which are lines to draw 3d manipulators.

Image
Example of 3d manipulators for objects placement.

Draw Geometry

What I draw is some basic geometry with color, opacity, texture coordinates, and the optional selection code. This is the struct for a vertex. There is nothing special about it, it is rather prosaic and unoptimized actually. But so far I have not seen any reason to make this any better. Beside leaving some comment for my future self, I let this be.

// A struct to express a vertex as used to draw GUI geometry.
struct VertsCode
{
    // Some handshakes with the shader code about texture coords
    static constexpr float k_noTexture = -64.f;
    static constexpr float k_marqueePattern = -128.f;

    VertsCode() {}

    // Constructor commonly used for non textured elements
    VertsCode(const Vec3f& v, Vec3f c, float opacity = 1.0f, uint32_t selectionCode = 0)
        : x(v.x), y(v.y), z(v.z),
          u(k_noTexture), v(k_noTexture),
          r(c.x), g(c.y), b(c.z), a(opacity),
          s(selectionCode)
    {}

    // Comprehensive constructor 
    VertsCode(const Vec3f & v, Vec2f uv, Vec3f c, float opacity = 1.0f, uint32_t selectionCode = 0)
        : x(v.x), y(v.y), z(v.z),
          u(uv.x), v(uv.y),
          r(c.x), g(c.y), b(c.z), a(opacity),
          s(selectionCode)
    {}

    float x, y, z;      //< position
    float u, v;         //< texture coordinates
    float r, g, b, a;   //< draw color (multiplies texture)

    // A selection code used for UI interactions through the pointer cursor.
    // The selection code is drawn to a separate frame buffer for picking and it
    // is not affected by opacity.
    uint32_t s;

    // Note, this data layout is not optimal. We should go for a multiple
    // of 16 bytes, which could be achieved by replacing rgba  floating point
    // values with a single 32 bit int, plus 4 bytes of padding.
    // Without any such optimization the GUI seems to draw in a fraction of a
    // millisecond, so we are going to keep it simple for now.
};

Some note about performance: current GPUs are optimized to read words of 16 bytes. VertsCode is not 16 bytes, nor it is aligned to 16 bytes boundaries. If I’d start drawing complex GUI with an extensive amount of geometry, there are a variety of improvements I could attempt:

  1. I may want to shrink VertsCode as suggested in the comments.
  2. I may try to eliminate repeated vertices by using indexed geometry instead.
  3. I may try to do coalesce subsequent draw commands that happens to be compatible, or I may switch to only draw triangles instead.
  4. I may use more advanced draw calls that are able to batch many draw calls into one…

That rabbit hole goes deep, but this time I am not going in.

The Shader

Of course this stuff need to be drawn so I need a minimalist shader to be paired with the geometry. I’d like to assume you know how to compile shaders, however a lot of the example code I have seen is too naïve to consider useful. One slap in the face I get by coming from offline rendering is seeing how shader code is somehow “second class citizen” in a program. Consider how HLSL in DRD3D is Microsoft’s product, nonetheless Visual Studio offers no syntax highlight for shader code… Besides, the developer needs to go out of their way in order to figure out how to integrate shader code in their binaries, or to stick to ugly strings of text embedded in the code. Doing anything more elegant than what I do here seems to require a non negligible infrastructure in the build system, so for now I am left baffled and keep on using strings of text. More on these lines: shader compilation is done at runtime by the driver, replacing build time compiler with runtime errors. Runtime errors I need to query and print to the terminal not to be completely in the dark. This listing is primordial, but it helps.

static bool checkShaderError(uint32_t resource, uint32_t code)
{
    int success;
    char infoLog[4096];
    glGetShaderiv(resource, code, &success);
    if (!success)
    {
        glGetShaderInfoLog(resource, 4096, nullptr, infoLog);

        fprintf(stderr, "Error: Shader compilation failed!\n"
                        "Error: %s\n", infoLog);

        return false;
    }
    return true;
}

static bool checkProgramError(uint32_t resource, uint32_t code)
{
    int success;
    char infoLog[4096];
    glGetProgramiv(resource, code, &success);
    if (!success)
    {
        glGetProgramInfoLog(resource, 4096, nullptr, infoLog);

        fprintf(stderr, "Error: Program compilation failed!\n"
                "Error: %s\n", infoLog);
        return false;
    }
    return true;
}

bool GlProgram::makeProgram(const char* vertex_shader_text, const char* fragment_shader_text)
{
    uint32_t vertex_shader = glCreateShader(GL_VERTEX_SHADER);
    glShaderSource(vertex_shader, 1, &vertex_shader_text, NULL);
    glCompileShader(vertex_shader);
    if (!checkShaderError(vertex_shader, GL_COMPILE_STATUS))
        std::abort();

    uint32_t fragment_shader = glCreateShader(GL_FRAGMENT_SHADER);
    glShaderSource(fragment_shader, 1, &fragment_shader_text, NULL);
    glCompileShader(fragment_shader);
    if (!checkShaderError(fragment_shader, GL_COMPILE_STATUS))
        std::abort();

    program = glCreateProgram();
    glAttachShader(program, vertex_shader);
    glAttachShader(program, fragment_shader);
    glLinkProgram(program);
    if (!checkProgramError(program, GL_LINK_STATUS))
        std::abort();

    return true;
}

The shader code that follows is equally primordial. After all, GUI element shading tends to be trivial. The only notes here are:

  1. The fragment shader runs twice in two modes, picking=0 is meant to draw the visible GUI, with some Anti-Aliasing. While picking=1 draws selection codes and in doing so it disables any texturing, opacity, blending and AA. Technically, I guess I could draw everything in a single pass, and paint the draw code in an auxiliary buffer. In practice I have failed to figure out how to successfully configure the main OpenGL context to do that and I gave up out of frustration… so, two passes it is…
  2. You may have noticed in struct VertsCode the definition of a pair of constants used as handshake between the struct and the shader code. The uv coordinates have some special values to enable special draw modes (no texture, and marquee checker pattern). Those should be defined in a common header file for better maintainability, but I was lazy and only left breadcrumbs.
  3. Marquee drawing mode is specific to draw selection marquee dotted lines, or other lines that I want to stand out against any background element no matter what color. Typical 3d applications, such as Maya, seem to draw a marquee as a dashed white line, which becomes invisible when against bright content. By drawing a line made of alternating white and black dots, the line is pretty much always visible, without messing with blend modes, even against mid-tone shades of gray. It may not be the best, but so far it is the least worse idea I could come up with.
    const char* vertex_shader_text =
        R"(
        #version 150
        uniform mat4 projection;
        in vec3 vPos;
        in vec2 vUV;
        in vec4 vCol;
        out vec4 Frag_color;
        out vec2 Frag_UV;

        void main()
        {
            Frag_UV = vUV;
            Frag_color = vCol;
            gl_Position = projection * vec4(vPos, 1.0);
        }
        )";

    const char* fragment_shader_text =
        R"(
        #version 150
        in vec4 Frag_color;
        in vec2 Frag_UV;
        in vec4 gl_FragCoord;
        uniform int picking;  //< draw for display or for picking? Picking has no texture.
        uniform sampler2D Texture;
        out vec4 outputF;

        void main()
        {
            vec4 result = Frag_color;

            // When drawing selection codes, everything is opaque.
            if (picking == 1)
                result.w = 1.0;

            // Gesture geometry handshake: any uv value below -64 means
            // no texture lookup. Check VertsCode::k_noTexture
            if (picking == 0 && Frag_UV.s > -64)
                result *= texture2D(Texture, Frag_UV.st);

            // Gesture geometry handshake: any uv equal to -128 means
            // overlay a checkerboard pattern. Check VertsCode::k_marqueePattern
            if (Frag_UV.s == -128)
            {
                // Create a pixel checkerboard pattern used for marquee
                // selection
                int x = int(gl_FragCoord.x); int y = int(gl_FragCoord.y);
                if (((x+y) & 1) == 0) result = vec4(0,0,0,1);
            }
            outputF = result;
        }
        )";

The Draw

Draw is conceptually straightforward. Gesture::Graphics has three hardcoded command lists which need to be configured and drawn one after the other. First I may draw any GUI geometry that I want to be depth-composited with the rest of the scene in viewport. This may be any supporting guide that needs to appear for the duration of some action and reveal intersections against the scene geometry. Temporarily adding and removing scene geometry to do so tends to be a bad idea.

    // Draw something "in the scene". This has a limitation that we assume there is a
    // single viewport.
    static void configure_3dDepthTested(SceneView& sceneView)
    {
        Shaders& shaders = sceneView.shaders;

        glUniformMatrix4fv(shaders.gui.loc_proj, 1, GL_FALSE,
                           (const GLfloat*)sceneView.camera.xform.m); CHECK_ERROR(__LINE__);

        glEnable(GL_DEPTH_TEST); CHECK_ERROR(__LINE__);
    }

The second pass is still about 3d geometry, only this time I want it to be drawn on top, without depth test. These two passes shares in common the same projection matrix as the rest of the scene. 3d manipulators shown earlier are examples.

    // Overlay something "in the scene". This has a limitation that we assume there
    // is a single viewport.
    static void configure_3dStacked(SceneView& sceneView)
    {
        Shaders& shaders = sceneView.shaders;

        glUniformMatrix4fv(shaders.gui.loc_proj, 1, GL_FALSE,
                          (const GLfloat*)sceneView.camera.xform.m); CHECK_ERROR(__LINE__);

        glDisable(GL_DEPTH_TEST); CHECK_ERROR(__LINE__);
    }

The third pass is a 2d orthographic projection of screen space, where the coordinates are measured in pixels starting at the lower left corner of the screen. Here is where I draw buttons or other traditional GUI elements if you wish.

    // Draw something in screen space without zbuffer.
    static void configure_2dScreen(SceneView& sceneView)
    {
        Shaders& shaders = sceneView.shaders;

        HomogeneousSpace4f p = HomogeneousSpace4f::ortho(
            sceneView.viewport.region.lower.x,
            sceneView.viewport.region.upper.x,
            sceneView.viewport.region.lower.y,
            sceneView.viewport.region.upper.y,
            1.0f, -1.f);
        glUniformMatrix4fv(shaders.gui.loc_proj, 1, GL_FALSE, (const float*)p.m); CHECK_ERROR(__LINE__);

        glDisable(GL_DEPTH_TEST); CHECK_ERROR(__LINE__);
    }

Each pass has its own configuration routine, these could be function pointers in case I’d like to add more than three passes.

Draw is as straightforward as looping through the command lists and their commands. I use a lambda to implement the draw because I’ll have to run it twice. Once for display and once more for selection codes. After drawing, commands and geometry are cleared, ready to be regenerated once more. Draw is called after scene rendering, right before swapping frame buffers to update the screen. This mechanism guarantees that the UI is always up to date, with zero delays and no jittering.

void Gesture::Graphics::draw(SceneView& sceneView, const SelectionBuffer& selection)
{
    // Gesture draw spans across the entire window and it is not restricted to a single
    // viewport.
    if (this->verts.empty())
    {
        clearCommands();
        return;
    }

    // YAGNI: With a small effort we could create dynamic passes that are
    //        fully user configurable...
    // 
    // Configure command lists
    void (*pipelineConfig[3])(SceneView&);
    // Step 1: we draw any command that is depth-composited with the scene
    pipelineConfig[static_cast<int>(Graphics::CommandSequence::k3dDepthTested)] = Pipeline::configure_3dDepthTested;
    // Step 2: we draw any command that is not depth composited but is otherwise using
    //         the same perspective projection
    pipelineConfig[static_cast<int>(Graphics::CommandSequence::k3dStacked    )] = Pipeline::configure_3dStacked;
    // Step 3: we draw anything that is just an overlay in creen space. Most of the UI
    //         elements go here.
    pipelineConfig[static_cast<int>(Graphics::CommandSequence::k2dScreen     )] = Pipeline::configure_2dScreen;

    // Backup state
    float lineWidth; glGetFloatv(GL_LINE_WIDTH, &lineWidth); CHECK_ERROR(__LINE__);
    float pointSize; glGetFloatv(GL_POINT_SIZE, &pointSize); CHECK_ERROR(__LINE__);
    bool depthTest = glIsEnabled(GL_DEPTH_TEST); CHECK_ERROR(__LINE__);

    // Draw UI and viewport manipulators
    {
        ScopedGlVertexBuffer vertex_buffer(this->verts.data(), this->verts.size() * sizeof(VertsCode));

        // Prepare a lambda to draw the Gesture commands. We'll run the lambda twice, once to
        // draw the GUI and once to draw the selection buffer data.
        auto drawGesture = [&](bool display)
        {
            Shaders& shaders = sceneView.shaders;
            shaders.gui.configure(display, this->glTextureId);

            for (int sequence = 0; sequence < Graphics::kNumCommandsLists; ++sequence)
            {
                if (!this->commands[sequence].empty())
                {
                    pipelineConfig[sequence](sceneView);

                    // YAGNI: Commands could be coalesced, setting state could be avoided
                    //        if not changing... For now it seems we can draw at over 2000 Hz
                    //        and no further optimization is required.
                    for (Graphics::CommandRange cmdr : this->commands[sequence])
                    {
                        Graphics::Command& cmd = cmdr.cmd;
                        if (cmdr.end == -1) cmdr.end = this->verts.size();
                        if (cmdr.begin >= cmdr.end) continue;

                        if (cmd.command == GL_LINES)  { glLineWidth(cmd.thickness); CHECK_ERROR(__LINE__); }
                        if (cmd.command == GL_POINTS) { glPointSize(cmd.thickness); CHECK_ERROR(__LINE__); }

                        glDrawArrays(cmd.command, cmdr.begin, cmdr.end - cmdr.begin); CHECK_ERROR(__LINE__);
                    }
                }
            }

            shaders.gui.cleanup();
            glDisableVertexAttribArray(shaders.gui.loc_vpos); CHECK_ERROR(__LINE__);
        };
        drawGesture(/*display*/ true);

        // The last thing we draw is selection codes for next frame. This allows us
        // to know what is under the pointer cursor. 
        drawGestureCodes(selection, sceneView.viewport, [&]()
        {
            drawGesture(/*display*/ false);
        });
    }

    // Restore state
    glLineWidth(lineWidth); CHECK_ERROR(__LINE__);
    glPointSize(pointSize); CHECK_ERROR(__LINE__);
    glToggle(GL_DEPTH_TEST, depthTest); CHECK_ERROR(__LINE__);

    clearCommands();
}


template<typename DrawBlock>
void drawGestureCodes(const SelectionBuffer& selection, const Viewport& viewport, DrawBlock drawSceneGeometry)
{
    // Backup
    GLenum last_framebuffer; glGetIntegerv(GL_DRAW_FRAMEBUFFER_BINDING, (GLint*)&last_framebuffer); CHECK_ERROR(__LINE__);
    GLboolean last_enable_depth_test = glIsEnabled(GL_DEPTH_TEST); CHECK_ERROR(__LINE__);
    GLboolean last_enable_blend = glIsEnabled(GL_BLEND); CHECK_ERROR(__LINE__);

    // Render to texture
    glBindFramebuffer(GL_FRAMEBUFFER, selection.frameBuffer); CHECK_ERROR(__LINE__);
    {
        glViewport(viewport.region.lower.x, viewport.region.lower.y, viewport.region.size().x, viewport.region.size().y);
        glEnable(GL_BLEND);
        glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);

        drawSceneGeometry();
    }

    // Restore
    glBindFramebuffer(GL_FRAMEBUFFER, last_framebuffer); CHECK_ERROR(__LINE__);//< Restore default framebuffer
    if (last_enable_depth_test) glEnable(GL_DEPTH_TEST); else glDisable(GL_DEPTH_TEST); CHECK_ERROR(__LINE__);
    if (last_enable_blend) glEnable(GL_BLEND); else glDisable(GL_BLEND); CHECK_ERROR(__LINE__);
}

Selection Buffer

In order to draw selection codes, I need a frame buffer. The same buffer is shared between GUI interaction and scene objects selection (not the subject of this article). Most of the time the selection frame buffer contains Gesture selection codes. If the pointer interaction is not towards a GUI element, but on otherwise “neutral” viewport space and the action is that of object selection, then and only then I overwrite the selection buffer with scene selection codes. At frame end, the GUI will paint itself again, filling the frame buffer with new selection codes… and the cycle continues. More to this in a future post.

// Some base RenderBuffer struct, in common between viewport rendering and
// other stuff... I am doubtful this is the best practice, for now I am dealing
// with gobbledygook one bit at a time.
struct RenderBuffer
{
    Vec2i resolution = Vec2i(0);
    int samples = 0;
    uint32_t frameBuffer = 0;
    uint32_t depthRenderBuffer = 0;
    uint32_t renderedTexture = 0;
    uint32_t depthTexture = 0;

    // Call update at the beginning of scene/frame update. We need to make sure
    // we do have a frame buffer (created lazily) at the right resolution.
    // @param samples is for MSAA, should be zero for selection buffers.
    bool update(Vec2i resolution, int samples = 0)
    {
        if (resolution == this->resolution && samples == this->samples)
            return true;

        // To prevent some buffer allocation issue, ignore zero-size buffer resize, this happens when
        // the app is minimized.
        if (resolution == Vec2i(0))
            return true;

        destroy();
        return create(resolution, samples);
    }
    void destroy();
    bool create(Vec2i resolution, int samples = 0);
};

struct SelectionBuffer : RenderBuffer
{
    // 1 bit is reserved for comonent flags.
    static constexpr uint32_t k_noSelectionCode = 0x7fffffffu;

    // There is most stuff here for scene content selection but it is not
    // relevant for now.
    [...]
};

The implementation is more copy-paste gobbledygook from Kronos reference guide pages. For clarity I removed the state error checking, which at least in debug mode should be done at each and every line. Fingers crossed this code is not leaking resources. If there is a way to check, it’s not trivial to find!

oid RenderBuffer::destroy()
{
    if (frameBuffer == 0) return;
    glDeleteFramebuffers(1, &frameBuffer);
    glDeleteRenderbuffers(1, &depthRenderBuffer); 
    glDeleteTextures(1, &renderedTexture);
    glDeleteTextures(1, &depthTexture);
    frameBuffer = 0;
    depthRenderBuffer = 0;
    renderedTexture = 0;
    resolution = Vec2i(0);
}

bool RenderBuffer::create(Vec2i resolution, int samples)
{
    this->resolution = resolution;
    this->samples = samples;

    glGenFramebuffers(1, &frameBuffer);
    glBindFramebuffer(GL_FRAMEBUFFER, frameBuffer);

    if (samples == 0)
    {
        glCreateTextures(GL_TEXTURE_2D, 1, &renderedTexture);

        // "Bind" the newly created texture: all future texture functions will modify this texture
        glBindTexture(GL_TEXTURE_2D, renderedTexture);

        // Define the texture quality and zeroes its memory
        glTexImage2D(GL_TEXTURE_2D, 0, GL_RGB, resolution.x, resolution.y, 0, GL_RGB, GL_UNSIGNED_BYTE, 0);

        // We don't need texture filtering, but we need to specify some.
        glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
        glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_NEAREST);

        // Set "renderedTexture" as our colour attachement #0
        glFramebufferTexture(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, renderedTexture, 0);

        // The depth buffer
        glGenRenderbuffers(1, &depthRenderBuffer); 
        glBindRenderbuffer(GL_RENDERBUFFER, depthRenderBuffer);
        glRenderbufferStorage(GL_RENDERBUFFER, GL_DEPTH_COMPONENT, resolution.x, resolution.y);
        glFramebufferRenderbuffer(GL_FRAMEBUFFER, GL_DEPTH_ATTACHMENT, GL_RENDERBUFFER, depthRenderBuffer); 
    }
    else
    {
        glCreateTextures(GL_TEXTURE_2D_MULTISAMPLE, 1, &renderedTexture);
        glCreateTextures(GL_TEXTURE_2D_MULTISAMPLE, 1, &depthTexture);

        // "Bind" the newly created texture : all future texture functions will modify this texture
        glBindTexture(GL_TEXTURE_2D_MULTISAMPLE, renderedTexture);
        glTexImage2DMultisample(GL_TEXTURE_2D_MULTISAMPLE, samples, GL_RGBA, resolution.x, resolution.y, GL_TRUE);
        glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D_MULTISAMPLE, renderedTexture, 0);

        // The depth buffer
        glBindTexture(GL_TEXTURE_2D_MULTISAMPLE, depthTexture);
        glTexImage2DMultisample(GL_TEXTURE_2D_MULTISAMPLE, samples, GL_DEPTH32F_STENCIL8, resolution.x, resolution.y, GL_TRUE);
        glFramebufferTexture2D(GL_FRAMEBUFFER, GL_DEPTH_STENCIL_ATTACHMENT, GL_TEXTURE_2D_MULTISAMPLE, depthTexture, 0);
    }

    // Always check that our framebuffer is ok
    bool status = (glCheckFramebufferStatus(GL_FRAMEBUFFER) == GL_FRAMEBUFFER_COMPLETE);

    glBindFramebuffer(GL_FRAMEBUFFER, 0);
    return status;
}

Selection

Selection, also referred as frame buffer picking, is the action of looking up values in a region of the frame buffer, instead of displaying it. The part that is relevant here is in how I deal with the region I want to pick values from, and what I do with the values. Everything else is GL boilerplate.

GUI interaction are at discrete positions over the frame buffer, from a picking point of view, and not continuous gesturers such as click-drag. Basically, I need to know where the cursor is hovering above, it is up to the program to decide what to do with it. Typically the press event may activate some state in the app, such as enabling a manipulator. To drag outside of the active manipulation area may be an action, or the canceling of the action… it is up to the implementation. In other words, click-drag on a manipulator may mean “do something”, click drag on a button may mean, “do-something wait… I-changed-my-mind… don’t-do-anything”. Here we don’t care about any of that, we only pass the information up the caller.

Each selection code has a priority, lower values means higher priority. I pick region around the cursor, the size of which is a arbitrary. Depending on the purpose of an app, the size of the region should be dictated by accessibility guidelines. The purpose of the region is to allow to select thin elements, without having to be precise. I wouldn’t want to draw thick “Lego Duplo” like lines just to be able to select them. I do like the visual elegance of thin lines. At the same time, selection codes do need a priority, I cannot just pick the code in the nearest non-empty pixel. Check this example:

The rotation manipulator has a large “free rotation” disk covering most of its area. It shows as a transparent surface, gray when inactive and turning yellow when active. It is a background to the axis rotation handles though. The selection code for the free rotation must be lower in priority, otherwise proximity-based selection would not allow me to easily pick the axis rotation handles.

Here is the listing with the picking implementation.

uint32_t Gesture::Graphics::pick(SelectionBuffer& selection, const Gesture::Input& input, const Viewport& viewport)
{
    // Todo: the choice of pointer button should not be hardcoded here
    const Input::Button& button = input.mbs[Gesture::Input::kButtonLeft];
    int clikEnded = (button.action == Input::Action::kRelease);
    int clickDrag = (button.action == Input::Action::kDrag);
    int32_t buttonModifier = button.modifier;

    if (clikEnded || clickDrag)
        return SelectionBuffer::k_noSelectionCode; //< not a selection event

    // Prepare a region in raster spacer
    BBox2i region(wb::empty);
    {
        Vec2i pixel = viewport.toRaster(input.cursorPos);

        // Grow the click position by some pixels to improve usability. Ideally
        // this should be a configurable parameter to improve accessibility.
        constexpr int kClickRadius = 7; //< in pixels
        region.extend(pixel - Vec2i(kClickRadius));
        region.extend(pixel + Vec2i(kClickRadius));
    }

    // Render on the whole framebuffer, complete from the lower left corner tothe upper right
    BBox2i viewRegion(viewport.region.lower, viewport.region.upper - 1);

    // Crop selection with view in order to feed GL draw a valid region.
    region = intersect(region, viewRegion);

    // Frame buffer resolution should be correct, check just in case.
    if (selection.resolution != viewport.resolution)
        return SelectionBuffer::k_noSelectionCode;

    uint32_t entry = SelectionBuffer::k_noSelectionCode; 

    // Render to texture
    GLint last_framebuffer; glGetIntegerv(GL_DRAW_FRAMEBUFFER_BINDING, &last_framebuffer); CHECK_ERROR(__LINE__);
    glBindFramebuffer(GL_FRAMEBUFFER, selection.frameBuffer);
    {
        glViewport(viewRegion.lower.x, viewRegion.lower.y, viewRegion.upper.x + 1, viewRegion.upper.y + 1);

        Vec2i regionSize = region.size() + Vec2i(1);
        size_t size = size_t(regionSize.x) * size_t(regionSize.y);
        if (size)
        {
            // Read pixels over a region. What we read is an 32 bits unsigned partitioned into 8 bits
            // RGB values... at least until we figure out how to do it better.
            // If selection region is small, work on stack memory, otherwise allocate.
            uint8_t  valuesLocalBuffer[1024 * 4];
            uint8_t* values = (size <= 1024 ? valuesLocalBuffer : (uint8_t*)malloc(size * 4));
            glReadPixels(region.lower.x, region.lower.y, regionSize.x, regionSize.y,
                            GL_RGBA, GL_UNSIGNED_BYTE, values); CHECK_ERROR(__LINE__);
     
            // Search the click area for the lowest selection code. Lower code means
            // higher selection priority.
            for (uint8_t* rgba = values; rgba < (values + size * 4); rgba += 4)
            {
                uint32_t code = selectionRGB8ToCode(rgba);
                if (code != SelectionBuffer::k_noSelectionCode)
                {
                    if (code < entry)
                        entry = code;
                }
            }

            if (values != valuesLocalBuffer)
                free(values);
        }
    }
    // Restore previous framebuffer
    glBindFramebuffer(GL_FRAMEBUFFER, last_framebuffer); CHECK_ERROR(__LINE__);

    return entry;
}

Conclusion

I have yet to show a concrete example for drawing a manipulator on screen. So far, this was just infrastructure. But I feel this post is already long enough that I don’t want to drag it any further.

I feel there are some rough corners here, and this post should give you a sense for my feeling of discomfort around the use of graphics APIs. You either know exactly what you are doing, or you are left with a lot to guess. In the end you have something that works, but one of those calls may be making the system wholly inefficient or round-trippy inside the GPU driver. The fact that I couldn’t figure out how to get gesture draw selection codes in a single pass along with the main frame buffer MSAA, is not something to write home about… so I write to the rest of the world instead, as an attempt to say: don’t feel down if you are struggling. This stuff can be weird and impenetrable. Hopefully you’ll find a mentor who knows and who can guide you to the right path. I am doing this the stubborn way, not asking for help, to see how far I’ll go. I am also not reading tutorials, as I am trying to avoid any imprinting onto some way of doing things. I only let my past experience on a the other end of graphics to guide me, which is the essence in this series of posts. I hope you find this useful, and in exchange I ask the experts for forgiveness.

Leave a comment