9

Take a look at the function SHA1Transform taken from an SHA1 algorithm on Github. Assuming SHA1HANDSOFF is defined, the function looks like this:

void SHA1Transform(
    uint32_t state[5],
    const unsigned char buffer[64]
)
{
    uint32_t a, b, c, d, e;

    typedef union
    {
        unsigned char c[64];
        uint32_t l[16];
    } CHAR64LONG16;

    CHAR64LONG16 block[1];      /* use array to appear as a pointer */

    memcpy(block, buffer, 64);

    /* Copy context->state[] to working vars */
    a = state[0];
    b = state[1];
    c = state[2];
    d = state[3];
    e = state[4];
    /* 4 rounds of 20 operations each. Loop unrolled. */
    R0(a, b, c, d, e, 0);
    R0(e, a, b, c, d, 1);

    [LOTS OF SIMILAR STATEMENTS REMOVED FOR READABILITY]

    /* Add the working vars back into context.state[] */
    state[0] += a;
    state[1] += b;
    state[2] += c;
    state[3] += d;
    state[4] += e;
    /* Wipe variables */
    a = b = c = d = e = 0;
    memset(block, '\0', sizeof(block));
}

As you can see, before the function exits, the variables a,b,c,d,e are reset to zero and also the contents of block is zeroed. Since these are local variables this looks like unnecessary overhead to me because they'll go out of scope as soon as the function returns anyway.

Nevertheless, I'm wondering if there's any reason why the author of the code chose to reset those locals? Is there any or do you agree that it's unnecessary overhead to reset these local variables?

6

3 Answers 3

22

The purpose of the wipe code is to avoid leaving a copy of the original block present in the stack memory. The original block may contain secrets such as a password so erasing this from memory is good practice.

It is rather easy to scan stack memory should an attacker gain control of the process after the secret was encrypted. Chances are the buffer will have been at least partially overwritten but playing it safe seems palatable.

The problem is this code may be optimized out by the compiler as modifying local data that goes out of scope seem unnecessary. The compiler inlines the memset call and will elide it too.

A new function memset_explicit was added to the C library as of the C23 Standard to prevent this optimisation and tell the compiler that such an erasure is required by the programmer and should not be elided.

Wiping the local variables a through e seems overkill, but forcing erasure of the copy of the argument block makes sense and has minimal cost compared to the SHA computations.

Sign up to request clarification or add additional context in comments.

4 Comments

Image
Also, the optional Annex K actually has had one useful function all the time: memset_s(): "Unlike memset, any call to the memset_s function shall be evaluated strictly according to the rules of the abstract machine as described in (5.1.2.3). That is, any call to the memset_s function shall assume that the memory indicated by s and n may be accessible in the future and thus must contain the values indicated by c."
@chqrlie, Idea: All top level local objects a through e, block, ... could be instead made part of ls, a structure object ls with members .a, .b ... .block. Then the final zeroing with one memset_explicit(&ls, 0, sizeof ls) call is clean to code.
That's a possibility, but it might complicate the work of the compiler optimizing the code that uses these variables.
Image
@chux: Wiping out the local variables a to e may not be that helpful, unfortunately. The compiler will put these variables into registers and registers are routinely spilled to the stack when calling functions -- such as memset_explicit. A struct doesn't help, optimizers are smart, and will put the individual struct fields of a stack-allocated value into registers just as well. And avoiding putting those variables in registers would likely impact performance...
3

Apparently you are right, if you have a function, a local variable is created and the function ends, then your variables are indeed getting out of scope. However, it looks like the author worried about their values still being unchanged someplace in memory and even though the variables no longer exist, their values are still unchanged at the memory location they were located at. So if somehow a hacker finds a reliable way to localize those unreferenced, but unchanged locations in memory, then referring to them later on may get the information leaked. It is also important what the function actually does, at the section you marked at

[LOTS OF SIMILAR STATEMENTS REMOVED FOR READABILITY]

and before and after that. If these variables are being passed by address someplace, then it is possible that helper functions are storing their address someplace that can be reached even after the function returns, in which case, reaching out to those goodies one may be able to get the values at that address, even though there is no variable addressing them anymore.

So the questions you need to ask yourself of are:

  1. Do I need to do this micro-optimization, do I really have a performance issue whose fix would be this? (removing the resets)
  2. How secure the values should be? Are they top-secret I am willing to protect from eager hackers, what is the security implication if that sha1 gets partially or entirely leaked?

Likely your answer to 1 is "no" and to 2 is "I need security". Hence, you need protection even of the no longer referenced values and hence it is smart to remove it, as the best security that can be offered here for those values is their nonexistent not only as variables, but as address too.

Comments

0

Deleting (wiping) local data is done for security reasons, as you have also been told.

But you need also to be told that, as those variables are not observable output of the routine, you will need special compilation options (concretely, don’t optimize the code generator) for the compiler not to evict completely that code. So always compile that compilation unit with -O0 preferably.

1 Comment

Image
As others have noted, there are ways to not having to rely on no optimisation during compilation (which is insane).

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.