My blog is now at andrea.corbellini.name.
The URL for the feed has changed, update it if you need to.
My blog is now at andrea.corbellini.name.
The URL for the feed has changed, update it if you need to.
A quick notice to everyone who follows my blog via RSS: my feeds will soon change URLs. This means that your feed reader will give you an error when trying to fetch new articles.
I’m currently switching to another content management system and, if everything goes well, I should be able to complete the migration this weekend. As soon as the migration will be completed, my RSS feeds will have different URLs.
So, if you receive an error these days, just delete the old feed and add the new one.
Sorry for the disruption.
Let’s Encrypt (the free, automated and open certificate authority) has just announced its launch schedule. According to it, certificates will be released to the public starting from the week of September 14, 2015.
Their intermediate certificates, which were generated a few days ago, will be signed by IdenTrust. What this means is that if you browse a web page secured by Let’s Encrypt, you won’t get any scary message, but the usual green lock.


In case you are curious: the root certificate is a 4096-bit RSA key, the two intermediate certificates are both 2048-bit RSA keys. But they are also planning to generate ECDSA keys later this year as well.
Technical aspects aside, this will be a great opportunity for the entire web. As I have already written, I always dreamed of an encrypted web, and I truly believe that Let’s Encrypt — or at least its approach to the problem — is the way to go.
So, will you get a Let’s Encrypt certificate when the time comes? I will do. Not for this blog (I can’t put a certificate without paying), but for other websites I manage.
Perhaps I’ll also show a “Proudly secured by Let’s Encrypt” badge.
This post is the fourth and last in the series ECC: a gentle introduction.
In the last post we have seen two algorithms, ECDH and ECDSA, and we have seen how the discrete logarithm problem for elliptic curves plays an important role for their security. But, if you remember, we said that we have no mathematical proofs for the complexity of the discrete logarithm problem: we believe it to be “hard”, but we can’t be sure. In the first part of this post, we’ll try to get an idea of how “hard” it is in practice with today’s techniques.
Then, in the second part, we will try to answer the question: why do we need elliptic curve cryptography if RSA (and the other cryptosystems based on modular arithmetic) work well?
We will now see the two most efficient algorithms for computing discrete logarithms on elliptic curve: the baby-step, giant-step algorithm, and Pollard’s rho method.
Before starting, as a reminder, here is what the discrete logarithm problem is about: given two points P and Q find out the integer x that satisfies the equation Q = xP. The points belong to a subgroup of an elliptic curve, which has a base point G and which order is n.
Before entering the details of the algorithm, a quick consideration: we can always write any integer x as x = am + b, where a, m and b are three arbitrary integers. For example, we can write 10 = 2 · 3 + 4.
With this in mind, we can rewrite the equation for the discrete logarithm problem as follows:
The baby-step giant-step is a “meet in the middle” algorithm. Contrary to the brute-force attack (which forces us to calculate all the points xP for every x until we find Q), we will calculate “few” values for bP and “few” values for Q – amP until we find a correspondence. The algorithm works as follows:
As you can see, initially we calculate the points bP with little (i.e. “baby”) increments for the coefficient b (1P, 2P, 3P, …). Then, in the second part of the algorithm, we calculate the points amP with huge (i.e. “giant”) increments for am (1mP, 2mP, 3mP, …, where m is a huge number).

To understand why this algorithm works, forget for a moment that the points bP are cached and take the equation Q = amP + bP. Consider what follows:
In conclusion, we are checking all points from 0P to nP (that is, all the possible points) performing at most 2m additions and multiplications (exactly m for the baby steps, at most m for the giant steps).
If you consider that a lookup on a hash table takes O(1) time, it’s easy to see that this algorithm has both time and space complexity O(√n) (or O(2k/2) if you consider the bit length). It’s still exponential time, but much better than a brute-force attack.
It may make sense to see what the complexity O(√n) means in practice. Let’s take a standardized curve: prime192v1 (aka secp192r1, ansiX9p192r1). This curve has order n = 0xffffffffffffffffffffffff99def836146bc9b1b4d22831. The square root of n is approximately 7.922816251426434 · 1028 (almost eighty octilions).
Now imagine storing √n points in a hash table. Suppose that each point requires exactly 32 bytes: our hash table would need approximately 2.5 · 1030 bytes of memory. Looking on the web, it seems that the total world storage capacity is in the order of the zettabyte (1021 bytes). This is almost ten orders of magnitude lower than the memory required by our hash table! Even if our points took 1 byte each, we would be still very far from being able to store all of them.
This is impressive, and is even more impressive if you consider that prime192v1 is one of the curves with the lowest order. The order of secp521r1 (another standard curve from NIST) is approximately 6.9 · 10156!
I made a Python script that computes discrete logarithms using the baby-step giant-step algorithm. Obviously it only works with curves with small orders: don’t try it with secp521r1, unless you want to receive a MemoryError.
It should produce an output like this:
Curve: y^2 = (x^3 + 1x - 1) mod 10177
Curve order: 10331
p = (0x1, 0x1)
q = (0x1a28, 0x8fb)
325 * p = q
log(p, q) = 325
Took 105 steps
Pollard’s rho is another algorithm for computing discrete logarithms. It has the same asymptotic time complexity O(√n) of the baby-step giant-step algorithm, but its space complexity is just O(1). If baby-step giant-step can’t solve discrete logarithms because of the huge memory requirements, will Pollard’s rho make it? Let’s see…
First of all, another reminder of the discrete logarithm problem: given P and Q find x such that Q = xP. With Pollard’s rho, we will solve a sightly different problem: given P and Q, find the integers a, b, A and B such that aP + bQ = AP + AQ.
Once the four integers are found, we can use the equation Q = xP to find out x:
Now we can get rid of P. But before doing so, remember that our subgroup is cyclic with order n, therefore the coefficients used in point multiplication are modulo n:
The principle of operation of Pollard’s rho is simple: we define a pseudo-random sequence of (a, b) pairs. This sequence of pairs can be used to generate the sequence of points aP + bQ. Because both P and Q are elements of the same cyclic subgroup, the sequence of points aP + bQ is cyclic too.
This means that if walk our pseudo-random sequence of (a, b) pairs, sooner or later we will detect a cycle. That is: we will find a pair (a, b) and another distinct pair (A, B) such that aP + bQ = AP + BQ. Same points, distinct pairs: we can apply the equation above to find the logarithm.
The problem is: how do we detect the cycle in an efficient way?
In order to detect our cycle, we could try all the possible values for a and b using a pairing function, but given that there are n2 such pairs, our algorithm would be O(n2), much worse than a brute-force attack.
But there exist a faster method: the tortoise and hare algorithm (also known as Floyd’s cycle-finding algorithm). The picture below shows the principle of operation of the tortoise and hare method, which is at the core of Pollard’s rho.

We have the curve y2 = (x3 + 2x + 3) mod 97 and the points P = (3, 6) and Q = (80, 87). The points belong to a cyclic subgroup of order 5.
We walk a sequence of pairs at different speeds until we find two different pairs (a, b) and (A, B) that produce the same point. In this case, we have found the pairs (3, 3) and (2, 0) that allow us to calculate the logarithm as x = (3 – 2)(0 – 3)-1 mod 5 = 3. And in fact we correctly have Q = 3P.
Basically, we take our pseudo-random sequence of (a, b) pairs, together with the corresponding sequence of aP + bQ points. The sequence of (a, b) pairs may or may not be cyclic, but the sequence of point is, because both P and Q were generated from the same base point, and from the properties of subgroups we know that we can’t “escape” from the subgroup using just scalar multiplication and addition.
Now we take our two pets, the tortoise and the hare, and make them walk our sequence from left to right. The tortoise (the green spot in the picture) is slow and reads each point one by one; the hare (represented in red) is fast and skips a point at every step.
After some time both the tortoise and the hare will have found the same point, but with different coefficient pairs. Or, to express that with equations, the tortoise will have found a pair (a, b) and the hare will have found a pair (A, B) such that aP + bQ = AP + BQ.
If our random sequence is defined through an algorithm (as opposed to being stored statically), it’s easy to see how this principle of operation requires just O(log n) space. Calculating the asymptotic time complexity is not that easy, but we can build a probabilistic proof that shows how the time complexity is O(√n), as we have already said.
I’ve built a Python script that computes discrete logarithms using Pollard’s rho. It is not the implementation of the original Pollard’s rho, but a slight variation of it (I’ve used a more efficient method for generating the pseudo-random sequence of pairs). The script contains some useful comments, so read it if you are interested in the details of the algorithm.
This script, like the baby-step giant-step one, works on a tiny curve, and produces the same kind of output.
We said that baby-step giant-step can’t be used in practice, because of the huge memory requirements. Pollard’s rho, on the other hand, requires very few memory. So, how practical is it?
Certicom launched a challenge in 1998 to compute discrete logarithms on elliptic curves with bit lengths ranging from 109 to 359. As of today, only 109-bit long curves have been successfully broken. The latest successful attempt was made in 2004. Quoting Wikipedia:
The prize was awarded on 8 April 2004 to a group of about 2600 people represented by Chris Monico. They also used a version of a parallelized Pollard rho method, taking 17 months of calendar time.
As we have already said, prime192v1 is one of the “smallest” elliptic curves. We also said that Pollard’s rho has O(√n) time complexity. If we used the same technique as Chris Monico (the same algorithm, on the same hardware, with the same number of machines), how much would it take to compute a logarithm on prime192v1?
This number is pretty self-explanatory and gives a clear idea of how hard it can be to break a discrete logarithm using such techniques.
I decided to put the baby-step giant-step script and the Pollard’s rho script together with a brute-force script into a fourth script to compare their performances.
This fourth script computes all the logarithms for all the points on the “tiny” curve using different algorithms and reports how much time it did take:
Curve order: 10331
Using bruteforce
Computing all logarithms: 100.00% done
Took 2m 31s (5193 steps on average)
Using babygiantstep
Computing all logarithms: 100.00% done
Took 0m 6s (152 steps on average)
Using pollardsrho
Computing all logarithms: 100.00% done
Took 0m 21s (138 steps on average)
As we could expect, the brute-force method is tremendously slow if compared to the others two. Baby-step giant-step is the faster, while Pollard’s rho is more than three times slower than baby-step giant-step (although it uses far less memory and fewer number of steps on average).
Also look at the number of steps: brute force used 5193 steps on average for computing each logarithm. 5193 is very near to 10331 / 2 (half the curve order). Baby-step giant-steps and Pollard’s rho used 152 steps and 138 steps respectively, two numbers very close to the square root of 10331 (101.64).
While discussing these algorithms, I have presented many numbers. It’s important to be cautious when reading them: algorithms can be greatly optimized in many ways. Hardware can improve. Specialized hardware can be built.
The fact that an approach today seems impractical, does not imply that the approach can’t be improved. It also does not imply that other, better approaches exist (remember, once again, that we have no proofs for the complexity of the discrete logarithm problem).
If today’s techniques are unsuitable, what about tomorrow’s techniques? Well, things are a bit more worrisome: there exist a quantum algorithm capable of computing discrete logarithms in polynomial time: Shor’s algorithm, which has time complexity O((log n)3) and space complexity O(log n).
The efficiency of quantum algorithms stands in state superposition. On classical computers, the memory cells (i.e. the bits) may be either 1 or 0. There are no intermediate states between the two. On the other hand, the memory cells of quantum computers (known as qubits) instead are subject to the uncertainty principle: they do not have a truly defined state until they are measured. State superposition does not mean that each qubit may be 0 and 1 at the same time (as it is often said on the web), it means that when we measure the qubit, we have a certain probability of observing 0, and another probability of observing 1. Quantum algorithms work by modifying the probability of each qubit.
This peculiarity implies that with a limited number of qubits, we can deal with lots of possible inputs at the same time. So, for example, we can tell a quantum computer that there’s a number x uniformly distributed between 0 and n – 1. This requires just log n qubits instead of n log n bits. Then, we can tell the quantum computer to perform scalar multiplication xP. This will result in the superposition of states given by all the points from 0P to (n – 1)P — that is, if we measured our qubits now, we would obtain one of the points from 0P to (n – 1)P with probability 1 / n.
This was to give you an idea of how powerful state superposition is. Shor’s algorithm does not work exactly this way, it is actually more complicated. What makes it complicated is that, while we can “simulate” n states at the same time, at some point we have to reduce these many states to just a few ones, because we want as output a single answer, not many (i.e. we want to know one single logarithm, not many probable wrong logarithms).
Now let’s forget about quantum computing, which is still far from being a serious problem. The question I’ll answer now is: why bothering with elliptic curves if RSA works well?
A quick answer is given by NIST, which provides with a table that compares RSA and ECC key sizes required to achieve the same level of security.
| RSA key size (bits) | ECC key size (bits) |
|---|---|
| 1024 | 160 |
| 2048 | 224 |
| 3072 | 256 |
| 7680 | 384 |
| 15360 | 521 |
Note that there is no linear relationship between the RSA key sizes and the ECC key sizes (in other words: if we double the RSA key size, we don’t have to double the ECC key size). This table tells us not only that ECC uses less memory, but also that key generation and signing are considerably faster.
But why is it so? The answer is that the faster algorithms for computing discrete logarithms over elliptic curves are Pollard’s rho and baby-step giant-step, while in the case of RSA we have faster algorithms. One in particular is the general number field sieve: an algorithm for integer factorization that can be used to compute discrete logarithms. The general number field sieve is the fastest algorithm for integer factorization to date.
All of this applies to other cryptosystems based on modular arithmetic as well, including DSA, D-H and ElGamal.
An now the hard part. So far we have discussed algorithms and mathematics. Now it’s time to discuss people, and things get more complicated.
If you remember, in the last post we said that certain classes of elliptic curves are weak, and to solve the problem of trusting curves from dubious sources we added a random seed to our domain parameters. And if we look at standard curves from NIST we can see that they are all verifiably random.
If we read the Wikipedia page for “nothing up my sleeve”, we can see that:
These numbers are random because their digits are uniformly distributed. And the are also unsuspicious, because they have a justification.
Now the question is: where do the random seeds for NIST curves come from? The answer is, sadly: we don’t know. Those seeds have no justification at all.
Is it possible that NIST has discovered a “sufficiently large” class of weak elliptic curves and has tried many possible seeds until they found a vulnerable curve? I can’t answer this question, but this is a legit and important question. We know that NIST has succeeded in standardizing at least a vulnerable random number generator (a generator which, oddly enough, is based on elliptic curves). Perhaps they also succeeded in standardizing a set of weak elliptic curves. How do we know? We can’t.
What’s important to understand is that “verifiably random” and “secure” are not synonyms. And it doesn’t matter how hard the logarithm problem is, or how long our keys are, if our algorithms are broken, there’s nothing we can do.
With respect to this, RSA wins, as it does not require special domain parameters that can be tampered. RSA (as well as other modular arithmetic systems) may be a good alternative if we can’t trust authorities and if we can’t construct our own domain parameters. And in case you are asking: yes, TLS may use NIST curves. If you check https://google.com, you’ll see that the connection is using ECDHE and ECDSA, with a certificate based on prime256v1 (aka secp256p1).
I hope you have enjoyed this series. My aim was to give you the basic knowledge, terminology and conventions to understand what elliptic curve cryptography today is. If I reached my aim, you should now be able to understand existing ECC-based cryptosystems and to expand your knowledge by reading “not so gentle” documentation. When writing this series, I could have skipped over many details and use a simpler terminology, but I felt that by doing so you would have not been able to understand what the web has to offer. I believe I have found a good compromise between simplicity and completeness.
Note though that by reading just this series, you are not able to implement secure ECC cryptosystems: security requires us to know many subtle but important details. Remember the requirements for Smart’s attack and Sony’s mistake — these are just two examples that should teach you how easy is to produce insecure algorithms and how easy it is to exploit them.
So, if you are interested in diving deeper into the world of ECC, where to go from here?
First off, so far we have seen Weierstrass curves over prime fields, but you must know that there exist other kinds of curve and fields, in particular:
nistk163, nistk283 and nistk571 (three curves defined over a field of 163, 283 and 571 bits).nistb163, nistb283 and nistb571.If you are interested in the implementation details of ECC, then I suggest you read the sources of OpenSSL and GnuTLS.
Finally, if you are interested in the mathematical details, rather than the security and efficiency of the algorithms, you must know that:
And don’t forget to study finite fields and field theory.
These are the keywords that you should look up if you’re interested in the topics.
Now the series is officially concluded. Thank you for all your friendly comments, tweets and mails. Many have asked me if I’m going to write other series on other closely related topics. The answer is: maybe. I accept suggestions, but I can’t promise anything.
Thanks for reading and see you next time!
This post is the third in the series ECC: a gentle introduction.
In the previous posts, we have seen what an elliptic curve is and we have defined a group law in order to do some math with the points of elliptic curves. Then we have restricted elliptic curves to finite fields of integers modulo a prime. With this restriction, we have seen that the points of elliptic curves generate cyclic subgrups and we have introduced the terms base point, order and cofactor.
Finally, we have seen that scalar multiplication in finite fields is an “easy” problem, while the discrete logarithm problem seems to be “hard”. Now we’ll see how all of this applies to cryptography.
Our elliptic curve algorithms will work in a cyclic subgroup of an elliptic curve over a finite field. Therefore, our algorithms will need the following parameters:
In conclusion, the domain parameters for our algorithms are the sextuple (p, a, b, G, n, h).
When I said that the discrete logarithm problem was “hard”, I wasn’t entirely right. There are some classes of elliptic curves that are particularly weak and allow the use of special purpose algorithms to solve the discrete logarithm problem efficiently. For example, all the curves that have p = hn (that is, the order of the finite field is equal to the order of the elliptic curve) are vulnerable to Smart’s attack, which can be used to solve discrete logarithms in polynomial time on a classical computer.
Now, suppose that I give you the domain parameters of a curve. There’s the possibility that I’ve discovered a new class of weak curves that nobody knows, and probably I have built a “fast” algorithm for computing discrete logarithms on the curve I gave you. How can I convince you of the contrary, i.e. that I’m not aware of any vulnerability? How can I assure you that the curve is “safe” (in the sense that it can’t be used for special purpose attacks by me)?
In an attempt to solve this kind of problem, sometimes we have an additional domain parameter: the seed S. This is a random number used to generate the coefficients a and b, or the base point G, or both. These parameters are generated by computing the hash of the seed S. Hashes, as we know, are “easy” to compute, but “hard” to reverse.


A curve generated through a seed is said to be verifiably random. The principle of using hashes to generate parameters is known as “nothing up my sleeve“, and is commonly used in cryptography.
This trick should give some sort of assurance that the curve has not been specially crafted to expose vulnerabilities known to the author. In fact, if I give you a curve together with a seed, it means I was not free to arbitrarily choose the parameters a and b, and you should be relatively sure that the curve cannot be used for special purpose attacks by me. The reason why I say “relatively” will be explained in the next post.
A standardized algorithm for generating and checking random curves is described in ANSI X9.62 and is based on SHA-1. If you are curious, you can read the algorithms for generating verifiable random curves on a specification by SECG (look for “Verifiably Random Curves and Base Point Generators”).
I’ve created a tiny Python script that verifies all the random curves currently shipped with OpenSSL. I strongly recommend you to check it out!
It took us a long time, but finally here we are! Therefore, pure and simple:
You see? If we know d and G (along with the other domain parameters), finding H is “easy”. But if we know H and G, finding the private key d is “hard”, because it requires us to solve the discrete logarithm problem.
Now we are going to describe two public-key algorithms based on that: ECDH (Elliptic curve Diffie-Hellman), which is used for encryption, and ECDSA (Elliptic Curve Digital Signature Algorithm), used for digital signing.
ECDH is a variant of the Diffie-Hellman algorithm for elliptic curves. It is actually a key-agreement protocol, more than an encryption algorithm. This basically means that ECDH defines (to some extent) how keys should be generated and exchanged between parties. How to actually encrypt data using such keys is up to us.
The problem it solves is the following: two parties (the usual Alice and Bob) want to exchange information securely, so that a third party (the Man In the Middle) may intercept them, but may not decode them. This is one of the principles behind TLS, just to give you an example.
Here’s how it works:
Alice and Bob exchange their public keys HA and HB over an insecure channel. The Man In the Middle would intercept HA and HB, but won’t be able to find out neither dA nor dB without solving the discrete logarithm problem.
Alice calculates S = dAHB (using her own private key and Bob’s public key), and Bob calculates S = dBHA (using his own private key and Alice’s public key). Note that S is the same for both Alice and Bob, in fact:
The Man In the Middle, however, only knows HA and HB (together with the other domain parameters) and would not be able to find out the shared secret S. This is known as the Diffie-Hellman problem, which can be stated as follows:
Given three points P, aP and bP, what is the result of abP?
Or, equivalently:
Given three integers k, kx and ky, what is the result of kxy?
(The latter form is used in the original Diffie-Hellman algorithm, based on modular arithmetic.)

The principle behind the Diffie-Hellman problem is also explained in a great YouTube video by Khan Academy, which later explains the Diffie-Hellman algorithm applied to modular arithmetic (not to elliptic curves).
The Diffie-Hellman problem for elliptic curves is assumed to be a “hard” problem. It is believed to be as “hard” as the discrete logarithm problem, although no mathematical proofs are available. What we can tell for sure is that it can’t be “harder”, because solving the logarithm problem is a way of solving the Diffie-Hellman problem.
Now that Alice and Bob have obtained the shared secret, they can exchange data with symmetric encryption.
For example, they can use the x coordinate of S as the key to encrypt messages using secure ciphers like AES or 3DES. This is more or less what TLS does, the difference is that TLS concatenates the x coordinate with other numbers relative to the connection and then computes a hash of the resulting byte string.
I’ve created another Python script for computing public/private keys and shared secrets over an elliptic curve.
Unlike all the examples we have seen till now, this script makes use of a standardized curve, rather than a simple curve on a small field. The curve I’ve chosen is secp256k1, from SECG (the “Standards for Efficient Cryptography Group”, founded by Certicom). This same curve is also used by Bitcoin for digital signatures. Here are the domain parameters:
(These numbers were taken from OpenSSL source code.)
Of course, you are free to modify the script to use other curves and domain parameters, just be sure to use prime fields and curves Weierstrass normal form, otherwise the script won’t work.
The script is really simple and includes some of the algorithms we have described so far: point addition, double and add, ECDH. I recommend you to read and run it. It will produce an output like this:
Curve: secp256k1
Alice's private key: 0xe32868331fa8ef0138de0de85478346aec5e3912b6029ae71691c384237a3eeb
Alice's public key: (0x86b1aa5120f079594348c67647679e7ac4c365b2c01330db782b0ba611c1d677, 0x5f4376a23eed633657a90f385ba21068ed7e29859a7fab09e953cc5b3e89beba)
Bob's private key: 0xcef147652aa90162e1fff9cf07f2605ea05529ca215a04350a98ecc24aa34342
Bob's public key: (0x4034127647bb7fdab7f1526c7d10be8b28174e2bba35b06ffd8a26fc2c20134a, 0x9e773199edc1ea792b150270ea3317689286c9fe239dd5b9c5cfd9e81b4b632)
Shared secret: (0x3e2ffbc3aa8a2836c1689e55cd169ba638b58a3a18803fcf7de153525b28c3cd, 0x43ca148c92af58ebdb525542488a4fe6397809200fe8c61b41a105449507083)
Some of you may have heard of ECDHE instead of ECDH. The “E” in ECHDE stands for “Ephemeral” and refers to the fact that the keys exchanged are temporary, rather than static.
ECDHE is used, for example, in TLS, where both the client and the server generate their public-private key pair on the fly, when the connection is established. The keys are then signed with the TLS certificate (for authentication) and exchanged between the parties.
The scenario is the following: Alice wants to sign a message with her private key (dA), and Bob wants to validate the signature using Alice’s public key (HA). Nobody but Alice should be able to produce valid signatures. Everyone should be able to check signatures.
Again, Alice and Bob are using the same domain parameters. The algorithm we are going to see is ECDSA, a variant of the Digital Signature Algorithm applied to elliptic curves.
ECDSA works on the hash of the message, rather than on the message itself. The choice of the hash function is up to us, but it should be obvious that a cryptographically-secure hash function should be chosen. The hash of the message ought to be truncated so that the bit length of the hash is the same as the bit length of n (the order of the subgroup). The truncated hash is an integer and will be denoted as z.
The algorithm performed by Alice to sign the message works as follows:
The pair (r, s) is the signature.

In plain words, this algorithm first generates a secret (k). This secret is hidden in r thanks to point multiplication (that, as we know, is “easy” one way, and “hard” the other way round). r is then bound to the message hash by the equation s = k-1 (z + rdA) mod n.
Note that in order to calculate s, we have computed the inverse of k modulo n. We have already said in the previous post that this is guaranteed to work only if n is a prime number. If a subgroup has a non-prime order, ECDSA can’t be used. It’s not by chance that almost all standardized curves have a prime order, and those that have a non-prime order are unsuitable for ECDSA.
In order to verify the signature we’ll need Alice’s public key HA, the (truncated) hash z and, obviously, the signature (r, s).
The signature is valid only if r = xP mod n.
The logic behind this algorithm may not seem obvious at a first sight, however if we put together all the equations we have written so far, things will be clearer.
Let’s start from P = u1G + u2HA. We know, from the definition of public key, that HA = dAG (where dA is the private key). We can write:
Using the definitions of u1 and u2, we can write:
Here we have omitted “mod n” both for brevity, and because the cyclic subgroup generated by G has order n, hence “mod n” is superfluous.
Previously, we had defined s = k-1 (z + rdA) mod n. Multiplying each side of the equation by k and dividing by s, we get: k = s-1 (z + rdA) mod n. Substituting this result in our equation for P, we get:
This is the same equation for P we had at step 2 of the signature generation algorithm! When generating signatures and when verifying them, we are calculating the same point P, just with a different set of equations. This is why the algorithm works.
Of course, I’ve created a Python script for signature generation and verification. The code shares some parts with the ECDH script, in particular the domain parameters and the public/private key pair generation algorithm.
Here is the kind of output produced by the script:
Curve: secp256k1
Private key: 0x9f4c9eb899bd86e0e83ecca659602a15b2edb648e2ae4ee4a256b17bb29a1a1e
Public key: (0xabd9791437093d377ca25ea974ddc099eafa3d97c7250d2ea32af6a1556f92a, 0x3fe60f6150b6d87ae8d64b78199b13f26977407c801f233288c97ddc4acca326)
Message: b'Hello!'
Signature: (0xddcb8b5abfe46902f2ac54ab9cd5cf205e359c03fdf66ead1130826f79d45478, 0x551a5b2cd8465db43254df998ba577cb28e1ee73c5530430395e4fba96610151)
Verification: signature matches
Message: b'Hi there!'
Verification: invalid signature
Message: b'Hello!'
Public key: (0xc40572bb38dec72b82b3efb1efc8552588b8774149a32e546fb703021cf3b78a, 0x8c6e5c5a9c1ea4cad778072fe955ed1c6a2a92f516f02cab57e0ba7d0765f8bb)
Verification: invalid signature
As you can see, the script first signs a message (the byte string “Hello!”), then verifies the signature. Afterwards, it tries to verify the same signature against another message (“Hi there!”) and verification fails. Lastly, it tries to verify the signature against the correct message, but using another random public key and verification fails again.
When generating ECDSA signatures, it is important to keep the secret k really secret. If we used the same k for all signatures, or if our random number generator were somewhat predictable, an attacker would be able to find out the private key!
This is the kind of mistake made by Sony a few years ago. Basically, the PlayStation 3 game console can run only games signed by Sony with ECDSA. This way, if I wanted to create a new game for PlayStation 3, I couldn’t distribute it to the public without a signature from Sony. The problem is: all the signatures made by Sony were generated using a static k.
(Apparently, Sony’s random number generator was inspired by either XKCD or Dilbert.)
In this situation, we could easily recover Sony’s private key dS by buying just two signed games, extracting their hashes (z1 and z2) and their signatures ((r1, s1) and (r2, s2)), together with the domain parameters. Here’s how:
The last equation lets us calculate k using only two hashes and their corresponding signatures. Now we can extract the private key using the equation for s:
Similar techniques may be employed if k is not static but predictable in some way.
I really hope you enjoyed what I’ve written here. As usual, don’t hesitate to leave a comment or send me a poke if you need help with something.
Next week I’ll publish the fourth and last article of this series. It’ll be about techniques for solving discrete logarithms, some important problems of Elliptic Curve cryptography, and how ECC compares with RSA. Don’t miss it!