In a previous post on integrity, there was the problem of checking whether a message came from the original source. In other words, whether it was sent by an entity that we treat as trusted. In order to determine this, we need to use authentication ( authentication), which is the process of making sure that the declared characteristic of an entity is true. In cryptography, this is implemented by the digital signature mechanism and the message authentication code (MAC), among others. The following post discusses the second of these techniques in more detail.
Message authentication code
MAC is a function that takes a message and a key as input, and the result of its operation is a certain value, depending on both the message and the key being processed. This means that it can only be determined for a given message by individuals who have the key. So the foundation of security is to ensure the secrecy of the key. It must be agreed or exchanged in such a way as to guarantee its confidentiality. I will write about how to implement such a requirement in the future.
How to use MAC
The message that we secure with a MAC (colloquially referred to as message macing) is sent to the recipient in plaintext with the value of the message authentication code determined by the key attached. The recipient independently calculates the MAC based on the received message and the stored key, and then makes a comparison with the received value. If they are equal, it means that the message has been prepared by the unit holding the key and can go for further processing. If the codes differ, then the transmitted message was disturbed or the MAC was calculated based on a different key. Regardless of the reason, we do not process such a message further, because we cannot verify its originality. It can be said that the sender uses the key to sign the message, while the receiver uses the key to verify it.
This is not enough to ensure the security of a solution using message authentication codes. An attacker can intercept the message along with the calculated code and... broadcast it again without any changes. Such an attack is called a replay attack. To avoid it, the same messages should not be allowed to be processed in the system. Each of them should be different, which can be achieved by adding, for example, a unique message identifier. This can be an ascending counter expressed by the next message number or the current time. In this case, the system will not process messages marked with a value less than or equal to the last processed one. Messages can also be given random, unique identifiers. With this solution, it is necessary to store a database of identifiers for messages already processed. One-time keys for the same messages can also be used. The implementation of such a solution is more complicated, because the keys must be securely exchanged or reconciled beforehand, or appropriate techniques must be used to prepare such keys.
An attacker can use the intercepted information again, not only by sending it again to the initial recipient, but also as a reply to the sender. This scenario can be avoided by using the techniques discussed above or by using two different keys to calculate the MAC, depending on which party is the sender of the message.
HMAC algorithm
To calculate the message authentication code, we can use the hash functions already learned. Using their properties, one may be tempted to calculate the MAC on the basis of $$MACK*(M) = H (K | M)$$ (the operator $$|$$ here denotes the combination of two strings). However, without knowing the details of the operation of the hash function used, we can thus allow an attacker to calculate the MAC without knowing the key based on the message M. This is because many hash functions, including the SHA-2 family, work in an iterative manner. This means that the calculated value for a given $$M$$ is the base for further hash calculation for the $$M | Mx$$ message*. Since an attacker can intercept a $$M$$ message along with $$H (K | M)$$ then he can start patterning other $$M | Mx$$ messagesand determine $$H (K| M | Mx)$$ based on the known $$H (K | M)$$. In this way, without knowing the secret key, it can manipulate messages. Not all hash functions are susceptible to this type of attack. They include, for example, SHA-3.
Instead of coming up with your own solution, use a dedicated HMAC ( Keyed-Hash Message Authentication Code) algorithm. It is based on the use of a hash function, but in a way that prevents the use of the attack described above. HMAC creates two different keys based on the transmitted key. It then calculates the MAC value as $$HMACK*(M) = H(K1*| H (K_2 | M))$$. The algorithm is described in FIPS 198-1 and in RFC 2140. The current recommendation for HMAC is to use a key of at least 112 bits and to use a hash function of at least SHA-256, which determines a 256-bit MAC value.
In Java, we can determine the value of the MAC function using the Mac
factory. The name of the chosen MAC calculation is passed as a string. Like other algorithms, we can find it in the Standard Algorithm Name Documentation.
The effect of the following program will be to calculate the value of the HMAC function based on SHA-256 for an example message and key.
1import javax.crypto.Mac; 2import javax.crypto.spec.SecretKeySpec; 3 4public class MACTest { 5 public static void main(String[] args) throws Exception { 6 byte[] data = "abcdefghijklmnoprstuwxyz1234567890!".getBytes(); 7 byte[] key = "to-jest-tajny-klucz".getBytes(); 8 9 SecretKeySpec macKey = new SecretKeySpec(key, "HmacSHA256"); 10 Mac mac = Mac.getInstance("HmacSHA256"); 11 mac.init(macKey); 12 byte hmac[] = mac.doFinal(data); 13 } 14}
The above example has a fundamental flaw. We store the secret key in the application's source code. This is a bad solution, because this way we will not ensure its security. It would require keeping the entire source code and its binary version secret. Problems related to secure key storage will also appear in the next entries and will require a good solution.
Summary
Message authentication codes provide the implementation of integrity service and authentication of the sent information. However, they should be used in an appropriate manner. In particular, you should not invent your own algorithms and properly prepare the communication protocol to prevent typical attacks that allow the intercepted information to be reused or extended by an unauthorized sender. The secured message is sent in an unclassified form, but we are sure that no one has manipulated it. If we want to ensure its secrecy, we need to implement another service, which is confidentiality. In the next post I will present algorithms that will allow us to achieve this goal.