Wednesday, 20 July 2016

Understanding the Advanced Encryption Standard (AES)

Advanced Encryption Standard (AES)

With concerns over DES and 3DES not providing a secure basis for the future, the US government opened a competition for a successor, and chose the variable block size Rijndael algorithm from Vincent Rijmen and Joan Daemen as the basis of the Advanced Encryption Standard (AES). In the AES standard a fixed block size of 128-bits is chosen with 128-bit, 192-bit and 256-bit key lengths, and the number of rounds to produce a result varies with the key size (i.e. 4 rounds for 128-bits, 6 rounds for 192 bits, and 8 rounds for 256-bits). Blocks are split into a matrix of four rows and four to eight columns (depending on key size) of 8-bit values called the state. The encryption algorithm consists of a number of rounds of the following four named steps, and is heavily dependent on matrix operations and the XOR operation:
1)   SubBytes – In this step, each element of the state matrix is run through a substitution box (S-box).
2)   ShiftRows  - In this step, each of the row elements is cyclically shifted via a shifting permutation box (P-box).
3)   MixColumns – In this step, another permutation box (P-box) is used to mix up the bits of each column using a function.
4)   AddRoundKey – In this step, a portion of the key schedule uses an exclusive OR (XOR) operation with the state matrix. This is the only step that makes use of the key.

The encryption steps are executed as follows:
a)    Perform AddRoundKey on the state matrix values
b)   Perform the following steps n - 1 times, where n is the number of rounds for the given key size
a.     Run SubBytes on the current state matrix
b.     Run ShiftRows on the current state matrix
c.      Run MixColumns on the current state matrix
d.     Run AddRoundKey on the current state matrix
End Round Loop
c)    Run SubBytes on the current state matrix
d)   Run ShiftRows on the current state matrix
e)    Run AddRoundKey on the current state matric

So, the first round just runs AddRoundKey, and the last round misses out the MixColumns step.

Decryption uses the some of same process steps, and some inverse process steps.
a)    Perform AddRoundKey on the state matrix values
b)   Perform the following steps n - 1 times, where n is the number of rounds for the given key size
a.     Run InverseSubBytes on the current state matrix
b.     Run InverseShiftRows on the current state matrix
c.      Run InverseMaxColumns on the current state matrix
d.     Run AddRoundKey on the current state matrix
End Round Loop
c)    Run InverseSubBytes on the current state matrix
d)   Run InverseShiftRows on the current state matrix
e)    Run AddRoundKey on the current state matric

SubBytes

The S-box for the SubBytes step for the Rijndael algorithm, essentially a lookup table, works on an inverse multiplication algorithm that transforms every one of the 256 possible byte values to another individual byte value. It can be described in many different ways, but here we will describe it with matrix multiplication at the bit level followed by an XOR of the result. Often it is implemented as a simple lookup table.

Consider the input matrix a consisting of bits 0 to 7 and the output matrix b for encryption purposes in big endian format for the forward S-box:

b7

1
1
1
1
1
0
0
0

a7

0
b6

0
1
1
1
1
1
0
0

a6

1
b5
=
0
0
1
1
1
1
1
0
x
a5
XOR
1
b4

0
0
0
1
1
1
1
1

a4

0
b3

1
0
0
0
1
1
1
1

a3

0
b2

1
1
0
0
0
1
1
1

a2

0
b1

1
1
1
0
0
0
1
1

a1

1
b0

1
1
1
1
0
0
0
1

a0

1

Note that in the core transformation matric, the values across the rows consist of five 1s followed by three zeros, shifted in each row. So, the rows at the byte level consist of the bytes 248 (0xF8), 124 (0x7C), 62 (0x3E), 31 (0x1F), 143 (0x8F), 199 (0xC7), 227 (0xE3), and 241 (0xF1); which can be implemented as a circular bit shifter, so after half way through the byte the values have 128 or 0x80 added and it should be evident the hexadecimal digits swap in the pattern half way through as a result. The values in the core transformation matrix are multiplied with the bit values from the input and then an XOR operation with [0, 1, 1, 0, 0, 0, 1, 1] is performed.

ShiftRows

The ShiftRows step performs a circular shift of the bytes at each row of the state matrix, with the four columns for 128-bit blocks, six columns for 192-bit blocks, and eight columns for 256-bit blocks. So, the first row is left as it is unchanged, the second row is rotated once to the left, the third row is rotated twice to the left, and the fourth row is shifted three times to the left. So, for 128-bit and 192-bit blocks each row is circular shifted left by n – 1 bytes, but for 256-bit blocks the offsets are zero for the first row, but one byte for the second row, three bytes for the third row, and four bytes for the fourth row in Rijndael, but this is not relevant to the AES implementation which does not use 256-bit blocks. ShiftRows is one of the steps that contributes to the statistical randomization of the ciphertext.

MixColumns

This is the most complicated part of the Rijndael/AES algorithm, and in it the four bytes of each column of the state are combined via a linear transformation such that the four bytes input result in four bytes output, each of which is dependent on each of the four input bytes. It is the matrix multiplication in the Galois Field (8) that gives this effect. The matrix is as follows to give the new entries bic from the original entries aic for each column where c is 1,2,3 and 4:

b0c

2
3
1
1

a0c
b1c
=
1
2
3
1
XOR
a1c
b2c

1
1
2
3

a2c
b3c

3
1
1
2

a3c

Essentially, we are multiplying two eight bit numbers and taking the remainder modulo 283, which is binary 100011011, and we use a binary XOR long division. We have shown the operation for one column above, but it is repeated for all four columns.

AddRoundKey

The final step, AddRoundKey, takes each 8-bit value within the state matrix with an 8-bit value of the key schedule, i.e. a bitwise XOR of the current block with a portion of the expanded key. It consists of a number of sub operations.

Block sizes are computed in terms of 32-bits that are split into 4 8-bit bytes, so a 128-bit cipher has block size of 4, a 192-bit block cipher has a block size of 6, and a 256-bit block cipher has a block size of 8.  Each of the 8-bit bytes is run through the SubBytes function and concatenated back together, and then each is cyclically rotated left 8-bits to replace each value with that on its right. A constant matrix is then used for multiplication with values [1,2,4,8,16,32,64,128,27,54,108,216]; each less than 256 to avoid overflowing a byte value. 

InvSubBytes

This merely uses an inverted version of the S-Box matrix from the SubBytes operation to reverse the operation.

For decryption purposes, an inverse S-box is used, in little endian form, and again we use matrix multiplication although usually a simple lookup table is used for implementation to give the new entries b from the original entries a:

b7

0
1
0
1
0
0
1
0

a7

0
b6

0
0
1
0
1
0
0
1

a6

0
b5
=
1
0
0
1
0
1
0
0
x
a5
XOR
0
b4

0
1
0
0
1
0
1
0

a4

0
b3

0
0
1
0
0
1
0
1

a3

0
b2

1
0
0
1
0
0
1
0

a2

1
b1

0
1
0
0
1
0
0
1

a1

0
b0

1
0
1
0
0
1
0
0

a0

1

Thus, the rows at byte level consist of the bytes 82 (0x52), 41 (0x29), 148 (0x94), 74 (0x4A), 37 (0x25), 146 (0x92), 73 (0x49), and 164 (0xA4). Again, these can be implemented as a circular bit shifter, and it should be evident the hexadecimal digits swap in the pattern half way through as a result. The values in the core transformation matrix are multiplied with the bit values from the ciphertext input and then an XOR operation with [0, 0, 0, 0, 0, 1, 0, 1] is performed.

InvShiftRows

This InvShiftRows operation simply rotates each row in the opposite direction to that in the ShiftRows, so the shift is to the right instead of to the left.

InvMixColumns

For decryption, the inverse mix columns operation just uses a different matrix to get the result:


b0c

15
11
14
9

a0c
b1c
=
9
15
11
14
XOR
a1c
b2c

14
9
15
11

a2c
b3c

11
14
9
15

a3c