Brute Force Search with Linear-Encoding Cryptography
Brute force search involves trying every single possibility.
This post is part of a series.
An encoding function maps a string to a sequence of numbers. For example, you have already implemented the trivial encoding function, defined as follows:
' ' --> 0
'a' --> 1
'b' --> 2
...
'z' --> 26
There are many different types of encoding functions, but here we will focus on linear encoding functions. For example, using a linear encoding function $f(x) = 2x+3,$ we can encode the message 'a cat'
as follows:
Original message: 'a cat'
Trivial encoding: [1, 0, 3, 1, 20]
Linear encoding: [2*1+3, 2*0+3, 2*3+3, 2*1+3, 2*20+3]
= [5, 3, 9, 5, 43]
Recovering a Message
If we want to recover our message from the linear encoding, we first solve for the inverse of the encoding function, i.e. the decoding function:
Then, we apply the decoding function to the encoded message:
Linear encoding: [5, 3, 9, 5, 43]
Trivial encoding: [(5-3)/2, (3-3)/2, (9-3)/2, (5-3)/2, (43-3)/2]
= [1, 0, 3, 1, 20]
Original message: 'a cat'
Recovering a Message with Multiple Possible Encoding Functions
Now, suppose we have a message [3, 1, 9, 31, 15]
and we know that this message was encoded using a linear encoding function $f(x) = ax+b$ with $a \in \lbrace 1, 2 \rbrace$ and $b \in \lbrace 1, 2 \rbrace.$ Can we break the code and recover the initial message?
The simplest way to do this is through brute-force search, which involves trying every single possibility. Here, there are $4$ possible encoding functions:
By inverting these encoding functions, we obtain the following decoding functions:
Let’s apply each of these decoding functions to our encoded message [3, 1, 9, 31, 15]
and see what they come up with. Remember that in order to represent a message, the results must all be integers between $0$ and $26$ inclusive.
[3, 1, 9, 31, 15]
a, b
[(3-b)/a, (1-b)/a, (9-b)/a, (31-b)/a, (15-b)/a]
a=1, b=1
[(3-1)/1, (1-1)/1, (9-1)/1, (31-1)/1, (15-1)/1]
[2, 0, 8, 30, 14]
30 is too big
does not represent a message
a=1, b=2
[(3-2)/1, (1-2)/1, (9-2)/1, (31-2)/1, (15-2)/1]
[1, -1, 7, 29, 13]
-1 is too small, 29 is too big
does not represent a message
a=2, b=1
[(3-1)/2, (1-1)/2, (9-1)/2, (31-1)/2, (15-1)/2]
[1, 0, 4, 15, 7]
message: $\'$a dog$\'$
a=2, b=2
[(3-2)/2, (1-2)/2, (9-2)/2, (31-2)/2, (15-2)/2]
[0.5, -0.5, 3.5, 14.5, 6.5]
contains non-integer entries
does not represent a message
We conclude that the original message was 'a dog'
and that it was encoded using $a=2$ and $b=1.$
Exercises
As usual, be sure to include a variety of tests.
- Write a function
encode_string(string, a, b)
that encodes thestring
using the linear encoding function $f(x) = ax+b$ and returns the resulting array of numbers. - Write a function
decode_numbers(numbers, a, b)
that attempts to decode thenumbers
array under the assumption that the encoding function was $f(x) = ax+b.$ If the numbers represent a message, then return the string corresponding to that message. Otherwise, returnFalse
. - Write a script to decode the message
[377, 717, 71, 513, 105, 921, 581, 547, 547, 105, 377, 717, 241, 71, 105, 547, 71, 377, 547, 717, 751, 683, 785, 513, 241, 547, 751]
, given that it was encoded with a linear encoding function $f(x)=ax+b$ where $a$ and $b$ are both integers between $0$ and $100$ inclusive. Be sure to print out all valid messages along with the values of $a$ and $b$ that generated them. Note that although there may be more than one valid message, only one will contain real words.
This post is part of a series.