Brute Force Search with Linear-Encoding Cryptography

by Justin Skycak (@justinskycak) on February 01, 2021

Brute force search involves trying every single possibility.

This post is part of the book Introduction to Algorithms and Machine Learning: from Sorting to Strategic Agents. Suggested citation: Skycak, J. (2021). Brute Force Search with Linear-Encoding Cryptography. In Introduction to Algorithms and Machine Learning: from Sorting to Strategic Agents. https://justinmath.com/brute-force-search-with-linear-encoding-cryptography/

Want to get notified about new posts? Join the mailing list and follow on X/Twitter.

An encoding function maps a string to a sequence of numbers. For example, you have already implemented the trivial encoding function, defined as follows:


' ' --> 0
'a' --> 1
'b' --> 2
...
'z' --> 26

There are many different types of encoding functions, but here we will focus on linear encoding functions. For example, using a linear encoding function $f(x) = 2x+3,$ we can encode the message 'a cat' as follows:


Original message: 'a cat'
Trivial encoding: [1, 0, 3, 1, 20]
Linear encoding:  [2*1+3, 2*0+3, 2*3+3, 2*1+3, 2*20+3]
                  = [5, 3, 9, 5, 43]

Recovering a Message

If we want to recover our message from the linear encoding, we first solve for the inverse of the encoding function, i.e. the decoding function:

$\begin{align*} f(x) &= 2x+3 \\[5pt] x &= 2 f^{-1}(x)+3 \\[5pt] f^{-1}(x) &= \dfrac{x-3}{2} \end{align*}$

Then, we apply the decoding function to the encoded message:


Linear encoding:  [5, 3, 9, 5, 43]
Trivial encoding: [(5-3)/2, (3-3)/2, (9-3)/2, (5-3)/2, (43-3)/2]
                  = [1, 0, 3, 1, 20]
Original message: 'a cat'

Recovering a Message with Multiple Possible Encoding Functions

Now, suppose we have a message [3, 1, 9, 31, 15] and we know that this message was encoded using a linear encoding function $f(x) = ax+b$ with $a \in \lbrace 1, 2 \rbrace$ and $b \in \lbrace 1, 2 \rbrace.$ Can we break the code and recover the initial message?

The simplest way to do this is through brute-force search, which involves trying every single possibility. Here, there are $4$ possible encoding functions:

$\begin{align*} f_{ab}(x) &= ax + b \\ \\ f_{11}(x) &= 1x + 1 \\ f_{12}(x) &= 1x + 2 \\ f_{21}(x) &= 2x + 1 \\ f_{22}(x) &= 2x + 2 \end{align*}$

By inverting these encoding functions, we obtain the following decoding functions:

$\begin{align*} f_{ab}^{-1}(x) &= \dfrac{x - b}{a} \\ \\ f_{11}^{-1}(x) &= \dfrac{x - 1}{1} \\[5pt] f_{12}^{-1}(x) &= \dfrac{x - 2}{1} \\[5pt] f_{21}^{-1}(x) &= \dfrac{x - 1}{2} \\[5pt] f_{22}^{-1}(x) &= \dfrac{x - 2}{2} \end{align*}$

Let’s apply each of these decoding functions to our encoded message [3, 1, 9, 31, 15] and see what they come up with. Remember that in order to represent a message, the results must all be integers between $0$ and $26$ inclusive.


[3, 1, 9, 31, 15]

a, b
    [(3-b)/a, (1-b)/a, (9-b)/a, (31-b)/a, (15-b)/a]

a=1, b=1
    [(3-1)/1, (1-1)/1, (9-1)/1, (31-1)/1, (15-1)/1]
    [2, 0, 8, 30, 14]
    30 is too big
    does not represent a message
 
a=1, b=2
    [(3-2)/1, (1-2)/1, (9-2)/1, (31-2)/1, (15-2)/1]
    [1, -1, 7, 29, 13]
    -1 is too small, 29 is too big
    does not represent a message

a=2, b=1
    [(3-1)/2, (1-1)/2, (9-1)/2, (31-1)/2, (15-1)/2]
    [1, 0, 4, 15, 7]
    message: $\'$a dog$\'$

a=2, b=2
    [(3-2)/2, (1-2)/2, (9-2)/2, (31-2)/2, (15-2)/2]
    [0.5, -0.5, 3.5, 14.5, 6.5]
    contains non-integer entries
    does not represent a message

We conclude that the original message was 'a dog' and that it was encoded using $a=2$ and $b=1.$

Exercises

As usual, be sure to include a variety of tests.

Write a function encode_string(string, a, b) that encodes the string using the linear encoding function $f(x) = ax+b$ and returns the resulting array of numbers.
Write a function decode_numbers(numbers, a, b) that attempts to decode the numbers array under the assumption that the encoding function was $f(x) = ax+b.$ If the numbers represent a message, then return the string corresponding to that message. Otherwise, return False.
Write a script to decode the message [377, 717, 71, 513, 105, 921, 581, 547, 547, 105, 377, 717, 241, 71, 105, 547, 71, 377, 547, 717, 751, 683, 785, 513, 241, 547, 751], given that it was encoded with a linear encoding function $f(x)=ax+b$ where $a$ and $b$ are both integers between $0$ and $100$ inclusive. Be sure to print out all valid messages along with the values of $a$ and $b$ that generated them. Note that although there may be more than one valid message, only one will contain real words.

Want to get notified about new posts? Join the mailing list and follow on X/Twitter.