2011-03-09

XOR enciphering data for medium security

Sometimes when developing systems that involve sending or storing semi-sensitive data as a string there might not be an easy option or enough time to implement conventional encryption even though there might be some need to hide the contents of the data in some way. Such situations may include where you want to send scrambled messages between servers on the interwebs, store data in places where others may have access to it (on shared environments with nosy admins etc) or where you want to be able to easily unscramble data in something like javascript or some other language that lack libraries for conventional encryption.



How Exclusive OR, XOR, is useful for enciphering

XOR is an operation that takes two Boolean operands (true/false, 1/0, on/off etc) and returns true if one of the operands has the value of true/1/on and the other does not:

1 XOR 0 = 1
0 XOR 1 = 1
1 XOR 1 = 0
0 XOR 0 = 0

In pseudo we could translate a XOR b to something like (a OR b) AND NOT(a AND b). Let’s have an example where we apply XOR to a couple of bytes: 01100001 (97 which is the char code for ‘a’) and 01111000 (120, char code for ‘x’). For the sake of clarity, XOR is applied once for each zero and one (0/1) in the byte; the first 0/1 in the first byte is XOR:ed with the first 0/1 in the second byte and so on, redering a resulting byte:

01100001 (97)
01111000 (120)
========
00011001 (25)

The result of 97 XOR 120 is 25. Now let’s see what makes XOR so useful for enciphering strings; let’s try 25 XOR 120 and let’s think about these zeroes and ones as a message, a password and a deciphered result:

00011001 <– enciphered message (25)
01111000 <– password (120)
========
01100001 <– message (97)

The result of 120 XOR 25 is 97. In other words, with XOR it is possible to take a bunch of bits, flip them according to another bunch of bits and then flip them back to the original value as long as we use the same “key”!

Implementing simple XOR enciphering for string in C#


In C# the bitwise XOR operator is denoted by the ^ character and is predefined for all integral types and Boolean expressions. The .NET String class has no ^ XOR operator defined since a string is not a Boolean expression nor an integral type. However, a string can be thought of as a list or an array of char and char is an integral type. It is not possible to add or overload operators to classes like String in .NET (though it would have been nice if it was) so for this example I have chosen to implement Xor as an extension method on String:
public static string Xor(this string thisString, string key)
{
    StringBuilder result = new StringBuilder();
 
    if (!string.IsNullOrEmpty(key))
    {
        int thisLen = thisString.Length;
        int keyLen = key.Length;
 
        for (int i = 0; i < thisLen; ++i)
        {
            result.Append((char)(thisString[i] ^ key[i % keyLen]));
        }
    }
 
    return result.ToString();
}
With this simple extension in place, even though there is probably room for improvement, it is now possible to XOR encipher a string using another string as a key using the following syntax:
string message = "Message in a bottle";
string key = "a secret key";
string enciphered = message.Xor(key);
string decipheredMessage = enciphered.Xor(key);
As you can tell, deciphering the message is as easy as just running the Xor method on the enciphered string using the same key as was used when enciphering. However, I really must stress that this cipher has several weaknesses and must never be regarded as completely safe, the two most important weaknesses being the following:
  • An attacker can try a lot of keys on the enciphered message (brute force). The shorter the key the faster the enciphered message will be cracked.
  • An enciphered message can be deciphered by using frequency analysis, especially if the attacker knows what language the deciphered message is written in. Simple statistics about how often each letter of the alphabet is used is compared to the enciphered text to figure out the length of the key and thereafter counting backwards to let the attacker learn the key as well as the contents of the message.
Luckily there are easy ways of making the process of the above at least a little bit harder to achieve. For the key, like any password, make it long and have it contain at least a couple of characters that are not alphanumerical. Frequency analysis depends on the contents of the enciphered message to contain actual language. If the message is encoded in something like Base64 before being enciphered the basic forms of frequency analysis are rendered less useful. As a side note you should keep in mind that Base64 makes the message string grow because of how it encodes the message content. Encoding the message in some format that is not natural language also slows down the process of brute forcing considerably:

string message = "Message in a bottle";
string key = "a secret key";
 
//// Imaginary ToBase64() method from String extension.
//// Also encoding to Base64 after the message has been
//// enciphered since the enciphered string will contain
//// a lot of weird chars that might not work well in
//// HTTP requests, XML storage etc.
string enciphered = message.ToBase64().Xor(key).ToBase64();
string decipheredMessage = enciphered.FromBase64().Xor(key).FromBase64();

There are of course a lot of other things that can be done to keep prying eyes from reading your XOR enciphered messages but keep in mind that if you plan to put time and effort into securing your data you should probably look at implementing real encryption like RSA.

No comments:

Post a Comment