In the interests of keeping people entertained, here's a small code challenge. It's of moderate difficulty and will probably take between 1-6 hours depending on your skill and the language you choose.
If people like this idea I might post more in the future. I'll attempt to use problems that are not too CPU-bound so those of us who favor slower languages won't be hindered.
The first(?) problem: http://www.wakachan.org/challenge/crypt.xor
Post a code snippet that can decrypt it. I may give a hint or two if people are stuck. Take your time, this isn't a race, it's just for fun.
BTW, props to http://www.arcanum.co.nz/ where I crossed paths with the problem. I highly recommend visiting.
Give us a hint... What is the output supposed to be? Text? A number sequence?
Text (including a space).
Quick preliminary analysis of the file (taking the name as a hint), suggests one ASCII text, XORed with another ASCII string.
Futhermore, the key string is probably six characters in length.
WAHa's correct on both counts.
SPOLIERS FOLLOW
.
.
.
.
.
.
.
.
.
.
.
.
.
.
#!perl
use strict;
my $data;
open FILE,"crypt.xor" or die;
binmode FILE;
$data.=$_ while(<FILE>);
close FILE;
my @message=map { ord $_ } split //,$data;
my @key=map { ord $_ } split //,"xorkey";
print chr ($message[$_]^$key[$_%@key]) for(0..$#message);
I guessed it was XOR encryption from the file name. This was further confirmed by looking at a histogram of the bytes, where the majority were in the 0-31 range (I actually used Photoshop for this).
The next step was to find the key length, so I read up on Vigenere cipher because I recalled there was a good method to find the key length of those, and they're closely related to this kind of XOR encryption. I constructed a small program to do conincidence counting, which is what you use to find the key length of Vigenere ciphers.
#!perl
use strict;
my $data;
open FILE,"crypt.xor" or die;
binmode FILE;
$data.=$_ while(<FILE>);
close FILE;
for(my $i=1;$i<length $data;$i++)
{
my $coincidences=0;
for(my $j=0;$j<(length $data)-$i;$j++)
{
$coincidences++ if ((substr $data,$j,1) eq (substr $data,$j+$i,1));
}
print "$i: $coincidences\n";
}
The rate of coincidences was very high for all multiple-of-6 offsets, which suggested a 6-character key. I proceeded to try a dictionary attack. I decoded the text using all six-letter words in the dictionary, and searched for the string "the" in the output. This gave me a list of possible candidates, none of which were correct. However, some were close enough for me to figure out the first letters of the message from context, and that game me the key right away.
/me weeps at WAHa's geeky excellence
Well, the problem itself would be an interesting challenge even to a CS major.
WAHa did a very thorough job. o.o-b
But he's in physics, and as we know physics major > CS major (as much as I hate admitting that).
Do you think doing more of these in the future is worth it? If so, I might make a multi-layer problem that would be interesting to people like WAHa but also accessible to others. Only if there's interest though, and it probably would be next week.
BTW, this was my solution. Perl does a lot better at this it seems:
#include <stdio.h>
#include <stdlib.h>
#define NUM 6
int
main()
{
char t[NUM];
char cmp[] = {'x','o','r','k','e','y'};
FILE *f;
int i;
f = fopen("crypt.xor","rb");
while (fread(t, sizeof(char), NUM, f)) {
for (i=0; i!=NUM; i++) {
t[i] = t[i]^cmp[i];
putchar(t[i]);
}
}
fclose(f);
return 0;
}
i'm interested in more challenges - while i wasn't able to solve this one, waha's detailed description aroused my interest for more of this stuff, so i ended up spending some time looking up several things related to encryption. gaining more knowledge is definitely relevant to my interests, to quote whoever came up with that catchphrase :o.
WAHa -
I would have been more impressed if you would have used a chi-square statistic test to deduce the "shifts" for each character of the ciphertext (that is, to deduce the XOR'd key).
I think you can do it in reasonable time if you artificially limit the size of your alphabet by only considering lower and upper case letters.
Oh, there were many better methods to find the key, but I tried the easy way first, and it was enough.
The index of coincidence method to find the key length really surprised me in how easy it was and how well it worked, though. Try running the code - the data is really clear.