Saturday 3 September 2016

Sometimes you win, sometimes you learn...

I recently participated in a competition (marketing stunt?) hosted by GE, called GEeks Go for #CC9900. There were three challenges, each one only opening once you had cracked the previous one. As it was to be expected, the fist two were reasonably easy. The third was not. There was only one prize for the first one to crack all three challenges. But the real prize was trying to solve the puzzles, having some fun in the way and learning some bits. Coding helped, but as far as I can see, it was not strictly required, definitely not for the first two.

Since the winner has already been announced, I guess that was it.

First challenge: Talking Machines

A number of twitter accounts were set up, impersonating Things which were having conversations. The only twist was that they were talking in binary, which in twitter only allows for very short conversations. It doesn't take long to notice that this must be some sort of ASCII code. Now, there were ten of these accounts, so although the instructions said that the tip would be in a conversation between two of them, I thought it would be best to have a script to automatically translate all of them, and not one character at a time. I have done this some other times while scraping webs for some content, like in this example.


The problem I had was that conversations are added to the page by a script, so I could not save the html directly and use beautifulsoup to extract the text.Something like this wouldn´t work
import requests
res = requests.get('https://www.geeksgoforgold.com')
print res.text()

Same for urllib.urlopen(url)

Maybe because the page does not allow robots? No idea. But if you know why, please leave a message with the explanation.

Eventually, I just pressed F12 to see the html, navigated to the class with the list of tweets and manually saved them in a text file. From there, extracting each tweet was simple, since there was a timeline-Tweet-text class which was easy to pick up with the code, and then translating the binary tweets to something readable was a breeze. 

But of course, it couldn´t be so simple. One of the tweets did not seem to have ASCII characters. So what do you next? Well, you try to see if they are unicode characters. Which they were and so the first code was cracked.

filein = open(r'.\GE_BULB.txt','r')
myhtml = filein.read()

GEsoup = bs4.BeautifulSoup(myhtml)
GE_Tweets = GEsoup.select('p[class="timeline-Tweet-text"]')

for tweet in reversed(GE_Tweets):
    s=''.join(unichr(int(letters,2)) for letters in tweet.getText().split() if letters.isdigit())
    print s
filein.close()

Funny thing is that by the time I joined the challenge, GE had already posted a clue in the form of an ascii converter! So I could have saved all the coding and just spent a few minutes copy pasting tweets. OK, I think it was more fun like this.

Second challenge: Hidden word

The second challenge did not seem to require much coding skills. The task was to find a hidden message somewhere in the image. However the image was actually like a Google map, with several levels of zooming, and of course the more you zoom in, the more tiles you have to inspect. At the highest zoom level, there were some 16 thousand tiles to be inspected, which with loading times and what not, starts to become a really tedious task. So web scraping abilities again to the rescue, I wrote a script that would download all the tiles in one go so that I could inspect them. Actually I was even thinking of running an OCR tool on each of them. Now, in one go here means in a long, long go. It takes quite some hours to download all the tiles (there must be better ways to do it). In the mean time I got to talk with one of the guys doing the challenge and he kindly gave me a dump of all the tiles. I still let the script run and keep downloading the images, just for the sake of it. He even put together all of them into two huge halves.

Because the image was using a google map class, I ended up clicking in the google logo as well, which led me to https://beyondthemap.withgoogle.com/en-us/ This has nothing to do with the challenge, but it is interesting in itself.


.
import requests
for n in range(128):
    for m in range(128):
        res = requests.get('https://www.geeksgoforgold.com/version/src/inc/image/tiles/7/'+str(n)+'/'+str(m)+'.jpg')
        imageFile = open(str(n)+'x'+str(m)+'.jpg','wb')
        imageFile.write(res.content)
        imageFile.close()
        print n, m

OCR is reasonably straight forward, but like everything, theory and practice are two separate things. Luckily I realized that I am a reasonably good scanning device myself. I loaded all the photos to Picasa and had them displayed in sufficient size to scan quickly through them. Half way through my eye caught a discrepancy. And bingo, two letters in ascii code. On to the third challenge!


Third challenge: WTF!

The third challenge was the final one, so this had to be the one separating the wannabees and script kiddies like myself and the pros. But since I was there and had made it by my own means, I may as well give it a go. First problem, not a coding puzzle, or not that I could see at first sight but a riddle instead of instructions and one image to download, but no submission button. What am I supposed to do with this? As it happens, quite a lot of things. They didn't work (like in I didn't win), but it was fun trying them.



Gimping the image to death with filters.

Nothing obvious

Unwarping the image.

You notice that huge vertical line from the center, do you? So did I.
I started by writing my own script to transform a round image to a rectangular one, and finally used a variation of this

I also tried other things like flattening the unwarped image, but nothing obvious there. Hold on, is that not a kind of @ in there?. The first challenge had to do with twitter accounts. So this could be the hiddend clue. Nope.

Translate image name to unicode

The image to be downloaded had a somewhat interesting name: "dc046bc021a7d3b818958e3a524cec68.png" I thought that a clue could be in the name, so I translated the dc04 6bc0 21a7 d3b8 1895 8e3a 524c ec68 to unicode. When I realized that this resulted in some chinese characters I thought, wow, I may win this one! But when I translated those I thought, ok, either an easter egg form a disgruntled programmer, or a really strange coincidence. No idea, but I could not really start anything with this.



Text in PNG file

Next idea, what about reading the binary file? I had never poked into a png file, but this lead me to this article, which clearly says that it is possible to embed text into a png file. And there was indeed some text and metadata, but nothing that seemed like a clue.

IPv6 address

What if the name of the file is a hidden IPv6 address? That got me tinkering with my router to see how IPv6 addresses can be enabled. Unfortunately http://[dc04:6bc0:21a7:d3b8:1895:8e3a:524c:ec68] did not work. In the mean time, I have read a couple of interesting articles on IPv6, so I guess, time well spent.

Long-Lat coordinates

What if the name of the file contains some hidden longitude and latitude coordinates? I ended up in weird places in Google Maps. In one case, this was pretty much in the border of Russia, not really what I was expecting. But this led to another idea. There were a couple of addresses mentioned in the terms and conditions of the challenge. One in Connecticut and one in Australia. I went straight into the streetmap view to see if there were some clues posted in or around those locations. No luck.

GE Clues

I have to say that I didn´t find the clues very enlightening. One of them was about the realizing that I was the point of view was in the center. After thoroughly checking the center of the image, I decided to check some of the latest 360 videos posted by GE in youtube. Again, interesting but no obvious clues there.

Slice the ripples and checking histogram


I remembered reading once about hidden messages in histograms. This could be the one! Not quite.


And I guess that was it. Congratulations to the winner Zoltán Szabó

Update:
There was another cue later on that had something to do with listening to the crowds. This made me think of signal processing (admittedly not my forte) and I ended up here, I already knew the Think Python book, but finding a whole series of free books is even better. Unfortunately I ran out of time to try new things due to private issues. All in all, a thoroughly enjoyable experience.