I was there, at Skroutz Awesome Factory, the heart of the popular greek price comparison engine, where Thinkbiz's Datathon took place. Here's how I experienced it.

This article is a translated, improved version of a greek one posted at pandorian.github.io back in the day. (Wayback Machine Link)

It was Stavros again who proposed to take part in the event. The memories of Battlehack were still fresh and that's why I immediately replied positively.

On top of that I had just finished an introductive edX course to the world of data analysis with R. The course was enough to make want to at least experiment with big-data, data analysis and data mining.

When I read the event's theme - it was about big-data analysis - I instantly knew I wanted in. We managed to persuade Dimitris to come with us, creating a team of three.

Stavros and I was already familiar with each other from our common work at DSG and the previous hackathon, Dimitris however was new to the team, although we have worked together remotely, developing Direct Solutions' Hermes-V platform. As it turned out, the three of us worked perfectly together, creating one of the best teams I have ever worked with.

About the hackathon

Datathon, obviously created from the combination of the words data + hackathon, as said above, was an event about data analysis.

datathon1

It was organized by ThinkBiz in the premises of the beautiful Skroutz Awesome Factory, the headquarters of Skroutz the most popular greek price comparison engine.

The challenge of the event was to analyse in any way we wanted an anonymized 4GB pure text dump of Skroutz's real 3 month data.

Our view

The three of us had no previous experience on data analysis. We had never worked before with so massive datasets either. That's why we experimented at the beginning.

datathon2

Dimitris, true to his Electronic Engineer blood, tried to load the dataset in Matlab; Stavros as a web developer tried to load the dataset in MySQL; me, thanks to the edX course I have taken, tried to load the dataset in R-Studio.

It took me some time to understand how R-Studio works. You see, the edX course was using a web application where you could code, something completely different from setting up R, R-Studio and R packages from the ground in your PC.

Stavros on the other hand, even though he managed to create a PHP script to load successfully the dataset in MySQL, we a quick calculation we saw it was clear we didn't have enough time to wait the loading and even if we did the editing of the data would take a long time with MySQL and the remaining time was just 6 hours.

datathon3

That's why we decided to work with R. Stavros and Dimitris downloaded R-Studio and for the next 4 hours we were learning R. Even I, having finished the edX course, had to learn a lot more.

At last, with one hour remaining, Stavros started a web-page presentation - instead of a powerpoint - using sexy interactive diagrams.

However with one hour remaining we could do a lot and the presentation at the end was incomplete and our biggest con. A con that Dimitris with an excellent speech neutralized giving us the (informal) 3rd place in the event.

More on the winners and the event at ThinkBiz and Skroutz Blogs.