This article is a translated, improved version of a greek one posted at pandorian.github.io back in the day. (Wayback Machine Link)
On top of that I had just finished an introductive edX course to the world of data analysis with R. The course was enough to make want to at least experiment with big-data, data analysis and data mining.
When I read the event's theme - it was about big-data analysis - I instantly knew I wanted in. We managed to persuade Dimitris to come with us, creating a team of three.
Stavros and I was already familiar with each other from our common work at DSG and the previous hackathon, Dimitris however was new to the team, although we have worked together remotely, developing Direct Solutions' Hermes-V platform. As it turned out, the three of us worked perfectly together, creating one of the best teams I have ever worked with.
About the hackathon
Datathon, obviously created from the combination of the words data + hackathon, as said above, was an event about data analysis.
It was organized by ThinkBiz in the premises of the beautiful Skroutz Awesome Factory, the headquarters of Skroutz the most popular greek price comparison engine.
The challenge of the event was to analyse in any way we wanted an anonymized 4GB pure text dump of Skroutz's real 3 month data.
The three of us had no previous experience on data analysis. We had never worked before with so massive datasets either. That's why we experimented at the beginning.
Dimitris, true to his Electronic Engineer blood, tried to load the dataset in Matlab; Stavros as a web developer tried to load the dataset in MySQL; me, thanks to the edX course I have taken, tried to load the dataset in R-Studio.
It took me some time to understand how R-Studio works. You see, the edX course was using a web application where you could code, something completely different from setting up R, R-Studio and R packages from the ground in your PC.
Stavros on the other hand, even though he managed to create a PHP script to load successfully the dataset in MySQL, we a quick calculation we saw it was clear we didn't have enough time to wait the loading and even if we did the editing of the data would take a long time with MySQL and the remaining time was just 6 hours.
That's why we decided to work with R. Stavros and Dimitris downloaded R-Studio and for the next 4 hours we were learning R. Even I, having finished the edX course, had to learn a lot more.
At last, with one hour remaining, Stavros started a web-page presentation - instead of a powerpoint - using sexy interactive diagrams.
However with one hour remaining we could do a lot and the presentation at the end was incomplete and our biggest con. A con that Dimitris with an excellent speech neutralized giving us the (informal) 3rd place in the event.