I was lucky to attend PyData conference in Barcelona this year, hosted in ESADE.
Although I’m basically a PHP developer, I’ve been playing with data science tools lately with python’s stack. I have no real experience in data science, apart from a couple of prediction coding using linear regression, but I was curious.
With a novice spirit, I set some clear objectives: find out if data science is like teenager sex, or companies are really using it; get a feeling of the community; and try to learn as much as I could.
First of all, the community is vibrant, actually far more than PHP’s one in Barcelona. The organization was smooth too, and all the people I talked with was really nice. Everybody had things to learn, so came with an open mind.
It was funny to see that I was on the “data owners” side, while most people were in the “looking for datasets” side. This led to several conversations asking me how we use the data in our company.
Regarding the talks, there were quite a lot about tools. Python science stack have a wide range of evolving tools, and this somehow reminds me of PHP circa 2008, when basic tools (PHPUnit, for example) were becoming popular. It’s good to polish your tools and master them, so I welcomed those talks.
There were also some talks on theory, which surprised me, as I haven’t never seen university professors in software conferences. Mathematical and computer science concepts were explained, for instance on optimization. This contrasts with the common industry solution: if some code is slow, just use more machine instances, which is far cheaper that spend time trying to optimize things (at least 99% of the time). I don’t mean I didn’t like those talks (actually one was really mind blowing), but I would love to see more professors in some other conferences, getting a real feel of some industry practices.
I was looking for talks showing “real fire”, real examples in companies. We heard about hotels trying to predict cancellations (in order to do overbooking); we saw IBM’s Watson analyzing the personality of customers; predict which employees will leave a big company; ideas to react knowing bad weather will arrive; best weekday to publish job offers and set interviews; and some other extremely interesting stuff… but I do want more!
My overall feeling is that I learned a lot. Python is not really used as a language but more as an interface for some amazing libraries. It looks like I have no option but to start exploring the data in ulabox!
I’d like to thank ulabox (my employer) that paid the ticket, and all the people in the organization that did a great job!
I published some of my (unedited) notes too.