Category: Development

Teaching students about real industry work

Some months ago I had the chance to teach University students about how we develop in the real world, as part of a “companies’ seminars” event.

There is an ongoing discussion in our industry: Do you need a major in Computer Science to become a successful developer?. People say that the subjects explained in the University become outdated quickly, basically due to the lightning speed of technology. People say that nowadays joining a course on javascript is enough to learn to program. Other people say that you must spend 4~5 years in University.

I’m on the side of the need for formal University education. Students need foundations to perfectly understand how things really work. But it’s true that they also need to know how the industry really work. Virtualization, code versioning, code quality (“clean”), tradeoffs, etc, are subjects that are not taught in University, unluckily.

During the seminar I taught students about general subjects like the tradeoffs we have to choose in our company, but also about last trending technologies like docker. Anyhow the most loved subject by them was my introduction to clean code, that opened their eyes. Let’s hope this will inspire them.

Here are the links to the slides I used:
Professional development
Clean Code
OOP and SOLID principles
Introduction to docker
Seminar conclusion

The best advice I gave them: Find a job in a company where you can learn.


PyData conference in Barcelona

pydatabcn2017I was lucky to attend PyData conference in Barcelona this year, hosted in ESADE.

Although I’m basically a PHP developer, I’ve been playing with data science tools lately with python’s stack. I have no real experience in data science, apart from a couple of prediction coding using linear regression, but I was curious.

With a novice spirit, I set some clear objectives: find out if data science is like teenager sex, or companies are really using it; get a feeling of the community; and try to learn as much as I could.

First of all, the community is vibrant, actually far more than PHP’s one in Barcelona. The organization was smooth too, and all the people I talked with was really nice. Everybody had things to learn, so came with an open mind.

It was funny to see that I was on the “data owners” side, while most people were in the “looking for datasets” side. This led to several conversations asking me how we use the data in our company.

Regarding the talks, there were quite a lot about tools. Python science stack have a wide range of evolving tools, and this somehow reminds me of PHP circa 2008, when basic tools (PHPUnit, for example) were becoming popular. It’s good to polish your tools and master them, so I welcomed those talks.

There were also some talks on theory, which surprised me, as I haven’t never seen university professors in software conferences. Mathematical and computer science concepts were explained, for instance on optimization. This contrasts with the common industry solution: if some code is slow, just use more machine instances, which is far cheaper that spend time trying to optimize things (at least 99% of the time). I don’t mean I didn’t like those talks (actually one was really mind blowing), but I would love to see more professors in some other conferences, getting a real feel of some industry practices.

I was looking for talks showing “real fire”, real examples in companies. We heard about hotels trying to predict cancellations (in order to do overbooking); we saw IBM’s Watson analyzing the personality of customers; predict which employees will leave a big company; ideas to react knowing bad weather will arrive; best weekday to publish job offers and set interviews; and some other extremely interesting stuff… but I do want more!

My overall feeling is that I learned a lot. Python is not really used as a language but more as an interface for some amazing libraries. It looks like I have no option but to start exploring the data in ulabox!

I’d like to thank ulabox (my employer) that paid the ticket, and all the people in the organization that did a great job!

I published some of my (unedited) notes too.


Virtual disk design kata

In my current job (ulabox) we do every Thursday a internal training session, usually prepared by one of our department members. Some months ago I prepared a code kata on design patterns, with 5 steps with instructions. The idea was to push the team to debate about different approaches to a common problem, and show them some classical design patterns, as a way to polish our weapons. The result was good, but the discussion only really happened at the end, when I showed them those patterns.

Ninja weaponsSome weeks later I heard about a code conference in Barcelona, organized by the Barcelona Software Craftsmanship group, so I took the chance to polish my kata and ask them to do in the event. It was rejected to the main event.

Later I heard about Monday’s katas: this group organizes every Monday a code kata with up to 20 developers. I offered my kata and our office to do it, and on December 12th we did it! All participants agree: the kata is smooth and induces to think about the subjects it later shows.

I published my kata on github. Have fun!


General programming principles

This is just a list about programming principles that I’m making for myself. These should be instinctive to any developer.


Links about quality website development

Just for reference, here I’m writing down some interesting links I’ve recently seen about quality while developing and maintaining websites.

That’s enough to keep me busy for months. But these days I’m also trying ideas, using CSS3 and HTML5 features, so I’m using a lot what it could be the best website with compatibility features.


Telecommuting: when good forms are forced

Pouchong TeaCurrently I’m working for a company, at home. It’s a small company: my boss is located in Mexico, I’m (project leader) here in Spain, and there are 3 developers in India. We just use Skype, Google Docs and email, and we have no big problems, just the ones related to the time zones.

“When you work remotely you work less”, they say. Actually this is the usual excuse for bosses to avoid telecommuting, because they feel they can control you less. But this idea is totally wrong!

Let me explain an interesting effect. Imagine a man working in a office with fixed time (like 9am to 5pm); when he sees the clock marking 5pm he stops working and runs home (except just in case of a final deadline for a project). That is, he doesn’t really take care about finishing the job. Perhaps he has not done a lot of it, and instead he spent a couple of hours with minesweeper! Meanwhile, I don’t have a really daily schedule: but I usually do something like 9am to 6pm, with some stops. The thing is that when I see 6pm in the clock, I do NOT stop, just think “is the task (scheduled for today) done?”. Usually it’s not done, so I work a couple of hours more. Of course it depends on each person, but usually remote working make you more implicated with the job.

On the other hand, the company is forced to have a really good system to schedule tasks. Other methods to control the work that everyone is doing are welcomed as well. Things like an updated calendar for tasks, good version control system and code reviews, ways to verify the quality of software, job reports, etc. The company has to firmly use methods of control in order to survive. But this is a good thing: the work is better organized! In my case, we usually have a meeting on Friday (evening, morning) to set the tasks for each single day of the next 2 weeks, and review the tasks done during the current week.

So, the real benefits of telecommuting, from the company’s perspective: no need to rent an office (and no need to pay its bills), better organization of the work, and happier employees. Sounds good, specially nowadays with a economic crisis around. The only real problem is the communication with the team, but with video-conferences you can reach a just enough level of interaction. However, this doesn’t work for all kind of people, but you can train them (even remotely!) to work with the discipline needed to telecommute effectively.


Software metrics (PHP focused) part 2

In part 1 I spoke about lines of code and average of lines per day. Two indexes that are quite naive. Let’s see some better metrics, most of them object oriented programming focused.

Third stop: Tests and code coverage
If you have unit tests you can easily control that nothing vital is broken in each contribution. A unit test stress a class, so if there is some change in it, you can verify that the expected behavior remains the same. Moreover you can get an analysis of which lines of the code are actually executed in a test suite: that is, the code coverage of the test. You will see high values in well tested classes and low values in classes that need more tests.

Tools? Of course, PHPUnit, with the help of xdebug to get the code coverage.

Four stop: Cyclomatic complexity
Cyclomatic complexity is just the amount of different paths the execution can go throughout. For example, in our main project we have a total of 3048 paths. This value can be interesting to detect places in code that have become too complex and maybe need some cleanup.

Five stop: Pure OO software metrics
There are some interesting software metrics for object oriented code organized in packages, used basically to value which packages (groups of classes) are of better quality than others. Values that show the relation among classes, dependences, the resilience to change, etc.

Last stop: Ratios
Combining some of the previous values you can get interesting ratios. For example:
– Cyclomatic complexity per lines of code: 0.2, a very good value.
– Lines of code per method (class function): 22.02, a normal value that we have to lower.
– Method per classes: 8.25, a good value.
– Average number of extended classes: 0.4, a good value.
– etc…

Tools? The excellent pdepend is used here. Have a look at the end of its example page to see the amount of data (and funny but interesting diagrams) you can get.

Finally, all those values and tests are compiled each single time a programmer sends a commit to our code repository, and I get a mail with all the details, including the lines added, the author’s name, and all the values that change. So with a quick look I can assure that the contribution is ok, or there is some code to improve. I wonder how many companies (which develop in PHP) use something like this. I bet less than 100 in the world!


Software metrics (PHP focused) part 1

Managing a software project with various programmers and around 10 contributions per day is a complex thing.
How can we measure the quality of every single contribution?
How can control the work of the programmers?

First stop: Lines of code

The easiest thing to measure in a project is the source lines of code it has. For example, the main project we are developing at work has ~30k lines (k=1000). That is, 24k of pure code and 7k of comments (and I’m counting only pure PHP OO code, without HTML, CSS or javascript).

Is this that simple to count? No, it’s not. The main problem is that programming is brain work, where creativity, skills to solve problems, and smartness are put in play. It’s like writing a novel: can you say that one novel is better than other just counting the pages it has? Can you say a Boeing 717 is worse than a 747 because it weights less? As the Wikipedia’s article says, only this metric can be useful when comparing 2 projects with different order of magnitude. However, the ratio comments per lines of code (in my case 1 comment every 4 lines) usually is a good index of quality.

Second stop: Average lines per day

Calculating the average lines of code per day can be tricky as well. At my job, a senior developer usually does ~50 lines per day, but a junior programmer does ~130 lines. Is that much? Well, having a look at some metrics about a similar project, phpMyAdmin, in Ohloh (a website with metrics for open source soft), and doing some maths, it’s suggestion is just 17 lines per day! On the other hand, if you google for “lines of code per day” you will get really wide values, from tops of 1k (using code generators: tricking) to normal less-than-100 values. Moreover, the deviation from this average value is actually huge: one day I can do 2 lines, other day I can do 200.

Senior does 50 lines, junior 130… is the senior one a slacker? Of course is not… usually the code from the senior one is better: less prone to errors, more adaptable, more concrete function’s names, more elegant, and does more things in less lines. The more the better? Actually the less code needed to solve a problem, the better. About this topic, I recommend reading the article : Code is your enemy!.

Tools? We use phploc for counting lines of code, and phpcpd for detecting cases of duplicated code. Both tools are developed by Sebastian Bergmann, the author of PHPUnit (the most popular testing framework for PHP).

Next, in part 2 of this post, I’ll be speaking about other metrics, like code coverage, cyclomatic complexity and some interesting ratios based on software package metrics.


Looking for quality content in the web 2.0

How can we induce users to participate more in our website?
For sure a lot of people have this question in their minds. Since the arrival of the Web 2.0 the value of a web is based on its users and the content they create. The more quantity and quality of the user’s created content, the more value of the website.

The first step is to simplify the UI as much as possible, to help users overcome their laziness and participate. The state of art includes clever AJAX tools, browser plugins, and desktop applications. In some websites they go one step further, and reward somehow the most valuable users, like the stackoverflow.com’s badges (a website for programmers), where you get medals for doing things (like “silver medal for good answer”: voted up 25 times).

But what about the quality of the content?
If you help users to add content, that doesn’t mean you will have a great content, just a lot of content. In some cases you can finish with a website flooded by low quality content (read “Facebook”). This is not a bad thing per se, as happens on quite a lot of TV channels: despite their low quality, people continue watching them. But seems that specific (or thematic) websites have better quality than generic websites (this also works on TV channels). Just compare the ratio interesting-content / total-content in Flickr vs. Facebook : of course you can find some bad pictures on Flickr, but meanwhile you can find tons of uninteresting content on Facebook. On Flickr you are somehow induced to publish only good pictures, on Facebook you are just tempted to publish a lot.

So, the balance between quantity vs. quality rules the net as it does in other places. The thing is, as website creator, find the most profitable ratio (regarding personal satisfaction and/or monetary ROI).

Lately I’ve been thinking about resurrecting an old pet project, a website for creating and playing games. Is that specific group (the gamers) enough to pay the bills or just to pay some caprices? Is the “create game” part too specific, or just what I need to make the difference? How could I work effectively on this project while keeping my day job?… too sunny to think!


Feeling as a senior programmer

Here it is a morning conversation with my new Hindi junior coworker. He is in New Delhi, working for us remotely, and I’m teaching him quite a lot of things, and love to discuss about programming problems…

Me – have you read about Composite Pattern?
He – i use to say to my friends that am good at programming, then I found you :(
Me – hahaha
He – right now reading Pragmatic Programmer, :)
Me – really? really good
He – planning to apply for zend certification so I’ve so much to study
Me – This book changed the way I think about programming