Category: Work

Mentoring juniors and interns remotely

Walk with me

My background

In my previous job I’d worked remotely during 5 years, and now due to COVID 19 we have to work from home again. From time to time people ask me about the lessons learned during my remote position: I mostly answer explaining the common problems and ways to solve them. Actually I even did a presentation on the subject.

However there is a topic that rarely appears in questions, and due to this pandemic situation, I consider vital to keep out IT industry in good shape: mentoring juniors and interns. In my experience, I mentored people from Mexico, India and Spain, remotelly.

Juniors in the remote world

The main problem working from home is the lack of communication. This problem has effect in all employees, but it’s far more severe in the case of juniors.

Juniors are basically novices in the Dreyfus model of skills, which means that they should be learning following rules and examples. Novices don’t know why an error arises. They look for constant feedback. Feedback is key to improvement, which is what you want a junior to focus on. Feedback is communication.

Juniors can’t work autonomously. They need the guide of a senior. They may not need the big picture, but a clear task to do, challenging but understandable: they need to keep momentum. They need a plan with examples. In some cases, like working in a fast-moving startup, the lack of plan can be very confusing.

Juniors bring freshness and enthusiasm. But if they don’t feel on track, their excitement starts to go down; and believe me, it’s the worst that can happen. It’s important to realize this situation as soon as possible, because depending on junior’s personality (openness, sincerity, etc), this problem could remain hidden for weeks or even months, and leads to fatal consequences.

There is a minor problem too that use to happen in remote situations: the lack of “team” feeling. In the case of juniors this can be an issue too.

Communication

Communication is the key to keep a junior on track: it helps his learning process and his daily rhythm. But working from home makes this communication less casual, so we need to patch this problem with some processes. Following I present some possible solutions.

Man vs city

Daily Learning

A senior should be always available to give feedback to a junior. Moreover, it should make him understand that if he has any doubts, he can freely ask about it, despite he could sound silly. This also helps to give advice just-in-time, while keeping the feedback loop really short.

It’s is vital that each conversation with a junior ends with an explicit “I understand” or some other form of ACK. A quiet answer is no answer.

A senior should give examples to learn from, and take always the chance to explain some lessons learned while doing a similar task (that the junior is currenlty doing).

Finally, a senior should challenge the mind of a junior. A common way is to ask him for alternative solutions: try to find 2 solutions to the problem (or at least a variation of the solution); and later discuss the trade offs together.

Planning

It’s mandatory to set 1:1’s frequently. In these meeting there should be a follow up of the junior’s career evolution. Your company may have a career track, like this one by patreon, or just a simple list of concepts to cross out. But anyway the junior should understand clearly his good points and the ones to improve.

Junior members should know the general idea of the following tasks, and have at clear path: a list of tasks with easy explanations.

Team building

There are lots of different actions for team building, even for the remote case. But these days it’s quite common to see online meetups and tech events, some of them with a cheap or free access. It’s a great chance to invite juniors to watch the event together, and later discuss the subjects.

Another similar approach is a reading club: find some must-read tech book, and set a time to discuss each chapter together.

Final words

This new work-at-home world has come to stay. Hiring and mentoring juniors can be more difficult, but the energy and momentum that bring to the team is worth of. And, frankly, they are cheaper to hire too. The industry can’t live without them, and now it’s not the time to look away.

Be the mentor you wanted to meet when you were a novice.

This text was proofread by some juniors.


4 things I would’ve liked to know about Cloud Functions

We had an important project to do: build and keep updated some transformed tables in BigQuery with data that comes from a transactional system. We needed 3 pieces: the code that build the tables, a place to deploy this code and a scheduler to call it. Given the code piece was in Python, we had to evaluate different platforms to deploy.

cloud-functionThe platform chosen was Google Cloud Functions. Actually there is a nice diagram to let you choose among Google services that helped out (but for some reason it was not easy to found). We could’ve tried it deploying on our own hardware server, but a lateral idea with this project was testing the cloud.

Cloud Functions are perhaps the easiest way to deploy your code: given it’s Python, Go or Node.js; for other languages you could try the new Cloud Run, which is basically a docker container that is executed on an external event. A Cloud Function can be triggered using a HTTP call, a Google PubSub message, or some platform internal events (like when a file is uploaded). In our case, we use Google Scheduler (a basic cron) that sends a PubSub message.


Up to here everything sounds perfect, but we found 4 minor problems due to the lack of knowledge of the platform:

DEPLOY AND LOGS WITH DELAY
As you deploy and use the Function on the cloud, the results will come with delay. This is not a real problem, but if you’re get used to work with “tail” or other console scripts to explore the logs, you have to relax and wait a couple of minutes before sentencing your code is not working.

TIMEOUT
Cloud Function’s documentation says that this product is intended to short-running scripts, so it comes with a 1 minute timeout. We missed this detail at first, and the actual use misled us to think there was no timeout: if you manually launch the Function, apparently there is no timeout and runs for several minutes. But later, having a look at the logs, we found the timeout error (that actually is logged as INFO instead of ERROR!). It seems that when our script uses more than 1 minute, it continues until somebody else asks for resources: so if we are in a relaxed pool, we can be lucky and run it more time.

However you can configure, on deploy time, the timeout limit up to 9 minutes. Unluckily our code runs for 11 minutes, so we had to split it.

NO WRITE PERMISSIONS
The library we use does some disk-writing internally, and we didn’t realize it until we saw the problem in the logs. The good thing is that you can just write in /tmp, so we only had to reconfigure the library to write there. The weird thing is that anything that your code writes into /tmp is also written in the function log, so the logs can become difficult to follow.

SINGLE THREAD
This was the most trickiest one! We are using a Python library that, by default, creates 4 threads of execution. For some reason this doesn’t work well on Cloud Functions, and sometimes the connection with BigQuery is closed before all threads have finished. So we had to use an undocumented feature of the library to work only with a single thread.


Summing up, Google Cloud Functions is a lightweight way to execute your Python scripts, with a really way to deploy and use. But sometimes things go wrong under the hood, so you should READ THE LOGS to find out if all is ok. Checking that the final results matches what you expect will help too (for instance, doing automated test-queries that try to find non matching numbers).

Disclaimer: we chose Google Cloud Functions due to the particular background of the company (team, knowledge, etc). Depending on the task to do, you might want to have a look at other more specific products, that could help you better to make ETLs or processing data (for instance, Dataflow/Beam on Google platform).


My 2018 in review

My main objective in 2018 was to go deep in Machine Learning (as a way to continue 2017’s focus). When the year started, I decided to organize my free time in small 1-month projects. The original idea was to start with Deep Learning too, but I ended up exploring other fields, like data engineering.

In February I tried different approaches to develop a ML model to solve the famous Titanic Kaggle competition, where you have to predict the survival of different passengers given some data. It was really fun, because I explored different ideas, but I ended with a quite over-engineered notebook. Later I realized how important is to find which are the noisy features that you have to ignore.

In March I decided to improve my python skills so I set a challenge based in intuition: try to group products that are bought together, using real data from work (Ulabox, an online supermarket). I enjoyed creating sparse matrices with scipy and doing matrix operations with numpy, which was a good refresh of maths. The result was a nice dentogram that showed that some vegetables are bought together, as well as some types of yogurt.

In May I created a simple notebook to solve the Titanic competition but with one idea: help my work mates to join a competition and get excited with ML. So I made the most simple code that worked, but at the same time trying to show an eye-catching result. I tried plotting a simple decision tree with a great result: both coworkers joined the session, and other Kaggle users voted up my notebook.

In June I bought a new computer with a GTX1080, getting ready to jump to Deep Learning. I tried some tutorials (Tensorflow and Pytorch), but I didn’t like starting from level 0, that is, creating my own neurons from scratch. Actually I learned about neural networks years ago, at university. Later, almost at the end of 2018 I finally found a book with the level I was looking for: Advanced Deep Learning with Keras.

PyConDERegarding conferences: in July I attended PyData Berlin thanks to my employee (who paid me the tickets). Later in September I also attended DataEngConf in Barcelona, that really matched the needs of my company: make a data engineering plan. In October I took a train to Paris and then another to Karlsruhe to attend PyConDE; this conference was really well organized, with a wide concepts’ talks and in an incredible venue: a digital art museum with thought-provoking expositions about the future we are building.

The most interesting books I’ve read this year came as suggestions from conferences’ sessions: one is Lean Analytics and other is Data Engineering Teams. During 2018 I read some non work related books too, most of them sci-fi novels (like The Expanse book 3 and 4).

During summer I continue improving my knowledge of Python, using libraries to create images and videos. Also joined a MOOC about Google Cloud Platform (as a need from work).

I sent 3 papers for different Call for Papers during the year, and was lucky to get selected to do a workshop in November in Barcelona, during the unforgettable PyDay. I prepared a practical introduction to NLP, using classic and modern methods to classify texts. I chose Spanish jokes as the corpus to work on, and the result was amazing: both the audience and myself enjoyed a lot the workshop.

Finally in December I took a rest regarding tech stuff… and got married 🙂


Teaching students about real industry work

Some months ago I had the chance to teach University students about how we develop in the real world, as part of a “companies’ seminars” event.

There is an ongoing discussion in our industry: Do you need a major in Computer Science to become a successful developer?. People say that the subjects explained in the University become outdated quickly, basically due to the lightning speed of technology. People say that nowadays joining a course on javascript is enough to learn to program. Other people say that you must spend 4~5 years in University.

I’m on the side of the need for formal University education. Students need foundations to perfectly understand how things really work. But it’s true that they also need to know how the industry really work. Virtualization, code versioning, code quality (“clean”), tradeoffs, etc, are subjects that are not taught in University, unluckily.

During the seminar I taught students about general subjects like the tradeoffs we have to choose in our company, but also about last trending technologies like docker. Anyhow the most loved subject by them was my introduction to clean code, that opened their eyes. Let’s hope this will inspire them.

Here are the links to the slides I used:
Professional development
Clean Code
OOP and SOLID principles
Introduction to docker
Seminar conclusion

The best advice I gave them: Find a job in a company where you can learn.


Remote working effectively

Some months ago my coworkers asked me to share my experience remote working. We work in a normal office, but I had worked from home during 5 years, from Barcelona, Seoul and Mexico DF. So I prepared a simple presentation about the main issues to consider if you want to try working from home.

After the presentation a interesting discussion followed. Some of my mates worked as freelancers in the past, and arrived to similar ways of managing the working time. It’s the key point when working from home: control yourself how and how many time you are productive.


Do you test your tests?

Weird shapesThe first time I read about serious testing was in The Pragmatic Programmer. The book explains the usual (boring) benefits of testing, but a twisted detail rolled my eyes up: also test your tests. Testing is a net that helps you to change the code without breaking the logic, and as a real life net, you should verify it works as expected. Tests should be in a tight relation with the code.

When is a test good? Trying to find the differences between a good test and a bad one is not obvious, however. Looking for lacks or anti-patterns in our tests is a good option to improve them.

Thanks to PHPUnit and Xdebug, the PHP community started to care about testing years ago. Since then, the easiest way to show the quality of a test suite is the code coverage, that is, the percentage of the code the test stresses. That worked until programmers started to focus on a 100% coverage, creating artificial tests that doesn’t stress the logic correctly, but instead get a fake 100% line coverage. If a line is executed once, even if the subject was a different test unit that uses that class, the line “is tested”.

Are you really testing each class? Following the logic? Even if you use proper unit tests, you may be missing things.

Let’s start with a stupid example, a function that does an “AND”, and a tests that gets a 100% coverage:

The test is only stressing 2 cases! Actually an “AND” has 4 possible cases, so the 2 missing cases (false-true, false-false) were totally ignored, despite you get a 100% line coverage.

This was a basic example to show the difference between line coverage and path coverage (in this case, 4 possible paths). The good news is that Derick Rethans is working on it. I wonder how many programmers will get surprised while seeing their code’s path coverage is low.

Another way to test your tests is to change the source code and see if the test fails (it should!) or not. This is called Mutation Testing, and helps to detect when a test is not working perfectly. For instance:

This test looks complete, but there is no test for the bound case, biggerThat5(5). This test is not really accurate.

In the PHP ecosystem there are 2 only available Mutation Testing tools: Humbug and Mutatesting. The second one, despite the author is also the creator of the excellent PHP-metrics, seems abandoned.

So the only real option is Humbug, developed by the author of Mockery. Unluckily it only works with PHPUnit for the moment. It basically finds places where the code can be easily changed, like a true for a false, or a number N for N+1, and runs the tests to see if that mutation is killed (that is, the test fails). For instance, in the previous example it changes 5 to 6, and the tests still work, so the mutation was not killed.

I just hope these tools become more popular, in order to improve the quality of our industry. And let’s hope soon Humbug will work in PHPspec too, as many companies are moving from PHPUnit to Behat-PHPspec.

The code of this post can be find at its github repo.


How I looked for a new tech job

Sagrada Familia at nightBack in Barcelona after 3 years working remotely, I decided to look for a new in-office job. But following an uncommon way to search for a job.

First I had a look on jobs’ websites, but only to get an idea of what technologies are popular in Barcelona. Symfony was the most remarkable one. But I didn’t apply to any of the job offers I saw there.

I don’t want to work in a company just because they opened a job offer. I want to work in a company with great developers to learn from, and a product that passionates me. Actually I discovered that sometimes good companies need more developers but have no time to publish openings.

So I looked for a list of local companies. Regarding Barcelona, I found http://internetmadeinbcn.org/ (¹). And started to make a list with the interesting ones.

I also started to join programming events like conferences and talks, meeting people there. The idea was to find companies which technical level is a bit better than my skills(²). If you have the chance to join one of them, you’ll improve greatly.

Pretty Hot PeopleSend them your CV with a well prepared cover letter (email). Some of them will contact you back. And the real fun begins: TECH interviews! Usually the process starts with a tech test, where you have to program something in a short time (between 1 to 4 hours). By the way, if the company does not ask you to do a test, run away (read the reasons in Joel’s test); once I joined a company that didn’t ask me so, and 3 weeks later I quited because their code was not a good one to learn from.

True fact: you will do your first interview horribly.

However, you will learn a lot while having interviews and tech tests. Specially if you ask for FEEDBACK! From my experience, only half of them will send you some feedback (following Sergey Brin’s style, “make the candidate learn something”). Feedback is pure gold. It’s the best way to learn from other developers working in the industry.

In my case, while having interviews I learned some new code design ideas. Moreover I ended up reading about DDD and BDD. For instance, in the first interview (3 months ago) they asked me about the meaning of BDD, and I had no idea; but in the last tech test, totally based on behat, I was able to code comfortably.

I can only say thank you to the few companies that gave me valuable feedback, even if they didn’t hire me. Now I’m better thanks to them. Somehow they helped me to sharp my skills!

Summing up, the “always learn” mantra should be applied to the process of looking for a new job too.

AND I GOT A NEW JOB, yeah!

Notes:
(¹) For other cities in Europe, you may want to have a look at tyba.com.
(²) Actually this is a borrowed idea from my Korean classes in Seoul. The teacher always speaks using some more words and expressions that the students should know, so the students keep fighting all the time trying to get the level. However, students can get exhausted with that drowning feeling.


VIM plugins for web development

Recently I bought a new laptop, and while configuring my tools I noticed I haven’t cared about VIM plugins for years. It was the perfect time to have a look at the most interesting plugins.

Firstly I installed Pathogen to manage VIM plugins. It allows you to install other VIM plugins in separate directories. This way you avoid the mess the .vim directory can become.

vim plugins
Regarding general plugins, I installed NERDtree which is a directory tree explorer, ideal to keep a general view of your project. Also installed Airline, that shows an improved status bar with lots of information about the current file. However I couldn’t see it at first and I had to add the following line in my ~/.vimrc file.
set laststatus=2

If we speak about writing code, the most useful plugin you can install is SuperTab, which improves auto-completion with the tab key. This one works really well in pair with Ultisnips, that (as it name suggests) allows you to save snips of code and recall them later. I also added to my list the transparent Skeletons plugin. If you create a new file, this plugin gives you a template to start with, depending on the type of file you created.

Time to have a look at web development. In order to work with HTML, I installed matchit (a classic) but also HTML5, which adds omnicomplete funtion, indent and syntax for HTML 5 and SVG. As sometimes I use less to write improved CSS, I added Less, a single file with syntax.

The PHP programming section is based in the PIV (PHP integration for VIM), which includes various plugins. Actually I disabled some of them, just to meet my needs. Finally I added Syntastic: every time you save a PHP file, it automatically checks it with PHP’s lint (finds grammar errors), PHP Code Sniffer (alerts style errors) and PHP Mess Detector (suggests improvements), showing the results inline.

Am I missing any other basic plugin? I guess not, as with these tools I feel really backed to develop any project.

Update: I forgot to list xDebug (an interface for PHP’s xDebug).


Adminer, a compact database manager

tl;dr If you use phpMyAdmin, stop right now and give a try to Adminer.

If you have several customers with different hostings, you have experienced problems accessing to their MySQL databases for sure. The situation comes like this: you get a new support ticket and ask for hosting access, but usually you just get FTP access. So now you have to ask again about the phpMyAdmin URL. Or you get access to their hosting panel, but there is just a horrible DB manager.

What can you do? Trying to deploy phpMyAdmin yourself in a customer’s hosting is a nightmare, and can take several minutes. That makes no sense, specially if you just need to check a tiny detail.

adminerAdminer is the solution, as it is just one file. That means uploading it is just a couple of seconds. In a moment you are operating with the database. Magic.

I don’t really know how I discovered about Adminer, but as soon as I started using it, it became an everyday tool. Actually I don’t remember when was the last time I used phpMyAdmin. Adminer as a single file is perfect, but moreover it comes with a lot of features that surprass phpMyAdmin. And it has an active development, so each new version comes with new features, like more databases (PostgreSQL, SQLite, MS SQL, Oracle, SimpleDB, Elasticsearch and MongoDB).

Finally there is a cut-down version, Adminer Editor, that only allows simple CRUD operations. That’s perfect if you need to quickly provide your customer a way to edit the content while you develop a proper back-office.


Holidays

Which one is better?
a) Use all (or most) of your free days in a row. You totally disconnect from work issues, but when you come back your’ll have a flood in your mail inbox.
b) Use them little by little, so do small breaks during the year. You don’t totally forget about work things, but you recover some energy several times.

Basically the thing depends on how many days do you need to reach a real disconnect status. And this also depends on what are you going to do during holidays: if your holidays include a lot of activities, you don’t need a lot of time to remove any track of work from your mind, but if you’re going to spend your time at home or next to your office, you’ll sure need more days.

Then, the key is to find the average amount of days you need to disconnect, and do as many breaks as you can with this length… if your employee lets you so.