Surviving the Rocky Road of Data Engineering

When you wake up one day and decide to be a Data Engineer, the obvious first step is to learn your SQL and your Python, like a good student (and yes, you should do this). Perhaps the adventurous ones out there will try their hand at data modelling (some of you will get good at […]

Why Deadlines Make This Data Engineer Tick (Here’s Why)

Deadlines are all about helping yourself. Deadlines are excuse-busters. Deadlines keep you on point and in focus. Well, that’s how I see them, anyway. Not everyone will agree, but I love deadlines. Both for work and for my personal side projects. They keep me in check. They keep me honest. With any project, there’s one thing you […]

The 3 Biggest Realisations I’ve Had As Data Engineer

Self-Belief, Self-Improvement, and Self-Education One of my favourite proverbs is a Haitian one about overcoming obstacles: Beyond mountains, more mountains. If you work in data engineering, I think you can agree that you can easily apply that wisdom to a data engineer’s journey: when you think you’ve grasped one thing or solved one problem, something […]

Breaking Free from the Data Engineering Learning Loop

Seasoned data engineers know something you don’t, but I’m going to let you in on the secret. You learn more by doing. That’s it, that’s the secret. But the trouble is, there is a lot of bad advice out there. Advice that says you just need to follow this 3-step plan, do this “step by step”, […]

3 Lessons for Junior Data Engineers

It’s been a few years since I began my Data Engineering journey. There have been a lot of ups and downs along the way. Here are 3 things I wish I’d known from day one: 1 — Leverage your strengths to gain an edge We are all different, and we all have our own strengths […]

Why Data Engineers Shouldn’t Be Afraid to Say “I Don’t Know”

The three words that will serve you well are: “I don’t know.” We can’t know everything. We just can’t, it’s not possible. The sooner you come to peace with this fact the better. This has never been more true in the field of Data Engineering. The data industry is exploding. Every other day, some new […]

A Data Engineers switch from AWS to GCP first thoughts

I’ve always been an AWS kind of guy not by choice, but just because every where I worked they had AWS. Sure, there were few places who had Azure – cue the developers in the room grumbling about Microsoft . Azure was doing something or other, but I just didn’t have any exposure to Azure […]

Useful Git commands every Data Engineer should know

There is no getting around it. If you are working as a Data Engineer you will be using some form of source control to manage and track your pipelines code changes, handle your deployments (CI/CD), collaborate with you team, push infrastructure changes, and document your pipelines. If you are not using source control. Stop reading […]

How To Pin public datasets to a Project In Google BigQuery

I recently started spending more time in GCP, so this is a short and sweet post on how to pin the bigquery-public-data dataset in Google BigQuery in three easy steps. I clicked around for ages trying to figure this out, but thanks to a little bit of googling and dumb luck I figured it out. […]