Practical XGBoost in Python

It's doing great, but … can it do better?

   Watch Promo

I bet you all heard that more than a half of Kaggle competitions was won using only one algorithm [source].

You probably even gave it a try. It’s so easy to get excited about a dream of getting on top of the leaderboard. Imagine all that fame and fortune, ahh.

Let’s get real. You made it work. The submitted results are good but definitely not the best. Then motivation falls.

Don’t give up! You know that it is possible to get more out of it. Start investigating deeper.

Time (and webpages) passes by. You are overwhelmed and even more confused than in the beginning. There are a lot of detailed, laser-focused guides, but it’s hard to find a more general, easy to follow one.

It starts to resemble Dilberts story:

Additionally more topics relating to each other stars forming:

  • "I’m dealing with an imbalanced dataset, which parameters to tune and how?"
  • "My data contains missing values - does XGBoost handles them?"
  • "Gradient Boosted Tree? What the hell is it?"
  • "How can I evaluate my results to be more confident that I’m not overfitting?”
  • "The XGBoost version in repo wasn’t updated for 1 year. Maybe I should install the latest version from sources?”

I have been that way.

I saw a lot of these question asked by other people on different groups or forums (uff, I was not alone). These are common issues. My “Read later” browser bookmark list was getting longer and longer.

And guess what.

I read it all.

Some of them were super boring, some very inspiring. I was determined and time spent cannot get wasted.

A guy named Seneca (Roman philosopher) once said - "While we teach, we learn".

So after spending 100+ hours of exploring all possible catches I present to you….

Practical XGBoost in Python

A 100% free online course that will show you how to use one of the hottest algorithms in 2016. You will learn things like:

  • how does the algorithm work explained in layman's terms,
  • using it both with a native and scikit-learn interface,
  • figuring out which features in your data are most important,
  • dealing with bias / variance tradeoff (overfitting problem),
  • evaluating algorithm performance,
  • dealing with missing data,
  • handling imbalanced datasets

Each topic is described from A to Z in a fully reproducible way. It starts with loading data set and takes you through all steps. At the end, you will have a clear vision and be able to use a technique in your cases.

Go through video materials and learn how to harness the algorithm to make it work for your data.

Your Instructor

Norbert Kozłowski
Norbert Kozłowski

Hi, I'm Norbert. From my early days I used to work with code. I started with Turbo Pascal when I was about 10 years old and didn't stop since then.

Combining software craftsmanship and data engineering skills results in clean and understandable code.

Want to hear more? Ping at Twitter, connect on LinkedIn.

XGBoost has proven its power in many competitions. It might be tempting to jump-in right away, but please take the time and read the recommended prerequisites before doing this.

This course is for you if:

  • you want to understand the mechanics of the methods used,
  • don’t want to get buried in math equations,
  • you respect your time (stop wasting it on side things like compiling sources),
  • you are focused on getting the job done,
  • want to know the proper approach when dealing with common machine learning issues specific to XGBoost

You shouldn’t take it if you:

  • don’t have elementary computer skills (you can install Git and Docker on you machine, do you?),
  • expect immediate results (remember that all great skills come with practice),
  • have never seen Python language before

"Very clear, well-structured and informative, even with a brief read through I can already pick up some helpful knowledge - i.e. how to handle imbalanced dataset"

Yifan Xie
Project Manager at Airbus

Frequently Asked Questions

What software do I need?
For the sake of reproducibility, I'm giving you access to personalized Docker image for provisioning the environment. You should be able to run it on your operating system. If you don't want to (or can't) you will have to install all the required libraries manually. You should also have Git installed to download necessary course materials.​
When does the course start and finish?
The course starts now and never ends! It is a completely self-paced online course - you decide when you start and when you finish.
What background is required to join the course?
It would be nice if you had some basic Python skills.
How long do I have access to the course?
How does lifetime access sound? After enrolling, you have unlimited access to this course for as long as you like - across any and all devices you own.
For the things we have to learn before we can do them, we learn by doing them.


“Practical XGBoost in Python” is a part of Parrot Prediction’s ESCO Courses. It's a collection of online data-science courses guided in an innovative way.

The main point is to gain experience from empirical processes. From there we can build the right intuition that can be reused everywhere.

Remember that knowledge without action is useless.

Find it useful? Spread the knowledge!

Digg Facebook Google LinkedIn Reddit StumbleUpon Tumblr Twitter VK

Get started now!