Image credit: Unsplash

Machine Learning — Don’t Just Rely on Your University

The drastic contrast between formal and online courses.

The Problem with Learning in a University

Incorporating machine learning into predictive analytics has been in high demand that provides businesses the competitive edge. This hot topic is highly subscribed by undergraduates all over the world. However, being formally introduced the concepts and techniques of machine learning in universities may prove extremely daunting for the average undergraduate.

During my undergraduate winter exchange in McGill University, I enrolled myself in their Applied Machine Learning course. Yes, it was foolish of me to enroll in a graduate-level course! It started with the introduction of Linear Regression as it should be since it is the most basic machine learning algorithm out there. The tutor did a great job attempting to explain the intuition behind the algorithm.

But after introducing a ton of heavy math behind the algorithm, the finishing blow was dealt with “Oh by the way, assignment 1 is out, it’s due in 10 days time and you are encouraged to complete it in Python, see you next lecture.” How was I supposed to learn the concepts of L2 regularization and fitting regression models with stochastic gradient descent in 10 days! I didn’t even know how to code in Python!

I was already overwhelmed by the second lecture that I immediately dropped out. In that course, solid background knowledge in statistics, calculus, linear algebra and programming is assumed. And sure enough, the 4 elements were immediately introduced within the first introductory lecture.

It doesn’t help that I was one of the many students who learn and forget majority of the mathematical concepts through summer/winter break!

The Discovery of Online Courses

Not allowing a little setback to crush my interest in data science, I proceeded to search online for resources that teach machine learning friendly enough to the general public who are “interested enough”. There are a lot of great online courses out there, some are free, some are paid, some are formally certified. I went for the paid route because I find genuine passion from instructors who provide paid courses.

The online course I enrolled in was Kirill Eremenko’s Machine Learning A-Z™: Hands-On Python & R In Data Science. Kirill did an absolutely fantastic job in providing easy-to-understand intuition in all the algorithms introduced in the course that is comprehensible even by a high school student. All algorithms introduced are supplemented with hands-on coding session with close-to-real-life data that helped me visualize and understand its significance.

Motivated, I re-enrolled into another machine learning course in my home university and found it a lot easier as I got my intuitive foundation built.

I would like to provide some level of contrast between what I learnt and what was taught in my university as an undergraduate in the following section. I hope this article will inspire those who are considering to extend their learning beyond the walls of their universities or working professionals who seek further knowledge in the realm of machine learning.

Disclaimer: The following comparison is only limited to learning materials provided by 2 sources: my university and Kirill’s Machine Learning course. Henceforth I do apologize for the biased view presented in the following article and it should not be taken as a general phenomenon that occurs in all formal versus online learning.

So What’s Different?

To illustrate the difference between formal learning and online courses, let’s look at a few common machine learning concepts and see how it is explained in my university versus how it is approached by Kirill’s course. The following list is a snapshot of what I find challenging to understand while in university but completely debunked the moment I watched Kirill’s videos.

Linear Regression

This is how my university attempts to explain linear regression in their lecture slides:


Throwing in mathematical notations as a general introduction to linear regression makes it difficult to understand (Credits: Pun Chi Seng Patrick)

To the experienced, this is child’s play. But to anyone who just started with some level of statistics and calculus, it takes quite a while to digest this information, especially with the lack of good visuals.

On the other hand, the online course dived straight into a business problem with a step-by-step animated explanation:

Explaining the relationship of working experience and salary makes the intuitive understanding of simple linear regression so much easier (Credits: Kirill Eremenko)

A simple 3-minute video explaining the intuition behind ordinary least squares is enough to get anyone to understand its significance.

Evaluating Linear Regression Models using R-Squared statistics

The R-squared statistics, also known as the goodness-of-fit, is widely used in the evaluation of linear regression models. The sentence in red below perfectly sums up the use of an R-squared statistic:


Explaining the use of R-squared statistics to evaluate generalized linear regression models ((Credits: Pun Chi Seng Patrick)

However, I find it a lot easier to understand from the visuals presented in Kirill’s course:

A visual representation of R-squared statistics presented by Kirill (Credits: Kirill Eremenko)

In layman’s term, the R-squared statistics is simply the comparison of the best-fit line to the “worst-fit line” which is the constant average of the data. The lower the residual sum squared, the higher the R-squared value. The adjusted R-squared adds a penalty to additional predictors that are irrelevant to the model and it is applicable for multiple linear regression models.

Support Vector Margin Classifier

Understanding the concept of support vector machines was notably the hardest task to do in my university’s machine learning course. The bombardment of heavy mathematics made it seem impossible to sit through and understand the concept in one go. Take a look at an extract information on maximal margin classifier below:


The heavy math behind the computation of a maximal margin classifier (Credits: Pun Chi Seng Patrick)

Now, in contrast, we see the beauty of how a simple illustration by Kirill nailed the intuition of maximal margin classifiers:

Credits: Kirill Eremenko

Alright, you get the picture, there are still a lot of comparisons I can do but I will spare you from the boredom! By now you might think that you will benefit more from learning online than taking a formal education in a university. But personally I feel that completing my university’s machine learning course made me far more employable than merely going through Kirill’s course.

But Formal Learning Can’t be THAT Bad, Right?

A machine learning course is not complete without a final project to work on. Kirill’s course provides easy-to-understand intuition and sample codes in both Python and R to work on and follow along. Completing his tutorials provided me with the satisfaction that I can make predictions with an arsenal of machine learning algorithms.

But it’s not enough.

The Importance of a Graded Final Project

A graded final project is what differentiates formal and online learning. We were tasked to design a machine learning problem from publicly available datasets. The following flowchart highlights the process of dealing with a good machine learning project that closely resembles real world application:


Our hands-on experience in modelling public dataset was supplemented with project consultation and support from the professor and TAs that enriched our learning with a personalized touch. During the project I was glad that I had a collection of machine learning scripts from Kirill’s course ready to be implemented for data pre-processing and modelling.

The Importance of Getting Used to Heavy Math

From statistical inference to hyperparameter tuning, the magic of a high-performing model lies in the deep understanding of the complex mathematics behind it.

Questions such as: “What is the optimal learning rate and weight decay for my ANN model?” and “What is the optimal number of trees required for a good random forest regression model without overfitting?” can only be answered with an adequate amount of mathematical knowledge.

In a Nutshell

The demand of data scientist to provide business and product intelligence is on the rise (Source: PredictiveAnalyticsWorld)

Be it building recommender systems to implementing computer vision in AI-enabled products and services, the implementation of machine learning algorithms continue to rise. The lucrative industry of AI has piqued the interest of universities that offering specialized degrees in data science starts popping up. Take my university’s recent offer in data science degree for instance.

As people become more aware of its demand, they start to venture into the unknown. Unaware of the technicalities behind the perfect blend of math and computer science, it becomes natural for an average undergraduate to struggle through the course. The simplicity on the application of machine learning techniques can be debunked by many available online courses. Unfortunately, the hard truth is that hardly any employers care whether you have an online course certification that is barely recognized. Your formal qualifications will always be the priority of a HR executive. Especially if the HR executive is not technically inclined.

Another hard truth is that most data science job postings require higher education to be even considered. Powering through an undergraduate course that involves machine learning has already proved to be a challenge to many. But I hope this article can inspire the readers to put less weight on their reliance on formal education and know that there are tons of very useful resources out there to supplement your learning in data science and machine learning.

Bobby Muljono
Data Analyst

Just an average Joe with a passion in data science