Top 35 Interview Questions for Machine Learning

machine learning interview questions

Table of Contents

Clearing a machine learning interview is not that tough as it might seem. All you need is a strong foundation on Machine Learning and you can easily clear the interview. But sometimes, in spite of having a strong base, you might find yourself in trouble during the interview.

The interviewer will focus on the basics first and as the interview proceeds, you will be challenged with more and more difficult questions. The interviewer will try his best to put you under pressure. But you need not worry.

To help you tackle this, I have compiled a list of interview questions to help you clear interviews on Machine Learning easily.

Also Read: What is Machine Learning?

I have divided these list of questions into 2 sections: Algorithm and theory-based, and practical based.

I hope by the end of this blog you will have answers to all the important interview questions on Machine Learning which will help you in landing that dream job.

Algorithm and Theory based Machine Learning interview questions

Q1. What are the different types of Machine Learning Algorithms?

There are three different types of Machine Learning algorithms:

i. Supervised Learning: In this machine learns under the guidance of labelled data. In this, a model makes predictions and decisions based on past data. Labelled data means sets of data that are numbered or labelled for reference. It can be further divided into two types: Classification and Regression.

ii. Unsupervised Learning: In unsupervised machine learning there is no such provision of labelled data. In this, the model input data needs to be given so that the machine can learn. It further consists of clustering algorithms.

iii. Reinforcement Learning: In this, the machine learns from a hit and trial method. The machine learns from the rewards or penalties it received from its previous actions.

Q2. What is overfitting and how can you avoid it?

Overfitting is when the model learns too well. It happens when the model learns the details and noise in training data to an extent that it begins to negatively impact the performance of the model. The most popular solutions to prevent overfitting are:

  • Cross-Validation
  • Train with more data
  • Remove features
  • Early stopping of the machine when you find out something is going wrong
  • Regularization of algorithms so that your model can be simpler

Q3. What is the difference between classification and regression in Machine Learning?

Classification and regression are the two main prediction problems which are most commonly faced while using Machine Learning.

It is the process of finding or discovering a model or function which helps in separating the data into multiple categorical classes which is discrete valuesIt is the process of finding a model or function for distinguishing the data into continuous real values instead of using classes or discrete values.
Nature of predicted data is unorderedNature of predicted data is ordered
Calculate using measuring accuracyCalculated by measurement of root mean square error
Example: Decision tree algo, logistic regression etc.Example: Regression tree, Linear regression

Q4. What is “Training set” and “Test set” in Machine Learning?

Training SetTrain Set
It is the examples given to the models to analyze and learn.It is used to test the accuracy of the hypothesis generated by the model.
70% of the total data is taken as the training dataset.The rest 30% is taken as testing dataset
This is the labelled data we use to train the model.We test without labelled data and verify results with labels.

Q5. What is Linear Regression?

Linear Regression is a supervised Machine Learning algorithm used to find out the linear relationship between the dependent and the independent variables for predictive analysis.Linear Regression equation is given as:

Y = A + B.X     Where :

  • X is the input or the independent variable
  • Y is the output or the dependent variable
  • A is the intercept and,
  • B is the coefficient of X

Q6. What are Bias and Variance?

Bias is the accuracy of our predictions

“Bias is the algorithm’s tendency to consistently learn the wrong thing by not taking into account all the information in the data (underfitting).”

A high bias means that the prediction will be inaccurate. Hence the bias value should be as low as possible to make accurate desired predictions.

Variance is the change in prediction accuracy of Machine Learning model between training data and test data.

Simply put, if the ML model prediction accuracy is “X” on training data and its prediction accuracy on test data is “Y” then

Variance = X-Y

Q7. What is the difference between inductive and deductive learning?

Inductive learning uses observations to draw conclusions

Deductive learning uses conclusions to form observations

Q8. What is Variance Inflation Factor?

Variance Inflation Factor (VIF) is an estimate of the volume of multicollinearity in the collection of regression variables.

It is given as;
VIF = Variance of model / Variance of the model with a single independent variable.

Q9. How do you handle missing data or corrupted data in the dataset?

You can use the Pandas library in Python to handle the missing data. There are two methods to handle the missing data:

  • isNull(): For detecting the missing values.
  • dropna(): We use dropna() method for removing the columns/rows with null values.

Q10. Explain the Confusion Matrix with Respect to Machine Learning Algorithms.

To measure the performance of an algorithm we use a confusion matrix. In supervised learning, it is called Confusion Matrix. In unsupervised learning, it is called matching matrix.The confusion matrix has two parameters:

  • Actual
  • Predicted

The confusion matrix visualizes the accuracy by comparing the actual and predicted classes. Below I have shown confusion table for a binary confusion matrix:

  • TP: True Positive: Predicted values correctly predicted as actual positive
  • FP: Predicted values incorrectly predicted an actual positive. i.e., Negative values predicted as positive
  • FN: False Negative: Positive values predicted as negative
  • TN: True Negative: Predicted values correctly predicted as an actual negative

You can use the confusion matrix to compute the accuracy test.

Also Read: How To Implement Linear Regression Using Python In Machine Learning

Q11. Compare K-means and KNN algorithms.

It is unsupervised in natureIt is supervised in nature
It is a clustering algorithmKNN is a classification algorithm
It needs unlabelled data to trainIt needs labelled data to train

Q12. What is ROC curve? What does it represent?

Receiver Operating Characteristic curve (or ROC curve) is a fundamental tool which is used for diagnostic test evaluation. It is a plot of Sensitivity vs Specificity i.e. it is a plot of the true positive rate against the false-positive rate.

Q13. What is the difference between type I and type II error?

Type I error is a false positive. Type I error is claiming something has happened when it hasn’t.

Type II error is a false negative error. Type II error is claiming nothing when in fact something has happened.

Also Read: Must Read Machine Learning Blogs

Q14. What are collinearity and multicollinearity?

Collinearity is when two predictor variables in a multiple regression have some relation between them.

Multicollinearity occurs when more than two predictor variables are inter-correlated.

Q15. What Is a Random Forest?

It is a supervised machine learning algorithm which is generally used for classification problems. It creates multiple decision trees during its training phase.

The random forest chooses the decision of the majority of trees and makes a final decision based on that.

Q16. When Will You Use Classification over Regression?

When the target is categorical we will use classification, whereas when the target variable is continuous we will use regression.Both these belong to supervised machine learning algorithms.

Examples of Classification problem include predicting:

  • Type of colour
  • Breed of animal
  • Gender of person
  • A statement is true or false
  • Yes or No
  • Type of flower

Whereas examples of regression problems include predicting:

  • Score of team
  • Amount of rainfall
  • Amount of revenue generated
  • Price of a product

Q17. What are Eigenvectors and Eigenvalues?

Eigenvectors: Their direction remains the same even when a linear transformation is performed on them

Eigenvalues: It is a scalar that is used for the transformation of an eigenvector.

The Eigenvector of a square matrix B is a non zero vector such that for some numbers we have the following:

Ax = x

where is an Eigenvalue.

Q18. What is SVM (Support Vector Machines)?

SVM is a supervised machine learning algorithm that is used for classification. They can be used to analyse data for classification and regression analysis.

In SVM each data item is plotted in n-dimensional space with the value of each feature being the value of a particular coordinate. After this, we perform classification by finding the hyper-plane that differentiates the two classes. (Follow the below graph)

Support Vectors are the co-ordinates of individual observations.

Q19. Implement the KNN classification algorithm.

In the following code snippet, we are using Iris dataset to implement the KNN classification algorithm.

# KNN classification algorithm

from sklearn.datasets import load_iris

from sklearn.neighbors import KNeighborsClassifier

import numpy as np

from sklearn.model_selection import train_test_split


X_train, X_test, Y_train, Y_test = train_test_split(iris_dataset[“data”], iris_dataset[“target”], random_state=0)

kn = KNeighborsClassifier(n_neighbors=1), Y_train)

X_new = np.array([[8, 2.5, 1, 1.2]])

prediction = kn.predict(X_new)

print(“Predicted target value: {}\n”.format(prediction))

print(“Predicted feature name: {}\n”.format


print(“Test score: {:.2f}”.format(kn.score(X_test, Y_test)))


Predicted Target Name: [0]

Predicted Feature Name: [‘ Setosa’]

Test Score: 0.92

Q20. What is cluster sampling?

Cluster sampling is a process of randomly selecting intact groups within a defined population sharing similar characteristics.Cluster sampling is a probability sample where each unit is a cluster of elements.

In this, the total population is divided into groups known as clusters. The elements in these clusters are then sampled. If all the elements in these clusters are sampled then this is referred to as a “one-stage” cluster sampling plan. If in these clusters a random set of subgroups is selected, then it is called “two-stage” cluster sampling plan.

The common aim for the cluster sampling is to reduce the cost and attain a desired level of accuracy.Now that we have discussed various Machine learning interview questions based on theory and algorithms, we will step up a bit and discuss certain machine learning questions based on real-life applications.

Read this section carefully because I am sure you will be asked most of the questions from this section.So, let’s get started.

Q21. How will you design an Email spam filter?

To build an email spam filter we follow the following steps:

1. The email spam filter will be fed with hundreds of emails everyday.

2. Each of these emails after going through the filter will be labelled: ‘spam’ or ‘not spam’.

3. The supervised machine learning algorithm will then find out which of these emails have been labelled spam based on spam words like the lottery, free offer, full refund, 100% off etc.

4. The next time the email is about to land in your mailbox, the spam filter will use algorithms like Decision Trees and SVM to determine that the mail is spam or not.

5. If the likelihood is high, it will be labelled as spam and the email won’t land in your mailbox.

6. We will then send a certain number of test e-mails and check the accuracy of the model.

7. The accuracy should be as close to the desired level as possible.

8. We will test various models with different algorithms.

9. The model with the highest accuracy will be used.

Q22. How does the recommendation engine work on e-commerce websites?

Once a user buys something from an e-commerce website it stores the purchase data for future reference and finds products that are most likely to be bought by the user in future. This is possible because of a future algorithm, which can identify patterns in a given dataset.

Recommendation engine work

Consider this example:

1. Let us suppose you buy a cycle from an online store.

2. The bot will gather information about this and place your user id in a group who also bought items related to cycle.

3. Let us call this group a cluster.

4. Now suppose if any other user from this cluster bought a cycle seat cover.

5. The bot will then crawl the online store and show all the users in these cluster items related to the cycle seat cover in their recommendation list.

Suppose you buy an iPhone from an eCommerce website.

The bot will gather this information and place your user id in that group of users who also bought an iPhone. Let us call this group a cluster.

The bot will search the entire online store and accordingly show products related to iPhone, such charger, earphone jack, screen guards etc. to this particular group in which you are placed.

This is how recommendation engine works in an e-commerce website.

Q23. How can you help our marketing team be more efficient?

The answer to this question depends upon the type of company. The below-mentioned examples will help you:

1. Clustering algorithms to build custom customer segments for each type of marketing campaign.

2. Natural language processing for headlines to predict performance before running the ad spend.

3. Predicting conversion probability based on a user’s website behaviour in order to create better re-targeting campaigns for potential customers.

This type of machine learning interview questions is very frequently asked. You can expect any kind of variation in this question.

Q24. You’ve built a random forest model with 10000 trees. You got delighted after getting a training error of 0.00. But, the validation error is 34.23. What is going on? Haven’t you trained your model perfectly?

It implies that the model has overfitted.Training error 0.00 implies that the classifier has copied the training data patterns to an extent, that they are not available in the unseen data. Hence when this classifier ran on an unseen sample, it couldn’t find those patterns and returned prediction with a high error.

Q25. Comment on the statement. Treating a categorical variable as a continuous variable would result in a better predictive model?

When the variable is ordinal in nature, only then the categorical variable can be considered as a continuous variable.

Q26. You are given a data set consisting of variables having more than 20% missing values? Let’s say, out of 50 variables, 8 variables have missing values higher than 20%. How will you deal with them?

The problem can be dealt with in the following ways:

1. Assign a unique category to the missing values.

2. We can remove them blatantly.

3. We can target variables to check their distribution. and if found any pattern we’ll keep those missing values and assign them a new category while removing others.

Q27. How would you approach the “Netflix Prize” competition?

Netflix prize was a competition organised by Netflix which offered $1,000,000 for a better collaborative filtering algorithm.

The problem statement says:

Predict user rating for films based on previous ratings without any other information about the users of films, i.e. without the users or the films being identified except by numbers assigned for the contest.

Its an open problem statement. Let me know what will be your solution to this problem

Q28. Explain How a System Can Play a Game of Chess Using Reinforcement Learning.

Reinforcement learning consists of an environment and an agent. The agent performs certain actions to achieve a specific goal. Every time the agent performs a task that is in relation to the goal, it is rewarded. And, every time it takes a step which goes against that goal or in the reverse direction, it is penalized.

Earlier, chess programs had to determine the best moves after research on numerous factors. Building a machine designed to play such games would require many rules to be specified.

With reinforced learning, we don’t have to deal with this problem as the learning agent learns by playing the game. It will make a move (decision), check if it’s the right move (feedback), and keep the outcomes in memory for the next step it takes (learning). There is a reward for every correct decision the system takes and punishment for the wrong one.

Machine learning interview questions based on real-life scenarios can be asked at any point during the interview.So, you need to be updated with the various advancements in this industry.

Q29. How Will You decide which Machine Learning Algorithm to choose for your classification problem?

You can follow the below-mentioned guidelines to choose an algorithm for your problem:

1. For accuracy, test the different algorithms and cross-validate them.

2. If the training dataset is small, you can use models that have low variance and high bias.

3. If the training dataset is large, you can use the models which have high variance and low bias.

Q30. How will you implement facebook’s “people you may know” using machine learning?

Facebook’s “people you may know” works by inspecting the activity of users on its platform. The machine learning model evaluates the data and the internet activity of the users and keeps on storing it. Also, the model inspects the existing friend list of the user, checks their friend list as well and gives suggestions based on it.

The unsupervised machine learning algorithm is used for this. Once started the model keeps learning on its own. A feedback system is used so that the model can evaluate the feedback and keep on learning from that to give more accurate and relevant results.

Q31. How machine learning powers targeted advertising?

Targeted advertising is implemented using an unsupervised machine learning model. To do so the bot gathers certain information of the user, such as location, mouse movements, connected applications to show precisely targeted ads to the user. Google uses selective filtering features in which the Google search algorithm has a bias towards the client’s products.

Targeted Advertising

Q32. Give a brief overview of sentiment analysis using machine learning.

Sentiment analysis involves the use of natural learning processing to categorise opinions expressed usually in a piece of text, in order to determine whether the writer’s attitude is positive, negative or neutral. Machine learning algorithms can be used to help ease the learning process of the bot. These learning algorithms are constantly fed with a massive amount of data so that it can adjust itself and continually improve.

Q33. How many trigrams phrases can be generated from the given sentence, after performing the following text cleaning steps:

1. Stopword Removal

2. Replacing punctuations by a single space

“#Verzeo is a great source to learn @data_science.”

Ans. After performing stopword removal and punctuation replacement the text becomes: “Verzeo great source learn data science”.

Trigrams – Verzeo great, Verzeogreat source, great source learn, source learn data, learn data science.

Q34. How would you implement a recommendation system for our company’s users?

For this particular problem, you need to have complete knowledge about what the company does. You need to research the company thoroughly before you can answer this question. You need to know who all are the customers, what are their revenue channels etc. so that you can answer this question properly.

Q35. How do you think Google is training data for self-driving cars?

Machine learning questions like this test that if you are up to date with the current industry standards or not. Google currently uses Re-Captcha to source labelled data on storefronts and traffic signs. They are also building on training data collected by Sebastian Thrun at GoogleX to train self-drive cars.

I think that I have covered all the relevant and important questions which you could encounter while appearing for a machine learning interview.

I recommend that you study the practical based questions carefully before appearing for the interview. Also, feel free to suggest more scenario-based questions on machine learning which you think are important and could be asked in the interview.

You can follow our website to explore various internship and certification programs offered by Verzeo. If you are interested in Machine Learning certification, check out the course curriculum.

Liked Our Article? Share it

Leave a Comment

Your email address will not be published. Required fields are marked *

Have a Suggestion? Sent it to us now

Find the right learning path for yourself

Talk to our counsellor