Knowledge Check

Question 1 of 15 Question 1 of 15

1

Which of these is the last step of an iteration within the CRISP-DM process?

Select one of the following:

Deployment
Evaluation
Modelling
Analysis

Explanation

Question 2 of 15 Question 2 of 15

1

If a model predicts the number of days a customer takes to come back to a company's website, which metric is the most adequate to assess the performance of the model?

Select one of the following:

Recall
Precision
Mean absolute error
Accuracy

Explanation

Question 3 of 15 Question 3 of 15

1

Machine Learning steps generally followed

1) ❌

2) ❌

3) ❌

4) ❌
5) ❌

Drag and drop to complete the text.

Model Deployment

Preparing Data

Build and Train Models

Data Ingestion

Monitoring Models

Explanation

Question 4 of 15 Question 4 of 15

1

A face recognition model for accessing a gym is posing issues because it denies access to too many customers. Why might this be?

Select one of the following:

The model is too sensitive and a higher threshold would solve the problem.
The model is too specific and a higher threshold would solve the problem.
The model is too sensitive and a lower threshold would solve the problem.
The model is too specific and a lower threshold would solve the problem.

Explanation

Question 5 of 15 Question 5 of 15

1

What is PII?

Select one of the following:

Personally Identifiable Index
Publically Incriminating Index
Publicly Identifiable Information
Personally Identifiable Information

Explanation

Question 6 of 15 Question 6 of 15

1

What is used to compare the predicted and actual values of a binary classification model?

Select one of the following:

R-squared value
BLEU score
Confusion matrix
Correlation coefficient

Explanation

Question 7 of 15 Question 7 of 15

1

A weather forecast model must update its predictions first thing in the morning, everyday. It is trained on daily historical data, available publicly, but that data is only refreshed at noon each day, so it is not available at the correct time necessary for updating the model's predictions. How could this problem be solved?

Select one of the following:

Replicate the dataset and run the predictions in production with that.
Do not use the dataset when running the model in production.
Train the model with different data because the model must make inferences using the same type of input data it saw during training.
Replicate the dataset and train the model with that.

Explanation

Question 8 of 15 Question 8 of 15

1

An online retail company wants to improve the speed at which it analyzes how users interact with its website. What is the most pressing architectural question you would address first?

Select one of the following:

Whether the data is being put into a data lake before analysis
Whether processes str in place to clean and preprocess data before storage
Whether the data ingestion processes are event-driven and real time, or nightly batch.
Whether business rules are applied on data in transit or in-situ

Explanation

Question 9 of 15 Question 9 of 15

1

Which module would you use to evaluate the performance of a binary classifier using scikit-learn?

Select one of the following:

sklearn.metrics.median_absolute_error
sklearn.metrics.auc
sklearn.metrics.mean_absolute_error
sklearn.metrics.r2_score

Explanation

Question 10 of 15 Question 10 of 15

1

After sampling 10,000 values of a random variable you observe that the mode, median, and mean are the same. What is the most likely variable distribution?

Select one of the following:

Logarithmic distribution
Uniform distribution
Normal distribution
Poisson distribution

Explanation

Question 11 of 15 Question 11 of 15

1

Which technique can be useful to handle highly imbalanced true/false labels?

Select one of the following:

Simple random sampling
Convenience sampling
Systematic sampling
Stratified sampling

Explanation

Question 12 of 15 Question 12 of 15

1

You have a dataset with Female and Male features. What will be the feature names when the code below is executed?

import pandas as pd





def azureml_main(dataframe1 = None, dataframe2 = None):
    pd.get_dummies(dataframe1)
    return dataframe1,

Select one of the following:

Female, Male
Female_Yes, Female_No, Male_Yes, Male_No
Female_Yes, Male_No
Female_No, Male_Yes

Explanation

Question 13 of 15 Question 13 of 15

1

You have a dataset that you want to use for your company's ML algorithm. It has 30 dimensions and you want to reduce the size to 3 dimensions to decrease memory usage and computation time. Which method should you choose?

Select one of the following:

t-Distributed Stochastic Neighbor Embedding
Linear Discriminant Analysis
Principal Component Analysis
K-means model stacking

Explanation

Question 14 of 15 Question 14 of 15

1

What is Kubernetes?

Select one of the following:

A serverless platform to build and manage your apps.
A proprietary platform built by Google and Docker to run and manage your applications.
An an open-source system to deploy, manage, and run Cloud Foundry apps.
A container orchestrator to provision, manage, and scale applications.

Explanation

Question 15 of 15 Question 15 of 15

1

Your team is building a data engineering and data science development environment.
The environment must support the following requirements:
✑ support Python and Scala
✑ compose data storage, movement, and processing services into automated data pipelines
✑ the same tool should be used for the orchestration of both data engineering and data science support workload isolation and interactive workloads
✑ enable scaling across a cluster of machines

You need to create the environment.
What should you do?

Select one of the following:

Build the environment in Apache Hive for HDInsight and use Azure Data Factory for orchestration.
Build the environment in Azure Databricks and use Azure Data Factory for orchestration.
Build the environment in Azure Databricks and use Azure Container Instances for orchestration.
Build the environment in Apache Spark for HDInsight and use Azure Container Instances for orchestration.

	Created by Mohammed Arif Mazumder about 5 years ago

Quiz on Knowledge Check, created by Mohammed Arif Mazumder on 04/25/2020.

0 comments

Knowledge Check

Knowledge Check

Question 1 of 15 Question 1 of 15

Which of these is the last step of an iteration within the CRISP-DM process?

Select one of the following:

Explanation

Question 2 of 15 Question 2 of 15

If a model predicts the number of days a customer takes to come back to a company's website, which metric is the most adequate to assess the performance of the model?

Select one of the following:

Explanation

Question 3 of 15 Question 3 of 15

Machine Learning steps generally followed 1) ❌ 2) ❌ 3) ❌ 4) ❌ 5) ❌

Drag and drop to complete the text.

Explanation

Question 4 of 15 Question 4 of 15

A face recognition model for accessing a gym is posing issues because it denies access to too many customers. Why might this be?

Select one of the following:

Explanation

Question 5 of 15 Question 5 of 15

What is PII?

Select one of the following:

Explanation

Question 6 of 15 Question 6 of 15

What is used to compare the predicted and actual values of a binary classification model?

Select one of the following:

Explanation

Question 7 of 15 Question 7 of 15

Select one of the following:

Explanation

Question 8 of 15 Question 8 of 15

An online retail company wants to improve the speed at which it analyzes how users interact with its website. What is the most pressing architectural question you would address first?

Select one of the following:

Explanation

Question 9 of 15 Question 9 of 15

Which module would you use to evaluate the performance of a binary classifier using scikit-learn?

Select one of the following:

Explanation

Question 10 of 15 Question 10 of 15

After sampling 10,000 values of a random variable you observe that the mode, median, and mean are the same. What is the most likely variable distribution?

Select one of the following:

Explanation

Question 11 of 15 Question 11 of 15

Which technique can be useful to handle highly imbalanced true/false labels?

Select one of the following:

Explanation

Question 12 of 15 Question 12 of 15

You have a dataset with Female and Male features. What will be the feature names when the code below is executed? import pandas as pd def azureml_main(dataframe1 = None, dataframe2 = None): pd.get_dummies(dataframe1) return dataframe1,

Select one of the following:

Explanation

Question 13 of 15 Question 13 of 15

You have a dataset that you want to use for your company's ML algorithm. It has 30 dimensions and you want to reduce the size to 3 dimensions to decrease memory usage and computation time. Which method should you choose?

Select one of the following:

Explanation

Question 14 of 15 Question 14 of 15

What is Kubernetes?

Select one of the following:

Explanation

Question 15 of 15 Question 15 of 15

Select one of the following:

Explanation

Machine Learning steps generally followed

1) ❌

2) ❌

3) ❌

4) ❌
5) ❌

You have a dataset with Female and Male features. What will be the feature names when the code below is executed?
`import pandas as pd def azureml_main(dataframe1 = None, dataframe2 = None): pd.get_dummies(dataframe1) return dataframe1,`