6

Feature Engineering

Unsolved
Feature Engineering
Supervised

Difficulty: 6 | Problem written by ankita
Problem reported in interviews at

Amazon
Apple
Facebook
Google
Netflix

Sometimes the dataset we get may not have a linear decision boundary. If we use a linear model, it may not get good results. However, we can still get complex decision boundaries such as circular boundaries for our predictions by enriching the dataset with new calculated features from existing features, e.g., polynomial features.

Feature engineering is a very important tool in the toolkit of a data scientist. It requires some domain knowledge to engineer valid and important features. Here, we will engineer some features based on the application of a few mathematical functions.

Apply the following feature engineering to both X_train and X_test:

Add two columns of squares of the two features

Add two columns of log of the two features

Add two columns of exp of the two features

The final feature vector should consist of the concatenation of the original X_train with the squares, log, and exp features, in that order.

Input:

The data set is a sample of the Iris dataset.

You are given as input:

X_train: Two numerical features (sepal length and sepal width in cm)

Y_train: labels for X_train (3 classes)

X_test: Two numerical features (sepal length and sepal width in cm)

Output:

Y_test: prediction on X_test after applying the above-mentioned feature engineering

You just have to complete the function Prediction(X_train, Y_train, X_test) which returns Y_test as a NumPy array for a given X_test.

Hints

Use LogisticRegression(solver='liblinear') to train the model on X_train with engineered features.

The output is a NumPy array.

 

Sample Input:
<class 'list'>
X_train: [[5.1, 3.5], [4.9, 3.0], [4.7, 3.2], [4.6, 3.1], [5.0, 3.6], [7.0, 3.2], [6.4, 3.2], [6.9, 3.1], [5.5, 2.3], [6.5, 2.8], [6.3, 3.3], [5.8, 2.7], [7.1, 3.0], [6.3, 2.9], [6.5, 3.0]]
<class 'list'>
Y_train: [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 2.0, 2.0, 2.0, 2.0, 2.0]
<class 'list'>
X_test: [[5.4, 3.9], [4.6, 3.4], [5.7, 2.8], [6.3, 3.3], [7.6, 3.0], [4.9, 2.5]]

Expected Output:
<class 'numpy.ndarray'>
[0. 0. 2. 2. 1. 0.]

Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

Non repellendus fuga labore odit aperiam officia fugiat, provident voluptates aliquid, explicabo labore placeat temporibus perspiciatis consectetur cupiditate, iure repudiandae nam quis hic tempora? Nulla vel odit sit nisi ut facilis repellendus laboriosam deleniti, consectetur inventore quod ratione optio molestiae tempore velit soluta voluptatum corrupti accusamus, voluptas repellendus perferendis expedita nulla libero deserunt a adipisci vero voluptatum, quidem perspiciatis assumenda reiciendis saepe dolore dolorum veritatis veniam alias corporis eaque?

Repellendus aliquam consequatur nihil doloribus, ea ipsa reprehenderit fugit veritatis quaerat dolor, qui iure eos ab optio aut voluptas ratione magnam distinctio nisi, suscipit illo earum?

Asperiores temporibus expedita quas fuga nam recusandae beatae necessitatibus modi, officia illum cupiditate facere ipsa delectus libero culpa porro doloremque, iure doloribus earum vitae illo dolor, sapiente vero perferendis optio rerum veritatis, possimus enim ex quod eos vel similique nostrum? Quia quam nostrum beatae illum placeat numquam incidunt et voluptatem, accusantium dignissimos ea iste, impedit corrupti veniam quis iure sit expedita necessitatibus nisi sint? Minus magni rerum incidunt, ex vel enim mollitia unde, obcaecati adipisci at natus nisi beatae nobis quo facilis eaque ullam, quisquam eos voluptates dicta ratione, culpa quam officia veritatis perspiciatis rerum commodi?

This is a premium feature.
To access this and other such features, click on upgrade below.

Ready.

Input Test Case

Please enter only one test case at a time
numpy has been already imported as np (import numpy as np)