N Days_With_Machine_Learning (Part1 )

In This Blog , I will write About my Activity to Learn and Master the Machine Learning filed , It will takes from me N Days , and it will be divided into 6 Parts (Data Preprocessing , Classification , Regression , Clustering , Artificial Neural networks , Reinforcement learning )

What is Machine learning ?

let’s Start With Data-Preprocessing

The Evolution of Aritifical Intelligence and Machine learning is related to the dispoiniblity of which are the critical point with it we can develop machine learning models with high accuracy

Our Mission is to Give the Machines access to the and let themselves.

But Even we have Good Data , we neeed to check that it is in a useful scale, format and even that meaningful features are included .

That’s we called “Data Preprocessing”.

Before we started coding ,you need to install these necessary python libraries :



Matplotlib is a Python 2D plotting library which produces publication-quality figures in a variety of hardcopy formats and interactive environments across platforms

scikit learn

Try To Install these packages , By Following these Commands

(Only For Linux User )

sudo apt-get update
sudo apt-get -y install python-pip
sudo apt-get install python3-matplotlib
sudo pip3 install numpy
sudo pip3 install pandas
sudo pip3 install scipy
sudo pip3 install -U scikit-learn

Now let’s start coding

the First step is to import the libraries

# Imporitng the libraries
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

Import the Dataset

# Importing the dataset
dataset = pd.read_csv('Data.csv');

Choose Which Columns we will Work with it

X = dataset.iloc[:,:-1].values
Y = dataset.iloc[:,3].values

Taking Care of Missing Data

Because Data can have missing values like for Our Example [‘Germany’ 40.0 nan] the vlaue for the Germany Customer , So we Need to deal with it Bu using :

sklearn.preprocessing.Imputer(, , , , )[source]

Python class to complete the missing values , you can read about it

from sklearn.preprocessing import Imputer 
imputer = Imputer(missing_values='NaN',strategy='mean',axis=0)
imputer = imputer.fit(X[:,1:3])
X[:,1:3]= imputer.transform(X[:,1:3])

Encoding Ctagroical data

sklearn.preprocessing.OneHotEncoder(, , , , )[source]

Python class to encode categorical integer features using a one-hot aka one-of-K scheme , you can read about it

from sklearn import preprocessing
from sklearn.preprocessing import OneHotEncoder
le = preprocessing.LabelEncoder()
enc = OneHotEncoder(categorical_features=[0])
X[:,0]= le.fit_transform(X[:,0])
X = enc.fit_transform(X).toarray();
Y = le.fit_transform(Y)

Splitting the Data into Training Set and Testing Set

from sklearn.model_selection import train_test_split
X_Train ,X_Test , Y_Train,Y_Test= train_test_split(X,Y, test_size=0.2,random_state=0)
print('**************testing data**********')

We need to split the data into “Training set” and “Testing set”


“Testing data set helps you to validate that the training has happened efficiently in terms of either accuracy, or precision so on”.

Feauture Scalling

Feature scaling is a general trick applied to optimization problems , in Our Case it will makes the Values be within the range , As a result it will Speeds up the calculation because number of calculations require will be less

from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
X_train = scaler.fit_transform(X_Train)
X_Test =scaler.transform(X_Test)'''

I just Finished my First Step “Data Preprocessing” , I hope that you understand this Step Before we start with Developping our Machine learning models and remember

Develop a passion for learning. If you do, you will never cease to grow.

You Can fin the Full code source Here :

Follow me in Twitter for code Updates

Thanks For Your FeedBack

<script>alert('try your best')</script>