N Days_With_Machine_Learning (Part1 )

In This Blog , I will write About my Activity to Learn and Master the Machine Learning filed , It will takes from me N Days , and it will be divided into 6 Parts (Data Preprocessing , Classification , Regression , Clustering , Artificial Neural networks , Reinforcement learning )

What is Machine learning ?

“Machine Learning is an application of Artificial Intelligence and is revolutionizing the way companies do business”

let’s Start With Data-Preprocessing

The Evolution of Aritifical Intelligence and Machine learning is related to the dispoiniblity of data which are the critical point with it we can develop machine learning models with high accuracy

Our Mission is to Give the Machines access to the data and let themselves.

But Even we have Good Data , we neeed to check that it is in a useful scale, format and even that meaningful features are included .

That’s we called “Data Preprocessing”.

Before we started coding ,you need to install these necessary python libraries :

NumPy is the fundamental package for scientific computing in Python. It is a Python library that provides a multidimensional array object…

pandas is an open source, BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language.

Matplotlib is a Python 2D plotting library which produces publication-quality figures in a variety of hardcopy formats and interactive environments across platforms

scikit-learn is a Python module for machine learning

Try To Install these packages , By Following these Commands

(Only For Linux User )

Now let’s start coding

the First step is to import the libraries

Import the Dataset

Choose Which Columns we will Work with it

Taking Care of Missing Data

Because Data can have missing values like for Our Example [‘Germany’ 40.0 nan] the vlaue nan for the Germany Customer , So we Need to deal with it Bu using :

class sklearn.preprocessing.Imputer(missing_values=’NaN’, strategy=’mean’, axis=0, verbose=0, copy=True)[source]

Python class to complete the missing values , you can read about it

Encoding Ctagroical data

class sklearn.preprocessing.OneHotEncoder(n_values=’auto’, categorical_features=’all’, dtype=<class ‘numpy.float64’>, sparse=True, handle_unknown=’error’)[source]

Python class to encode categorical integer features using a one-hot aka one-of-K scheme , you can read about it

Splitting the Data into Training Set and Testing Set

We need to split the data into “Training set” and “Testing set”

“Training data sets are sets on which you train your machine i.e algorithm to form relationships between variables”.

“Testing data set helps you to validate that the training has happened efficiently in terms of either accuracy, or precision so on”.

Feauture Scalling

Feature scaling is a general trick applied to optimization problems , in Our Case it will makes the Values be within the range , As a result it will Speeds up the calculation because number of calculations require will be less

I just Finished my First Step “Data Preprocessing” , I hope that you understand this Step Before we start with Developping our Machine learning models and remember

Develop a passion for learning. If you do, you will never cease to grow.

You Can fin the Full code source Here :

Follow me in Twitter for code Updates

Thanks For Your FeedBack

<script>alert('try your best')</script>