Pandas: Library heavily used in machine learning

Pandas By Sagar Jaybhay

Boolean Indexing:

Boolean vectors can be used to filter data and if you usemultiple conditions then that can be grouped under the brackets.

If you want to find who won the gold medal then you can use head.Medal==’Gold’ by using this you will get the array or series of  true and false but you want real data so you use below like this

Head[head.Medal==’Gold’] it return a data frame as a result. Above is single condition and if you want multiple condition then use below

head[(head.Medal=='Gold') &(head.Gender=='Men')]

String Handling:

By using series.str the str using this you can access the string methods and can apply this methods also

Series.str.contains()
Series.str.startwith()

Below is the syntax that find the athlete who’s name is start with FLACK.

head[head.Athlete.str.contains('FLACK')]

Indexing :

head.index #o/p RangeIndex(start=0, stop=29216, step=1)

pandas.DataFrame.set_index this is used to set the index in data frame using preexisting column as the index.

Link-https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.set_index.html

Pandas: Library heavily used in machine learning
Head.set_index(‘City’,inplace=true)

The parameter here inplace is used to set the indexpermanently in underlying table.

If you set index once using inplace=True you can’t use same column name on next time it will throw an error.

To remove index use below line

head.reset_index(inplace=True) 

Sort_Index():

        You can sort the element along the axis.

Loc[..]:

 You can search the index and if not found raise an key value error.

head.loc[1]
head.iloc[2000]
Pandas: Library heavily used in machine learning

Group By:

By using this you can split the data frame into groups based on some criteria. You can combine the result based on groups. Group By object is not a data frame but it is group of data frames a dictionary like structure.

head.groupby('Sport')

it returns the group by object. We have to check type of group by object is.

type(head.groupby('Sport')) #o/p is pandas.core.groupby.groupby.DataFrameGroupBy
list(head.groupby('Sport'))
Pandas: Library heavily used in machine learning

Iterate through the group.

df.count()

by using this you will get total number of rows by column wise data.

df[1:100]

this is used for starting from row position to end row number means from this code we will get 1 row from 1 st row position to 99th row. Means last number-1 .

Operations on data frames: To import a data from xlsx file we need to install xlrd.

df.to_csv(‘c:\data\mydata.csv’) :

this method is used to save data from data frame to csv on your given location. But it will save the index which is present in data frame. So if you don’t want to save index in csv you can use one parameter flag in this function .df.to_csv(‘c:\data\mydata.csv’)

Pandas: Library heavily used in machine learning

To store data into excel using pandas and python you need to install package openpyxl

df.to_excel(‘demo.xlsx’,sheet_name=’demo’)

and if you don’t want to save index of file use below

df.to_excel(‘demo.xlsx’,sheet_name=’demo’,index=False)

This will create xlsx file in your folder where you havecreated jupyter notebook file. Basically it create folders under

C:\Users\UserName\YourFolderName

You may also like...

2 Responses

  1. IT Support says:

    Thanks for consisting of the lovely images– so open to a feeling of
    contemplation. https://lgnetworksinc.com/it-support/

  2. IT Support says:

    Thanks for consisting of the lovely images– so open to a feeling of contemplation. https://lgnetworksinc.com/it-support/

Leave a Reply

Your email address will not be published. Required fields are marked *