Sunday, March 1, 2015

Basic Functionality of Series or DataFrame in Pandas

Throughout this post I will take you over the fundamental mechanics of interacting with the data contained in a Series or DataFrame in pandas(python).

Reindexing

Reindexing is a critical method on pandas objects. 'Reindexing' means to create a new object with the data conformed to a new index. Here is my object that I will be using for this post

obj = pd.Series([4.1, 2.6, 1.1, 3.7], index=['d', 'b', 'a', 'c'])

image

By Calling reindex on this Series, it rearranges the data according to the new index, introducing missing values if any index values were not already present in last series.

obj2 = obj.reindex(['a', 'b', 'c', 'd', 'e'])

image
you can fill the missing value by passing fill_value as below

obj.reindex(['a', 'b', 'c', 'd', 'e'], fill_value=0)

You can fill by forward ('ffill') value or backward 'bfill'

obj3.reindex(range(6), method='ffill')

In DataFrame, reindex can alter either the (row) index, columns, or both.
When you passed just a sequence, the rows are reindexed in the result.

frame = pd.DataFrame(np.arange(27.0,31.5,0.5)).reshape((3, 3)), index=['a', 'c', 'd'], columns=['Colombo', 'Negombo', 'Gampaha'])

image

reindex
frame2 = frame.reindex(['a', 'b', 'c', 'd'])

reindex in rows in dataframe
cities= ['Colombo', 'Negombo', 'Kandy']
frame.reindex(columns=cities)

frame.reindex(index=['a', 'b', 'c', 'd'], method='ffill', columns=cities)

image

Dropping entries

In series, We can drop one or more entries from an axis
new_obj = obj.drop('c')


new_obj2 = obj.drop(['d', 'c'])

With DataFrame, index values can be deleted from either axis:
data.drop('a')
data.drop(['a','c'])

deleting entries over the axis
image

data.drop('Negombo', axis=1)

data.drop(['Negombo', 'Kandy'], axis=1)

No comments:

Post a Comment