How to get & check data types of dataframes columns in python pandas ?
Dataframe data types: In this article we will discuss different ways to get the data type of single or multiple columns.
Use Dataframe.dtype to get data types of columns in Dataframe :
In python’s pandas module provides Dataframe class as a container for storing and manipulating two-dimensional data which provides an attribute to get the data type information of each column.
This Dataframe.dtype returns a series mentioned with the data type of each column.
Let’s try with an example:
#Program :
import pandas as pd
import numpy as np
#list of tuples
game = [('riya',37,'delhi','cat','rose'),
('anjali',28,'agra','dog','lily'),
('tia',42,'jaipur','elephant','lotus'),
('kapil',51,'patna','cow','tulip'),
('raj',30,'banglore','lion','orchid')]
#Create a dataframe object
df = pd.DataFrame(game, columns=['Name','Age','Place','Animal','Flower'], index=['a','b','c','d','e'])
print(df)
Output: Name Age Place Animal Flower a riya 37 delhi cat rose b anjali 28 agra dog lily c tia 42 jaipur elephant lotus d kapil 51 patna cow tulip e raj 30 banglore lion orchid
- Python Data Persistence – Excel with Pandas
- Pandas: Drop Rows With NaN/Missing Values in any or Selected Columns of Dataframe
- How to Find and Drop duplicate columns in a DataFrame | Python Pandas
This is the contents of the dataframe. Now let’s fetch the data types of each column in dataframe.
#Program :
import pandas as pd
import numpy as np
#list of tuples
game = [('riya',37,'delhi','cat','rose'),
('anjali',28,'agra','dog','lily'),
('tia',42,'jaipur','elephant','lotus'),
('kapil',51,'patna','cow','tulip'),
('raj',30,'banglore','lion','orchid')]
#Create a dataframe object
df = pd.DataFrame(game, columns=['Name','Age','Place','Animal','Flower'], index=['a','b','c','d','e'])
DataType = df.dtypes
print('Data type of each column:')
print(DataType)
Output: Data type of each column: Name object Age int64 Place object Animal object Flower object dtype: object
Get Data types of dataframe columns as dictionary :
#Program :
import pandas as pd
import numpy as np
#list of tuples
game = [('riya',37,'delhi','cat','rose'),
('anjali',28,'agra','dog','lily'),
('tia',42,'jaipur','elephant','lotus'),
('kapil',51,'patna','cow','tulip'),
('raj',30,'banglore','lion','orchid')]
#Create a dataframe object
df = pd.DataFrame(game, columns=['Name','Age','Place','Animal','Flower'], index=['a','b','c','d','e'])
#get a dictionary containing the pairs of column names and data types object
DataTypeDict = dict(df.dtypes)
print('Data type of each column :')
print(DataTypeDict)
Output:
Data type of each column :{'Name': dtype('O'), 'Age': dtype('int64'), 'Place': dtype('O'), 'Animal': dtype('O'), 'Flower': dtype('O')}
Get the data type of a single column in dataframe :
By using Dataframe.dtypes we can also get the data type of a single column from a series of objects.
#Program :
import pandas as pd
import numpy as np
#list of tuples
game = [('riya',37,'delhi','cat','rose'),
('anjali',28,'agra','dog','lily'),
('tia',42,'jaipur','elephant','lotus'),
('kapil',51,'patna','cow','tulip'),
('raj',30,'banglore','lion','orchid')]
#Create a dataframe object
df = pd.DataFrame(game, columns=['Name','Age','Place','Animal','Flower'], index=['a','b','c','d','e'])
#get a dictionary containing the pairs of column names and data types object
DataTypeObj = df.dtypes['Age']
print('Data type of each column Age : ')
print(DataTypeObj)
Output : Data type of each column Age :int64
Get list of pandas dataframe column names based on data types :
Suppose, we want a list of column names based on datatypes. Let’s take an example program whose data type is object(string).
import pandas as pd
import numpy as np
#list of tuples
game = [('riya',37,'delhi','cat','rose'),
('anjali',28,'agra','dog','lily'),
('tia',42,'jaipur','elephant','lotus'),
('kapil',51,'patna','cow','tulip'),
('raj',30,'banglore','lion','orchid')]
#Create a dataframe object
df = pd.DataFrame(game, columns=['Name','Age','Place','Animal','Flower'], index=['a','b','c','d','e'])
# Get columns whose data type is object means string
filteredColumns = df.dtypes[df.dtypes == np.object]
# list of columns whose data type is object means string
listOfColumnNames = list(filteredColumns.index)
print(listOfColumnNames)
Output: ['Name', 'Place', 'Animal', 'Flower']
Get data types of a dataframe using Dataframe.info() :
Dataframe.info() function is used to get simple summary of a dataframe. By using this method we can get information about a dataframe including the index dtype and column dtype, non-null values and memory usage.
#program :
import pandas as pd
import numpy as np
#list of tuples
game = [('riya',37,'delhi','cat','rose'),
('anjali',28,'agra','dog','lily'),
('tia',42,'jaipur','elephant','lotus'),
('kapil',51,'patna','cow','tulip'),
('raj',30,'banglore','lion','orchid')]
#Create a dataframe object
df = pd.DataFrame(game, columns=['Name','Age','Place','Animal','Flower'], index=['a','b','c','d','e'])
df.info()
Output: <class 'pandas.core.frame.DataFrame'> Index: 5 entries, a to e Data columns (total 5 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- a Name 5 non-null object b Age 5 non-null int64 c Place 5 non-null object d Animal 5 non-null object e Flower 5 non-null object dtypes: int64(1), object(4) memory usage: 240.0+ bytes