Machine Learning & Big Data Blog

How To Create a Pandas Dataframe from a Dictionary

3 minute read
Walker Rowe
image_pdfimage_print

Here is yet another example of how useful and powerful Pandas is. Pandas can create dataframes from many kinds of data structures—without you having to write lots of lengthy code. One of those data structures is a dictionary.

In this tutorial, we show you two approaches to doing that.

(This tutorial is part of our Pandas Guide. Use the right-hand menu to navigate.)

A word on Pandas versions

Before you start, upgrade Python to at least 3.7. With Python 3.4, the highest version of Pandas available is 0.22, which does not support specifying column names when creating a dictionary in all cases.

If you are running virtualenv, create a new Python environment and install Pandas like this:

virtualenv py37  --python=python3.7
pip install pandas

You can check the Pandas version with:

import pandas as pd
pd.__version__

Create dataframe with Pandas DataFrame constructor

Here we construct a Pandas dataframe from a dictionary. We use the Pandas constructor, since it can handle different types of data structures.

The dictionary below has two keys, scene and facade. Each value has an array of four elements, so it naturally fits into what you can think of as a table with 2 columns and 4 rows.

Pandas is designed to work with row and column data. Each row has a row index. By default, it is the numbers 0, 1, 2, 3, … But it also lets you use names.

So, let’s use the same in the array idx.

import pandas as pd

dict =  {'scene': ["foul", "murder", "drunken", "intrigue"],

'facade': ["fair", "beaten", "fat", "elf"]}

idx = ['hamlet', 'lear', 'falstaff','puck']

dp = pd.DataFrame(dict,index=idx)

Here is the resulting dataframe:

Create dataframe with Pandas from_dict() Method

Pandas also has a Pandas.DataFrame.from_dict() method. If that sounds repetitious, since the regular constructor works with dictionaries, you can see from the example below that the from_dict() method supports parameters unique to dictionaries.

In the code, the keys of the dictionary are columns. The row indexes are numbers. That is default orientation, which is orient=’columns’ meaning take the dictionary keys as columns and put the values in rows.

pd.DataFrame.from_dict(dict)


Now we flip that on its side.  We will make the rows the dictionary keys.

pd.DataFrame.from_dict(dict,orient='index')

Notice that the columns have no names, only numbers. That’s not very useful, so below we use the columns parameter, which was introduced in Pandas 0.23.

It’s as simple as putting the column names in an array and passing it as the columns parameter. One wonders why the earlier versions of Pandas did not have that.

pd.DataFrame.from_dict(dict,orient='index',columns=idx)

hamlet    lear falstaff      puck

scene    foul  murder  drunken  intrigue

facade   fair  beaten      fat       elf

Related reading

 

Learn ML with our free downloadable guide

This e-book teaches machine learning in the simplest way possible. This book is for managers, programmers, directors – and anyone else who wants to learn machine learning. We start with very basic stats and algebra and build upon that.


These postings are my own and do not necessarily represent BMC's position, strategies, or opinion.

See an error or have a suggestion? Please let us know by emailing blogs@bmc.com.

BMC Bring the A-Game

From core to cloud to edge, BMC delivers the software and services that enable nearly 10,000 global customers, including 84% of the Forbes Global 100, to thrive in their ongoing evolution to an Autonomous Digital Enterprise.
Learn more about BMC ›

About the author

Walker Rowe

Walker Rowe is an American freelancer tech writer and programmer living in Cyprus. He writes tutorials on analytics and big data and specializes in documenting SDKs and APIs. He is the founder of the Hypatia Academy Cyprus, an online school to teach secondary school children programming. You can find Walker here and here.