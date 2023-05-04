PandasAI is a game-changing technology that is revolutionizing data analysis and machine learning. It is a free and open-source data manipulation and analysis package based on the Python programming language. The library includes a variety of tools for working with structured data, including data frames and series. PandasAI is extremely popular among data scientists and analysts because to its simplicity and adaptability.

Python Pandas, as we all know, is an open-source toolkit that provides data manipulation and analysis capabilities for Python programming. This versatile library has become a must-have for data scientists and analysts.

With its basic yet powerful data structures such as Series and DataFrame, it provides an effective way to managing structured data.

PandasAI is often used in the preprocessing stage of machine learning and deep learning procedures in the field of artificial intelligence. Pandas aid in the translation of raw datasets into organized, ready-to-use forms that can be fed into AI algorithms by offering seamless data cleaning, reshaping, merging, and aggregation.

As a result, it is crucial in reducing data preparation time and speeding up the AI development process. I’m guessing that’s why “PandasAI” was created.

What is pandasAI?

PandasAI is meant to be used in conjunction with Pandas. It turns Pandas into a conversational tool that allows you to ask questions about your data and receive answers in the form of Pandas DataFrames.

Installation

pip install pandasai

Now import the dependencies:

import pandas as pd from pandasai import PandasAI from pandasai.llm.openai import OpenAI

We create a dataframe using pandas:

You may ask PandasAI to discover all the rows in a DataFrame with a column value greater than 5, and it will return a DataFrame containing just those rows.

df = pd.DataFrame({ "country": ["United States", "United Kingdom", "France", "Germany", "Italy", "Spain", "Canada", "Australia", "Japan", "China"], "gdp": [21400000, 2940000, 2830000, 3870000, 2160000, 1350000, 1780000, 1320000, 516000, 14000000], "happiness_index": [7.3, 7.2, 6.5, 7.0, 6.0, 6.3, 7.3, 7.3, 5.9, 5.0] })

set up the llm (in this case, OpenAI). Make sure to replace the API key with your OpenAI API key.

However, in order to use this new library on the market, you will need an OpenAI key, and each request will require you to pay a small cost using your OpenAI key.

OPENAI_API_KEY = "YOUR API KEY" llm = OpenAI(api_token=OPENAI_API_KEY)

Then we instantiate Pandas AI with the provided large language model and we run it, passing the data frame and the prompt.

pandas_ai = PandasAI(llm) pandas_ai.run(df, prompt='Which are the 5 happiest countries?')

the top 5 happiest countries are the United States, Canada, Australia, United Kingdom, and Germany.

So, for those who are unfamiliar with Python or pandas manipulations/transformations, this is a new way of programming with dataframes.

Consider a universe in which, instead of programming the work at hand, you virtually converse with the machine and tell it what you want the outcome to be. The computer will convert this message into machine-readable code and provide the output to you.

You can also show a chart, for example:

pandas_ai.run(df, "Plot the histogram of countries showing for each the gpd, using different colors for each bar")

This article helps you learn about Pandas AI. We trust that it has been helpful to you. Please feel free to share your thoughts and feedback in the comment section below.