Interactive data analysis through Google API, Google.generativeai, Pandas, and Ipython.display, create a data science agent using the Gemini-2.0-Flash-Lite model: Tutorial for code implementation using the Gemini-2.0-Flash-Lite model.
In this tutorial, we demonstrate the integration of Python’s powerful data manipulation library PANDA with advanced generation capabilities of Google Cloud through the Google.generativeai package and the Gemini Pro model. By setting up the environment with the necessary libraries, configuring Google Cloud API keys, and leveraging IPYTHON display capabilities, the code provides a step-by-step approach to building a data science agent that analyzes sample sales datasets. This example shows how to convert a data framework to a price reduction and then use natural language queries to generate insights about the data, highlighting the potential of combining traditional data analytics tools with modern AI-driven approaches.
!pip install pandas google-generativeai --quiet
First, we quietly installed the giant panda and Google-generativeai libraries to set up the environment for data operations and AI-driven analysis.
import pandas as pd
import google.generativeai as genai
from IPython.display import Markdown
We use giant pandas for data manipulation, Google.generativeai accesses Google’s Generative AI capabilities and lowers Markdown from ipython.display to render Markdown-formatted Outputs.
GOOGLE_API_KEY = "Use Your API Key Here"
genai.configure(api_key=GOOGLE_API_KEY)
model = genai.GenerativeModel('gemini-2.0-flash-lite')
We assign a placeholder API key, use it to configure the Google.generativeai client, and then initialize the “Gemini-2.0-Flash-lite” GenerativeModel for generating the content.
data = {'Product': ['Laptop', 'Mouse', 'Keyboard', 'Monitor', 'Webcam', 'Headphones'],
'Category': ['Electronics', 'Electronics', 'Electronics', 'Electronics', 'Electronics', 'Electronics'],
'Region': ['North', 'South', 'East', 'West', 'North', 'South'],
'Units Sold': [150, 200, 180, 120, 90, 250],
'Price': [1200, 25, 75, 300, 50, 100]}
sales_df = pd.DataFrame(data)
print("Sample Sales Data:")
print(sales_df)
print("-" * 30)
Here we create a giant panda dataframe called Sales_DF with sample sales data for various products, and then print the dataframe, followed by divider lines to visually distinguish the output.
def ask_gemini_about_data(dataframe, query):
"""
Asks the Gemini Pro model a question about the given Pandas DataFrame.
Args:
dataframe: The Pandas DataFrame to analyze.
query: The natural language question about the DataFrame.
Returns:
The response from the Gemini Pro model as a string.
"""
prompt = f"""You are a data analysis agent. Analyze the following pandas DataFrame and answer the question.
DataFrame:
```
{dataframe.to_markdown(index=False)}
```
Question: {query}
Answer:
"""
response = model.generate_content(prompt)
return response.text
Here, we construct a scored form prompt from the Pandas data framework and natural language queries, and then generate and return the analysis response using the Gemini Pro model.
# Query 1: What is the total number of units sold across all products?
query1 = "What is the total number of units sold across all products?"
response1 = ask_gemini_about_data(sales_df, query1)
print(f"Question 1: {query1}")
print(f"Answer 1:n{response1}")
print("-" * 30)
# Query 2: Which product had the highest number of units sold?
query2 = "Which product had the highest number of units sold?"
response2 = ask_gemini_about_data(sales_df, query2)
print(f"Question 2: {query2}")
print(f"Answer 2:n{response2}")
print("-" * 30)
# Query 3: What is the average price of the products?
query3 = "What is the average price of the products?"
response3 = ask_gemini_about_data(sales_df, query3)
print(f"Question 3: {query3}")
print(f"Answer 3:n{response3}")
print("-" * 30)
# Query 4: Show me the products sold in the 'North' region.
query4 = "Show me the products sold in the 'North' region."
response4 = ask_gemini_about_data(sales_df, query4)
print(f"Question 4: {query4}")
print(f"Answer 4:n{response4}")
print("-" * 30)
# Query 5. More complex query: Calculate the total revenue for each product.
query5 = "Calculate the total revenue (Units Sold * Price) for each product and present it in a table."
response5 = ask_gemini_about_data(sales_df, query5)
print(f"Question 5: {query5}")
print(f"Answer 5:n{response5}")
print("-" * 30)
In summary, the tutorial successfully illustrates how synergies between Pandas, Google.generativeai packages and Gemini Pro models can transform data analysis tasks into more interactive and insightful processes. This approach simplifies querying and interpreting data and opens avenues for advanced use cases such as data cleaning, functional engineering, and exploratory data analysis. By leveraging these state-of-the-art tools in the familiar Python ecosystem, data scientists can increase their productivity and innovation, making it easier to get meaningful insights from complex datasets.
This is COLAB notebook. Also, don’t forget to follow us twitter And join us Telegram Channel and LinkedIn GrOUP. Don’t forget to join us 85k+ ml reddit.
Interactive data analysis through Google API, Google.generativeai, Pandas and Ipython.display, creating a data science agent using the Gemini-2.0-Flash-Lite model: Code implementation is implemented using the Gemini-2.0-Flash-Lite model.