AI Agents and Tabular Data
Have you ever wondered how AI agents understand tabular data, such as that in CSV or Excel files? Or how a file loaded into a platform like ChatGPT can be instantly understood and processed? This insight explores the creation of a custom AI agent capable of achieving these tasks from scratch. AI Agents and Tabular Data. Context Jen, an AI Engineer at AI United, leads a team developing an AI agent within a 30-day timeline. This agent will generate tailored interactive charts based on uploaded data files, enabling users to better visualize and interpret the data. To achieve this, the team needed to ensure the AI agent could analyze the file’s data context and autonomously recommend the most appropriate chart types. The agent development was broken down into four main steps: Here’s a look at how the team developed the AI system to ingest CSV data and aggregate it into an actionable format. Setup The development began by configuring package installations and defining the language model to be used: pythonCopy code%pip install langchain_openai %pip install langchain_core %pip install langchain_community %pip install langchain_experimental from langchain_openai.chat_models import ChatOpenAI openai_key = os.environ.get(“OPENAI_API”) gpt4o = ChatOpenAI(temperature=0.0, model=”gpt-4o”, openai_api_key=openai_key) Step 1: Context Creation Before generating the code to process raw data, the team created context around the dataset to enhance the AI’s response accuracy. Metadata extraction included: For demonstration, a wine reviews dataset was used, and metadata was extracted as follows: pythonCopy codeimport pandas as pd def extract_metadata(df): metadata = { ‘Number of Columns’: df.shape[1], ‘Schema’: df.columns.tolist(), ‘Data Types’: str(df.dtypes), ‘Sample’: df.head(1).to_dict(orient=”records”) } return metadata df = pd.read_csv(“wine_reviews.csv”) metadata = extract_metadata(df) Step 2: Prompt Augmentation To help the AI model interpret the dataset, prompts were augmented with extracted metadata using a template: pythonCopy codeprompt_template = ”’ Assistant is an AI model that suggests charts to visualize data based on the following metadata. SCHEMA: ——– {schema} DATA TYPES: ——– {data_types} SAMPLE: ——– {sample} ”’.format(schema=metadata[“Schema”], data_types=metadata[“Data Types”], sample=metadata[“Sample”]) gpt40.invoke(prompt_template) Step 3: Simple Agent Code Generation & Execution With the prompt augmented, the model was able to suggest suitable charts. For data transformation, an agentic workflow with a Python REPL tool was used, where the AI generated code for aggregating data and then executed it to provide the necessary structure for plotting. A REPL instance was created to pass data into Python functions, enabling the AI to perform aggregation. pythonCopy codefrom langchain_experimental.utilities import PythonREPL repl = PythonREPL() repl.globals[‘df’] = df from langchain_core.tools import tool @tool def python_repl(code: str): try: result = repl.run(code) except BaseException as e: return f”Failed to execute. Error: {repr(e)}” result_str = f”Successfully executed:n“`pythonn{code}n“`nStdout: {result}” return result_str tools = [python_repl] Step 4: Final Data Aggregation and Charting Finally, the AI suggested the Bar Chart type for plotting the top 10 wineries by average points, and the REPL instance executed the code to transform the data for the chart: pythonCopy code# Code to aggregate and convert data into dictionary format top_wineries = df.groupby(‘winery’)[‘points’].mean().sort_values(ascending=False).head(10) top_wineries_dict = top_wineries.to_dict() print(top_wineries_dict) The aggregated data was output as: jsonCopy code{‘Macauley’: 96.0, ‘Heitz’: 95.5, ‘Bodega Carmen Rodríguez’: 95.5, ‘Maurodos’: 95.0, ‘Blue Farm’: 95.0, ‘Numanthia’: 95.0, ‘Château Lagrézette’: 95.0, ‘Patricia Green Cellars’: 95.0, ‘Ponzi’: 95.0, ‘Muga’: 95.0} With this approach, the AI agent was capable of not only understanding the data in the uploaded file but also generating interactive visualizations, making complex datasets more accessible and insightful. Like Related Posts Salesforce OEM AppExchange Expanding its reach beyond CRM, Salesforce.com has launched a new service called AppExchange OEM Edition, aimed at non-CRM service providers. Read more Salesforce Jigsaw Salesforce.com, a prominent figure in cloud computing, has finalized a deal to acquire Jigsaw, a wiki-style business contact database, for Read more Health Cloud Brings Healthcare Transformation Following swiftly after last week’s successful launch of Financial Services Cloud, Salesforce has announced the second installment in its series Read more Top Ten Reasons Why Tectonic Loves the Cloud The Cloud is Good for Everyone – Why Tectonic loves the cloud You don’t need to worry about tracking licenses. Read more