Function Calling Archives - gettectonic.com
Fivetrans Hybrid Deployment

Fivetrans Hybrid Deployment

Fivetran’s Hybrid Deployment: A Breakthrough in Data Engineering In the data engineering world, balancing efficiency with security has long been a challenge. Fivetran aims to shift this dynamic with its Hybrid Deployment solution, designed to seamlessly move data across any environment while maintaining control and flexibility. Fivetrans Hybrid Deployment. The Hybrid Advantage: Flexibility Meets Control Fivetran’s Hybrid Deployment offers a new approach for enterprises, particularly those handling sensitive data or operating in regulated sectors. Often, these businesses struggle to adopt data-driven practices due to security concerns. Hybrid Deployment changes this by enabling the secure movement of data across cloud and on-premises environments, giving businesses full control over their data while maintaining the agility of the cloud. As George Fraser, Fivetran’s CEO, notes, “Businesses no longer have to choose between managed automation and data control. They can now securely move data from all their critical sources—like Salesforce, Workday, Oracle, SAP—into a data warehouse or data lake, while keeping that data under their own control.” How it Works: A Secure, Streamlined Approach Fivetran’s Hybrid Deployment relies on a lightweight local agent to move data securely within a customer’s environment, while the Fivetran platform handles the management and monitoring. This separation of control and data planes ensures that sensitive information stays within the customer’s secure perimeter. Vinay Kumar Katta, a managing delivery architect at Capgemini, highlights the flexibility this provides, enabling businesses to design pipelines without sacrificing security. Beyond Security: Additional Benefits Hybrid Deployment’s benefits go beyond just security. It also offers: Early adopters are already seeing its value. Troy Fokken, chief architect at phData, praises how it “streamlines data pipeline processes,” especially for customers in regulated industries. AI Agent Architectures: Defining the Future of Autonomous Systems In the rapidly evolving world of AI, a new framework is emerging—AI agents designed to act autonomously, adapt dynamically, and explore digital environments. These AI agents are built on core architectural principles, bringing the next generation of autonomy to AI-driven tasks. What Are AI Agents? AI agents are systems designed to autonomously or semi-autonomously perform tasks, leveraging tools to achieve objectives. For instance, these agents may use APIs, perform web searches, or interact with digital environments. At their core, AI agents use Large Language Models (LLMs) and Foundation Models (FMs) to break down complex tasks, similar to human reasoning. Large Action Models (LAMs) Just as LLMs transformed natural language processing, Large Action Models (LAMs) are revolutionizing how AI agents interact with environments. These models excel at function calling—turning natural language into structured, executable actions, enabling AI agents to perform real-world tasks like scheduling or triggering API calls. Salesforce AI Research, for instance, has open-sourced several LAMs designed to facilitate meaningful actions. LAMs bridge the gap between unstructured inputs and structured outputs, making AI agents more effective in complex environments. Model Orchestration and Small Language Models (SLMs) Model orchestration complements LAMs by utilizing smaller, specialized models (SLMs) for niche tasks. Instead of relying on resource-heavy models, AI agents can call upon these smaller models for specific functions—such as summarizing data or executing commands—creating a more efficient system. SLMs, combined with techniques like Retrieval-Augmented Generation (RAG), allow smaller models to perform comparably to their larger counterparts, enhancing their ability to handle knowledge-intensive tasks. Vision-Enabled Language Models for Digital Exploration AI agents are becoming even more capable with vision-enabled language models, allowing them to interact with digital environments. Projects like Apple’s Ferret-UI and WebVoyager exemplify this, where agents can navigate user interfaces, recognize elements via OCR, and explore websites autonomously. Function Calling: Structured, Actionable Outputs A fundamental shift is happening with function calling in AI agents, moving from unstructured text to structured, actionable outputs. This allows AI agents to interact with systems more efficiently, triggering specific actions like booking meetings or executing API calls. The Role of Tools and Human-in-the-Loop AI agents rely on tools—algorithms, scripts, or even humans-in-the-loop—to perform tasks and guide actions. This approach is particularly valuable in high-stakes industries like healthcare and finance, where precision is crucial. The Future of AI Agents With the advent of Large Action Models, model orchestration, and function calling, AI agents are becoming powerful problem solvers. These agents are evolving to explore, learn, and act within digital ecosystems, bringing us closer to a future where AI mimics human problem-solving processes. As AI agents become more sophisticated, they will redefine how we approach digital tasks and interactions. Like Related Posts Salesforce OEM AppExchange Expanding its reach beyond CRM, Salesforce.com has launched a new service called AppExchange OEM Edition, aimed at non-CRM service providers. Read more The Salesforce Story In Marc Benioff’s own words How did salesforce.com grow from a start up in a rented apartment into the world’s Read more Salesforce Jigsaw Salesforce.com, a prominent figure in cloud computing, has finalized a deal to acquire Jigsaw, a wiki-style business contact database, for Read more Health Cloud Brings Healthcare Transformation Following swiftly after last week’s successful launch of Financial Services Cloud, Salesforce has announced the second installment in its series Read more

Read More
AI Agents Connect Tool Calling and Reasoning

AI Agents Connect Tool Calling and Reasoning

AI Agents: Bridging Tool Calling and Reasoning in Generative AI Exploring Problem Solving and Tool-Driven Decision Making in AI Introduction: The Emergence of Agentic AI Recent advancements in libraries and low-code platforms have simplified the creation of AI agents, often referred to as digital workers. Tool calling stands out as a key capability that enhances the “agentic” nature of Generative AI models, enabling them to move beyond mere conversational tasks. By executing tools (functions), these agents can act on your behalf and tackle intricate, multi-step problems requiring sound decision-making and interaction with diverse external data sources. This insight explores the role of reasoning in tool calling, examines the challenges associated with tool usage, discusses common evaluation methods for tool-calling proficiency, and provides examples of how various models and agents engage with tools. Reasoning as a Means of Problem-Solving Successful agents rely on two fundamental expressions of reasoning: reasoning through evaluation and planning, and reasoning through tool use. While both reasoning expressions are vital, they don’t always need to be combined to yield powerful solutions. For instance, OpenAI’s new o1 model excels in reasoning through evaluation and planning, having been trained to utilize chain of thought effectively. This has notably enhanced its ability to address complex challenges, achieving human PhD-level accuracy on benchmarks like GPQA across physics, biology, and chemistry, and ranking in the 86th-93rd percentile on Codeforces contests. However, the o1 model currently lacks explicit tool calling capabilities. Conversely, many models are specifically fine-tuned for reasoning through tool use, allowing them to generate function calls and interact with APIs effectively. These models focus on executing the right tool at the right moment but may not evaluate their results as thoroughly as the o1 model. The Berkeley Function Calling Leaderboard (BFCL) serves as an excellent resource for comparing the performance of various models on tool-calling tasks and provides an evaluation suite for assessing fine-tuned models against challenging scenarios. The recently released BFCL v3 now includes multi-step, multi-turn function calling, raising the standards for tool-based reasoning tasks. Both reasoning types are powerful in their own right, and their combination holds the potential to develop agents that can effectively deconstruct complex tasks and autonomously interact with their environments. For more insights into AI agent architectures for reasoning, planning, and tool calling, check out my team’s survey paper on ArXiv. Challenges in Tool Calling: Navigating Complex Agent Behaviors Creating robust and reliable agents necessitates overcoming various challenges. In tackling complex problems, an agent often must juggle multiple tasks simultaneously, including planning, timely tool interactions, accurate formatting of tool calls, retaining outputs from prior steps, avoiding repetitive loops, and adhering to guidelines to safeguard the system against jailbreaks and prompt injections. Such demands can easily overwhelm a single agent, leading to a trend where what appears to an end user as a single agent is actually a coordinated effort of multiple agents and prompts working in unison to divide and conquer the task. This division enables tasks to be segmented and addressed concurrently by distinct models and agents, each tailored to tackle specific components of the problem. This is where models with exceptional tool-calling capabilities come into play. While tool calling is a potent method for empowering productive agents, it introduces its own set of challenges. Agents must grasp the available tools, choose the appropriate one from a potentially similar set, accurately format the inputs, execute calls in the correct sequence, and potentially integrate feedback or instructions from other agents or humans. Many models are fine-tuned specifically for tool calling, allowing them to specialize in selecting functions accurately at the right time. Key considerations when fine-tuning a model for tool calling include: Common Benchmarks for Evaluating Tool Calling As tool usage in language models becomes increasingly significant, numerous datasets have emerged to facilitate the evaluation and enhancement of model tool-calling capabilities. Two prominent benchmarks include the Berkeley Function Calling Leaderboard and the Nexus Function Calling Benchmark, both utilized by Meta to assess the performance of their Llama 3.1 model series. The recent ToolACE paper illustrates how agents can generate a diverse dataset for fine-tuning and evaluating model tool use. Here’s a closer look at each benchmark: Each of these benchmarks enhances our ability to evaluate model reasoning through tool calling. They reflect a growing trend toward developing specialized models for specific tasks and extending the capabilities of LLMs to interact with the real world. Practical Applications of Tool Calling If you’re interested in observing tool calling in action, here are some examples to consider, categorized by ease of use, from simple built-in tools to utilizing fine-tuned models and agents with tool-calling capabilities. While the built-in web search feature is convenient, most applications require defining custom tools that can be integrated into your model workflows. This leads us to the next complexity level. To observe how models articulate tool calls, you can use the Databricks Playground. For example, select the Llama 3.1 405B model and grant access to sample tools like get_distance_between_locations and get_current_weather. When prompted with, “I am going on a trip from LA to New York. How far are these two cities? And what’s the weather like in New York? I want to be prepared for when I get there,” the model will decide which tools to call and what parameters to provide for an effective response. In this scenario, the model suggests two tool calls. Since the model cannot execute the tools, the user must input a sample result to simulate. Suppose you employ a model fine-tuned on the Berkeley Function Calling Leaderboard dataset. When prompted, “How many times has the word ‘freedom’ appeared in the entire works of Shakespeare?” the model will successfully retrieve and return the answer, executing the required tool calls without the user needing to define any input or manage the output format. Such models handle multi-turn interactions adeptly, processing past user messages, managing context, and generating coherent, task-specific outputs. As AI agents evolve to encompass advanced reasoning and problem-solving capabilities, they will become increasingly adept at managing

Read More
Open AI Update

Open AI Update

OpenAI has established itself as a leading force in the generative AI space, with its ChatGPT being one of the most widely recognized AI tools. Powered by the GPT series of large language models (LLMs), as of September 2024, ChatGPT primarily uses GPT-4o and GPT-3.5. This insight provides an Open AI Update. In August and September 2024, rumors circulated about a new model from OpenAI, codenamed “Strawberry.” Initially, it was unclear if this model would be a successor to GPT-4o or something entirely different. On September 12, 2024, the mystery was resolved with the official launch of OpenAI’s o1 models, including o1-preview and o1-mini. What is OpenAI o1? OpenAI o1 is a new family of LLMs optimized for advanced reasoning tasks. Unlike earlier models, o1 is designed to improve problem-solving by reasoning through queries rather than just generating quick responses. This deeper processing aims to produce more accurate answers to complex questions, particularly in fields like STEM (science, technology, engineering, and mathematics). The o1 models, currently available in preview form, are intended to provide a new type of LLM experience beyond what GPT-4o offers. Like all OpenAI LLMs, the o1 series is built on transformer architecture and can be used for tasks such as content summarization, new content generation, question answering, and writing code. Key Features of OpenAI o1 The standout feature of the o1 models is their ability to engage in multistep reasoning. By adopting a “chain-of-thought” approach, o1 models break down complex problems and reason through them iteratively. This makes them particularly adept at handling intricate queries that require a more thoughtful response. The initial September 2024 launch included two models: Use Cases for OpenAI o1 The o1 models can perform many of the same functions as GPT-4o, such as answering questions, summarizing content, and generating text. However, they are particularly suited for tasks that benefit from enhanced reasoning, including: Availability and Access The o1-preview and o1-mini models are available to users of ChatGPT Plus and Team as of September 12, 2024. OpenAI plans to extend access to ChatGPT Enterprise and Education users starting September 19, 2024. While free ChatGPT users do not have access to these models at launch, OpenAI intends to introduce o1-mini to free users in the future. Developers can also access the models through OpenAI’s API, and third-party platforms such as Microsoft Azure AI Studio and GitHub Models offer integration. Limitations of OpenAI o1 As preview models, o1 comes with certain limitations: Enhancing Safety with OpenAI o1 To ensure safety, OpenAI released a System Card that outlines how the o1 models were evaluated for risks like cybersecurity threats, persuasion, and model autonomy. The o1 models improve safety through: GPT-4o vs. OpenAI o1 Here’s a quick comparison between GPT-4o and OpenAI’s new o1 models: Feature GPT-4o o1 Models Release Date May 13, 2024 Sept. 12, 2024 Model Variants Single model Two variants: o1-preview and o1-mini Reasoning Capabilities Good Enhanced, especially for STEM fields Mathematics Olympiad Score 13% 83% Context Window 128K tokens 128K tokens Speed Faster Slower due to in-depth reasoning Cost (per million tokens) Input: $5; Output: $15 o1-preview: $15 input, $60 output; o1-mini: $3 input, $12 output Safety and Alignment Standard Enhanced safety, better jailbreak resistance OpenAI’s o1 models bring a new level of reasoning and accuracy, making them a promising advancement in generative AI. Like1 Related Posts Salesforce OEM AppExchange Expanding its reach beyond CRM, Salesforce.com has launched a new service called AppExchange OEM Edition, aimed at non-CRM service providers. Read more Salesforce Jigsaw Salesforce.com, a prominent figure in cloud computing, has finalized a deal to acquire Jigsaw, a wiki-style business contact database, for Read more Health Cloud Brings Healthcare Transformation Following swiftly after last week’s successful launch of Financial Services Cloud, Salesforce has announced the second installment in its series Read more Top Ten Reasons Why Tectonic Loves the Cloud The Cloud is Good for Everyone – Why Tectonic loves the cloud You don’t need to worry about tracking licenses. Read more

Read More
chatGPT open ai 01

ChatGPT Open AI o1

OpenAI has firmly established itself as a leader in the generative AI space, with its ChatGPT being one of the most well-known applications of AI today. Powered by the GPT family of large language models (LLMs), ChatGPT’s primary models, as of September 2024, are GPT-4o and GPT-3.5. In August and September 2024, rumors surfaced about a new model from OpenAI, codenamed “Strawberry.” Speculation grew as to whether this was a successor to GPT-4o or something else entirely. The mystery was resolved on September 12, 2024, when OpenAI launched its new o1 models, including o1-preview and o1-mini. What Is OpenAI o1? The OpenAI o1 family is a series of large language models optimized for enhanced reasoning capabilities. Unlike GPT-4o, the o1 models are designed to offer a different type of user experience, focusing more on multistep reasoning and complex problem-solving. As with all OpenAI models, o1 is a transformer-based architecture that excels in tasks such as content summarization, content generation, coding, and answering questions. What sets o1 apart is its improved reasoning ability. Instead of prioritizing speed, the o1 models spend more time “thinking” about the best approach to solve a problem, making them better suited for complex queries. The o1 models use chain-of-thought prompting, reasoning step by step through a problem, and employ reinforcement learning techniques to enhance performance. Initial Launch On September 12, 2024, OpenAI introduced two versions of the o1 models: Key Capabilities of OpenAI o1 OpenAI o1 can handle a variety of tasks, but it is particularly well-suited for certain use cases due to its advanced reasoning functionality: How to Use OpenAI o1 There are several ways to access the o1 models: Limitations of OpenAI o1 As an early iteration, the o1 models have several limitations: How OpenAI o1 Enhances Safety OpenAI released a System Card alongside the o1 models, detailing the safety and risk assessments conducted during their development. This includes evaluations in areas like cybersecurity, persuasion, and model autonomy. The o1 models incorporate several key safety features: GPT-4o vs. OpenAI o1: A Comparison Here’s a side-by-side comparison of GPT-4o and OpenAI o1: Feature GPT-4o o1 Models Release Date May 13, 2024 Sept. 12, 2024 Model Variants Single Model Two: o1-preview and o1-mini Reasoning Capabilities Good Enhanced, especially in STEM fields Performance Benchmarks 13% on Math Olympiad 83% on Math Olympiad, PhD-level accuracy in STEM Multimodal Capabilities Text, images, audio, video Primarily text, with developing image capabilities Context Window 128K tokens 128K tokens Speed Fast Slower due to more reasoning processes Cost (per million tokens) Input: $5; Output: $15 o1-preview: $15 input, $60 output; o1-mini: $3 input, $12 output Availability Widely available Limited to specific users Features Includes web browsing, file uploads Lacks some features from GPT-4o, like web browsing Safety and Alignment Focus on safety Improved safety, better resistance to jailbreaking ChatGPT Open AI o1 OpenAI o1 marks a significant advancement in reasoning capabilities, setting a new standard for complex problem-solving with LLMs. With enhanced safety features and the ability to tackle intricate tasks, o1 models offer a distinct upgrade over their predecessors. Like Related Posts Salesforce OEM AppExchange Expanding its reach beyond CRM, Salesforce.com has launched a new service called AppExchange OEM Edition, aimed at non-CRM service providers. Read more Salesforce Jigsaw Salesforce.com, a prominent figure in cloud computing, has finalized a deal to acquire Jigsaw, a wiki-style business contact database, for Read more Health Cloud Brings Healthcare Transformation Following swiftly after last week’s successful launch of Financial Services Cloud, Salesforce has announced the second installment in its series Read more Top Ten Reasons Why Tectonic Loves the Cloud The Cloud is Good for Everyone – Why Tectonic loves the cloud You don’t need to worry about tracking licenses. Read more

Read More
Large Action Models and AI Agents

Large Action Models and AI Agents

The introduction of LAMs marks a significant advancement in AI, focusing on actionable intelligence. By enabling robust, dynamic interactions through function calling and structured output generation, LAMs are set to redefine the capabilities of AI agents across industries.

Read More
GPT-o1 GPT5 Review

GPT-o1 GPT5 Review

OpenAI has released its latest model, GPT-5, also known as Project Strawberry or GPT-o1, positioning it as a significant advancement in AI with PhD-level reasoning capabilities. This new series, OpenAI-o1, is designed to enhance problem-solving in fields such as science, coding, and mathematics, and the initial results indicate that it lives up to the anticipation. Key Features of OpenAI-o1 Enhanced Reasoning Capabilities Safety and Alignment Targeted Applications Model Variants Access and Availability The o1 models are available to ChatGPT Plus and Team users, with broader access expected soon for ChatGPT Enterprise users. Developers can access the models through the API, although certain features like function calling are still in development. Free access to o1-mini is expected to be provided in the near future. Reinforcement Learning at the Core The o1 models utilize reinforcement learning to improve their reasoning abilities. This approach focuses on training the models to think more effectively, improving their performance with additional time spent on tasks. OpenAI continues to explore how to scale this approach, though details remain limited. Major Milestones The o1 model has achieved impressive results in several competitive benchmarks: Chain of Thought Reasoning OpenAI’s o1 models employ the “Chain of Thought” prompt engineering technique, which allows the model to think through problems step by step. This method helps the model approach complex problems in a structured way, similar to human reasoning. Key aspects include: While the o1 models show immense promise, there are still some limitations, which have been covered in detail elsewhere. However, based on early tests, the model is performing impressively, and users are hopeful that these capabilities are as robust as advertised, rather than overhyped like previous projects such as SORA or SearchGPT by OpenAI. Like Related Posts Salesforce OEM AppExchange Expanding its reach beyond CRM, Salesforce.com has launched a new service called AppExchange OEM Edition, aimed at non-CRM service providers. Read more Salesforce Jigsaw Salesforce.com, a prominent figure in cloud computing, has finalized a deal to acquire Jigsaw, a wiki-style business contact database, for Read more Health Cloud Brings Healthcare Transformation Following swiftly after last week’s successful launch of Financial Services Cloud, Salesforce has announced the second installment in its series Read more Top Ten Reasons Why Tectonic Loves the Cloud The Cloud is Good for Everyone – Why Tectonic loves the cloud You don’t need to worry about tracking licenses. Read more

Read More
Salesforce Tiny Giant LLM

Salesforce Tiny Giant LLM

‘On-device Agentic AI is Here!’: Salesforce Announces the ‘Tiny Giant’ LLM Salesforce CEO Marc Benioff is excited about the company’s latest innovation in AI, introducing the ‘Tiny Giant’ LLM, which he claims is the world’s top-performing “micro-model” for function-calling. Salesforce’s new slimline “Tiny Giant” LLM reportedly outperforms larger models, marking a significant advancement in on-device AI. According to a paper published on Arxiv by Salesforce’s AI Research department, the xLAM-7B LLM model ranked sixth among 46 models, including those from OpenAI and Google, in a competition testing function-calling (execution of tasks or functions through API calls). The xLAM-7B LLM was trained on just seven billion parameters, a small fraction compared to the 1.7 trillion parameters rumored to be used by GPT-4. However, Salesforce highlights the xLAM-1B, a smaller model, as its true star. Despite being trained on just one billion parameters, the xLAM-1B model finished 24th, surpassing GPT-3.5-Turbo and Claude-3 Haiku in performance. CEO Marc Benioff proudly shared these results on X (formerly Twitter), stating: “Meet Salesforce Einstein ‘Tiny Giant.’ Our 1B parameter model xLAM-1B is now the best micro-model for function-calling, outperforming models 7x its size… On-device agentic AI is here. Congrats Salesforce Research!” Salesforce’s research emphasizes that function-calling agents represent a significant advancement in AI and LLMs. Models like GPT-4, Gemini, and Mistral already execute API calls based on natural language prompts, enabling dynamic interactions with various digital services and applications. While many popular models are large and resource-intensive, requiring cloud data centers and extensive infrastructure, Salesforce’s new models demonstrate that smaller, more efficient models can achieve state-of-the-art performance. To test function-calling LLMs, Salesforce developed APIGen, an “Automated Pipeline for Generating verifiable and diverse function-calling datasets,” to synthesize data for AI training. Salesforce’s findings indicate that models trained on relatively small datasets can outperform those trained on larger datasets. “Models trained with our curated datasets, even with only seven billion parameters, can achieve state-of-the-art performance… outperforming multiple GPT-4 models,” the paper states. The ultimate goal is to create agentic AI models capable of function-calling and task execution on devices, minimizing the need for extensive external infrastructure and enabling self-sufficient operations. Dr. Eli David, Co-Founder of the cybersecurity firm Deep Instinct, commented on X, “Smaller, more efficient models are the way to go for widespread deployment of LLMs.” Like Related Posts Salesforce OEM AppExchange Expanding its reach beyond CRM, Salesforce.com has launched a new service called AppExchange OEM Edition, aimed at non-CRM service providers. Read more The Salesforce Story In Marc Benioff’s own words How did salesforce.com grow from a start up in a rented apartment into the world’s Read more Salesforce Jigsaw Salesforce.com, a prominent figure in cloud computing, has finalized a deal to acquire Jigsaw, a wiki-style business contact database, for Read more Health Cloud Brings Healthcare Transformation Following swiftly after last week’s successful launch of Financial Services Cloud, Salesforce has announced the second installment in its series Read more

Read More
gettectonic.com