Advanced RAG Archives - gettectonic.com

29Oct

More AI Tools to Use

Additionally, Arc’s collaboration with Perplexity elevates browsing by transforming search experiences. Perplexity functions as a personal AI research assistant, fetching and summarizing information along with sources, visuals, and follow-up questions. Premium users even have access to advanced large language models like GPT-4 and Claude. Together, Arc and Perplexity revolutionize how users navigate the web. Like Related Posts Salesforce OEM AppExchange Expanding its reach beyond CRM, Salesforce.com has launched a new service called AppExchange OEM Edition, aimed at non-CRM service providers. Read more Salesforce Jigsaw Salesforce.com, a prominent figure in cloud computing, has finalized a deal to acquire Jigsaw, a wiki-style business contact database, for Read more Health Cloud Brings Healthcare Transformation Following swiftly after last week’s successful launch of Financial Services Cloud, Salesforce has announced the second installment in its series Read more Top Ten Reasons Why Tectonic Loves the Cloud The Cloud is Good for Everyone – Why Tectonic loves the cloud You don’t need to worry about tracking licenses. Read more

October 29, 2024in AI Tools, Data, Google

24Oct

AI Innovation at Salesforce

AI innovation is advancing at an unprecedented pace, unlike anything I’ve seen in nearly 25 years at Salesforce. It’s now a top priority for every CEO, CTO, and CIO I speak with. As a trusted partner, we help customers innovate, iterate, and navigate the evolving AI landscape. They recognize AI’s immense potential to revolutionize every aspect of business, across all industries. While they’re already seeing significant advancements, we are still just scratching the surface of AI’s full transformational promise. They seek AI technologies that will enhance productivity, augment employee performance at scale, improve customer relationships, and ultimately drive rapid time to value and higher margins. That’s where our new Agentforce Platform comes in. Agentforce represents a breakthrough in AI, delivering on the promise of autonomous AI agents. These agents perform advanced planning and decision-making with minimal human input, automating entire workflows, making real-time decisions, and adapting to new information—all without requiring human intervention. Salesforce customers are embracing Agentforce and integrating it with other products, including Einstein AI, Data Cloud, Sales Cloud, and Service Cloud. Here are some exciting ways our customers are utilizing these tools: Strengthening Customer Relationships with AI Agents OpenTable is leveraging autonomous AI agents to handle the massive scale of its operations, supporting 60,000 restaurants and millions of diners. By piloting Agentforce for Service, they’ve automated common tasks like account reactivations, reservation management, and loyalty point expiration. The AI agents even answer complex follow-up questions, such as “when do my points expire in Mexico?”—a real “wow” moment for OpenTable. These agents are redefining how customers engage with companies. Wiley, an educational publisher, faces a seasonal surge in service requests each school year. By piloting Agentforce Service Agent, they increased case resolution by 40-50% and sped up new agent onboarding by 50%, outperforming their previous systems. Harnessing Data Insights The Adecco Group, a global leader in talent solutions, wanted to unlock insights from its vast data reserves. Using Data Cloud, they’re connecting multiple Salesforce instances to give 27,000 recruiters and sales staff real-time, 360-degree views of their operations. This empowers Adecco to improve job fill rates and streamline operations for some of the world’s largest companies. Workday, a Salesforce customer for nearly two decades, uses Service Cloud to power customer service and Slack for internal collaboration. Our new partnership with Workday will integrate Agentforce with their platform, creating a seamless employee experience across Salesforce, Slack, and Workday. This includes AI-powered employee service agents accessible across all platforms. Wyndham Resorts is transforming its guest experience by using Data Cloud to harmonize CRM data across Sales Cloud, Marketing Cloud, and Service Cloud. By consolidating their systems, Wyndham anticipates a 30% reduction in call resolution time and an overall enhanced customer experience through better access to accurate guest and property data. Empowering Employees Air India, with ambitions to capture 30% of India’s airline market, is using Data Cloud, Service Cloud, and Einstein AI to unify data across merged airlines and enhance customer service. Now, human agents spend more time with customers while AI handles routine tasks, resulting in faster resolution of 550,000 monthly service calls. Heathrow Airport is focused on improving employee efficiency and personalizing passenger experiences. Service Cloud and Einstein chatbots have significantly reduced call volumes, with chatbots answering 4,000 questions monthly. Since launching, live chat usage has surged 450%, and average call times have dropped 27%. These improvements have boosted Heathrow’s digital revenue by 30% since 2019. Driving Productivity and Margins Aston Martin sought to improve customer understanding and dealer collaboration. By adopting Data Cloud, they unified their customer data, reducing redundancy by 52% and transitioning from six data systems to one, streamlining operations. Autodesk, a leader in 3D design and engineering software, uses Einstein for Service to generate AI-driven case summaries, cutting the time spent summarizing customer chats by 63%. They also use Salesforce to enhance data security, reducing ongoing maintenance by 30%. Creating a Bright Future for Our Customers For over 25 years, Salesforce has guided customers through transformative technological shifts. The fusion of AI and human intelligence is the most profound shift we’ve seen, unlocking limitless potential for business success. Join them at Dreamforce next month, where we’ll celebrate customer achievements and share the latest innovations. Like Related Posts Salesforce OEM AppExchange Expanding its reach beyond CRM, Salesforce.com has launched a new service called AppExchange OEM Edition, aimed at non-CRM service providers. Read more The Salesforce Story In Marc Benioff’s own words How did salesforce.com grow from a start up in a rented apartment into the world’s Read more Salesforce Jigsaw Salesforce.com, a prominent figure in cloud computing, has finalized a deal to acquire Jigsaw, a wiki-style business contact database, for Read more Health Cloud Brings Healthcare Transformation Following swiftly after last week’s successful launch of Financial Services Cloud, Salesforce has announced the second installment in its series Read more

October 24, 2024in Agentforce Platform

19Oct

LLMs and AI

Large Language Models (LLMs): Revolutionizing AI and Custom Solutions Large Language Models (LLMs) are transforming artificial intelligence by enabling machines to generate and comprehend human-like text, making them indispensable across numerous industries. The global LLM market is experiencing explosive growth, projected to rise from $1.59 billion in 2023 to $259.8 billion by 2030. This surge is driven by the increasing demand for automated content creation, advances in AI technology, and the need for improved human-machine communication. Several factors are propelling this growth, including advancements in AI and Natural Language Processing (NLP), large datasets, and the rising importance of seamless human-machine interaction. Additionally, private LLMs are gaining traction as businesses seek more control over their data and customization. These private models provide tailored solutions, reduce dependency on third-party providers, and enhance data privacy. This guide will walk you through building your own private LLM, offering valuable insights for both newcomers and seasoned professionals. What are Large Language Models? Large Language Models (LLMs) are advanced AI systems that generate human-like text by processing vast amounts of data using sophisticated neural networks, such as transformers. These models excel in tasks such as content creation, language translation, question answering, and conversation, making them valuable across industries, from customer service to data analysis. LLMs are generally classified into three types: LLMs learn language rules by analyzing vast text datasets, similar to how reading numerous books helps someone understand a language. Once trained, these models can generate content, answer questions, and engage in meaningful conversations. For example, an LLM can write a story about a space mission based on knowledge gained from reading space adventure stories, or it can explain photosynthesis using information drawn from biology texts. Building a Private LLM Data Curation for LLMs Recent LLMs, such as Llama 3 and GPT-4, are trained on massive datasets—Llama 3 on 15 trillion tokens and GPT-4 on 6.5 trillion tokens. These datasets are drawn from diverse sources, including social media (140 trillion tokens), academic texts, and private data, with sizes ranging from hundreds of terabytes to multiple petabytes. This breadth of training enables LLMs to develop a deep understanding of language, covering diverse patterns, vocabularies, and contexts. Common data sources for LLMs include: Data Preprocessing After data collection, the data must be cleaned and structured. Key steps include: LLM Training Loop Key training stages include: Evaluating Your LLM After training, it is crucial to assess the LLM’s performance using industry-standard benchmarks: When fine-tuning LLMs for specific applications, tailor your evaluation metrics to the task. For instance, in healthcare, matching disease descriptions with appropriate codes may be a top priority. Conclusion Building a private LLM provides unmatched customization, enhanced data privacy, and optimized performance. From data curation to model evaluation, this guide has outlined the essential steps to create an LLM tailored to your specific needs. Whether you’re just starting or seeking to refine your skills, building a private LLM can empower your organization with state-of-the-art AI capabilities. For expert guidance or to kickstart your LLM journey, feel free to contact us for a free consultation. Like Related Posts Salesforce OEM AppExchange Expanding its reach beyond CRM, Salesforce.com has launched a new service called AppExchange OEM Edition, aimed at non-CRM service providers. Read more The Salesforce Story In Marc Benioff’s own words How did salesforce.com grow from a start up in a rented apartment into the world’s Read more Salesforce Jigsaw Salesforce.com, a prominent figure in cloud computing, has finalized a deal to acquire Jigsaw, a wiki-style business contact database, for Read more Health Cloud Brings Healthcare Transformation Following swiftly after last week’s successful launch of Financial Services Cloud, Salesforce has announced the second installment in its series Read more

October 19, 2024in Artificial Intelligence, Data, Google, Salesforce, Technology

17Oct

RIG and RAG

Imagine you’re a financial analyst tasked with comparing the GDP of France and Italy over the last five years. You query a language model, asking: “What are the current GDP figures of France and Italy, and how have they changed over the last five years?” Using Retrieval-Augmented Generation (RAG), the model first retrieves relevant information from external sources, then generates this response: “France’s current GDP is approximately $2.9 trillion, while Italy’s is around $2.1 trillion. Over the past five years, France’s GDP has grown by an average of 1.5%, whereas Italy’s GDP has seen slower growth, averaging just 0.6%.” In this case, RAG improves the model’s accuracy by incorporating real-world data through a single retrieval step. While effective, this method can struggle with more complex queries that require multiple, dynamic pieces of real-time data. Enter Retrieval Interleaved Generation (RIG)! Now, you submit a more complex query: “What are the GDP growth rates of France and Italy in the past five years, and how do these compare to their employment rates during the same period?” With RIG, the model generates a partial response, drawing from its internal knowledge about GDP. However, it simultaneously retrieves relevant employment data in real time. For example: “France’s current GDP is $2.9 trillion, and Italy’s is $2.1 trillion. Over the past five years, France’s GDP has grown at an average rate of 1.5%, while Italy’s growth has been slower at 0.6%. Meanwhile, France’s employment rate increased by 2%, and Italy’s employment rate rose slightly by 0.5%.” Here’s what happened: RIG allowed the model to interleave data retrieval with response generation, ensuring the information is up-to-date and comprehensive. It fetched employment statistics while continuing to generate GDP figures, ensuring the final output was both accurate and complete for a multi-faceted query. What is Retrieval Interleaved Generation (RIG)? RIG is an advanced technique that integrates real-time data retrieval into the process of generating responses. Unlike RAG, which retrieves information once before generating the response, RIG continuously alternates between generating text and querying external data sources. This ensures each piece of the response is dynamically grounded in the most accurate, up-to-date information. How RIG Works: For example, when asked for GDP figures of two countries, RIG first retrieves one country’s data while generating an initial response and simultaneously fetches the second country’s data for a complete comparison. Why Use RIG? Real-World Applications of RIG RIG’s versatility makes it ideal for handling complex, real-time data across various sectors, such as: Challenges of RIG While promising, RIG faces a few challenges: As AI evolves, RIG is poised to become a foundational tool for complex, data-driven tasks, empowering industries with more accurate, real-time insights for decision-making. Like Related Posts Salesforce OEM AppExchange Expanding its reach beyond CRM, Salesforce.com has launched a new service called AppExchange OEM Edition, aimed at non-CRM service providers. Read more Salesforce Jigsaw Salesforce.com, a prominent figure in cloud computing, has finalized a deal to acquire Jigsaw, a wiki-style business contact database, for Read more Health Cloud Brings Healthcare Transformation Following swiftly after last week’s successful launch of Financial Services Cloud, Salesforce has announced the second installment in its series Read more Top Ten Reasons Why Tectonic Loves the Cloud The Cloud is Good for Everyone – Why Tectonic loves the cloud You don’t need to worry about tracking licenses. Read more

October 17, 2024in Data

05Oct

AI Era Demands an Intelligent Data Infrastructure

Why the AI Era Demands an Intelligent Data Infrastructure NetApp aims to bridge the gap between cloud-based AI models and on-premises data management systems, addressing a major challenge many organizations face as they explore emerging AI capabilities, such as generative AI. AI Era Demands an Intelligent Data Infrastructure. While the most advanced AI tools often reside in hyperscale clouds, the key data of most organizations remains on-premises. As companies move from AI experimentation to scaling their initiatives, they face the complex task of closing the divide between their data and AI. At first glance, the solution seems simple: Organizations can integrate their proprietary data with cloud-based AI services to enhance large language models (LLMs) and other AI tools, leveraging their industry-specific expertise to generate unique insights and drive value. In fact, a recent Enterprise Strategy Group (ESG) survey found that 84% of respondents believed incorporating enterprise data was critical to supporting their generative AI efforts. However, achieving this at scale is far more difficult than it appears. Many organizations are reluctant to send their valuable data to public clouds due to concerns about security, such as exposing sensitive intellectual property or personal information. Additionally, managing the process of transferring large datasets, creating data copies, updating models with fresh data, and keeping track of everything over time adds cost and complexity. Beyond these logistical issues, many organizations struggle with a fundamental challenge: poor data management. Rapid data growth, fragmentation, and a lack of understanding of their own data forces business users and data scientists to spend excessive time “data wrangling”—the process of identifying, gathering, and preparing data for AI models. This inefficiency delays training and deploying models, pushing back the point of inference, which is the key to unlocking value from AI. Increased latency also raises the risk that inferencing is based on outdated data, leading to repeated cycles and further inefficiency. This challenge has made data management a critical obstacle in AI adoption. ESG research identified poor data quality as the top challenge organizations face when implementing AI. Inadequate data is rapidly becoming a roadblock that hampers AI initiatives across industries. To address these challenges, NetApp has shifted its focus toward building an “intelligent data infrastructure” for the AI era. At the company’s Insight 2024 conference, NetApp outlined a strategy to support AI success by bridging the gap between cloud-based AI models and on-premises data environments. The company’s vision includes simplifying, automating, and mitigating risks in the data management workflow needed to scale AI in the enterprise. NetApp’s approach includes innovations like a global metadata namespace and enhancements to its OnTap software, which enable exploration, classification, and management of data across the NetApp ecosystem. These capabilities integrate directly into AI data pipelines, enabling scalable searches and retrieval-augmented generation inferencing. Additionally, NetApp’s new “disaggregated” storage architecture will support more cost-effective scaling for compute-heavy AI tasks such as model training. NetApp’s strategy also extends to the public cloud, where it already has strong partnerships with all three hyperscalers, offering NetApp services as first-party options in their clouds. The company is expanding this by developing additional cloud-native capabilities for AI, such as integrating Azure NetApp Files with Microsoft Azure AI services, FSx for NetApp OnTap with Amazon Bedrock, and Google Cloud NetApp Volumes with Google Vertex AI and BigQuery. These integrations will allow organizations to securely enrich public cloud-based AI models with their on-premises data, creating a scalable and secure environment for AI development. While some aspects of NetApp’s strategy will be delivered over the next year, many cloud integrations will be available sooner. NetApp is also building a partner ecosystem, including hardware partners like Lenovo and Nvidia, software ISVs, and service providers like Domino Data Labs, in addition to its cloud alliances. With this vision for intelligent data infrastructure, NetApp is positioning itself as a key enabler of AI-driven innovation, and it will be intriguing to watch how this strategy unfolds in the coming months. Like Related Posts Salesforce OEM AppExchange Expanding its reach beyond CRM, Salesforce.com has launched a new service called AppExchange OEM Edition, aimed at non-CRM service providers. Read more The Salesforce Story In Marc Benioff’s own words How did salesforce.com grow from a start up in a rented apartment into the world’s Read more Salesforce Jigsaw Salesforce.com, a prominent figure in cloud computing, has finalized a deal to acquire Jigsaw, a wiki-style business contact database, for Read more Health Cloud Brings Healthcare Transformation Following swiftly after last week’s successful launch of Financial Services Cloud, Salesforce has announced the second installment in its series Read more

October 5, 2024in AI Tools, Data, Enterprise, Generative AI, Google

02Oct

Anthropic’s New Approach to RAG

advanced RAG methodology demonstrates how AI can overcome traditional challenges, delivering more precise, context-aware responses while maintaining efficiency and scalability.

October 2, 2024in Data

24Sep

Introducing Marketing Cloud Advanced

Salesforce has unveiled a series of innovations in its Marketing Cloud, (Introducing Marketing Cloud Advanced) designed to empower businesses with AI-driven tools and enhanced data capabilities to elevate customer engagement. These new features aim to deepen customer relationships, improve team productivity, and boost operational efficiency. Introducing Marketing Cloud Advanced One of the standout innovations is Marketing Cloud Advanced, an upcoming edition that integrates advanced automation and AI. This edition is designed to connect marketing journeys with sales, service, and commerce workflows, offering a more personalized experience across multiple customer touchpoints. Additionally, the introduction of Agentforce for Marketing will bring generative and predictive AI into the marketing realm, helping marketers create comprehensive, end-to-end campaign experiences. Steve Hammond, Executive Vice President and General Manager of Marketing Cloud at Salesforce, commented: “Today’s most successful marketers engage customers on their terms and act as value multipliers across the entire customer experience—whether helping sales or service have more personalized conversations or re-engaging inactive customers. Built on Data Cloud, Marketing Cloud is the only solution that unifies data across every department and moment in the customer lifecycle, powered by Agentforce Agents and automation, driving growth, loyalty, and optimizing ROI.” Agentforce for Marketing introduces several capabilities that streamline marketing processes. Marketers can now plan, launch, and optimize campaigns with ease. Agentforce allows marketers to set campaign goals and brand guidelines, after which the AI generates campaign briefs, identifies target audience segments, and drafts initial emails and landing pages. The system continuously monitors performance and provides data-driven optimization suggestions based on key performance indicators (KPIs). A key addition is Einstein Marketing Intelligence (EMI), which helps marketers manage and optimize cross-channel campaign performance. EMI automates the process of data preparation, enrichment, harmonization, and visualization, enabling marketers to measure campaign effectiveness and make informed decisions to improve return on investment. Furthermore, Salesforce introduced Einstein Personalization, an AI-powered decision engine that delivers tailored customer experiences. This tool allows sales, service, and commerce teams to engage customers in real time based on live interactions and data. Using Flow’s A/B split testing feature, marketers can select dynamic email content for different audience segments and track performance to adjust strategies effectively. Sarah Lukins, General Manager of Digital at Fisher & Paykel Appliances, praised the new functionality: “Salesforce enables us to seamlessly access all of our marketing, commerce, service, sales, and external data in one place and leverage AI for more targeted audience engagement. We can now deliver more relevant and consistent personalized experiences across email, ads, web, social, and service engagements.” The Marketing Cloud Advanced Edition will roll out to customers in North America, Europe, and Latin America, while Agentforce Personalization is expected to become generally available by next summer. Additional releases include expanded Einstein multi-language support and unified SMS conversation capabilities. These innovations are part of Salesforce’s ongoing efforts to equip marketers with unified and actionable data, enhancing the performance of marketing teams and fostering deeper integration across organizations. Through AI and automation, Salesforce is helping businesses deliver more personalized, connected, and seamless customer experiences. Like Related Posts Salesforce OEM AppExchange Expanding its reach beyond CRM, Salesforce.com has launched a new service called AppExchange OEM Edition, aimed at non-CRM service providers. Read more The Salesforce Story In Marc Benioff’s own words How did salesforce.com grow from a start up in a rented apartment into the world’s Read more Salesforce Jigsaw Salesforce.com, a prominent figure in cloud computing, has finalized a deal to acquire Jigsaw, a wiki-style business contact database, for Read more Health Cloud Brings Healthcare Transformation Following swiftly after last week’s successful launch of Financial Services Cloud, Salesforce has announced the second installment in its series Read more

September 24, 2024in Data, Salesforce Marketing Cloud

14Sep

Salesforce Advanced AI Models

Salesforce has introduced two advanced AI models—xGen-Sales and xLAM—designed to enhance its Agentforce platform, which seamlessly integrates human agents with autonomous AI for greater business efficiency. xGen-Sales, a proprietary AI model, is tailored for sales tasks such as generating customer insights, summarizing calls, and managing pipelines. By automating routine sales activities, it enables sales teams to focus on strategic priorities. This model enhances Agentforce’s capacity to autonomously handle customer interactions, nurture leads, and support sales teams with increased speed and precision. The xLAM (Large Action Model) family introduces AI models designed to perform complex tasks and trigger actions within business systems. Unlike traditional Large Language Models (LLMs), which focus on content generation, xLAM models excel in function-calling, enabling AI agents to autonomously execute tasks like initiating workflows or processing data without human input. These models vary in size and capability, from smaller, on-device versions to large-scale models suitable for industrial applications. Salesforce AI Research developed the xLAM models using APIGen, a proprietary data-generation pipeline that significantly improves model performance. Early xLAM models have already outperformed other large models in key benchmarks. For example, the xLAM-8x22B model ranked first in function-calling tasks on the Berkeley Leaderboards, surpassing even larger models like GPT-4. These AI innovations are designed to help businesses scale AI-driven workflows efficiently. Organizations adopting these models can automate complex tasks, improve sales operations, and optimize resource allocation. The non-commercial xLAM models are available for community review on Hugging Face, while proprietary versions will power Agentforce. xGen-Sales has completed its pilot phase and will soon be available for general use. Like Related Posts Salesforce OEM AppExchange Expanding its reach beyond CRM, Salesforce.com has launched a new service called AppExchange OEM Edition, aimed at non-CRM service providers. Read more The Salesforce Story In Marc Benioff’s own words How did salesforce.com grow from a start up in a rented apartment into the world’s Read more Salesforce Jigsaw Salesforce.com, a prominent figure in cloud computing, has finalized a deal to acquire Jigsaw, a wiki-style business contact database, for Read more Health Cloud Brings Healthcare Transformation Following swiftly after last week’s successful launch of Financial Services Cloud, Salesforce has announced the second installment in its series Read more

September 14, 2024in Agentforce Platform

30Aug

Tectonic Guide to Rag Part 2

The first notable change in the field of language models is the significant expansion of context window sizes and a reduction in token costs. For instance, Anthropic’s largest model, Claude, has a context window exceeding 200,000 tokens, while recent reports indicate that Gemini’s context window can reach up to 10 million tokens. Under such circumstances, Retrieval-Augmented Generation (RAG) may no longer be necessary for many tasks, as all required data can be accommodated within the expanded context window. Several financial and analytical projects have already demonstrated that tasks can be solved without needing a vector database as intermediate storage. This trend of reducing token costs and increasing context window sizes is likely to continue, potentially decreasing the need for external mechanisms in LLMs, although they are still relevant for the time being. If the context window remains insufficient, methods for summarization and context compression have been introduced. LangChain, for example, offers a class called ConversationSummaryMemory to address this challenge. pythonCopy codellm = OpenAI(temperature=0) conversation_with_summary = ConversationChain( llm=llm, memory=ConversationSummaryMemory(llm=OpenAI()), verbose=True ) conversation_with_summary.predict(input=”Hi, what’s up?”) Knowledge Graphs As the volume of data continues to grow, navigating through it efficiently becomes increasingly critical. In certain cases, understanding the structure and attributes of data is essential for effective use. For example, if the data source is a company’s wiki, an LLM might not recognize a phone number unless the structure or metadata indicates that it’s the company’s contact information. Humans can infer meaning from conventions, such as the subdirectory “Company Information,” but standard RAG may miss such connections. This challenge can be addressed by Knowledge Graphs, also known as Knowledge Maps, which provide both raw data and metadata that illustrates how different entities are interconnected. This method is referred to as Graph Retrieval-Augmented Generation (GraphRAG). Graphs are excellent for representing and managing structured, interconnected information. Unlike vector databases, they excel at capturing complex relationships and attributes among diverse data types. Creating a Knowledge Graph The process of creating a knowledge graph typically involves collecting and structuring data, which requires expertise in both the subject matter and graph modeling. However, LLMs can automate a significant portion of this process by analyzing textual data, identifying entities, and recognizing their relationships, which can then be represented in a graph structure. In many cases, an ensemble of vector databases and knowledge graphs can improve accuracy, as discussed previously. For example, search functionality might combine keyword search through a regular database (e.g., Elasticsearch) and graph-based queries. LangChain can also assist in extracting structured data from entities, as demonstrated in this code example: pythonCopy codedocuments = parse_and_load_data_from_wiki_including_metadata() graph_store = NebulaGraphStore( space_name=”Company Wiki”, tags=[“entity”] ) storage_context = StorageContext.from_defaults(graph_store=graph_store) index = KnowledgeGraphIndex.from_documents( documents, max_triplets_per_chunk=2, space_name=space_name, tags=[“entity”] ) query_engine = index.as_query_engine() response = query_engine.query(“Tell me more about our Company”) Here, searching is conducted based on attributes and related entities, instead of similar vectors. If set up correctly, metadata from the company’s wiki, such as its phone number, would be accessible through the graph. Access Control One challenge with this system is that data access may not be uniform. For instance, in a wiki, access could depend on roles and permissions. Similar issues exist in vector databases, leading to the need for access management mechanisms such as Role-Based Access Control (RBAC), Attribute-Based Access Control (ABAC), and Relationship-Based Access Control (ReBAC). These access control methods function by evaluating paths between users and resources within graphs, such as in systems like Active Directory. To ensure the integrity of data during the ingestion phase, metadata related to permissions must be preserved in both the knowledge graph and vector database. Some commercial vector databases already have this functionality built in. Ingestion and Parsing Data needs to be ingested into both graphs and vector databases, but for graphs, formatting is especially critical since it reflects the data’s structure and serves as metadata. One particular challenge is handling complex formats like PDFs, which can contain diverse elements like tables, images, and text. Extracting structured data from such formats can be difficult, and while frameworks like LLama Parse exist, they are not always foolproof. In some cases, Optical Character Recognition (OCR) may be more effective than parsing. Enhancing Answer Quality Several new approaches are emerging to improve the quality of LLM-generated answers: While these advancements in knowledge graphs, access control, and retrieval mechanisms are promising, challenges remain, particularly around data formatting and parsing. However, these methods continue to evolve, enhancing LLM capabilities and efficiency. Like Related Posts Salesforce OEM AppExchange Expanding its reach beyond CRM, Salesforce.com has launched a new service called AppExchange OEM Edition, aimed at non-CRM service providers. Read more Salesforce Jigsaw Salesforce.com, a prominent figure in cloud computing, has finalized a deal to acquire Jigsaw, a wiki-style business contact database, for Read more Health Cloud Brings Healthcare Transformation Following swiftly after last week’s successful launch of Financial Services Cloud, Salesforce has announced the second installment in its series Read more Alphabet Soup of Cloud Terminology As with any technology, the cloud brings its own alphabet soup of terms. This insight will hopefully help you navigate Read more

August 30, 2024in Data

21Jul

Advanced RAG

Significant Changes in Context Windows and Token Costs The first significant change in the AI world is the substantial increase in the context window size and the decrease in token costs. For example, the context window size of the largest model, Claude from Anthropic, exceeds 200,000 tokens. According to the latest news, Gemini’s context window can reach up to 10 million tokens. Under these conditions, Retrieval-Augmented Generation (RAG) may not be required for many tasks, as all necessary data can fit into the context window. This shift has been observed in several financial and analytical projects where tasks were completely solved without using a vector database as intermediate storage. The trend of reducing token costs and increasing context window sizes is likely to continue, diminishing the need for external mechanisms for large language models (LLMs). However, they remain necessary for the time being. Advanced RAG. If the context size is still insufficient, different methods of summarization and context compression have been devised. LangChain has introduced a class aimed at this: ConversationSummaryMemory. pythonCopy codellm = OpenAI(temperature=0) conversation_with_summary = ConversationChain( llm=llm, memory=ConversationSummaryMemory(llm=OpenAI()), verbose=True ) conversation_with_summary.predict(input=”Hi, what’s up?”) Knowledge Graphs As the amount of data LLMs must navigate grows, the ability to navigate this data becomes increasingly important. Without the ability to analyze the data structure and other attributes, it’s impossible to use them effectively. For example, suppose the data source is a company’s wiki with a page containing the company’s phone number, but this isn’t explicitly indicated anywhere. How does the LLM understand that this is the company’s phone number? It doesn’t, which is why standard RAG won’t provide any information about the company’s phone number (as it sees no connection). A person can understand that this is the company’s phone number from the convention of how the data is stored (i.e., from the structure or metadata). For LLMs, this problem is solved with Knowledge Graphs with metadata (also known as Knowledge Maps), which means the LLM has not only the raw data but also information about the storage structure and the connections between different data entities. This approach is also known as Graph Retrieval-Augmented Generation (GraphRAG). Graphs are excellent for representing and storing heterogeneous and interconnected information in a structured form, easily capturing complex relationships and attributes among different types of data, which vector databases struggle with. Example of a Knowledge Graph Creating a Knowledge Graph typically involves collecting and structuring data, requiring a deep understanding of both the subject area and graph modeling. This process can largely be automated with LLMs. Thanks to their understanding of language and context, LLMs can automate significant parts of the Knowledge Graph creation process. By analyzing textual data, these models can identify entities, understand their relationships, and suggest how best to represent them in a graph structure. Advanced RAG This ensemble of a vector database and a knowledge graph generally improves accuracy and often includes a search through a regular database or by keywords (e.g., Elasticsearch). Knowledge Graph Retriever Example For example, a user asks a question about the company’s phone number. If this is done in code, the entities from the question can be formatted in JSON or using with_structured_output from LangChain. These entities are then searched for in the Knowledge Graph. How this is done depends on where the graph is stored. pythonCopy codedocuments = parse_and_load_data_from_wiki_including_metadata() graph_store = NebulaGraphStore( space_name=”Company Wiki”, tags=[“entity”] ) storage_context = StorageContext.from_defaults(graph_store=graph_store) index = KnowledgeGraphIndex.from_documents( documents, max_triplets_per_chunk=2, space_name=”Company Wiki”, tags=[“entity”] ) query_engine = index.as_query_engine() response = query_engine.query(“Tell me more about our Company”) The search differs from a vector database search in that it searches for attributes and related entities, not similar vectors. Returning to the initial question, if the wiki structure was transferred to the graph correctly, the company’s phone number would be added as a related entity in the graph. The data from the graph and the vector database search are then passed to the LLM to generate a complete answer. Challenges and Solutions Access Control Access to data may not be uniform. In the same wiki, there may be roles and permissions, and not every user can see all information. This problem exists for both graph and vector database searches, requiring access management. Role-Based Access Control (RBAC), Attribute-Based Access Control (ABAC), and Relationship-Based Access Control (ReBAC) are common methods. Permissions and categories are also forms of metadata, which must be preserved at the data ingestion stage in the knowledge graph and vector database. When searching in the vector database, it is necessary to check whether the role or other access attributes match what the user has access to. Some commercial vector databases already include this functionality. Data embedded in the LLM during training relies on the LLM’s reasonableness, which is not recommended. Ingestion and Parsing Data needs to be inserted into the graph and the vector database. For the graph, the format is critical as it reflects the data structure and serves as metadata. Parsing data, especially from PDFs, can be challenging. Frameworks like LLama Parse attempt this with varying degrees of success. However, OCR or recognizing a document image can sometimes be easier. Improving Answer Advanced RAG Several approaches aim to improve answer quality beyond using knowledge graphs: Corrective Retrieval Augmented Generation (CRAG) CRAG addresses incorrect RAG results by automating correction processes. LangGraph can implement this approach, which essentially forms a state machine. Self-RAG Self-reflective RAG fine-tunes the LLM to generate self-reflection tokens in addition to regular ones, helping build a state machine for better results. HyDe HyDe (Hypothetical Document Embeddings) modifies the usual RAG retrieval process by using the LLM to generate a response and then searching the vector database with that response. This is useful when users’ questions are too abstract and require more context. These methods, including CRAG, Self-RAG, and HyDe, provide various ways to enhance the performance of LLMs and improve the quality of their answers. Like Related Posts Salesforce OEM AppExchange Expanding its reach beyond CRM, Salesforce.com has launched a new service called AppExchange OEM Edition, aimed at non-CRM

July 21, 2024in Salesforce

01Jun

RAG Chunking Method

Enhancing Retrieval-Augmented Generation (RAG) Systems with Topic-Based Document Segmentation Dividing large documents into smaller, meaningful parts is crucial for the performance of Retrieval-Augmented Generation (RAG) systems. RAG Chunking Method. These systems benefit from frameworks that offer multiple document-splitting options. This Tectonic insight introduces an innovative approach that identifies topic changes using sentence embeddings, improving the subdivision process to create coherent topic-based sections. RAG Systems: An Overview A Retrieval-Augmented Generation (RAG) system combines retrieval-based and generation-based models to enhance output quality and relevance. It first retrieves relevant information from a large dataset based on an input query, then uses a transformer-based language model to generate a coherent and contextually appropriate response. This hybrid approach is particularly effective in complex or knowledge-intensive tasks. Standard Document Splitting Options Before diving into the new approach, let’s explore some standard document splitting methods using the LangChain framework, known for its robust support of various natural language processing (NLP) tasks. LangChain Framework: LangChain assists developers in applying large language models across NLP tasks, including document splitting. Here are key splitting methods available: Introducing a New Approach: Topic-Based Segmentation Segmenting large-scale documents into coherent topic-based sections poses significant challenges. Traditional methods often fail to detect subtle topic shifts accurately. This innovative approach, presented at the International Conference on Artificial Intelligence, Computer, Data Sciences, and Applications (ACDSA 2024), addresses this issue using sentence embeddings. The Core Challenge Large documents often contain multiple topics. Conventional segmentation techniques struggle to identify precise topic transitions, leading to fragmented or overlapping sections. This method leverages Sentence-BERT (SBERT) to generate embeddings for individual sentences, which reflect changes in the vector space as topics shift. Approach Breakdown 1. Using Sentence Embeddings: 2. Calculating Gap Scores: 3. Smoothing: 4. Boundary Detection: 5. Clustering Segments: Algorithm Pseudocode Gap Score Calculation: pythonCopy code# Example pseudocode for gap score calculation def calculate_gap_scores(sentences, n): embeddings = [sbert.encode(sentence) for sentence in sentences] gap_scores = [] for i in range(len(sentences) – n): before = embeddings[i:i+n] after = embeddings[i+n:i+2*n] score = cosine_similarity(before, after) gap_scores.append(score) return gap_scores Gap Score Smoothing: pythonCopy code# Example pseudocode for smoothing gap scores def smooth_gap_scores(gap_scores, k): smoothed_scores = [] for i in range(len(gap_scores)): start = max(0, i – k) end = min(len(gap_scores), i + k + 1) smoothed_score = sum(gap_scores[start:end]) / (end – start) smoothed_scores.append(smoothed_score) return smoothed_scores Boundary Detection: pythonCopy code# Example pseudocode for boundary detection def detect_boundaries(smoothed_scores, c): boundaries = [] mean_score = sum(smoothed_scores) / len(smoothed_scores) std_dev = (sum((x – mean_score) ** 2 for x in smoothed_scores) / len(smoothed_scores)) ** 0.5 for i, score in enumerate(smoothed_scores): if score < mean_score – c * std_dev: boundaries.append(i) return boundaries Future Directions Potential areas for further research include: Conclusion This method combines traditional principles with advanced sentence embeddings, leveraging SBERT and sophisticated smoothing and clustering techniques. This approach offers a robust and efficient solution for accurate topic modeling in large documents, enhancing the performance of RAG systems by providing coherent and contextually relevant text sections. Like Related Posts Salesforce OEM AppExchange Expanding its reach beyond CRM, Salesforce.com has launched a new service called AppExchange OEM Edition, aimed at non-CRM service providers. Read more Salesforce Jigsaw Salesforce.com, a prominent figure in cloud computing, has finalized a deal to acquire Jigsaw, a wiki-style business contact database, for Read more Health Cloud Brings Healthcare Transformation Following swiftly after last week’s successful launch of Financial Services Cloud, Salesforce has announced the second installment in its series Read more Top Ten Reasons Why Tectonic Loves the Cloud The Cloud is Good for Everyone – Why Tectonic loves the cloud You don’t need to worry about tracking licenses. Read more

June 1, 2024in Salesforce Einstein

13Dec

Retrieval Augmented Generation Techniques

A comprehensive study has been conducted on advanced retrieval augmented generation techniques and algorithms, systematically organizing various approaches. This insight includes a collection of links referencing various implementations and studies mentioned in the author’s knowledge base. If you’re familiar with the RAG concept, skip to the Advanced RAG section. Retrieval Augmented Generation, known as RAG, equips Large Language Models (LLMs) with retrieved information from a data source to ground their generated answers. Essentially, RAG combines Search with LLM prompting, where the model is asked to answer a query provided with information retrieved by a search algorithm as context. Both the query and the retrieved context are injected into the prompt sent to the LLM. RAG emerged as the most popular architecture for LLM-based systems in 2023, with numerous products built almost exclusively on RAG. These range from Question Answering services that combine web search engines with LLMs to hundreds of apps allowing users to interact with their data. Even the vector search domain experienced a surge in interest, despite embedding-based search engines being developed as early as 2019. Vector database startups such as Chroma, Weavaite.io, and Pinecone have leveraged existing open-source search indices, mainly Faiss and Nmslib, and added extra storage for input texts and other tooling. Two prominent open-source libraries for LLM-based pipelines and applications are LangChain and LlamaIndex, both founded within a month of each other in October and November 2022, respectively. These were inspired by the launch of ChatGPT and gained massive adoption in 2023. The purpose of this Tectonic insight is to systemize key advanced RAG techniques with references to their implementations, mostly in LlamaIndex, to facilitate other developers’ exploration of the technology. The problem addressed is that most tutorials focus on individual techniques, explaining in detail how to implement them, rather than providing an overview of the available tools. Naive RAG The starting point of the RAG pipeline described in this article is a corpus of text documents. The process begins with splitting the texts into chunks, followed by embedding these chunks into vectors using a Transformer Encoder model. These vectors are then indexed, and a prompt is created for an LLM to answer the user’s query given the context retrieved during the search step. In runtime, the user’s query is vectorized with the same Encoder model, and a search is executed against the index. The top-k results are retrieved, corresponding text chunks are fetched from the database, and they are fed into the LLM prompt as context. An overview of advanced RAG techniques, illustrated with core steps and algorithms. 1.1 Chunking Texts are split into chunks of a certain size without losing their meaning. Various text splitter implementations capable of this task exist. 1.2 Vectorization A model is chosen to embed the chunks, with options including search-optimized models like bge-large or E5 embeddings family. 2.1 Vector Store Index Various indices are supported, including flat indices and vector indices like Faiss, Nmslib, or Annoy. 2.2 Hierarchical Indices Efficient search within large databases is facilitated by creating two indices: one composed of summaries and another composed of document chunks. 2.3 Hypothetical Questions and HyDE An alternative approach involves asking an LLM to generate a question for each chunk, embedding these questions in vectors, and performing query search against this index of question vectors. 2.4 Context Enrichment Smaller chunks are retrieved for better search quality, with surrounding context added for the LLM to reason upon. 2.4.1 Sentence Window Retrieval Each sentence in a document is embedded separately to provide accurate search results. 2.4.2 Auto-merging Retriever Documents are split into smaller child chunks referring to larger parent chunks to enhance context retrieval. 2.5 Fusion Retrieval or Hybrid Search Keyword-based old school search algorithms are combined with modern semantic or vector search to improve retrieval results. Encoder and LLM Fine-tuning Fine-tuning of Transformer Encoders or LLMs can further enhance the RAG pipeline’s performance, improving context retrieval quality or answer relevance. Evaluation Various frameworks exist for evaluating RAG systems, with metrics focusing on retrieved context relevance, answer groundedness, and overall answer relevance. The next big thing about building a nice RAG system that can work more than once for a single query is the chat logic, taking into account the dialogue context, same as in the classic chat bots in the pre-LLM era.This is needed to support follow up questions, anaphora, or arbitrary user commands relating to the previous dialogue context. It is solved by query compression technique, taking chat context into account along with the user query. Query routing is the step of LLM-powered decision making upon what to do next given the user query — the options usually are to summarise, to perform search against some data index or to try a number of different routes and then to synthesise their output in a single answer. Query routers are also used to select an index, or, broader, data store, where to send user query — either you have multiple sources of data, for example, a classic vector store and a graph database or a relational DB, or you have an hierarchy of indices — for a multi-document storage a pretty classic case would be an index of summaries and another index of document chunks vectors for example. This insight aims to provide an overview of core algorithmic approaches to RAG, offering insights into techniques and technologies developed in 2023. It emphasizes the importance of speed in RAG systems and suggests potential future directions, including exploration of web search-based RAG and advancements in agentic architectures. Like Related Posts Salesforce OEM AppExchange Expanding its reach beyond CRM, Salesforce.com has launched a new service called AppExchange OEM Edition, aimed at non-CRM service providers. Read more Salesforce Jigsaw Salesforce.com, a prominent figure in cloud computing, has finalized a deal to acquire Jigsaw, a wiki-style business contact database, for Read more Health Cloud Brings Healthcare Transformation Following swiftly after last week’s successful launch of Financial Services Cloud, Salesforce has announced the second installment in its series Read more Top Ten Reasons Why Tectonic Loves the

December 13, 2023in Salesforce Einstein

26Oct

RAG – Retrieval Augmented Generation in Artificial Intelligence

Salesforce has introduced advanced capabilities for unstructured data in Data Cloud and Einstein Copilot Search. By leveraging semantic search and prompts in Einstein Copilot, Large Language Models (LLMs) now generate more accurate, up-to-date, and transparent responses, ensuring the security of company data through the Einstein Trust Layer. Retrieval Augmented Generation in Artificial Intelligence has taken Salesforce’s Einstein and Data Cloud to new heights. These features are supported by the AI framework called Retrieval Augmented Generation (RAG), allowing companies to enhance trust and relevance in generative AI using both structured and unstructured proprietary data. RAG Defined: RAG assists companies in retrieving and utilizing their data, regardless of its location, to achieve superior AI outcomes. The RAG pattern coordinates queries and responses between a search engine and an LLM, specifically working on unstructured data such as emails, call transcripts, and knowledge articles. How RAG Works: Salesforce’s Implementation of RAG: RAG begins with Salesforce Data Cloud, expanding to support storage of unstructured data like PDFs and emails. A new unstructured data pipeline enables teams to select and utilize unstructured data across the Einstein 1 Platform. The Data Cloud Vector Database combines structured and unstructured data, facilitating efficient processing. RAG in Action with Einstein Copilot Search: RAG for Enterprise Use: RAG aids in processing internal documents securely. Its four-step process involves ingestion, natural language query, augmentation, and response generation. RAG prevents arbitrary answers, known as “hallucinations,” and ensures relevant, accurate responses. Applications of RAG: RAG offers a pragmatic and effective approach to using LLMs in the enterprise, combining internal or external knowledge bases to create a range of assistants that enhance employee and customer interactions. Retrieval-augmented generation (RAG) is an AI technique for improving the quality of LLM-generated responses by including trusted sources of knowledge, outside of the original training set, to improve the accuracy of the LLM’s output. Implementing RAG in an LLM-based question answering system has benefits: 1) assurance that an LLM has access to the most current, reliable facts, 2) reduce hallucinations rates, and 3) provide source attribution to increase user trust in the output. Retrieval Augmented Generation in Artificial Intelligence Content updated July 2024. Like2 Related Posts Salesforce OEM AppExchange Expanding its reach beyond CRM, Salesforce.com has launched a new service called AppExchange OEM Edition, aimed at non-CRM service providers. Read more The Salesforce Story In Marc Benioff’s own words How did salesforce.com grow from a start up in a rented apartment into the world’s Read more Salesforce Jigsaw Salesforce.com, a prominent figure in cloud computing, has finalized a deal to acquire Jigsaw, a wiki-style business contact database, for Read more Health Cloud Brings Healthcare Transformation Following swiftly after last week’s successful launch of Financial Services Cloud, Salesforce has announced the second installment in its series Read more

October 26, 2023in Einstein 1 Platform, Generative AI, Salesforce, Salesforce Data Cloud

Advanced RAG

More AI Tools to Use

RIG and RAG

Salesforce Advanced AI Models

Tectonic Guide to Rag Part 2

Advanced RAG

RAG Chunking Method

Retrieval Augmented Generation Techniques

RAG – Retrieval Augmented Generation in Artificial Intelligence

Recent Posts

Benefits of Linking CloudSign with Salesforce

Agentforce: Your Partner in Seamless Customer Experiences

On Premise Gen AI

Statement Accuracy Prediction based on Language Model Activations

Salesforce Marketing Cloud (SFMC) Connector

Contact Us

Be in touch today — and start your business on a path to success.

Category

Archives

Advanced RAG

Recent Posts

Contact Us

Be in touch today — and start your business on a path to success.

Category

Tags

Archives

Subscribe to our mailing list. Join our mail list to receive our newsletter