LLM Performance - gettectonic.com
Where LLMs Fall Short

LLM Economies

Throughout history, disruptive technologies have been the catalyst for major social and economic revolutions. The invention of the plow and irrigation systems 12,000 years ago sparked the Agricultural Revolution, while Johannes Gutenberg’s 15th-century printing press fueled the Protestant Reformation and helped propel Europe out of the Middle Ages into the Renaissance. In the 18th century, James Watt’s steam engine ushered in the Industrial Revolution. More recently, the internet has revolutionized communication, commerce, and information access, shrinking the world into a global village. Similarly, smartphones have transformed how people interact with their surroundings. Now, we stand at the dawn of the AI revolution. Large Language Models (LLMs) represent a monumental leap forward, with significant economic implications at both macro and micro levels. These models are reshaping global markets, driving new forms of currency, and creating a novel economic landscape. The reason LLMs are transforming industries and redefining economies is simple: they automate both routine and complex tasks that traditionally require human intelligence. They enhance decision-making processes, boost productivity, and facilitate cost reductions across various sectors. This enables organizations to allocate human resources toward more creative and strategic endeavors, resulting in the development of new products and services. From healthcare to finance to customer service, LLMs are creating new markets and driving AI-driven services like content generation and conversational assistants into the mainstream. To truly grasp the engine driving this new global economy, it’s essential to understand the inner workings of this disruptive technology. These posts will provide both a macro-level overview of the economic forces at play and a deep dive into the technical mechanics of LLMs, equipping you with a comprehensive understanding of the revolution happening now. Why Now? The Connection Between Language and Human Intelligence AI did not begin with ChatGPT’s arrival in November 2022. Many people were developing machine learning classification models in 1999, and the roots of AI go back even further. Artificial Intelligence was formally born in 1950, when Alan Turing—considered the father of theoretical computer science and famed for cracking the Nazi Enigma code during World War II—created the first formal definition of intelligence. This definition, known as the Turing Test, demonstrated the potential for machines to exhibit human-like intelligence through natural language conversations. The test involves a human evaluator who engages in conversations with both a human and a machine. If the evaluator cannot reliably distinguish between the two, the machine is considered to have passed the test. Remarkably, after 72 years of gradual AI development, ChatGPT simulated this very interaction, passing the Turing Test and igniting the current AI explosion. But why is language so closely tied to human intelligence, rather than, for example, vision? While 70% of our brain’s neurons are devoted to vision, OpenAI’s pioneering image generation model, DALL-E, did not trigger the same level of excitement as ChatGPT. The answer lies in the profound role language has played in human evolution. The Evolution of Language The development of language was the turning point in humanity’s rise to dominance on Earth. As Yuval Noah Harari points out in his book Sapiens: A Brief History of Humankind, it was the ability to gossip and discuss abstract concepts that set humans apart from other species. Complex communication, such as gossip, requires a shared, sophisticated language. Human language evolved from primitive cave signs to structured alphabets, which, along with grammar rules, created languages capable of expressing thousands of words. In today’s digital age, language has further evolved with the inclusion of emojis, and now with the advent of GenAI, tokens have become the latest cornerstone in this progression. These shifts highlight the extraordinary journey of human language, from simple symbols to intricate digital representations. In the next post, we will explore the intricacies of LLMs, focusing specifically on tokens. But before that, let’s delve into the economic forces shaping the LLM-driven world. The Forces Shaping the LLM Economy AI Giants in Competition Karl Marx and Friedrich Engels argued that those who control the means of production hold power. The tech giants of today understand that AI is the future means of production, and the race to dominate the LLM market is well underway. This competition is fierce, with industry leaders like OpenAI, Google, Microsoft, and Facebook battling for supremacy. New challengers such as Mistral (France), AI21 (Israel), and Elon Musk’s xAI and Anthropic are also entering the fray. The LLM industry is expanding exponentially, with billions of dollars of investment pouring in. For example, Anthropic has raised $4.5 billion from 43 investors, including major players like Amazon, Google, and Microsoft. The Scarcity of GPUs Just as Bitcoin mining requires vast computational resources, training LLMs demands immense computing power, driving a search for new energy sources. Microsoft’s recent investment in nuclear energy underscores this urgency. At the heart of LLM technology are Graphics Processing Units (GPUs), essential for powering deep neural networks. These GPUs have become scarce and expensive, adding to the competitive tension. Tokens: The New Currency of the LLM Economy Tokens are the currency driving the emerging AI economy. Just as money facilitates transactions in traditional markets, tokens are the foundation of LLM economics. But what exactly are tokens? Tokens are the basic units of text that LLMs process. They can be single characters, parts of words, or entire words. For example, the word “Oscar” might be split into two tokens, “os” and “car.” The performance of LLMs—quality, speed, and cost—hinges on how efficiently they generate these tokens. LLM providers price their services based on token usage, with different rates for input (prompt) and output (completion) tokens. As companies rely more on LLMs, especially for complex tasks like agentic applications, token usage will significantly impact operational costs. With fierce competition and the rise of open-source models like Llama-3.1, the cost of tokens is rapidly decreasing. For instance, OpenAI reduced its GPT-4 pricing by about 80% over the past year and a half. This trend enables companies to expand their portfolio of AI-powered products, further fueling the LLM economy. Context Windows: Expanding Capabilities

Read More
Gen AI Role in Healthcare

Gen AI Role in Healthcare

Generative AI’s Growing Role in Healthcare: Potential and Challenges The rapid advancements in large language models (LLMs) have introduced generative AI tools into nearly every business sector, including healthcare. As defined by the Government Accountability Office, generative AI is “a technology that can create content, including text, images, audio, or video, when prompted by a user.” These systems learn patterns and relationships from vast datasets, enabling them to generate new content that resembles but is not identical to the original training data. This capability is powered by machine learning algorithms and statistical models. In healthcare, generative AI is being utilized for various applications, including clinical documentation, patient communication, and clinical text summarization. Streamlining Clinical Documentation Excessive documentation is a leading cause of clinician burnout, as highlighted by a 2022 athenahealth survey conducted by the Harris Poll. Generative AI shows promise in easing these documentation burdens, potentially improving clinician satisfaction and reducing burnout. A 2024 study published in NEJM Catalyst explored the use of ambient AI scribes within The Permanente Medical Group (TPMG). This technology employs smartphone microphones and generative AI to transcribe patient encounters in real-time, providing clinicians with draft documentation for review. In October 2023, TPMG deployed this ambient AI technology across various settings, benefiting 10,000 physicians and staff. Physicians who used the ambient AI scribe reported positive outcomes, including more personal and meaningful patient interactions and reduced after-hours electronic health record (EHR) documentation. Early patient feedback was also favorable, with improved provider interactions noted. Additionally, ambient AI produced high-quality clinical documentation for clinician review. However, a 2023 study in the Journal of the American Medical Informatics Association (JAMIA) cautioned that ambient AI might struggle with non-lexical conversational sounds (NLCSes), such as “mm-hm” or “uh-uh,” which can convey clinically relevant information. The study found that while the ambient AI tools had a word error rate of about 12% for all words, the error rate for NLCSes was significantly higher, reaching up to 98.7% for those conveying critical information. Misinterpretation of these sounds could lead to inaccuracies in clinical documentation and potential patient safety issues. Enhancing Patient Communication With the digital transformation in healthcare, patient portal messages have surged. A 2021 study in JAMIA reported a 157% increase in patient portal inbox messages since 2020. In response, some healthcare organizations are exploring the use of generative AI to draft replies to these messages. A 2024 study published in JAMA Network Open evaluated the adoption of AI-generated draft replies to patient messages at an academic medical center. After five weeks, clinicians used the AI-generated drafts 20% of the time, a notable rate considering the LLMs were not fine-tuned for patient communication. Clinicians reported reduced task load and emotional exhaustion, suggesting that AI-generated replies could help alleviate burnout. However, the study found no significant changes in reply time, read time, or write time between the pre-pilot and pilot periods. Despite this, clinicians expressed optimism about time savings, indicating that the cognitive ease of editing drafts rather than writing from scratch might not be fully captured by time metrics. Summarizing Clinical Data Summarizing information within patient records is a time-consuming task for clinicians, and errors in this process can negatively impact clinical decision support. Generative AI has shown potential in this area, with a 2023 study finding that LLM-generated summaries could outperform human expert summaries in terms of conciseness, completeness, and correctness. However, using generative AI for clinical data summarization presents risks. A viewpoint in JAMA argued that LLMs performing summarization tasks might not fall under FDA medical device oversight, as they provide language-based outputs rather than disease predictions or numerical estimates. Without statutory changes, the FDA’s authority to regulate these LLMs remains unclear. The authors also noted that differences in summary length, organization, and tone could influence clinician interpretations and subsequent decision-making. Furthermore, LLMs might exhibit biases, such as sycophancy, where responses are tailored to user expectations. To address these concerns, the authors called for comprehensive standards for LLM-generated summaries, including testing for biases and errors, as well as clinical trials to quantify potential harms and benefits. The Path Forward Generative AI holds significant promise for transforming healthcare and reducing clinician burnout, but realizing this potential requires comprehensive standards and regulatory clarity. A 2024 study published in npj Digital Medicine emphasized the need for defined leadership, adoption incentives, and ongoing regulation to deliver on the promise of generative AI in healthcare. Leadership should focus on establishing guidelines for LLM performance and identifying optimal clinical settings for AI tool trials. The study suggested that a subcommittee within the FDA, comprising physicians, healthcare administrators, developers, and investors, could effectively lead this effort. Additionally, widespread deployment of generative AI will likely require payer incentives, as most providers view these tools as capital expenses. With the right leadership, incentives, and regulatory framework, generative AI can be effectively implemented across the healthcare continuum to streamline clinical workflows and improve patient care. Like Related Posts Salesforce OEM AppExchange Expanding its reach beyond CRM, Salesforce.com has launched a new service called AppExchange OEM Edition, aimed at non-CRM service providers. Read more Salesforce Jigsaw Salesforce.com, a prominent figure in cloud computing, has finalized a deal to acquire Jigsaw, a wiki-style business contact database, for Read more Health Cloud Brings Healthcare Transformation Following swiftly after last week’s successful launch of Financial Services Cloud, Salesforce has announced the second installment in its series Read more Top Ten Reasons Why Tectonic Loves the Cloud The Cloud is Good for Everyone – Why Tectonic loves the cloud You don’t need to worry about tracking licenses. Read more

Read More
gettectonic.com