Hyperscalers - gettectonic.com
Generative ai energy consumption

AI Energy Consumption

At the Gartner IT Symposium/Xpo 2024, industry leaders emphasized that rising energy consumption and costs are fast becoming constraints on IT capabilities. Solutions discussed include adopting acceleration technologies, exploring microgrids, and keeping an eye on emerging energy-efficient technologies. With enterprise AI applications expanding, computing demands – and the energy needed to support them – are rapidly increasing. Nvidia’s CEO, Jensen Huang, highlighted this challenge, noting that advancements in traditional computing are failing to keep pace with data processing needs. “If compute demand grows exponentially while general-purpose performance stagnates, you’ll face not just cost inflation but significant energy inflation,” he said. Huang suggested that leveraging accelerated computing can mitigate some of these impacts, improving energy efficiency. Another approach highlighted was the use of microgrids, with Gartner predicting that Fortune 500 companies will shift up to $500 billion toward such systems by 2027 to manage ongoing energy risks and AI demand. Gartner’s Daryl Plummer noted that these independent energy networks could help energy-intensive enterprises avoid dependence on strained public power grids. Hyperscalers, including major cloud providers, are already exploring alternative power sources, such as nuclear energy, to meet escalating demands. For instance, Microsoft has announced plans to source energy from the Three Mile Island nuclear plant. While emerging technologies like quantum, neuromorphic, and photonic computing offer the promise of significant energy efficiency, they’re still years away from maturity. Gartner analyst Frank Buytendijk predicted it will take five to ten years before these options become viable solutions. “Energy-efficient computing is on the horizon, but we have a ways to go,” he said. Until then, enterprises will need to consider proactive strategies to manage energy risks and costs as part of their AI and IT planning. Like Related Posts Salesforce OEM AppExchange Expanding its reach beyond CRM, Salesforce.com has launched a new service called AppExchange OEM Edition, aimed at non-CRM service providers. Read more The Salesforce Story In Marc Benioff’s own words How did salesforce.com grow from a start up in a rented apartment into the world’s Read more Salesforce Jigsaw Salesforce.com, a prominent figure in cloud computing, has finalized a deal to acquire Jigsaw, a wiki-style business contact database, for Read more Health Cloud Brings Healthcare Transformation Following swiftly after last week’s successful launch of Financial Services Cloud, Salesforce has announced the second installment in its series Read more

Read More
Generative AI Energy Consumption Rises

Generative AI Energy Consumption Rises

Generative AI Energy Consumption Rises, but Impact on ROI Unclear The energy costs associated with generative AI (GenAI) are often overlooked in enterprise financial planning. However, industry experts suggest that IT leaders should account for the power consumption that comes with adopting this technology. When building a business case for generative AI, some costs are evident, like large language model (LLM) fees and SaaS subscriptions. Other costs, such as preparing data, upgrading cloud infrastructure, and managing organizational changes, are less visible but significant. Generative AI Energy Consumption Rises One often overlooked cost is the energy consumption of generative AI. Training LLMs and responding to user requests—whether answering questions or generating images—demands considerable computing power. These tasks generate heat and necessitate sophisticated cooling systems in data centers, which, in turn, consume additional energy. Despite this, most enterprises have not focused on the energy requirements of GenAI. However, the issue is gaining more attention at a broader level. The International Energy Agency (IEA), for instance, has forecasted that electricity consumption from data centers, AI, and cryptocurrency could double by 2026. By that time, data centers’ electricity use could exceed 1,000 terawatt-hours, equivalent to Japan’s total electricity consumption. Goldman Sachs also flagged the growing energy demand, attributing it partly to AI. The firm projects that global data center electricity use could more than double by 2030, fueled by AI and other factors. ROI Implications of Energy Costs The extent to which rising energy consumption will affect GenAI’s return on investment (ROI) remains unclear. For now, the perceived benefits of GenAI seem to outweigh concerns about energy costs. Most businesses have not been directly impacted, as these costs tend to affect hyperscalers more. For instance, Google reported a 13% increase in greenhouse gas emissions in 2023, largely due to AI-related energy demands in its data centers. Scott Likens, PwC’s global chief AI engineering officer, noted that while energy consumption isn’t a barrier to adoption, it should still be factored into long-term strategies. “You don’t take it for granted. There’s a cost somewhere for the enterprise,” he said. Energy Costs: Hidden but Present Although energy expenses may not appear on an enterprise’s invoice, they are still present. Generative AI’s energy consumption is tied to both model training and inference—each time a user makes a query, the system expends energy to generate a response. While the energy used for individual queries is minor, the cumulative effect across millions of users can add up. How these costs are passed to customers is somewhat opaque. Licensing fees for enterprise versions of GenAI products likely include energy costs, spread across the user base. According to PwC’s Likens, the costs associated with training models are shared among many users, reducing the burden on individual enterprises. On the inference side, GenAI vendors charge for tokens, which correspond to computational power. Although increased token usage signals higher energy consumption, the financial impact on enterprises has so far been minimal, especially as token costs have decreased. This may be similar to buying an EV to save on gas but spending hundreds and losing hours at charging stations. Energy as an Indirect Concern While energy costs haven’t been top-of-mind for GenAI adopters, they could indirectly address the issue by focusing on other deployment challenges, such as reducing latency and improving cost efficiency. Newer models, such as OpenAI’s GPT-4o mini, are more economical and have helped organizations scale GenAI without prohibitive costs. Organizations may also use smaller, fine-tuned models to decrease latency and energy consumption. By adopting multimodel approaches, enterprises can choose models based on the complexity of a task, optimizing for both speed and energy efficiency. The Data Center Dilemma As enterprises consider GenAI’s energy demands, data centers face the challenge head-on, investing in more sophisticated cooling systems to handle the heat generated by AI workloads. According to the Dell’Oro Group, the data center physical infrastructure market grew in the second quarter of 2024, signaling the start of the “AI growth cycle” for infrastructure sales, particularly thermal management systems. Liquid cooling, more efficient than air cooling, is gaining traction as a way to manage the heat from high-performance computing. This method is expected to see rapid growth in the coming years as demand for AI workloads continues to increase. Nuclear Power and AI Energy Demands To meet AI’s growing energy demands, some hyperscalers are exploring nuclear energy for their data centers. AWS, Google, and Microsoft are among the companies exploring this option, with AWS acquiring a nuclear-powered data center campus earlier this year. Nuclear power could help these tech giants keep pace with AI’s energy requirements while also meeting sustainability goals. I don’t know. It seems like if you akin AI accessibility to more nuclear power plants you would lose a lot of fans. As GenAI continues to evolve, both energy costs and efficiency are likely to play a greater role in decision-making. PwC has already begun including carbon impact as part of its GenAI value framework, which assesses the full scope of generative AI deployments. “The cost of carbon is in there, so we shouldn’t ignore it,” Likens said. Generative AI Energy Consumption Rises Like Related Posts Salesforce OEM AppExchange Expanding its reach beyond CRM, Salesforce.com has launched a new service called AppExchange OEM Edition, aimed at non-CRM service providers. Read more The Salesforce Story In Marc Benioff’s own words How did salesforce.com grow from a start up in a rented apartment into the world’s Read more Salesforce Jigsaw Salesforce.com, a prominent figure in cloud computing, has finalized a deal to acquire Jigsaw, a wiki-style business contact database, for Read more Health Cloud Brings Healthcare Transformation Following swiftly after last week’s successful launch of Financial Services Cloud, Salesforce has announced the second installment in its series Read more

Read More
GPUs and AI Development

GPUs and AI Development

Graphics processing units (GPUs) have become widely recognized due to their growing role in AI development. However, a lesser-known but critical technology is also gaining attention: high-bandwidth memory (HBM). HBM is a high-density memory designed to overcome bottlenecks and maximize data transfer speeds between storage and processors. AI chipmakers like Nvidia rely on HBM for its superior bandwidth and energy efficiency. Its placement next to the GPU’s processor chip gives it a performance edge over traditional server RAM, which resides between storage and the processing unit. HBM’s ability to consume less power makes it ideal for AI model training, which demands significant energy resources. However, as the AI landscape transitions from model training to AI inferencing, HBM’s widespread adoption may slow. According to Gartner’s 2023 forecast, the use of accelerator chips incorporating HBM for AI model training is expected to decline from 65% in 2022 to 30% by 2027, as inferencing becomes more cost-effective with traditional technologies. How HBM Differs from Other Memory HBM shares similarities with other memory technologies, such as graphics double data rate (GDDR), in delivering high bandwidth for graphics-intensive tasks. But HBM stands out due to its unique positioning. Unlike GDDR, which sits on the printed circuit board of the GPU, HBM is placed directly beside the processor, enhancing speed by reducing signal delays caused by longer interconnections. This proximity, combined with its stacked DRAM architecture, boosts performance compared to GDDR’s side-by-side chip design. However, this stacked approach adds complexity. HBM relies on through-silicon via (TSV), a process that connects DRAM chips using electrical wires drilled through them, requiring larger die sizes and increasing production costs. According to analysts, this makes HBM more expensive and less efficient to manufacture than server DRAM, leading to higher yield losses during production. AI’s Demand for HBM Despite its manufacturing challenges, demand for HBM is surging due to its importance in AI model training. Major suppliers like SK Hynix, Samsung, and Micron have expanded production to meet this demand, with Micron reporting that its HBM is sold out through 2025. In fact, TrendForce predicts that HBM will contribute to record revenues for the memory industry in 2025. The high demand for GPUs, especially from Nvidia, drives the need for HBM as AI companies focus on accelerating model training. Hyperscalers, looking to monetize AI, are investing heavily in HBM to speed up the process. HBM’s Future in AI While HBM has proven essential for AI training, its future may be uncertain as the focus shifts to AI inferencing, which requires less intensive memory resources. As inferencing becomes more prevalent, companies may opt for more affordable and widely available memory solutions. Experts also see HBM following the same trajectory as other memory technologies, with continuous efforts to increase bandwidth and density. The next generation, HBM3E, is already in production, with HBM4 planned for release in 2026, promising even higher speeds. Ultimately, the adoption of HBM will depend on market demand, especially from hyperscalers. If AI continues to push the limits of GPU performance, HBM could remain a critical component. However, if businesses prioritize cost efficiency over peak performance, HBM’s growth may level off. Like Related Posts Salesforce OEM AppExchange Expanding its reach beyond CRM, Salesforce.com has launched a new service called AppExchange OEM Edition, aimed at non-CRM service providers. Read more Salesforce Jigsaw Salesforce.com, a prominent figure in cloud computing, has finalized a deal to acquire Jigsaw, a wiki-style business contact database, for Read more Health Cloud Brings Healthcare Transformation Following swiftly after last week’s successful launch of Financial Services Cloud, Salesforce has announced the second installment in its series Read more Alphabet Soup of Cloud Terminology As with any technology, the cloud brings its own alphabet soup of terms. This insight will hopefully help you navigate Read more

Read More
Scale and AI Influence Shape Partner Ecosystems

Scale and AI Influence Shape Partner Ecosystems

Hyperscalers’ Scale and AI Influence Shape Partner Ecosystems Despite their seemingly saturated networks, the largest cloud vendors continue to dominate as top ecosystems for service providers, according to a recent survey. Hyperscalers are playing a critical role in partner alliances, a trend that has only intensified in recent years. A study released by Tercera, an investment firm specializing in IT services, highlights the dominance of cloud giants AWS, Google Cloud, and Microsoft Azure in the partner ecosystem landscape. More than 50% of the 250 technology service providers surveyed by Tercera identified one of these three vendors as their primary partner. This data comes from Tercera’s third annual report on the Top 30 Partner Ecosystems. The report emphasizes the “gravitational pull” of these hyperscalers, attracting partners despite their already vast networks. Each of the major cloud vendors maintains relationships with thousands of software and services partners. “The hyperscalers continue to defy the law of large numbers when you look at how many partners are in their ecosystems,” said Michelle Swan, CMO at Tercera. The Shift in Channel Alliances The emergence of cloud vendors as top partners for service providers has been evident since at least 2021. That year, a survey by Accenture of 1,150 channel companies found that AWS, Google, and Microsoft accounted for the majority of revenue for these partners. This represents a significant shift in channel economics, where traditionally large hardware companies occupied the top spots in partner alliances. AI’s Role in Partner Ecosystem Growth The rise of generative AI (GenAI) is reshaping alliance strategies, as service providers increasingly align themselves with hyperscalers and their AI technology partners. For instance, AWS channel partners interested in GenAI are likely to work with Anthropic, following Amazon’s $4 billion investment in the AI company. Meanwhile, Microsoft partners tend to collaborate with OpenAI, as Microsoft has committed up to $13 billion in investments to expand their partnership. “They have their own solar systems,” Swan remarked, referencing AWS, Google, Microsoft, and the AI startups within their ecosystems. Tiers of Partner Ecosystems Tercera categorizes its top 30 ecosystems into three tiers. The first tier, known as “market anchors,” includes AWS, Google, Microsoft, and large independent software vendors (ISVs) such as Salesforce and ServiceNow. The second tier, “market movers,” features publicly traded vendors with evolving partner ecosystems. The third tier, “market challengers,” is made up of privately held vendors with a partner-centric focus, such as Anthropic and OpenAI. Generative AI Ecosystem Survey A 2024 generative AI survey conducted by TechTarget and its Enterprise Strategy Group supports the idea that the leading cloud vendors play a central role in AI ecosystems. In a poll of 610 GenAI decision-makers and users, Microsoft topped the list of ecosystems supporting GenAI initiatives, with 54% of respondents citing it as the best ecosystem. Microsoft’s partner, OpenAI, followed with 35%. Google and AWS ranked third and fourth, with 30% and 24% of the responses, respectively. The survey covered a wide range of industries, including business services and IT, further reinforcing the dominant role hyperscalers play in shaping AI and partner ecosystems. Like Related Posts Salesforce OEM AppExchange Expanding its reach beyond CRM, Salesforce.com has launched a new service called AppExchange OEM Edition, aimed at non-CRM service providers. Read more The Salesforce Story In Marc Benioff’s own words How did salesforce.com grow from a start up in a rented apartment into the world’s Read more Salesforce Jigsaw Salesforce.com, a prominent figure in cloud computing, has finalized a deal to acquire Jigsaw, a wiki-style business contact database, for Read more Health Cloud Brings Healthcare Transformation Following swiftly after last week’s successful launch of Financial Services Cloud, Salesforce has announced the second installment in its series Read more

Read More
Generative ai energy consumption

Growing Energy Consumption in Generative AI

Growing Energy Consumption in Generative AI, but ROI Impact Remains Unclear The rising energy costs associated with generative AI aren’t always central in enterprise financial considerations, yet experts suggest IT leaders should take note. Building a business case for generative AI involves both obvious and hidden expenses. Licensing fees for large language models (LLMs) and SaaS subscriptions are visible expenses, but less apparent costs include data preparation, cloud infrastructure upgrades, and managing organizational change. Growing Energy Consumption in Generative AI. One under-the-radar cost is the energy required by generative AI. Training LLMs demands vast computing power, and even routine AI tasks like answering user queries or generating images consume energy. These intensive processes require robust cooling systems in data centers, adding to energy use. While energy costs haven’t been a focus for GenAI adopters, growing awareness has prompted the International Energy Agency (IEA) to predict a doubling of data center electricity consumption by 2026, attributing much of the increase to AI. Goldman Sachs echoed these concerns, projecting data center power consumption to more than double by 2030. For now, generative AI’s anticipated benefits outweigh energy cost concerns for most enterprises, with hyperscalers like Google bearing the brunt of these costs. Google recently reported a 13% increase in greenhouse gas emissions, citing AI as a major contributor and suggesting that reducing emissions might become more challenging with AI’s continued growth. Growing Energy Consumption in Generative AI While not a barrier to adoption, energy costs play into generative AI’s long-term viability, noted Scott Likens, global AI engineering leader at PwC, emphasizing that “there’s energy being used — you don’t take it for granted.” Energy Costs and Enterprise Adoption Generative AI users might not see a line item for energy costs, yet these are embedded in fees. Ryan Gross of Caylent points out that the costs are mainly tied to model training and inferencing, with each model query, though individually minor, adding up over time. These expenses are often spread across the customer base, as companies pay for generative AI access through a licensing model. A PwC sustainability study showed that GenAI power costs, particularly from model training, are distributed among licensees. Token-based pricing for LLM usage also reflects inferencing costs, though these charges have decreased. Likens noted that the largest expenses still come from infrastructure and data management rather than energy. Potential Efficiency Gains Though energy isn’t a primary consideration, enterprises could reduce consumption indirectly through technological advancements. Newer, more cost-efficient models like OpenAI’s GPT-4o mini are 60% less expensive per token than prior versions, enabling organizations to deploy GenAI on a larger scale while keeping costs lower. Small, fine-tuned models can be used to address latency and lower energy consumption, part of a “multimodel” approach that can provide different accuracy and latency levels with varying energy demands. Agentic AI also offers opportunities for cost and energy savings. By breaking down tasks and routing them through specialized models, companies can minimize latency and reduce power usage. According to Likens, using agentic architecture could cut costs and consumption, particularly when tasks are routed to more efficient models. Rising Data Center Energy Needs While enterprises may feel shielded from direct energy costs, data centers bear the growing power demand. Cooling solutions are evolving, with liquid cooling systems becoming more prevalent for AI workloads. As data centers face the “AI growth cycle,” the demand for energy-efficient cooling solutions has fueled a resurgence in thermal management investment. Liquid cooling, being more efficient than air cooling, is gaining traction due to the power demands of AI and high-performance computing. IDTechEx projects that data center liquid cooling revenue could exceed $50 billion by 2035. Meanwhile, data centers are exploring nuclear power, with AWS, Google, and Microsoft among those considering nuclear energy as a sustainable solution to meet AI’s power demands. Future ROI Considerations While enterprises remain shielded from the full energy costs of generative AI, careful model selection and architectural choices could help curb consumption. PwC, for instance, factors in the “carbon impact” as part of its GenAI deployment strategy, recognizing that energy considerations are now a part of the generative AI value proposition. As organizations increasingly factor sustainability into their tech decisions, energy efficiency might soon play a larger role in generative AI ROI calculations. Like Related Posts Salesforce OEM AppExchange Expanding its reach beyond CRM, Salesforce.com has launched a new service called AppExchange OEM Edition, aimed at non-CRM service providers. Read more The Salesforce Story In Marc Benioff’s own words How did salesforce.com grow from a start up in a rented apartment into the world’s Read more Salesforce Jigsaw Salesforce.com, a prominent figure in cloud computing, has finalized a deal to acquire Jigsaw, a wiki-style business contact database, for Read more Health Cloud Brings Healthcare Transformation Following swiftly after last week’s successful launch of Financial Services Cloud, Salesforce has announced the second installment in its series Read more

Read More
gettectonic.com