Scaling Generative AI: Infrastructure Challenges and Emerging Solutions
Thank you for reading this post, don't forget to subscribe!Early generative AI adopters often relied on SaaS tools like ChatGPT or Microsoft Copilot, which were cost-effective without presenting infrastructure burdens. But as companies move to scale, these challenges are surfacing.
Organizations exploring gen AI usually start with enterprise-grade cloud-based services like OpenAI’s ChatGPT or Anthropic’s Claude. Positive early results, like efficiency gains in producing executive summaries or marketing content, inspire further AI use across the enterprise. In the coming year, experts anticipate even broader adoption of these tools as they are integrated into platforms like Adobe Photoshop, Google Workspace, and Salesforce, where infrastructure management remains with the providers. However, companies aiming to go beyond off-the-shelf solutions will need to build custom models, fine-tune existing ones, or deploy retrieval-augmented generation (RAG) systems that integrate real-time data, necessitating a more robust infrastructure.
Case in Point: Spirent’s Custom AI Path
Spirent, a telecom testing firm, initially utilized OpenAI’s enterprise edition for its generative AI needs, offering data privacy assurances without needing a custom large language model (LLM). However, as their AI applications expanded, Spirent recognized a need for more tailored solutions, which led to significant infrastructure upgrades. This journey began with a data modernization effort using AWS to ensure a strong data foundation, essential for effective AI performance. When legacy integration tools proved insufficient for large-scale data flows, Spirent chose SnapLogic for its powerful capabilities, including an AI builder to streamline costs.
Data Management: A Key Enabler for Gen AI
Spirent’s data modernization reflects a broader trend. A Deloitte report indicates that 75% of organizations are increasing investments in data lifecycle management due to gen AI. Proper data handling is crucial for maintaining high-quality AI output, whether through a modern data lake or improved pipelines. Spirent’s infrastructure overhaul enabled employees to leverage AI to produce contextually accurate materials in Salesforce, drawing from both SharePoint and Salesforce repositories.
Global Challenges: Regional and Technical Constraints
As a global company, Spirent faces unique challenges, such as managing LLM access in countries with restricted AI resources. One solution is deploying AI models in specific regions, though not all LLMs are available globally, making regional adaptation difficult. For now, Spirent continues to rely on large commercial providers like OpenAI.
The Role of Public, Private, and GPU-as-a-Service Clouds
Many organizations follow a hybrid approach to AI infrastructure, combining public clouds, colocation facilities, and on-prem solutions. Specialized GPU-as-a-service vendors, for instance, are becoming popular for handling high-demand AI computations, helping businesses manage costs without compromising performance. Business process outsourcing company TaskUs, for example, focuses on optimizing compute and data flows as it scales its gen AI deployments, while Cognizant advises that companies distinguish between training and inference needs, each with different latency requirements.
The AI Infrastructure Skill Gap
Despite infrastructure advancements, skill shortages in AI management persist. As companies scale AI pilots into production, challenges around data storage, connectivity, and compute capacity only increase. Addressing this, experts suggest leveraging gen AI for managing infrastructure complexities, possibly through self-optimizing AI-driven code that adapts to evolving provider offerings.
The rapid evolution of gen AI makes it critical for organizations to remain agile, adjusting strategies and infrastructure to meet expanding AI workloads. As generative AI reshapes enterprise workflows, its infrastructure demands require proactive, scalable, and cost-effective solutions that enable continuous growth without compromising data integrity or compliance.