AI technology experienced generational shifts with rapid innovation over the past several years. Despite this progress, there is still much to be accomplished in the AI ecosystem before the technology becomes commonplace in both our daily lives and enterprise workflows. Carli Stein, Investor at M12 focused on AI startups across the stack, explores how smaller language models, robust security tools, and improved testing methodologies will enable broader AI adoption.
Key Points from Carli Stein’s POV:
Why will AI infrastructure be such an important enabler of enterprise AI adoption going forward?
Smaller AI models – enabled by new architectures beyond the transformer model – will support the deployment of cheaper, faster, and more accurate AI in the enterprise. “While large language models (LLMs) are rapidly improving, they are not a one-size-fits-all solution,” says Stein. Smaller models — built on millions rather than billions of parameters – are able to better serve task-specific use cases in real time. Given their lower latency, these models can potentially be run on a CPU and deployed at the edge or on low-cost devices, thus democratizing access to AI for both developers and consumers. Likewise, the smaller size reduces the needed GPU-hours for training by magnitudes, dramatically decreasing the development and deployment costs for models tailored to curated datasets. “This shift signifies a departure from the 'bigger is better' mindset, offering faster, cheaper, and more accurate AI solutions tailored to various tasks.”
Enterprise AI demands higher standards for security, testing, and efficiency (cost) when putting AI into production. “I’m excited about startups addressing these opportunities with new infrastructure solutions as the market becomes more competitive with ambitious founders. I expect to see the next wave of solutions to create cost efficiencies for ROI,” says Stein. “While I’ve seen first hand some concerns enterprises have when wanting to deploy AI, I remain bullish on the investable opportunities in the AI infrastructure space that will further propel AI into production. Engineering teams are quickly innovating existing products from the ground up, while business teams are calculating true ROI of these expenditures in productivity and revenue generation. There is a huge opportunity for startups innovating with efficient AI models and standardizing data protection, identity management, and reliable AI application testing.” The increases in R&D expenditure to deploy AI applications in production creates an open landscape for startups building infrastructure for enterprise grade security and reliability. These opportunities include:
AI-specific security, particularly for agents: Securing AI agents is critical as they become more embedded in our daily lives and businesses. Unauthorized access to sensitive data like financial records, medical records, or passwords through agents can lead to disastrous data breaches. Enterprises want to be certain of security and privacy in their use of agents. Thus, there is untapped potential for startups creating solutions to secure AI agents and design data protection policies. “The need for security is driving an opportunity in innovation around permission management and identity frameworks for agents,” adds Stein. “These developments are crucial for fostering trust in AI systems and ensuring their safe and responsible use.”
Testing and evaluation software: Traditional software testing methodologies lack the sophistication to accommodate the non-deterministic nature of AI models. LLMs are generalized by nature, and thus can hallucinate for specific tasks even with guardrails. Effective scaling of LLMs within enterprises requires robust testing to ensure accurate performance and deliver measurable ROI. “While some companies are building in-house solutions, the testing market remains fragmented as most testing solutions are inconsistent across a range of application types. Aligning AI and refining SLMs will be crucial for more efficient AI deployment and I am eager to see the next wave of startups that will be evaluating new types of AI in production.”
There is a huge opportunity for startups innovating with efficient AI models and standardizing data protection, identity management, and reliable AI application testing
Carli Stein~quoteblock
What are some of the emerging business models or use cases that you’re keeping an eye on in this category?
Smaller AI Models: Small model providers are demonstrating success mirroring the TaaS (Token as a Service) billing structures of larger model providers. These companies have an arbitrage opportunity to charge just as much as the leading LLMs, while developing and training their models at a fraction of the cost. “As I’ve seen with Microsoft’s Phi 3 model family and now GPT-4o mini, along with M12’s recent investments into two stealth SLM companies, I’m excited to explore how tiny models may enable the next wave of AI into production. This will ultimately make AI accessible to the masses as AI can run on-device in commodity hardware. Emerging architectures, such as symbolic AI, have the potential to disrupt the existing transformer landscape as we know it.”
As I’ve seen with Microsoft’s Phi 3 model family and now GPT-4o mini, along with M12’s recent investments into two stealth SLM companies, I’m excited to explore how tiny models may enable the next wave of AI into production. This will ultimately make AI accessible to the masses as AI can run on-device in commodity hardware
Carli Stein~quoteblock
AI-agent specific security that manages identity, access, and agent-to-agent interactions: AI agent security is essential for managing permissions and agent-to-agent interaction. Automatedsecurity tools may eventually replace humans in the loop to control data access and prevent agents from leaking sensitive information. With the rise of multi-agent and agent to agent systems, identity verification is vital to confirm authorized access. This is an emerging space with a few early startups, but one that will become more important as the AI agents ecosystem continues to search for product market fit.
Testing and evaluation software that accommodates each layer of the AI value chain. Demand is growing for evaluation software across AI agents, multi-modal models, and applications. This market is nascent and fragmented, including existing observability and monitoring tools adopting AI specific features into their products and new startups emerging. While the ecosystem is unfolding and enterprises are further deploying AI at scale, the market needs an enterprise focused company that wins across the value chain for testing AI in production.
What are some of the potential roadblocks?
Two challenges may prevent small language models from quick adoption:
Enterprises have invested heavily in LLMs. While smaller models offer performance advantages, LLMs may need more enterprise saturation before smaller models penetrate the market. Much of the initial traction for smaller models has been through open source initiatives that have achieved product-led growth from developers themselves. “PLG adoption may take time to translate into enterprise-level adoption,” says Stein.
The market for smaller models will be highly competitive and present choices to customers. “Different models will be more effective for different use cases. Moreover, incumbents such as Microsoft are continuing to release smaller models,” says Stein. “That said, I am bullish there is a tremendous market opportunity in this space with room for multiple winners.”
Market consolidation across monitoring, evaluation, observability, and deployment may create an ecosystem where emerging players push for differentiation. As the market saturates, many platforms aim to capture and retain a ‘sticky’ and sustainable customer base that warrants venture backing. Companies will likely need to pivot upstream or downstream to effectively capture a larger market share.
While there is overwhelming demand, it is unclear how many winners this category can support, and whether they will be holistic or specialized solutions. “In assessing testing methodologies for AI applications, there's a notable split between companies focusing on pre-production monitoring and those tackling full application testing. While the space is active, I’ve yet to see clear leaders emerge, raising questions about scalability and revenue generation. The potential for a winning solution is plausible, but differentiation and stickiness remain key challenges.”
IN THE INVESTOR’S OWN WORDS
The past two years I have seen the complete generational shifts in AI technology. My thesis centered on identifying the most transformative types of AI agents. I saw everything from workflow agents to next-gen RPA agents, and even agent-to-agent interactions. I searched for agents with proactive rather than reactive capabilities and multi-agent chaining ability.
However, I have come to the conclusion that AI agents are still climbing to reach product market fit for enterprise or consumer adoption. AI agents are not widely adopted yet due to 1) the costs necessary to implement and train LLMs, 2) security concerns, and 3) the need for more robust evaluation infrastructure. This thesis remains ahead of its time as AI agents are still writing their name in the history books as fixtures of the AI era.
However, I’m bullish on the opportunity for smaller models to propel AI agents into production grade technology. AI agents must be cost efficient, extremely low latency, and accurate to provide true ROI for enterprises to deploy them into production. Small models will take big strides to get agents closer to accomplishing their goals successfully without hallucination. While companies can apply guardrails, there is an opportunity to create solutions to protect against security vulnerabilities with identity management. Lastly, AI applications must have robust testing solutions to ensure they’re acting as intended, and I believe the market has yet to find a true evaluation solution that solves end-to-end testing for enterprise grade AI products.
To draw a historical parallel, the first wave of iPhone and mobile apps didn’t have lasting effects like those that succeeded them. Similarly, today’s AI agents need the underlying infrastructure and security for significant impact. The AI agents of today are not the AI agents of tomorrow. We are about to see a transformational shift, enabling broad enterprise AI deployment into production.
Looking ahead, the AI industry’s focus should return to the underlying infrastructure to foster enduring AI native applications. There is a tremendous opportunity for companies to address performance gaps, reduce costs, lower latency, and improve reliability for widespread adoption across enterprise and consumer domains.
WHAT ELSE TO WATCH IN THIS CATEGORY
Open source projects are key to the advancements in the underlying infrastructure that enables greater AI adoption. “Open source projects are often overlooked in driving AI advancement and adoption,” adds Stein. Microsoft's Phi-2 and Phi-3 models, both of which are open source, are examples of smaller models that demonstrate comparable capabilities to LLMs while requiring significantly less computing resources to operate. “Our GitHub Fund invests in open source projects, promoting open source monetization strategies and supporting developer-driven AI infrastructure growth. I am on the hunt for startups continuing to innovate in open source that will propel AI into production.”
Today, AI and LLMs are enabling a new era of solutions that are able to perform tasks autonomously, thereby alleviating the workload of overstretched staff and helping businesses rapidly grow their businesses, save on costs, and streamline operations....
Both enterprise and consumer are ready to adopt deep tech for consumer technologies as interest and spending in the space has become greater than ever before...