Structured data is broken for AI!

As soon as you try to convert that POC into a production app, things fall apart fairly quickly

Jun 02, 2025

GenAI, LLM-based AI agents and AI assistants, are starting to have a real and huge impact on how people live and work. Companies of all sizes and in all verticals are frantically trying to build and leverage AI into their user and business workflows.

RAG-based workflows have now become de-facto for AI applications. Chatting with your PDFs and unstructured data is great, and to be fair, has unlocked huge swaths of previously impenetrable knowledge bases. However, I believe strongly that without incorporating the crown jewels of the business i.e. structured operational data (databases, data warehouses, and API-based systems), the true potential of AI can’t be unlocked.

So why is it so hard to build real-world AI agents and applications incorporating structured data? After all, there is Text2SQL, MCP is quickly becoming the way to connect to various data sources, and LLMs are getting smarter with more reasoning almost every day!

Here’s the issue: some of the above techniques work well in demos and POCs. But as soon as you try to convert that POC into a production app, things fall apart fairly quickly. None of these methods work with the complexities of real operational data – often spread across multiple systems, evolving schemas, and inconsistent naming – accurately and repeatedly, and definitely not at scale.

Data Lakes won’t fix the problem

Data silos are a fact of life.

Every business, whether it’s an SMB or a large enterprise, has a unique tech stack that exists within the context of the business. Some parts of the company may use Postgres, others may use MySQL or Oracle or Snowflake, and yet others may use Salesforce, Stripe, and Hubspot, to meet the needs of that particular team or business unit.

The “ideal world” consolidation approach – that all data can always be in one place and queried easily from there – is appealing but just isn’t practical or feasible in a lot of cases. I know firsthand from my experience working directly with countless Google Spanner and GCP customers that no matter how well designed and carefully architected your data-lake or data-warehousing solution is, there’s always a data source that exists somewhere in the organization that isn’t included in the lake or warehouse.

In addition, even if the data is orchestrated through complex pipelines to be in the lake eventually, that data is stale. So it can’t be used for real-time decision-making use cases. It has to be fetched on-demand. This means you have to maintain multiple different workflows and pipelines to ETL data and also retrieve data from the same source for different use cases … what a waste of precious time, engineering resources and energy!

Smarter LLMs aren’t the holy grail either

Will the accuracy and complexity issues of structured data retrieval be solved by better reasoning models and smarter LLMs? My answer is no!

Even the best reasoning machines today — humans — don't always make the right decisions, especially when they don't have the right information. In fact, we tend to make up stories to fill in gaps in our own knowledge. We’ve built many systems-of-record over time to help us with precise look of exact data as and when needed, including databases, spreadsheets, other structured and tabular formats etc.

Switching back to machines designed to think like humans, research conducted by Transluce found that as reasoning models become "cleverer", they also tend to be less reliable because they fabricate information, sometimes quite elaborately, to fill in their own knowledge gaps. This means, reasoning models are more likely to lead to inaccurate and misleading information. That’s not something we want to rely on for specific data to make critical decisions.

What’s considered state of the art today?

Let’s shift gears and talk about solutions that are currently available to AI builders and AI developers. They work… until they don’t!

There’s Text2SQL. But it doesn’t work without spoon-fed context. And when given the usually large schemas in the context, LLMs tend to get overwhelmed or confused, and are prone to hallucination when they don't know the exact SQL dialect or the exact data object name, etc. Instead, they make guesses. They make up table names and column names. Maybe Text2SQL works in a demo. But it breaks in production for even moderately complex data or queries.

So what about using RAG to retrieve schema details? It’s somewhat more scalable but still widely unreliable… and again, it leads to more hallucinations. RAG is notoriously bad at exact matches and precise lookups, which is exactly what structured data retrieval needs.

LLM fine-tuning can work, especially when used with good prompting techniques to increase the consistency of outputs. But how much is your dataset likely to evolve? In the real world, schemas continuously change, and so do data sets. It’s not practical to re-train every time a schema change happens. And you definitely can’t train on every internal data system your business uses. It's never-ending upkeep and maintenance, which just isn’t possible.

Then there’s MCP. It’s a great open source start to the “connector” problem, providing a consistent and standard way to do tool calling for LLMs. But MCP is just the plumbing. To actually use it well, you still need a smart agent that understands the business context inside and out: how the data is structured, where it lives, and what queries actually make sense. The hard stuff still isn’t easily repeatable or automated. A *lot* of the heavy lifting is left to the “agent(s)” i.e. ultimately to the AI app developer to figure it out.

Why Snow Leopard is different

At Snow Leopard, we’re building an intelligent data retrieval solution that uses a combination of techniques mentioned above along with a focus on semantic intelligence about the data itself. This allows us to deal with the data consistency and accuracy issues that are rampant in LLM-based agents and problematic for multiple enterprise AI use-cases.

Text2SQL agents and similar agentic workflows for structured data retrieval fail today because LLMs and AI agents are missing the business context around the data.

This means the relationships within the data source (eg: between different columns and tables) and across data sources (where objects that mean the same thing have different names and ways to address them, such as: customer_id column in Postgres is the same as client_id column in Snowflake) are completely unknown to the agent. Often, these relationships exist only in the heads of the analysts or data engineers that manage and maintain the data infrastructure of the business. But when this business logic can be extracted and fed to the agent, the results are far better and more grounded in reality.

That is the mission we’re on at Snow Leopard! We are making it simpler and easier for AI developers to use their operational business data for AI applications in production and on-demand, without having to pre-define the complex data workflows and build all the pipelines. This process, called "live retrieval," means Snow Leopard has a federation (vs. consolidation) approach, and evaluates each query in real-time, deciding on the fly where to pull data from—without pre-storing or pre-processing it.

We’re excited to be on this journey to make structured, operational data work for AI!