The Missing Language for Managing Data at Scale

Picture this: it is budget season, and the Chief Operating Officer, the Chief Technology Officer, and the Head of Risk and Compliance are in the same room, staring at a finite investment envelope. Each has a list of priorities. The COO wants to accelerate a customer onboarding programme. The CTO needs to replace three applications approaching end of life. The Head of Risk is insisting on a data quality remediation that the regulator has flagged. Each priority is justified. Each comes with its own slide deck, its own business case, its own set of numbers. And yet nobody in the room can answer the most basic question: where do these investments overlap, where do they conflict, and which part of the system will break first if we get the sequence wrong?

This is not a failure of intelligence or goodwill. It is a failure of language. There is no shared picture of the data landscape that lets these executives see their competing priorities on the same surface — no way to point to a specific region of the system and say, “this is where three projects are about to land simultaneously on infrastructure that is already fragile.” The slide decks do not connect to each other. The architecture diagrams, if they exist at all, are months out of date and readable only by specialists. The risk register lives in a separate system from the project portfolio, which lives in a separate system from the technology roadmap.

I have spent four decades building, running, and fixing IT systems for large, global financial institutions, and this scene has played out in front of me more times than I can count. The picture we have of our data is never the picture of what is actually happening. We draw diagrams, we maintain inventories, we produce dashboards — and within weeks, sometimes days, they are already behind reality.

This is not because teams are lazy or underfunded. It is a structural problem. Systems evolve faster than the artefacts used to describe them. Every schema, data model, lineage diagram, and data dictionary serves a narrow audience and a specific moment in time. Dashboards built on inventories of systems, controls, and projects are maintained independently, never situated in relation to one another or to the data domains they actually affect. Enterprise architecture, which is supposed to address these overlapping contexts, has not incorporated the data layer onto the traditional dimensions of applications and business capabilities and processes. The result is that the impact of isolated decisions involving data exchanges — from technology choices to process control improvements — is invisible. And because it is invisible, it cannot be managed.

There is a name for what has gone wrong. It is the reification fallacy: the error of treating a model of a system as if it were the system itself. Inventories and catalogues try to replicate systems. They are not digital twins. They are snapshots of intent, not representations of reality. And in a world where AI adoption is compressing the gap between system change and architecture documentation to almost nothing, intent without reality is worse than useless — it is misleading.

I decided to look at this problem from a completely different direction. Instead of asking how we can make our architecture artefacts keep up with change, I asked a simpler question: is there a language that humans already use to navigate complex, changing environments — one that does not require specialist training, and that scales from a quick glance to a deep investigation?

The answer was obvious, and it has been in front of us for centuries: cartography.

Maps are not terrain. They never have been. A map of London does not reproduce London — it projects London into a form that lets you navigate it, even if you have never set foot in the city. Since Google Maps entered the public domain in 2005, every person with a smartphone has become a natural map reader. The conventions are intuitive: distance tells you how far things are from each other, landmarks orient you, colour signals status, and you zoom, pan, and filter without thinking about it. Nobody needs a training course to read a city map. This is the language I have built Data Cartography on.

Data Cartography is the practice of projecting complex, evolving data systems into navigable maps that reveal structure, stability, and change across domains.

Data Cartography is not a new database, not a graph technology, not a dashboard replacement, and not a visualisation of raw data. It is a projection — selective, purpose-driven, and designed for navigation rather than precision measurement. Like a geographic map, it is an interpretive tool that makes complex terrain accessible to anyone who needs to make decisions about it.

The visual grammar of Data Cartography consists of four elements, each borrowed directly from the principles that make geographic maps work.

The first element is spatial semantics — the map itself. Position and distance convey meaning. Objects placed close together on a Data Map have strong functional relationships, much like neighbouring districts in a city that share transport links and a common character. The foundational structures on the map are data zones: stable regions that represent the core data entities recognised by the organisation — customers, transactions, products, risks. These zones are like continents or boroughs: they change slowly, if at all, and they give the map its enduring shape. Distance between objects on the map indicates the strength of their relationship. Systems that exchange data in real time sit close together. Systems with occasional, loosely coupled interactions are further apart. The map does not calculate this with mathematical precision — it represents it with the same spatial intuition you use when you know that Soho and Covent Garden are neighbours, but Heathrow is a different journey entirely.

The second element is visual primitives — the vocabulary. Icons represent commonly understood landmarks: applications, business processes, data stores. Size represents volume or population — a large circle means more data, more users, or more transactions flowing through that point. Colour represents risk or intensity: green for stable, amber for emerging concern, red for high risk. Paths represent connections between landmarks — some permanent, like major roads, and some temporary, like construction detours. Borders demarcate zones of responsibility, just as political borders or geological boundaries do on a geographic map. And labels name things and point to further detail, exactly as street names and station markers do in any city.

The third element is layers and annotations — situational awareness. Layers represent organisational structures: departments, teams, functions, each with prescribed responsibilities for data within the business operating model. They are placed over the map during specific events to create context. This is the equivalent of dropping a pin on Google Maps and saying “you are here” or “meet me here.” Layers are more dynamic than the map itself — they change as organisations restructure or as projects move through their lifecycle — but they are essential for turning a static picture into a basis for communication and action.

The fourth element is interactions — navigation. Even static maps support interaction. In a paper atlas, you zoom by turning to a page that shows a smaller area in greater detail. You pan by following margin references to adjacent pages. You filter by choosing a thematic map — terrain, transport, population density — that shows the same geography through a different lens. Electronic maps make these interactions native: pinch to zoom, drag to pan, toggle layers on and off. Data Maps work the same way. Zoom into a data zone to see finer-grained detail. Pan to an adjacent zone to understand how data flows between them. Filter to see a different projection: cost, risk, regulatory exposure, or AI readiness.

To make this concrete, consider the example below — a Data Map for something every financial institution deals with daily: calculating an account balance.

[INSERT USE CASE MAP IMAGE HERE]

This Data Map is laid over a city-map background — deliberately. It uses the visual language everyone already knows. At the centre sits the Account Balance, the landmark that everything else relates to: the balance as of the local day end, rendered as a prominent icon the way a central station or town hall would appear on a city map. Around it, four data zones are placed according to their functional proximity. The Customer zone — the party associated with the account — sits close by, connected through the account holder relationship, drawn as a directional path the way a main road connects a residential district to a commercial centre. The Financial Transactions zone — credits, debits, adjustments — sits on the other side, linked by the path that derives the balance from transaction history. Further out, the End of Day cutoff point and the Account Region Timezone occupy their own positions, connected to the centre through policy-driven paths: the daily snapshot cutoff and the timezone that determines when “end of day” actually occurs for each account.

Now look at the layers. Dashed borders in different colours demarcate responsibility zones, just as political and administrative boundaries do on a geographic map. A green border encloses Customer Responsibility — the domain where customer data is owned and governed. An orange border encloses Ledger Responsibility — the domain accountable for transaction integrity. A blue border traces the Time and Regional Policy layer, cutting across both domains because timezone and cutoff rules affect everyone but are owned by neither the customer team nor the ledger team alone. This is precisely the kind of cross-cutting dependency that traditional architecture diagrams bury in footnotes or miss entirely.

At the bottom of the map, warning icons flag the high-risk conditions that this particular data zone is prone to: late transactions arriving after the cutoff, incorrect day-end cutoff calculations, and timezone misalignment between account regions. These are the equivalent of roadworks and hazard warnings on a transport map — signals that something in this area needs attention, visible to anyone reading the map regardless of their technical background.

What does this Data Map reveal that a traditional entity-relationship diagram or a data dictionary cannot? Three things. First, the spatial relationships between data zones — you can see at a glance that the account balance is the gravitational centre, and that everything from customer identity to regional policy converges on it. Second, the responsibility boundaries — the overlapping dashed borders make immediately visible that calculating a correct balance requires coordination across at least three organisational domains, each with its own governance. A schema will tell you the tables involved. A lineage diagram will show you the pipeline. Neither will show you that the customer team, the ledger team, and the regional policy team must all be aligned for this single number to be trustworthy. Third, the risks are situated — they are not in a separate register or a disconnected dashboard, but placed directly on the map where they occur, in the context of the data zones and responsibility boundaries they affect.

This matters now more than it ever has, because AI is not solving the data problem — it is amplifying it. In the age of AI, the final data product is never final. The output of a customer support interaction — a complaint logged as text and voice — becomes the input of analytical processes identifying trends and anomalies, which in turn become the input of a chatbot handling the next enquiry from that same customer. Boundaries between internal and external data sources are fading within company systems. Boundaries between structured and unstructured data are disappearing as large language models are adopted into business processes. Autonomous decisioning and AI-driven feedback loops mean that teams must understand the impact of their work on the overall system — not just on their own application or their own pipeline. Without a shared language to describe how data moves through this increasingly interconnected landscape, investment decisions, risk management, and regulatory compliance are based on out-of-context, subjective interpretation presented in slide decks that have no connection to architectural or engineering reality. Data Cartography provides that shared language.

Return now to the budget meeting from the opening of this article. Imagine the COO, the CTO, and the Head of Risk are looking at the Account Balance Data Map together. The COO’s customer onboarding programme touches the Customer zone — the green boundary on the left. The CTO’s end-of-life replacement programme affects the systems that calculate the daily snapshot cutoff — the blue boundary at the top. The risk remediation demanded by the regulator targets late transactions and timezone misalignment — the warning icons at the bottom of the map. On separate slide decks, these three initiatives look independent. On the Data Map, they converge. All three are changing systems that feed directly into the Account Balance at the centre. If the onboarding programme alters how customer records are linked to accounts at the same time as the cutoff calculation is being migrated to a new platform, while the risk team is tightening controls on late transactions, the result is three concurrent changes hitting the most critical point on the map simultaneously.

Now add a layer. Switch on the technology lifecycle filter, and the map shows that the application currently handling the end-of-day cutoff is eighteen months past its vendor support date. The infrastructure underneath the Financial Transactions zone is flagged amber — not yet end of life, but approaching it. Suddenly the conversation in the room changes. The CTO’s replacement programme is not just a modernisation exercise — it is the precondition for the other two initiatives to land safely. The COO can see that her onboarding changes need to be sequenced after the cutoff system is stabilised, not before. The Head of Risk can see that the remediation she is planning depends on infrastructure that is itself a risk. The investment priorities have not changed, but the sequence and the dependencies are now visible to everyone — on the same map, in the same room, in a language that does not require an architecture degree to read.

This is what over-investment and under-investment look like when you can finally see them. A cluster of project icons piling up in one corner of the map — three, four, five concurrent changes in a single data zone — while other zones sit empty of investment but marked with ageing infrastructure and rising risk indicators. Without the map, each project sponsor believes their initiative is well-scoped and properly resourced. With the map, the executives can see that they are overloading one part of the system and neglecting another. They can ask the question that no slide deck has ever answered for them: are we investing in the right place, in the right order, and is the system strong enough to absorb these changes at the pace we are planning?

A Data Map does not require new technology. Most commercial diagramming tools support the visual language needed to get started. The Resource Description Framework (RDF), an established and widely supported standard for data representation, can be used to build and manage the underlying structure of a map, and most drawing tools already support RDF or an RDF-like language. The first step is straightforward: choose a single, well-understood data domain — customer data, transaction processing, account balances — and map it. Identify the data zones. Place the landmarks. Draw the paths. Apply a single layer of organisational responsibility. You will find that the act of mapping reveals questions about your data landscape that years of inventories and lineage diagrams never surfaced.

One important caveat. Data Cartography provides a shared visual language for navigating complexity, but it does not resolve the deeper problem of semantics and meaning. What a “customer” means to the onboarding team, what it means to the ledger team, and what it means to the regulator may be three different things — and those differences are still a significant impediment to good communication and the safe and correct use of data across silos in large organisations. The map can show you that these teams share a data zone and that their responsibilities overlap, but it cannot, by itself, reconcile the definitions they are each working with. That remains a challenge that data governance, shared ontologies, and sustained cross-functional dialogue must address. Data Cartography makes the need for that dialogue visible. It does not replace it.

The maps you already carry in your head — the informal, unwritten understanding of how your systems fit together — are the foundation. Data Cartography gives them a grammar, a structure, and a shared surface where everyone in the room can finally point to the same place and say: “Here. This is where we need to invest first — and here is what needs to wait.”

Simone Steel CITP FBCS is a senior technology leader in financial services with four decades of experience in systems architecture, data management, and digital transformation. Data Cartography is her original framework for navigating enterprise data landscapes.

The author acknowledges the pioneering work of Simon Wardley, whose Wardley Maps demonstrated that mapping as a practice — not just a metaphor — can transform strategic decision-making in technology. Data Cartography builds on a different foundation and serves a different purpose, but it shares with Wardley’s work the conviction that visual, spatial reasoning is essential for navigating complexity.

The Missing Language for Managing Data at Scale

I, Data: A citizenship challenge

linkedin.com/in/simone-t-steel