Steven Hart

Knowledge Architect

Designing an enterprise knowledge graph for a private equity intelligence platform

PEI Group publishes intelligence for the private equity and infrastructure investment industry. Their platform holds rich data across fund records, investor profiles, deal histories, editorial content, people, and events, but all stored in isolated silos with no formal relationships between entities.

I created a knowledge graph proof of concept in Neo4j to demonstrate what a formally connected data layer could do for the product, and to make the case for a new direction.

Skills this work demonstrated

I created a knowledge graph from scratch to model the infrastructure private equity asset class.

The schema was built around a deep understanding of the data, gained from desk analysis and meetings with subject matter experts within PEI, combined with insights into how customers wanted to use the data, gleaned from participating in user research interviews.

This case study demonstrates:

Domain translation at depth: A complex, relationship-dense commercial domain formally modelled as a queryable knowledge graph.

Ontology design practice: Entity resolution, controlled vocabularies, relationship directionality, and property design applied to a real production-scale dataset.

Business-technical bridging: Stakeholder communication from data team to design leadership to CPO, with the model as the connective artefact.

Design thinking applied to knowledge modelling: Jobs To Be Done methodology used to identify the right entities and relationships, treating ontology design as a user research problem, not a data engineering problem.

The domain challenge: translating private equity into a knowledge model

Private equity infrastructure investment is a relationship-dense field with specific logic. For example, organisations declare strategic intent (the sectors, geographies, and fund strategies they will pursue), and that intent needs to be traceable to actual fund commitments, the managers running those funds, the portfolio companies held within them, and the editorial signals that provide market context.

The core architectural problem was that PEI's existing platform stored entities without relationships. Funds existed in the fund database. Investors existed in the CRM. Articles existed in the CMS. The connections between them were either manual, duplicated, or missing entirely. The platform was built for publishing content, not analysing it.

Formalising these relationships and deciding which entities mattered, how they connected, what properties carried analytical value, and how to resolve the same real-world entity appearing under different names across different source systems was the primary design challenge.

This is not a data engineering problem as much as a knowledge modelling problem, requiring deep domain understanding and structured thinking about meaning.

Ontology and data model design

I designed a property graph ontology structured around the objects and relationships of the private equity domain. The model comprises eight core entity types:

  • Fund

  • InvestmentManager

  • Investor

  • Deal

  • PortfolioCompany

  • Person

  • Event

  • Article

The model also included taxonomy nodes to help tie objects to the mental model that most key users have of the market:

  • Strategy

  • Sector

  • Region

  • Appetite

  • Intention

Relationships were named, quantified and given direction, enabling multi-hop traversals across the full ecosystem.

Some key modelling decisions

Modelling the domain brought me into contact with some fundamental questions about how the business operates and understands itself. Through a combination of direct conversation with willing SMEs, extrapolation from user research sessions and desk research, examples of the type of modelling decisions made are:

Entity resolution: Identifying when different references across editorial, CRM, and database sources described the same real-world entity (a non-trivial problem at scale).

Controlled vocabularies: Applying consistent taxonomies across varied and inconsistent source content to enable reliable querying.

Relationship directionality: Deciding which direction each relationship should traverse to support the most valuable query patterns.

Property design: Determining which attributes belonged on nodes versus relationships, and which carried analytical versus display value.

Scalability: Ensuring the model could evolve to accommodate new entity types and use cases without structural rework.

A new article page, with knowledge graph-driven network intelligence modules to either side, is now a window onto the whole data ecosystem, not just a container for static content.

Understanding the business: from domain knowledge to structured model

The most demanding part of the work was the 'translation layer': mapping the domain, understanding how private equity professionals actually think and work, and encoding that understanding into a formal model which enabled both a realistic technical implementation path and a product interface that users would understand and want to use.

A key insight was that PEI's users do not merely browse content. They execute investment research strategies. Their mental model is: 'show me everything relevant to Infrastructure investment in Europe and Asia-Pacific — who is active, who is moving, what is being written about it.' The existing CMS-driven navigation (News, Awards, Events) was organised around content production, not investment intent.

This insight became the design anchor for the new model.

I identified Strategy, Sector, and Region (SSR) as the primary dimensions through which users experience the platform and its content or data, and I designed the graph to support that query pattern natively.

The network intelligence for an Infrastructure strategy (above) and the interface tool the graph allows in the product (below), just one such tool that lets users traverse the graph meaningfully.

What the graph enables

The property graph model enables query patterns that were previously manual, time-intensive, or simply not feasible.

Multi-hop relationship traversal

Starting from a named investment strategy, a rich network of connected objects can be traced and queried in a single operation:

Strategy drift analysis

The graph enables direct comparison of declared investment focus against actual portfolio behaviour — surfacing signals that would require significant manual analysis in the old model:

Investor intention mapping

Knowing which investors have declared intent to invest in which strategies, and being able to trace those intentions to the people, events, and articles connected to them, enables a class of product intelligence that flat database queries cannot support.

From model to product capability

The graph was not designed as a backend model in isolation. It was designed to drive product features directly. The architecture actively enables:

Network intelligence modules: Article and profile pages can surface contextually relevant connections across the ecosystem — investors affected by a topic, funds managed by a mentioned firm, events attended by key people.

Market Focus widget: A single interface element allowing users to select Strategy, Sector, and Region and instantly surface all relevant content, data, and signals — driven by a simple graph query.

Dashboards by investment profile: Market overviews anchored to investor or fund manager context, rather than generic content aggregation.

Agentic AI readiness: The graph provides the structured business context that future AI agents will need to answer complex questions about the private equity market — the foundational capability layer.

All working files including Cypher queries, import scripts and design documentation are available in a GitHub repository. Please contact me for access.

From model to product capability

The graph was not designed as a backend model in isolation. It was designed to drive product features directly. The architecture actively enables:

Network intelligence modules: Article and profile pages can surface contextually relevant connections across the ecosystem — investors affected by a topic, funds managed by a mentioned firm, events attended by key people.

Market Focus widget: A single interface element allowing users to select Strategy, Sector, and Region and instantly surface all relevant content, data, and signals — driven by a simple graph query.

Dashboards by investment profile: Market overviews anchored to investor or fund manager context, rather than generic content aggregation.

Agentic AI readiness: The graph provides the structured business context that future AI agents will need to answer complex questions about the private equity market — the foundational capability layer.

All working files including Cypher queries, import scripts and design documentation are available in a GitHub repository. Please contact me for access.

Outcomes and reception

The prototype was built in Neo4j Aura with 800+ nodes, 1,500+ relationships, and synthetic data covering the infrastructure investment ecosystem. It included demo Cypher queries, a comprehensive written pitch, and interactive D3.js visualisations.

The work was reviewed and supported by senior leadership. The Head of Data & AI called it "a kick up the arse" and "exactly the direction" the company needed to go in.

The Head of Design called it groundbreaking. Both responses confirmed what the work set out to demonstrate: that restructuring information at the data layer changes what a product can do.

Outcomes and reception

The prototype was built in Neo4j Aura with 800+ nodes, 1,500+ relationships, and synthetic data covering the infrastructure investment ecosystem. It included demo Cypher queries, a comprehensive written pitch, and interactive D3.js visualisations.

The work was reviewed and supported by senior leadership. The Head of Data & AI called it "a kick up the arse" and "exactly the direction" the company needed to go in.

The Head of Design called it groundbreaking. Both responses confirmed what the work set out to demonstrate: that restructuring information at the data layer changes what a product can do.

Steven Hart

Knowledge Architect