Identity as a Big Data Problem
- Simon Moffatt
- May 9, 2025
- Identity Security Posture Management
- 5 MIN READ

Identity as a Lifecycle
- Creation and Assurance
- Credentials and Authentication
- Access Change & Request
- Access Removal
This is the initial post of a three-part series taking a look at the emerging, yet critical role of data–centricity as it pertains to identity security and identity security posture management.
Let us first consider that our identity and access management (IAM) landscape does not exist in a vacuum, especially when considering the B2E workforce ecosystem. Identities are centered around an entry point via an authoritative source, with an increasing use of verification and validation stages used to provide a level of identity assurance (IA). From there of course the interesting dynamics of B2E identity take hold – from credential management and issuance, job changes, role and permissions associations as well as more contextual and often weekly changes associated with projects, tasks and functions – all requiring subtle changes to assurance levels, permissions and how these authentication and authorization decisions are enforced within downstream systems.
This entire lifecycle of identities is in constant flux too, with an ever-increasing rise in the number of systems needing integration, the number of identities under management and the variety of identities under management. These sub-lifecycles have historically been managed with isolated tools and processes, often resulting in disconnection and duplication of core IAM functions.
By breaking down our IAM functions into a lifecycle we can start to analyze and in turn understand where our issues potentially lie – with respect to security, productivity and risk. At each stage with a B2E journey there are numerous touch points, management planes and success factors that need to be considered when designing future IAM service integration.
Data as a Lifecycle
- Ingest
- Correlate
- Normalize
A large change in the understanding of this IAM landscape comes about when we start to consider that each part of this IAM lifecycle is both a data consumer and also a data generator. Failures in parts of this lifecycle often occur due to a lack of context – where the consumption of signals is often lacking, resulting in poor decision making with respect to IAM risk or security. Many IGA projects during both the access request and access review management stages for example are often hampered by a lack of context – resulting in poor decision making and bad security outcomes.
However other parts of the IAM lifecycle need to be considered too – from onboarding, identity verification and validation (IDV), privileged access management (PAM), authorization policy decision points (PDP) and policy enforcement points (PEP), identity providers (IDP) and other orthogonal areas such as SIEM (security information and event management) and CIEM (cloud infrastructure entitlements management).
Many of these resource-centric pillars are lacking context and detailed information as it pertains to other parts of the access path the identity may be part of.
But to create a data-centric fabric or mesh across these often disparate and isolated systems requires several tenets of data management and governance:
- Firstly the ability to ingest data from a variety of sources such as databases, traditional LDAP and typical REST/JSON based APIs. Standards such as SCIM help here, but the consumption of identity data, authoritative sources and activity data should be simple and repeatable. This should also contain information from ticketing systems and configuration management databases (CMDB).
- Secondarily, that data needs to be correlated using a variety of out-of-the-box methods – linking different naming standards, username formats and often dynamically linking and creating identifiers. From that point, normalization of data signals can occur which looks to provide a unified view of identity attribute data as well as usage and system related activity.
IAM Data Concerns
- Lack of Integration
- Lack of Visibility
- Rise in Identity Sprawl
- Poor Identity Hygiene
Many organizations struggle with the ability to create an IAM specific data-centric view. A lack of integration is often the first concern – with often only a subset of high-risk systems being integrated into IDP or IGA platforms. This lack of integration also extends to systems that can provide significant value to the IAM chain too – such as CMDB for the linking of application and IP data to account activity logs or ITSM ticketing systems to capture context regarding access request metrics.
As our identity ecosystem continues to grow – in both deployment types across on premises and cloud – and the variety and number of identities, this identity sprawl is generating blind spots both for visibility, but also governance and management – simply due to a lack of connectivity.
The end result is bad hygiene across many parts of the identity journey – from the more obvious redundant and orphan accounts, to excessive permissions, under used roles and policies and missing access request workflows. This has considerable impact on a range of stakeholders across the business each with different metrics and success factors. Productivity is impacted due to poor access request and review management processes resulting in delays in end user onboarding and permissions fulfilment. Wasted effort is amplified at multiple levels – from help desk operators completing access request tickets through to line managers that are engaged as part of application access review processes.
From a security and risk perspective, numerous issues start to manifest – from an inability to understand the control assurance posture for access removal, right size permissions association and access request approval. The end result is the wrong identities have the wrong permissions at the wrong time.
A Path Ahead
- Leverage Existing Data Sources
- Integrate and Overlay
- Benefits
A move towards identity data-centricity is emergent and can provide significant benefits to organizations of different sizes and in different sectors. It is important to understand that existing data sources play a vital role in this. The core IAM systems clearly have a vast amount of valuable data to help with attribute triangulation and assurance, but other orthogonal data sources play a role too. From understanding the overarching application, network and endpoint configuration point of view, the IAM landscape can be enriched with more signals to help provide context, identify redundancy and assist in making more informed decisions as they pertain to “who should have access to what” and “who is accessing what”.
Many parts of the identity lifecycle have an important role to play – including existing IDP, IGA, PAM and authorization tools. They are unlikely to be replaced readily and often have a long lifespan. These tools should be leveraged and augmented with a more holistic approach to identity data capture, analytics and in turn outbound processing.
Resource-centric products such as IGA or PAM often have quite tightly coupled features and integrations. Do not necessarily look to replace them – but overlay – to provide increased value from a conjoined view of profile and activity data.
As identity moves to become a strategic enabler and not just an IT and operational stalwart, the ability to provide improved data analytics and insight at this level is crucial. By taking existing data management capabilities and applying them in a specific IAM landscape can deliver significant value to an ever-increasing array of stakeholders. The rise in identity variety and volume is a catalyst for generating vast amounts of actionable insights and “big data” esque concepts are needed.
The next article in this series will move us ahead in our thinking and start to understand the strategic components of Identity Security Posture Management. Why has this emerged, what problems does it look to solve and how can a data-centric approach support and expand ISPMs core capabilities to deliver a more preventative pre-breach approach to security.
Learn More
Subscribe to receive blog updates
Don’t miss the latest conversations and innovations from Radiant Logic, delivered straight to your in-box.