In my last blogpost (was it really more than two months ago—I plead CEO-overload!), we looked at how SQL and data integration are essential to the development of a truly useful customer profile. At the end, I promised to step through the process of nurturing relationships, where we guide prospects and customers through each stage, sharing and collecting information in a step-wise cadence. So here goes—and note that I’m using the vocabulary and categorizations from Salesforce, one of the main customer relationship management apps on the market:
- First, a set of information is collected from an interested party—also known as a lead—and further information is sent to match the needs of that lead.
- After that, the lead is qualified as a prospect, and the sales rep conducts further qualification discussions to move that prospect to the next stage of the pipeline.
- At this point, enough information is known on the needs of the prospect to determine if an opportunity for a sale exists. If yes, the sales rep takes the final qualification step by negotiating the terms of a deal.
- When (and if) a deal is struck, that opportunity becomes a customer.
What we can see in this nurturing process, as in most business processes or complex transactions, is that the whole operation is built around a series of steps, or a business workflow. At each step, specific information is gathered and you move to the next steps only when the information requirement of the current step is fulfilled, as we see below:
What I am describing here is obvious at the business level—or “conceptual level” in the parlance of the data-modeling world. However, when it comes to the details of low-level implementation at the data structure or database level, things are not so cleanly delineated and as a result, currently deployed solutions are far from optimal. So let’s revisit this pattern as it applies to the integration of a user profile at the level of SQL.
read more →
Last week, we took a look at the challenges faced by “traditional IAM” vendors as they try to move into the customer identity space. Such vendors offer web access management and federation packages that are optimized for LDAP/AD and aimed at employees. Now we should contrast that with the new players in this realm and explore how they’re shaping the debate—and growing the market.
Beyond Security with the New IAM Contenders: Leveraging Registration to Build a More Complete Customer Profile
So let’s review the value proposition of the two companies that have brought us this new focus on customer identity: Gigya and Janrain. For these newcomers, the value is not only about delivering security for access or a better user experience through registration. They’re also aimed at leveraging that registration process to collect data for a complete customer profile, moving from a narrow security focus to a broader marketing/sales focus—and this has some consequences for the identity infrastructure and services needed to support these kind of operations.
For these new contenders, security is a starting point to serve better customer knowledge, more complete profiles, and the entire marketing and sales lifecycle. So in their case it is not only about accessing or recording customer identities, it’s about integrating and interfacing this information into the rest of the marketing value chain, using applications such as Marketo and others to build a complete profile. So one of the key values here is about collecting and integrating customer identity data with the rest of the marketing/sales activities.
At the low level of storage and data integration, that means the best platform for accomplishing this would be SQL—or better yet, a higher-level “join” service that’s abstracted or virtual, as in the diagram below. It makes sense that you’d need some sort of glue engine to join identities with the multiple attributes that are siloed across the different processes of your organization. And we know that LDAP directories alone, without some sort of integration mechanism, are not equipped for that. In fact, Gigya, the more “pure play” in this space, doesn’t even use LDAP directories; instead, they store everything in a relational database because SQL is the engine for joining.
So if we look at the customer identity market through this lens of SQL and the join operation, I see a couple of hard truths for the traditional IAM folks:
- First, if we’re talking about using current IAM packages in the security field for managing customer access, performance and scalability are an issue due to the “impedance” problem. Sure, your IAM package “supports” SQL but it’s optimized for LDAP, so unless you migrate—or virtualize—your customers’ identity from SQL to LDAP in the large volumes that are characteristic of this market, you’ll have problems with the scalability and stability of your solution. (And this does not begin to cover the need for flexibility or ease of integration with your existing applications and processes dealing with customers).
- And second, if you are looking at leveraging the customer registration process as a first step to build a complete profile, your challenge is more in data/service integration than anything else. In that case, I don’t see where there’s a play for “traditional WAM” or “federation” vendors that stick to an LDAP model, because no one except those equipped with an “unbound” imagination would use LDAP as an engine for integration and joining… 🙂
The Nature of Nurturing: An Object Lesson in Progressive, Contextual Disclosure
Before we give up all hope on directories (or at least on hierarchies, graphs, and LDAP), let’s step beyond the security world for a second and look at the marketing process of nurturing prospect and customer relationships. Within this discipline, a company deals with prospects and customers in a progressive way, guiding them through each stage of the process in a series of steps and disclosing the right amount of information within the right context. And of course, it’s natural that such a process could begin with the registration of a user.
We’ll step through this process in my next post, so be sure to check back for more on this topic…
Current Web Access Management Solutions Will Work for the Customer Identity Market—If We Solve the Integration Challenge
I find it ironic that within the realm of IAM/WAM, we’re only now discovering the world of customer identity, when the need for securing customer identity has existed since the first business transactions began happening on the Internet. After all, the e-commerce juggernauts from Amazon to eBay and beyond have figured out the nuances of customer registration, streamlined logons, secure transactions, and smart shopping carts which personalize the experience, remembering everything you’ve searched and shopped for, in order to serve up even more targeted options at the moment of purchase.
It reminds me of a parable from a classic book on investing*: Imagine a Wall Street insider at the Battery in New York, pointing out all the yachts that belong to notorious investment bankers, brokers, and hedge fund managers. After watching for a while, one lone voice pipes up and asks: “That’s great—but where are the customers’ yachts?”
Could this new focus on “customer identity” be an attempt by IAM/packaged WAM vendors to push their solution toward what they believe is a new market? Let’s take a look at what would justify their bets in the growing customer identity space.
Customer Identity: The Case for the WAM Vendors
The move to digitization is unstoppable for many companies and sectors of the economy, opening opportunities for WAM vendors to go beyond the enterprise employee base. As traditional brick and mortar companies move to a new digitized distribution model based on ecommerce, they’re looking for ways to reach customers without pushing IT resources into areas where they have no expertise.
While there are many large ecommerce sites that have “grown their own” when it comes to security, a large part of this growing demand will not have the depth and experience of the larger Internet “properties.” So a packaged solution for security makes a lot of sense, with less expense and lower risks. And certainly, the experience of enterprise WAM/federation vendors, with multiple packaged solutions to address the identity lifecycle, could be transferred to this new market with success. However, such a transition will need to address a key challenge at the level of the identity infrastructure.
The Dilemma for WAM Vendors: Directory-Optimized Solutions in a World of SQL
As we know, the current IAM/WAM stack is tightly tied to LDAP and Active Directory—these largely employee-based data stores are bolted into the DNA of our discipline, and, in the case of AD, offer an authoritative list of employees that’s at the center of the local network. This becomes an issue when we look at where the bulk of customer identities and attributes are stored: in a SQL database.
So if SQL databases and APIs are the way to access customer identities, we should ask ourselves if the current stack of WAM/federation solutions, built on LDAP/AD to target employees, would work well as well with customers. Otherwise, we’re just selling new clothes to the emperor—and this new gear is just as invisible as those customers’ yachts.
Stay tuned over the next few weeks as I dive deeper into this topic—and suggest solutions that will help IAM vendors play in the increasingly vital world of customer identity data services.
*Check out “Where Are the Customers’ Yachts: or A Good Hard Look at Wall Street” by Fred Schwed. A great read—and it’s even funny!
How a Federated ID Hub Helps You Secure Your Data and Better Serve Your Customers
Welcome back to my series on bringing identity back to IAM. Today we’re going to take a brief look at what we’ve covered so far, then surf the future of our industry, as we move beyond access to the world of relationships, where “identity management” will help us not only secure but also know our users better—and meet their needs with context-driven services.
We began by looking at how the wave of cloud services adoption is leading to a push for federation—using SAML or OpenID Connect as the technology for delivering cloud SSO. But as I stressed in this post, for most medium-to large-enterprises, deploying SAML will require more than just federating access. By federating and delegating the authentication from the cloud provider to the enterprise, your organization must act as an Identity provider (IdP)—and that’s a formidable challenge for many companies dealing with a diverse array of distributed identity stores, from AD and legacy LDAP to SQL and web services.
It’s becoming clear that you must federate your identity layer, as well. Handling all these cloud service authentication requests in a heterogeneous and distributed environment means you’ll have to invest some effort into aggregating identities and rationalizing your identity infrastructure. Now you could always create some point solution for a narrow set of sources, building what our old friend Mark Diodati called an “identity bridge.” But how many how of these ad hoc bridges can you build without a systematic approach to federating your identity? Do you really want to add yet another brittle layer to an already fragmented identity infrastructure, simply for the sake of expediency? Or do you want to seriously rationalize your infrastructure instead, making it more fluid and less fragile? If so, think hub instead of bridge.
Beyond the Identity Bridge: A Federated Identity Hub for SSO and Authorization
This identity hub gives you a federated identity system where identity is normalized—and your existing infrastructure is respected. Such a system offers the efficiency of a “logical center” without the drawbacks of inflexible modeling and centralization that we saw with, say, the metadirectory. In my last post, we looked at how the normalization process requires require some form of identity correlation that can link global IDs to local IDs, tying everything together without having to modify existing identifiers in each source. Such a hub is key for SSO, authorization, and attribute provisioning. But that’s not all the hub gives you—it’s also way to get and stay ahead of the curve, evolving your identity to meet new challenges and opportunities.
The Future’s Built In: The Hub as Application Integration Point and Much More
Another huge advantage of federating your identity? Now that you can tie back the global ID to all those local representations, the hub can act as a key integration point for all your applications. Knowing who’s who across different applications allows you to bring together all the specific aspects of a person that have been collected by those applications. So while it begins as a tool for authentication, the hub can also aggregate attributes about a given person or entity from across applications. So yes, the first win beyond authentication is also in the security space: those rich attributes are key for fine-grained authorization. But security is not our only goal. I would contend that this federated identity system is also your master identity table—yes, read CDI and MDM—which is essential for application integration. And if you follow this track to its logical conclusion, you will move toward the promised land of context-aware applications and semantic representations. I’ve covered this topic extensively, so rather than repeat myself, I will point you to this series of posts I did last spring—think of it as Michel’s Little Red Book on Context… 😉
- First we introduced the topic of Context as the Next Frontier of Your Digital Identity.
- Then we went From Groups to Roles to Context, looking at the Emergence of Attributes in Authorization.
- Then we explored Attributes, Predicates, and Sentences as the Building Blocks of Context.
- And finally, we achieved Valhalla: Man and Machine, Speaking the Same Language.
So the way we see it here at Radiant, the emergence of the hub puts you on the path toward better data management and down the road to the shining Eldorado of semantic integration, where your structured and unstructured data comes together to serve you better. But you don’t have to wait for that great day to realize a return—your investment starts to pay off right away as you secure your devices and cloud services.
read more →
In my last post on digital context, we took a trip back to logic class, looking at how we could begin to describe our world using “sentences” based on first order logic. This essential “predicate semantics” is the foundation of all mathematics, and hence, computing. In fact, it’s the basis for our most key data storage mechanisms (think SQL). With so much of structured information already encoded in this predicate representation, this gives us an excellent foundation for more semantically-driven contextual computing.
Let’s Begin at the Beginning: What is Context, Anyway?
According to my Webster’s, the word “context” comes from the Latin ”contextus,” which means a joining or weaving together. There are a couple of different types of context:
- There’s context as represented through language, or “the parts of a sentence, paragraph, or discourse immediately next to or surrounding a specified word or passage and determining its exact meaning (e.g., to quote a remark out of context)”.
- And there’s the context we glean through perceptions, meaning “the whole situation, background, or environment relevant to a particular event, personality, creation, etc…”
It’s this second aspect, the perceptual side, which most would agree upon as the meaning of context. Using this definition, our animal friends are “context-aware” up to some level, able to “read” a situation and act accordingly. But we also have the first aspect, language, which allows us to describe the world in sentences, sharing contextual information. So context can be represented by a set of related sentences about a given subject—that’s our “parts of a…discourse immediately next to or surrounding a specified word.” And what makes this especially interesting from my perspective, which begins in the narrow field of security, a “security context” is a set of facts about a given “subject” represented by attributes and relations between entities. As such, a security context can be represented as a subset of first order logic—or by sentences in a limited, constrained form of English.
So if you can find a way to extract information for a given subject from a structured system and represent it as sentences then you are, in fact, extracting the underlying “application” context for this subject. And—drumroll, please—that’s just what we’ve done! Basically, we’ve returned to first principles here at Radiant, devising a “contextual and computational language” method to reverse engineer metadata from an application and represent it in a way that’s as easy to interpret at the human level as it is to execute at the machine level.
Now, this wasn’t my idea alone—if you follow the developments around the semantic web, you know that the idea to semantically encode the web (HTML/text) so that our machines can more meaningfully interpret our descriptions and intentions is based on this same foundation. But standards such as RDF and OWL depend on adoption, which cannot be controlled and is currently confined to a minuscule part of the web. On top of that, they have a different purpose. While they are tagging text the same way than we do—object attribute/verb value or other object—their objective is for machine to be able to interpret these tags. Our goal is bigger: we want to create sentences that are readable by both man and machine. So unless you can read the XML that’s behind RDF as if it were your own language, why not speak in plain English instead, rather than working at the interface level and supporting RDF at the generation phase? But we’ll get to that part a little later on…
From Database Standards to Semantics: Making Structured Data Searchable Across Silos
There’s no single data standard representation in our enterprises—you have vital data stored across SQL databases, LDAP directories, XML, web services/REST, and more. While useful on their own, this “Babel tower” of protocols, data representations, and abstractions makes it difficult, if not impossible, to connect the information across different application kingdoms. Why is this so important? Because each silo offers plenty of powerful contextual richness that we can leverage well beyond the scope of that application.
This is essential because even in the very specialized scope of security, you can’t adequately protect a system of applications if you don’t have a clear picture of what’s really enforced at the level of each application, and how all your applications are interrelated. This is why, despite lots of tools for creating roles and policies, progress in authorization has been extremely slow. The challenge is not just in knowing what you want to enforce—that’s the easy part—you must first understand what exists and what is really enforceable, both at the level of a single application and across a complicated process made up of multiple applications. For instance, when I talk to people in the banking sector about their compliance efforts, what I hear is that it’s not only about defining what they want to enforce, it’s about understanding what they have in the first place.
Context is also vital because this structured data is so valuable. It represents perhaps only 10% of the data in the world, but 90% of the value that we derive from automation. Without structured data, automation would be extremely limited, and the productivity that we derive from automation would evaporate. So wouldn’t it be great if we could understand at the layman’s layer what exists in an application (beyond just forms and interface), and link it to the rest of the infrastructure?
Think about what HTML did for text and other unstructured data on the web, making it searchable, discoverable, and so much more useful. Now imagine your structured data, all that incredible process-driven information and context trapped in application silos. What if we could read all that information, link all that data, and free all those contextual relationships that exist between silos? After all, it’s not only the facts, it’s the links between facts that build up a context. Go back to the etymology we discussed above: “context” is from the Latin contextus and it means the joining, the weaving together.
Again, these ideas are not mine alone—there’s a whole discipline within the semantic web dealing with “linked data,” based on how you could link information once it’s tagged under the form of RDF, which means subject-verb-object or subject-attribute-value. (See my last post for an in-depth look at this.)
read more →
We covered the key role of attributes in my last blogpost, moving from the blunter scope of groups and roles to the more fine-grained approach of attributes. Now we’re going to take this progression a step further, as we narrow in on my favorite topic: digital context. (If you haven’t already, check out my first two posts on context, where I laid out the roadmap and looked at groups, roles, and attributes.) Our first order today is to travel back to logic class and think about predicates.* But Michel, you’re thinking, what does all this have to do with digital context? Well, one way to describe a context about something is to express it using sentences related to the question. While we will come back to the definition of context in a following post, for now let’s just say that we need some building blocks to express facts about the world, some form of sentences that can be interpreted by a computer, and logic is one of the tools for that.
Subject-Predicate-Object: First Order Logic 101
In my most recent post, we saw how the notions of groups and roles ended up in the increased use of attributes as a way to categorize or define identities. This should not be surprising. Behind this use of attributes lays a fundamental mechanism—a way to represent a simple fact. And it’s the same mechanism that we use when we reason based on the rules of formal logic, which has been in practice forever, or when we represent a fact on a computer (think SQL). In fact, one of the greatest achievements of the early 20th century has been the formalization of logic (needed for mathematic foundation) and computation. This type of logical representation is core to everything we do, as reasoned thinkers and as computer scientists.
But in case you’re a few years removed from logic class, let’s examine this mechanism at work by looking at some very simple diagrams about what we are doing when we associate some attribute with a person or an object, such as assigning a person to a group:
Or assigning a subgroup to a group:
Each of these constructs can be summarized by the following diagram:
In this diagram, a fact can be asserted by the notation: subject-predicate-object. In predicate logic (AKA first order logic), it’s conventionally written as predicate(X,Y), where the variables X and Y could be themselves objects (references to entities) and/or values (arbitrarily “quoted” labels belonging to the initial vocabulary of our logic system). For instance, in our example above, the fact that “Jane is member of the product marketing group” can be written as memberOf(“Jane”,”Product Marketing”) and subGroupOf(“Product Marketing”,“Marketing”).
These kinds of predicates are called “binary” predicates and they are quite common. So if there are binary predicates, the astute reader (that’s you!) might well wonder if there are also unary predicates and, more generally, n-ary predicates. Indeed, the unary predicate exists and generally it’s used to assign a label to an entity—so if we want to say that Jane is an executive, you would write it as executive(“Jane”). As for the n-ary predicate, well here’s where you will find the usual “n-slots” notation of entities/tables as they’re used in the relational/SQL world. So we’d see something like this: age(“Jane”, “33”) or employee(“Jane”, “33”,”product marketing”).
Now, if you look at all those diagrams above, you’ll notice they have a direction, an orientation that tells us which entity plays the role of subject, since the object for a given predicate cannot generally be substituted. This translates into a given order for the different slots of a predicate; for example, in the notation age(“Jane”, “33”), the first slot—“Jane”—is for the person, and the second—“33”—is for her age. Of course, there are always exceptions where the slots are permutable, such as the “brother binary predicate,” where if x is a brother of y—brother(“x”,”y”)— then y is also a brother of x, which could read: brother(“y”,”x”)= brother(“x”,”y”). But in general order, orientation matters.
The diagrams above form directed graphs and the orientation is essential for preserving the semantics of this representation. After all, saying that x kills y—Kill(“x”,”y”)—is very different from saying that y kills x—Kill(“y”,”x”)!
read more →
Last week, I introduced my favorite topic—digital context—and laid out a plan for how to consider the case. Today, we’ll dive in with a real-world example, looking at how freeing context from across application silos helps us make more considered, immediate, and relevant access control decisions. For those of you who have been following along (and thanks for sticking with me in my madness), this is blog 8 in response to Ian Glazer’s provocative video on killing IAM in order to save it. And if you haven’t been with me from the beginning: I’m in favor of skipping the murder and going straight to the resurrection. Those of you who are coming in late to the game, here’s the recent introduction to context, or you can catch up with the entire story in order here: one, two, three, four, five, six, seven.
It All Starts with Groups: The Simple, Not Especially Sophisticated Solution
Let’s start first with the notion of groups and their implementation. On the surface, nothing could be more straightforward: If I have to manage a sizeable set of users and assign them different rights to applications, I need to categorize those users into groups with the same profile, whether that’s by function, role, need to know, hierarchy, or some other factor. This is the simplest approach to any categorization, creating some “relevant” labels, then assigning people that fit within those label to define groups.
So let’s say we’re creating groups based work functions, such as sales, marketing, production, and administration. All we need to do is list all the people under a particular function, create a label, and then assign this label to those people. Couldn’t be easier, right? The simplicity of the process explains the huge success of groups—and although we implementers tend to make fun of groups as crude categorizations, I would guesstimate that at least 90% of our authorization policies are still implemented through groups. (So much for all that talk about advanced fine-grained authorization! But I’m getting ahead of myself here…)
In fact, we’ve become so dependent on groups that in many cases, especially with sizeable organization where the business processes are quite refined and well managed, we’re seeing that there are often more groups than users! At first glance, this seems paradoxical—after all, what’s the point of regrouping people if you have more groups than people? But the joke is on us technical people because we ignored another key reality: the business one. Sure, we could have a lot of people, but generally a well-managed and productive organization can have more activities (or different aspects of a given activity) that require the multiplication of those groups. So we gave our users a simple mechanism to categorize people into groups, and they used it—talk about being a victim of our own success! 🙂
Basically, we played the sorcerer’s apprentice and our simple formula yielded a multiplication of groups, which quickly became un-manageable. So we went back to the formula and started to tweak it, creating groups inside groups, hierarchies of groups, and nested groups; introducing Boolean operations on groups; aggregating them into roles, and so on. So what we were just saying about groups being simple? Simple for whom? Simple for the group implementers—yes, definitely. Simple for a user in charge of the initial creation of the group—sure. But add any complexity into the mix and the chaos begins.
So Much for the Digital Revolution: Every Change, Managed Manually
From a computer’s point of view, the assignment of a user to a group is totally opaque—just an explicit list entered by the person in charge of creating the group. This explicit list contains no information about why or how a user is dispatched into or associated with a group. In short, the definition of membership rests with the group owner, which is fine on the face of it. But that excludes any automated assignment of a new member to the group without manual intervention of the group owner. That means every change must be entered by hand—imagine the complexity as people constantly change roles and shift responsibilities. And imagine how easy it would be for an overworked manager to miss removing the name of the person she just fired from just one of the groups he was part of. Now imagine the security risk if that guy’s still got access to sensitive files.
Without explicitly externalizing those rules, those policies, the administration of the system becomes tied to the group owners/creators. The effort of sub-categorizing with nested groups or introducing more flexible ways to combine groups by using Boolean operators just reveals the root of the problem: When you give users better ways to characterize their groups, you are forcing those users to either make explicit the formation rules of their groups—or continue to make every single change manually, even as those changes become more complex and unmanageable.
And that’s how we (re)discovered the value of attribute-based group definitions.
I know I’ve been the Old Man of Novato, ranting about context all these years, but the market, the industry, and—most importantly—the technology are finally evolving toward this direction. For the longest time, it was just me and the usual suspects in academia and elsewhere, muttering in our corners about the Semantic Web, but now we’re hearing about context-aware computing from every direction. While I’ve refined a set of slides on context that I’ve delivered to groups large and small over the years, along with a demo of our Context Browser technology, now seems like a great time to put everything I know down in writing.
Although my French heritage and Math background prefer to start from theory and illustrate through examples, my newly American pragmatic tinkerer side is planning to do a quick roadmap here, then look at examples from our existing systems and, through them, make the theoretical case. It’ll take a few posts to get there, but then, I’ve really been enjoying blogging lately, as my manifesto in response to Ian Glazer will testify. Read it from the beginning, if you’d like a peek into my recent madness: one, two, three, four, five, six.
Context Matters: Where We’re At, Where We’re Headed
We’ve already seen the word creeping into marketing materials, but one of these days—okay, maybe months or years—it’s going to be more than a promise: digital context will be everything. As we get closer to digitalizing our entire lives, we’re also moving toward a context-aware computing world. Now, when we’ve talked about context-aware computing so far, it has seemed like one of those woolly concepts straight from a hyper-caffeinated analyst’s brain (or an over-promising marketer’s pen). But the truth is, any sizeable application that’s not somehow context-aware is pretty useless or poorly designed.
Sure, there are pieces of code or programs that exist to provide some transition between observable states and, as such, are “stateless.” And I know that on the geeking edge, it’s trendy to talk about stateless systems, which are an important part of the whole picture. In reality, however, the world needs to record all kinds of states, because a stateless world also means a world without any form of memory—no past, present, or future. So it’s not like most of our programs and applications are not context-aware. They are, and most of the time they’re pretty good at managing their own context.
The problem is that we move from context to context, and in the digital world this means that unless those programs, those agents, those devices share their context, we are facing a stop-and-go experience where the loss of context can be as annoying—or as dangerous—as an interrupted or broken service. The lack of context integration can mean a bad user experience—or a dead patient due to a wrong medication. In a world where actions and automated decisions can be taken in a split-second, this absence of context integration is a huge challenge. Nowhere is the issue is more acute than in security, in authentication and authorization.
read more →