In my last post on digital context, we took a trip back to logic class, looking at how we could begin to describe our world using “sentences” based on first order logic. This essential “predicate semantics” is the foundation of all mathematics, and hence, computing. In fact, it’s the basis for our most key data storage mechanisms (think SQL). With so much of structured information already encoded in this predicate representation, this gives us an excellent foundation for more semantically-driven contextual computing.
Let’s Begin at the Beginning: What is Context, Anyway?
According to my Webster’s, the word “context” comes from the Latin ”contextus,” which means a joining or weaving together. There are a couple of different types of context:
- There’s context as represented through language, or “the parts of a sentence, paragraph, or discourse immediately next to or surrounding a specified word or passage and determining its exact meaning (e.g., to quote a remark out of context)”.
- And there’s the context we glean through perceptions, meaning “the whole situation, background, or environment relevant to a particular event, personality, creation, etc…”
It’s this second aspect, the perceptual side, which most would agree upon as the meaning of context. Using this definition, our animal friends are “context-aware” up to some level, able to “read” a situation and act accordingly. But we also have the first aspect, language, which allows us to describe the world in sentences, sharing contextual information. So context can be represented by a set of related sentences about a given subject—that’s our “parts of a…discourse immediately next to or surrounding a specified word.” And what makes this especially interesting from my perspective, which begins in the narrow field of security, a “security context” is a set of facts about a given “subject” represented by attributes and relations between entities. As such, a security context can be represented as a subset of first order logic—or by sentences in a limited, constrained form of English.
So if you can find a way to extract information for a given subject from a structured system and represent it as sentences then you are, in fact, extracting the underlying “application” context for this subject. And—drumroll, please—that’s just what we’ve done! Basically, we’ve returned to first principles here at Radiant, devising a “contextual and computational language” method to reverse engineer metadata from an application and represent it in a way that’s as easy to interpret at the human level as it is to execute at the machine level.
Now, this wasn’t my idea alone—if you follow the developments around the semantic web, you know that the idea to semantically encode the web (HTML/text) so that our machines can more meaningfully interpret our descriptions and intentions is based on this same foundation. But standards such as RDF and OWL depend on adoption, which cannot be controlled and is currently confined to a minuscule part of the web. On top of that, they have a different purpose. While they are tagging text the same way than we do—object attribute/verb value or other object—their objective is for machine to be able to interpret these tags. Our goal is bigger: we want to create sentences that are readable by both man and machine. So unless you can read the XML that’s behind RDF as if it were your own language, why not speak in plain English instead, rather than working at the interface level and supporting RDF at the generation phase? But we’ll get to that part a little later on…
From Database Standards to Semantics: Making Structured Data Searchable Across Silos
There’s no single data standard representation in our enterprises—you have vital data stored across SQL databases, LDAP directories, XML, web services/REST, and more. While useful on their own, this “Babel tower” of protocols, data representations, and abstractions makes it difficult, if not impossible, to connect the information across different application kingdoms. Why is this so important? Because each silo offers plenty of powerful contextual richness that we can leverage well beyond the scope of that application.
This is essential because even in the very specialized scope of security, you can’t adequately protect a system of applications if you don’t have a clear picture of what’s really enforced at the level of each application, and how all your applications are interrelated. This is why, despite lots of tools for creating roles and policies, progress in authorization has been extremely slow. The challenge is not just in knowing what you want to enforce—that’s the easy part—you must first understand what exists and what is really enforceable, both at the level of a single application and across a complicated process made up of multiple applications. For instance, when I talk to people in the banking sector about their compliance efforts, what I hear is that it’s not only about defining what they want to enforce, it’s about understanding what they have in the first place.
Context is also vital because this structured data is so valuable. It represents perhaps only 10% of the data in the world, but 90% of the value that we derive from automation. Without structured data, automation would be extremely limited, and the productivity that we derive from automation would evaporate. So wouldn’t it be great if we could understand at the layman’s layer what exists in an application (beyond just forms and interface), and link it to the rest of the infrastructure?
Think about what HTML did for text and other unstructured data on the web, making it searchable, discoverable, and so much more useful. Now imagine your structured data, all that incredible process-driven information and context trapped in application silos. What if we could read all that information, link all that data, and free all those contextual relationships that exist between silos? After all, it’s not only the facts, it’s the links between facts that build up a context. Go back to the etymology we discussed above: “context” is from the Latin contextus and it means the joining, the weaving together.
Again, these ideas are not mine alone—there’s a whole discipline within the semantic web dealing with “linked data,” based on how you could link information once it’s tagged under the form of RDF, which means subject-verb-object or subject-attribute-value. (See my last post for an in-depth look at this.)
Using Model-Driven Virtualization to Reveal the Hidden Semantics of Structured Data
Here at Radiant, we’re leveraging the same principles as those behind the semantic web, but applying them to all the structured data you have stored in SQL, LDAP, web services, REST, etc… We re-discovered these ideas because we were facing the challenge of integrating identity and its extended profiles across silos. And in order to do that, you need first to translate—to tag—this information into a common language or annotation. The method we use is called “virtualization by model,” and we’re going to take a look at how that works in the slideshow below. (And yes, my old friend Dave Kearns, the wise man of IdM, always teases me about the fact that I’ve been telling this same story with the same slides for an eternity now. So Dave, please forgive me here. As far as I know, the theorems of Thales or Pythagoras are more than 2000 years old now and are still presented with the same original proofs, too…). 🙂
The common language is first order logic, and the method we propose just automates to a large extent the tagging, the metadata reverse engineering and translation of our structured data from multiple specific protocols to this common representation. With this virtualization approach, we can extract application contexts that can be linked to unify these application silos. And because those predicates, with the right vocabulary defined by the users, can be read as a limited, “constrained” form of English sentences, these contexts are intelligible for the non-IT person. While it isn’t Shakespeare by any means, our system delivers English sentences that we can all understand—even machines.
A Sort-Of Pidgin English, Readable By Both Marketers and Machines
This basic “caveman” English allows a business user—the exact person who knows precisely what he needs to define the security policy—to “read” into the “books” of his automated process made of application silos. This gives business users a way to directly understand what really exists and is enforced currently in their applications, and to interact with this information to define the security policies they want to enforce.
These application contexts, described in English sentences (a subset of first order logic), would allow any business user understand what exactly is occurring within an automated process as it is enforced by an application. But all this sounds very esoteric—let’s take a look at how this might work in real life. We saw above that using virtualization by model lets us deliver different “views” of identity to meet the needs to diverse applications, whether they’re expecting LDAP, SQL, REST and yes, even RDF. But we can take that a step farther and deliver views of data designed for the human user, as well, using these plain English sentences I keep talking about. So here’s an example of a possible user interface that lets an end user search and navigate contextual information:
Voilà, digital context, delivered through the magic of first order logic and model-driven virtualization! 🙂
PS: Let me hear your thoughts, feelings, and questions on all this, particularly those of you who’ve been following along with my magnum opus in response to Ian Glazer’s video on Killing IAM in order to save it. If you’re coming in late to the game, you can get caught up here: one, two, three, four, five, six, seven, eight, nine.