Semantic Technology: A Promising Solution to Today’s Metadata Needs
By Vickie Farrell, Principal
The promise of the Semantic Web, an extension of the current web in which information is given well-defined meaning, better enabling computers and people to work in cooperation, is often described using examples like this: You book a trip to San Francisco on the Internet and receive, through appropriate and comprehensive integration of data throughout the web, a list of suggested Greek restaurants (your favorite) in the neighborhood of your hotel. If you have a hard time translating this into relevance for your business, you’re probably not alone.
Actually, the vision of the Semantic Web1 has spawned some useful technologies and W3C language standards for representing data and relationships between data items, which is a good thing. An even better thing is that a handful of vendors and some pragmatic corporate practitioners have found a very good use for semantic technology _ one that’s not far from home for most organizations.
They’re using it to solve the age-old problem of metadata. Semantic technology can be and is being applied to everything from automating tailored customer service on a website to managing coalition forces’ data in real time to complex product configuration to dynamic document creation. Fundamental to each of these applications is using semantic technology to create a flexible metadata management solution to foster dynamic data integration, which is not as disruptive as it sounds.
For decades, computers have been used to process structured data, automating simple, but once manual tasks such as adding, sorting and record selection based on a specific defined value in a particular field. The use of semantic technology allows computers to understand and process unstructured data (whether it’s on the internet or in your own corporate environment) to automate more complex tasks. Examples are things like finding all the documents that relate to a certain process or part (even if it’s provided by multiple suppliers), or inserting the appropriate text into a loan agreement based on the particular customer situation.
The approach is to develop a metadata model that goes beyond defining data syntax (format, structure, source, values) and also provides a means for the computer to understand the meaning of the data and to ensure proper use and consistency across applications. Thus, functional departments or separate divisions that define business terms differently can maintain their current definitions but interoperate effectively without having to code point-to-point data transformations. And without requiring mass adoption and rigid adherence to a single agreed-to definition in a static metadata repository. Both of these outdated approaches have become impractical. Point-to-point data mapping between systems is costly, time-consuming to maintain, prone to error, and doesn’t scale. Forcing agreement on a common set of terms in a repository, even if it were feasible to do once, won’t last past the first reorganization, product reclassification or revision in regulatory requirements.
By defining concepts and the relationships of terms to those concepts using a graph structure rather than relational, you can allow not only human users, but applications and services that define the terms differently to interoperate without having to hardcode transformations between their data. This is particularly important for organizations that are implementing a service-oriented architecture and for those that are addressing data quality and master data management.
SOA. A semantic approach to metadata provides a layer of abstraction for information, similar to the business process and transport abstraction layers used to enable SOA and assure reuse and agility. Standards groups in many industries have defined a standard industry data model such as SID for telecommunications, HL7 for healthcare and ACORD for insurance. As organizations in those industries mean to implement these models anyway, they may find that using them as the basis for an information abstraction layer helps achieve agility and standards conformance in one implementation.
Pantero in Waltham, MA, with a focus on the telecommunications industry, automates the design and implementation of a common model architecture for information abstraction and brokering between applications, leveraging existing infrastructure that provides abstraction at the business process and transport tier. Unicorn Solutions, recently acquired by IBM as part of WebSphere Metadata Services, offers tools that allow technical users to map disparate metadata sources and to edit standard-based metamodels graphically during design and at runtime.
Data quality and MDM. Silver Creek Systems in Westminster, CO focuses its technology on matching product data, an order of magnitude more complex than the more familiar issue of name and address matching. Conventional statistical matching is so inaccurate that it usually becomes a manual process. Silver Creek’s approach allows automation by using semantic matching to extract relevant patterns, yielding a higher level of accuracy.
The bottom line. SOA can’t be successful without an enterprise information metadata solution that provides as much flexibility as loosely coupled application services. Using semantic technology to build, based on an industry standard data model if possible, is likely to represent today’s best practice. Several companies offer a package that starts with the industry model and allows you to extend it for your environment.
Similarly, MDM systems for complex data such as products or clinical diagnoses and treatments need a flexible metadata structure that can accurately and automatically relate inconsistent data from many different structured and unstructured sources for use in different contexts in real time. Semantic technology supports a unified view via a common model while providing the needed flexible expression and interpretation of rules and data relationships.
1. The vision of the Semantic Web was first described in a ground-breaking paper by Sir Tim Berners-Lee, Professor Jim Hendler and Ora Lassila (The Semantic Web, Scientific American, May 2001).