by Marcia Kaufman
I just returned from the IBM Information on Demand 2006 conference, a meeting that pulled together the various product offerings from IBM and its various acquisitions in information management. Over the past five years, IBM has bought more than 25 companies in the information management space including such high profile acquisitions such as Informix, Ascential Software, and Filenet. In addition, there have been smaller buys such as SRD for identity resolution, Venetica, intended to manage unstructured data, Green Pasture for enterprise content management, Unicorn for metadata management, and Trigo, a data synchronization technology, to name but a few. Not surprisingly, IBM has had its hands full integrating these offerings into a comprehensive strategy. IBM is using the objective of delivering information as a service to create a data strategy to support the strong movement toward Service Oriented Architecture (SOA). In this piece, I offer a glimpse into how data is transformed through SOA and IBM’s new offering, the IBM Information Server.
With the movement towards Service Oriented Architecture, organizations are recognizing that all the information stored in a vast array of data silos needs to become a set of shared services accessible across the organization. For this to become a reality a different approach to information management is required, a life cycle approach to the management of information that ensures that the business can trust where their data comes from, understands who is responsible for it, and ensures that it can be reused as needed throughout the organization. This approach is called delivering information as a service.
Traditionally, customer or product data are tightly connected to applications for a specific line of business or division. These data may be monitored and, therefore, trusted by the responsible business entity. But confusion and quality problems may ensue when the business tries to make decisions based on customer and product data that is spread across separate business entities.
For example, integrating data across the data stores of merged companies can result in misinterpretations of the combined customer base. The definition of a customer may vary between the companies making it hard to determine top customers without a lot of manual intervention. The lack of a single view of the customer across the various lines of business can lead to missed opportunities for the business and dissatisfied customers.
The Information Server is IBM’s response to an increasing corporate imperative to create an approach to information management that provides the various users of business information with the ability to control their own data sources while maintaining flexibility for reuse. With a great deal of promotional fanfare and the confidence that comes from having over 75 external customers beta test your product, IBM introduced the IBM Information Server. Based on a very thorough demonstration of the Information Server’s capabilities by IBM’s product and development teams as well as some hands-on demos I took advantage of during the conference, I got a pretty good introduction to the new product.
The IBM Information Server provides customers with a unified approach to leverage their data assets wherever they may reside in the organization. It provides a framework for the organization to deliver trusted data in a distributed manner, providing a consistent look and feel to the group of IBM WebSphere-branded technologies designed for understanding, cleansing, transforming, and federating the diverse information sources found in an organization.
The Information Server is not really a single product. It resembles the world of the application server, which aggregated different technologies and allowed data to be easily and consistently passed back and forth between a database, an application, and through a Web browser to the end-user. Prior to the introduction of the application server, there were many different types of connections required to support the early Web applications causing them to take a very long time to load at the end-users browser.
Likewise, the IBM Information Server includes multiple products from both IBM and Ascential Software (which IBM purchased in May, 2005) that are aggregated to enable a more efficient and consistent information integration process. The products can be used separately, but the combined capabilities, which embed data quality tools in the information integration process, help customers raise the bar on quality.
The main components of the Information Server are:
- WebSphere Information Analyzer. Analyzes the structure and determines the relationship between data across fields and sources. The business-side subject matter experts and data analysts can use this product to ensure that integration of data sources is based on a thorough understanding of the available data.
- WebSphere Business Glossary. This is where the business metadata –such as how the data are used and the rules for governing that data – are recorded to provide a business context for the data. Business subject matter experts and other business users may deploy this component to establish an understanding of how the data elements are defined by the business.
- WebSphere QualityStage. Based on certain rules that can be adjusted by the business, this product is used to clean up the duplicate records and enable organizations to create a single view of their customers or products across various data stores.
- WebSphere DataStage. This product is designed to carry out the transformation of information across data stores. It is the core for the primary ETL (extract, transform, and load) functions of the integration process.
- WebSphere Federation Server. This technology helps hide the complexity of the various data stores from the end-user. It provides a way to access data across the organization regardless of its location, format, and application.
IBM’s goal for customers who deploy the Information Server is to bring together the components of an information management strategy so that the overall process of integrating data can be reused in many different business situations rather than applying these efforts to a single project.
The IBM Information Server is an example of what it will take for a company to begin to move to delivering information as a service. It is based on SOA principles so that each component is designed both to work on its own, if necessary, and at the same time to work as part of a loosely coupled environment. This would enable customers to move efficiently from understanding to cleansing to transforming the data sources within the organization and ultimately to delivering consistent and trusted information to the business users.