Software Labs
Proven Data Automation
Skip Navigation Links
Home   
Solutions   
Products   
Partners   
Support   
About Us   
Articles












Magic Quadrant for Data Integration Tools  
September 22, 2008

Organizations increasingly view investments in data integration tools as a strategic basis for enterprise data management. Vendors with capabilities across multiple styles of data delivery, supported by strong metadata management and service enablement, are becoming the focus of market demand

The data integration tools market is gaining new momentum as organizations recognize the role of these technologies in support of high-profile initiatives such as master data management (MDM), business intelligence (BI) and delivery of service-oriented architectures (SOAs). Recent focus on cost control has made data integration tools a surprising priority as organizations realize the "people" commitment for implementing and supporting custom-coded or semimanual data integration approaches is no longer reasonable. Vendor consolidation continues, driven by the convergence of single-purpose tools into data integration suites or platforms. While most vendors still approach this market with multiple products, metadata-driven architectures supporting a range of data delivery styles continue to emerge. Organizations seeking data integration tools must assess their current and future requirements and map them against product functionality, including support for a range of data integration patterns and latencies. Buyers must recognize that, as an evolving market, disruptions caused by merger and acquisition activity are likely as smaller vendors with valuable technology continue to be subsumed into larger entities to form more complete data integration tools portfolios.

Market Overview

The discipline of data integration comprises the practices, architectural techniques and tools for achieving consistent access to, and delivery of, data across the spectrum of data subject areas and data structure types in the enterprise, to meet the data consumption requirements of all applications and business processes. As such, data integration capabilities are at the heart of the information-centric infrastructure and will power the frictionless sharing of data across all organizational and system boundaries. Contemporary pressures are leading to an increased investment in data integration in all industries and geographic regions. Business drivers, such as the imperative for speed to market and agility to change business processes and models, are forcing organizations to manage their data assets differently. Simplification of processes and the IT infrastructure are necessary to achieve transparency, and transparency requires a consistent and complete view of the data, which represents the performance and operation of the business. Data integration is a critical component of an overall enterprise information management (EIM) strategy that can address these data-oriented issues.

From a technology point of view, data integration tools were traditionally delivered via a set of related markets, with vendors in each market offering a specific style of data integration tool. Traditionally, tools for extraction, transformation and loading (ETL) used predominantly in data warehouse/mart implementations held the largest market shares in this overall space and formed a "center of gravity" for market consolidation. BI efforts, with their focus on metadata, promoted the infusion of the combined data integration market with metadata management capabilities. However, in the past two years, vendors and leading organizations have been pursuing a strategy of centralizing their capabilities for semantic interpretation and reconciliation as a service within a platform and reducing, in relative terms, their emphasis on connectivity and specific data delivery styles. A variety of other related markets, such as those for data quality tools, adapters and data modeling tools, overlap with the data integration tools space. The result of this historical fragmentation in the markets is the equally fragmented and complex way in which data integration is accomplished in large enterprises — different teams using different tools with little consistency, lots of overlap and redundancy, and no common management and leverage of metadata. Technology buyers have been forced to acquire a portfolio of tools from multiple vendors to amass the capabilities necessary to address the full range of their data integration requirements. The market has not yet reached a point at which data integration is typically achieved via a single platform or suite, but notable improvements exist — especially relative to data services delivery and metadata management.

With the emergence of the data integration tools market, separate and distinct submarkets continue to converge, both at a vendor and technology level. This is being driven by buyer demands (for example, organizations realizing they need to think about data integration holistically and have a common set of data integration capabilities they can use across the enterprise, particularly when working on SOA initiatives that have many different consumers working in various contexts). It is also being driven by vendors' actions (for example, vendors in individual data integration submarkets organically expanding their capabilities into neighboring areas, and acquisition activity bringing vendors from multiple submarkets together). The result is a market for complete data integration tools that address a range of different data integration styles and are based on common design tooling, metadata and runtime architecture. This market has supplanted the former data integration tools submarkets, such as ETL, and becomes the competitive landscape in which Gartner evaluates vendors for placement within this Magic Quadrant. While the traditional ETL vendor submarket was the driver for consolidation, it is important to note that, even after acquiring federation or message-based integration capabilities, the market is not showing wide implementation of either of these approaches. Vendors that supply all three types of integration techniques generally exhibit an overwhelming strength in one — even those vendors that acquired market leaders in a second submarket.

Gartner estimates the size of the market for data integration tools at approximately $1.44 billion as of the end of 2007, and believes it is growing at a compound annual rate of more than 17% (see "Market Trends: Data Integration Tools, Worldwide, 2007-2012"). Services revenue from data integration tools implementations is also growing, with the time and effort required to implement the tools varying widely depending on the scope and complexity of the deployment (see "Toolkit Sample Template: Data Integration LOE Estimator").

During 2H07 and 1H08, the market showed the most pronounced dichotomy in execution that we have seen since 2006 when the Magic Quadrant for Data Integration Tools first replaced the Magic Quadrant for ETL. The top execution vendors — IBM, Informatica, SAP-Business Objects, Oracle and Microsoft — come from the more traditional data integration tools heritage and their execution capabilities arise from each vendor's traditional strength in supporting integration of tabular (structured) data. The vendors in the Leaders' quadrant have that traditional background, plus the incorporation of data integration for newly emergent (unstructured) data types, as well as the ability to support all major styles of data delivery. The remaining vendors demonstrate an interesting response to the market demands and represent the potential for disruptive practices in data integration. While ETI, Syncsort, Open Text and Pitney Bowes Software are successful in the traditional data integration tools market (providing only minimal support beyond ETL), the remaining vendors (all the Visionaries plus Sybase) are pursuing a different strategy. The Visionaries have chosen a different method of challenging the traditional Leaders. Some, including Sun Microsystems and Tibco Software, recognize the emerging importance of services architectures and, while the Leaders also offer this type of solution, the Visionaries recognize this area of performance as an opportunity. SAS, iWay Software, Pervasive Software and Sybase are each pursuing a focus that is unique and different from that of the Leaders in terms of market execution — channels, integration tightly linked with quality and analytics, and/or blending search with integration. The Leaders also possess features/functionality in these areas but the Visionaries differentiate on these characteristics (among others). The challenge for the Visionaries will be to expand on execution and that, in part, is dependent on their vision focus matching the emerging needs of the market related to services-style delivery, metadata and data quality.

Market Definition/Description

The data integration tools market comprises vendors that offer software products to enable the construction and implementation of data access and delivery infrastructure for a variety of data integration scenarios, including:

  • Data acquisition for BI and data warehousing: Extracting data from operational systems, transforming and merging that data, and delivering it to integrated data structures for analytic purposes. BI and data warehousing remain a mainstay of the demand for data integration tools.
  • Creation of integrated master data stores: Enabling the consolidation and rationalization of the data, representing critical business entities such as customers, products and employees. MDM may or may not be subject-based, and data integration tools can be used to build the data consolidation and synchronization processes that are key to success.
  • Data migrations/conversions: Traditionally addressed most often via the custom coding of conversion programs, data integration tools are increasingly addressing the data movement and transformation challenges inherent in the replacement of legacy applications and consolidation efforts during merger and acquisition activities.
  • Synchronization of data between operational applications: Similar in concept to each of the previous scenarios, data integration tools provide the capability to ensure database-level consistency across applications, both on an internal and interenterprise basis, and in a bidirectional or unidirectional manner.
  • Creation of federated views of data from multiple data stores: Data federation, often referred to as enterprise information integration (EII), is growing in popularity as an approach for providing real-time integrated views across multiple data stores without physical movement of data. Data integration tools are increasingly including this type of virtual federation capability.
  • Delivery of data services in an SOA context: An architectural technique, rather than a data integration usage itself, data services are the emerging trend for the role and implementation of data integration capabilities within SOAs. Data integration tools will increasingly enable the delivery of many types of data services.
  • Unification of structured and unstructured data: Not a specific use-case itself, and relevant to each of the above scenarios, there is an early but growing trend toward leveraging data integration tools for merging both structured and unstructured data sources, as organizations work on delivering a holistic information infrastructure that addresses all data types.

Gartner has defined several classes of functional capabilities that vendors of data integration tools must possess to deliver optimal value to organizations in support of a full range of data integration scenarios:

  • Connectivity/adapter capabilities (data source and target support).
  • Data delivery capabilities.
  • Data transformation capabilities.
  • Metadata and data modeling capabilities.
  • Design and development environment capabilities.
  • Data governance capabilities (data quality, profiling and mining).
  • Runtime platform capabilities.
  • Operations and administration capabilities.
  • Architecture and integration.
  • Service-enablement capabilities.

Connectivity/Adapter Capabilities (Data Source and Target Support)

The ability to interact with a range of different data structures types, including:

  • Relational databases.
  • Legacy and nonrelational databases.
  • Various file formats.
  • XML.
  • Packaged applications such as CRM and supply chain management.
  • Industry-standard message formats such as electronic data interchange (EDI), SWIFT and Health Level Seven (HL7).
  • Message queues, including those provided by application integration middleware products and standards-based products (such as Java Messaging Service [JMS]).
  • Emergent data types, such as e-mail, Web sites, office productivity tools and content repositories.

In addition, data integration tools must support different modes of interaction with this range of data structure types, including:

  • Bulk acquisition and delivery.
  • Granular trickle-feed acquisition and delivery.
  • Changed-data capture (ability to identify and extract modified data).
  • Event-based acquisition (time-based or data-value-based).

Data Delivery Capabilities

The ability to provide data to consuming applications, processes and databases in a variety of modes, including:

  • Physical bulk data movement between data repositories.
  • Federated views formulated in memory.
  • Message-oriented movement via encapsulation.
  • Replication of data between homogeneous or heterogeneous database management systems (DBMSs) and schemas.

In addition, support for delivery of data across the range of latency requirements is important:

  • Scheduled batch delivery.
  • Streaming/real-time delivery.
  • Event-driven delivery.

Data Transformation Capabilities

Built-in capabilities for achieving data transformation operations of varying complexity, including:

  • Basic transformations, such as data type conversions, string manipulations and simple calculations.
  • Intermediate complexity transformations, such as lookup and replace operations, aggregations, summarizations, deterministic matching and management of slowly changing dimensions.
  • Complex transformations, such as sophisticated parsing operations on free-form text and rich media.

In addition, the tools must provide facilities for development of custom transformations and extension of packaged transformations.

Metadata and Data Modeling Capabilities

As the increasingly important heart of data integration capabilities, metadata management and data modeling requirements include:

  • Automated discovery and acquisition of metadata from data sources, applications and other tools.
  • Data model creation and maintenance.
  • Physical to logical model mapping and rationalization.
  • Defining model-to-model relationships via graphical attribute-level mapping.
  • Lineage and impact analysis reporting, via graphical and tabular format.
  • An open metadata repository, with the ability to share metadata bidirectionally with other tools.
  • Automated synchronization of metadata across multiple instances of the tools.
  • Ability to extend the metadata repository with customer-defined metadata attributes and relationships.
  • Documentation of project/program delivery definitions and design principles in support of requirements definition activities.
  • Business analyst/end-user interface to view and work with metadata.

Design and Development Environment Capabilities

Facilities for enabling the specification and construction of data integration processes, including:

  • Graphical representation of repository objects, data models and data flows.
  • Workflow management for the development process, addressing requirements such as approvals and promotions.
  • Granular role-based and developer-based security.
  • Team-based development capabilities, such as version control and collaboration.
  • Functionality to support reuse across developers and projects, and facilitate identification of redundancies.
  • Support for testing and debugging.

Data Governance Capabilities (Data Quality, Profiling and Mining)

Mechanisms for aiding the understanding and assurance of quality of data over time, including interoperability with:

  • Data profiling tools.
  • Data mining tools.
  • Data quality tools.

Runtime Platform Capabilities

Breadth of support for hardware and operating systems on which data integration processes may be deployed, specifically:

  • Mainframe environments, such as IBM z/OS and z/Linux.
  • Midrange environments, such as IBM System i (formerly AS/400) or HP Tandem.
  • Unix-based environments.
  • Wintel environments.
  • Linux environments.

Operations and Administration Capabilities

Facilities for enabling adequate ongoing support, management, monitoring and control of data integration processes implemented via the tools, such as:

  • Error-handling functionality, both predefined and customizable.
  • Monitoring and control of runtime processes.
  • Collection of runtime statistics to determine use and efficiency, as well as an application-style interface for visualization and evaluation.
  • Security controls, for both data "in flight" and administrator processes.
  • Runtime architecture that ensures performance and scalability.
Architecture and Integration

The degree of commonality, consistency and interoperability between the various components of the data integration toolset, including:

  • Minimal number of products (ideally one) supporting all data delivery modes.
  • Common metadata (single repository) and/or the ability to share metadata across all components and data delivery modes.
  • Common design environment for supporting all data delivery modes.
  • Ability to switch seamlessly and transparently between delivery modes with minimal rework.
  • Interoperability with other integration tools and applications, via certified interfaces and robust application programming interfaces (APIs).
  • Efficient support for all data delivery modes regardless of runtime architecture type (centralized server engine vs. distributed runtime).
Service-Enablement Capabilities

As acceptance of data services concepts continues to grow, data integration tools must exhibit service-oriented characteristics and provide support for SOA deployments, such as:

  • Ability to deploy all aspects of runtime functionality as data services.
  • Management of publication and testing of data services.
  • Interaction with service repositories and registries.
  • Service enablement of the development and administration environments, such that external tools and applications can dynamically modify and control runtime behavior of the tools.
Inclusion and Exclusion Criteria

For vendors to be included in this Magic Quadrant, they had to meet the following requirements:

  • Possess within their technology portfolio the subset of capabilities identified by Gartner as most critical from within the overall range of capabilities expected in data integration tools. Specifically, vendors must deliver the following functional requirements:
    • Range of connectivity/adapter support (sources and targets): native access to relational DBMS products, plus access to nonrelational legacy data structures, flat files, XML, and message queues.
    • Mode of connectivity/adapter support (against a range of sources and targets): bulk/batch and change data capture.
    • Data delivery modes support: bulk/batch (ETL-style) delivery, plus at least one additional mode (federated views, message-oriented delivery or data replication).
    • Data transformation support: at a minimum, packaged capabilities for basic transformations (such as data type conversions, string manipulations and calculations).
    • Metadata and data modeling support: automated metadata discovery, lineage and impact analysis reporting, and an open metadata repository including mechanisms for bidirectional sharing of metadata with other tools.
    • Design and development support: graphical design/development environment and team development capabilities (such as version control and collaboration).
    • Data governance support: ability to interoperate at a metadata level with data profiling and/or data quality tools.
    • Runtime platform support: Windows, Unix or Linux operating systems.
    • Service enablement (ability to deploy functionality as services conforming to SOA principles).

For this iteration of the Magic Quadrant, we added support for interaction with message queues, changed-data capture capabilities, and ability to deploy functionality as data services to reflect their importance in the ideal of a comprehensive information-centric infrastructure, as well as in response to increasing demand for this functionality in the market. Vendors had to:

  • Generate at least $20 million of annual software revenue from data integration tools or maintain at least 300 production customers.
  • Support data integration tools customers in at least two of the major geographic regions (North America, Latin America, Europe and Asia/Pacific).
  • Have customer implementations that reflect the use of the tools at an enterprise (cross-departmental and multiproject) level.

We excluded vendors focusing only on one specific data subject area (for example, only customer data integration), a single industry, or their own data models and architectures.

Many other vendors of data integration tools exist beyond those included in this Magic Quadrant. However, most do not meet the above criteria and, therefore, we have not included them in this analysis. Market trends in the past three years indicate that organizations want to use data integration tools that provide flexible data access, delivery and operational management capabilities within a single vendor solution. Excluded vendors frequently provide products to address one very specific style of data delivery (for example, only data federation) but cannot support other styles. Others provide a range of functionality, but operate only in a single region or support only narrow, departmental implementations. Some vendors meet all the functional, deployment and geographic requirements but are very early in their maturity and have limited revenue and few production customers. The following vendors are sometimes considered by Gartner clients alongside those appearing in the Magic Quadrant when deployment needs are aligned with their specific capabilities and/or are newer market entrants with relevant capabilities:

Ab Initio, Lexington, Massachusetts, www.abinitio.com — Application development toolbox (Co>Operating System) and component library for metadata management and data integration.

Alebra Technologies, Minneapolis, Minnesota, www.alebra.com — Parallel Data Mover for cross-platform file and database copying and sharing.

Apatar, Chicopee, Massachusetts, www.apatar.com — Open-source data integration tools focused on ETL and data federation scenarios.

Attunity, Burlington, Massachusetts, www.attunity.com — A range of data-integration-oriented products, including adapters (Attunity Connect), change data capture (Attunity Stream) and data federation (Attunity Federate) for various platforms and database/file types.

CA, Islandia, New York, www.ca.com — Advantage Data Transformer provides ETL-oriented data integration. InfoRefiner provides replication and propagation capabilities for mainframe data repositories.

CDB Software, Houston, Texas, www.cdbsoftware.com — CDB/Delta provides change data capture and replication capabilities for IBM DB2 on the z/OS platform.

Composite Software, San Mateo, California, www.compositesw.com — Composite Information Server provides data federation/EII capabilities and supports delivery of data services.

Datawatch, Chelmsford, Massachusetts, www.datawatch.com — The Monarch Data Pump product provides ETL functionality with a bias toward extracting data from report text, PDF files, spreadsheets and other less-structured data sources.

Denodo Technologies, Palo Alto, California and Madrid, Spain, www.denodo.com — The Denodo Platform provides data federation and mashup capabilities for joining structured data sources with data from Web sites, documents and other less-structured repositories.

Embarcadero Technologies, San Francisco, California, www.embarcadero.com — The DT/Studio ETL tool provides support for a range of relational and other data sources, and integrates with the vendor's data modeling and database design tools.

ETL Solutions, Blaenau Ffestiniog, U.K., www.etlsolutions.com — Transformation Manager provides a metadata-driven toolset for the authoring, testing, debugging and deployment of various data integration requirements.

Exeros, Santa Clara, California, www.exeros.com — The Discovery product automates the process of discerning the business rules that enable mapping and transformation of data between dissimilar data structures.

expressor software, Burlington, Massachusetts, www.expressor-software.com — The expressor product is based on a semantic approach to designing and managing data integration processes.

GoldenGate Software, San Francisco, California, www.goldengate.com — Real-time, heterogeneous data replication capabilities provided by the Transactional Data Management (TDM) software platform.

Ikan Software, Mechelen, Belgium, www.etl4all.com — Java-based ETL technology named ETL4all, supporting transformation servers on Windows, Linux, Unix and IBM iSeries.

Innovative Routines International (CoSort), Melbourne, Florida, www.cosort.com — The Fast Extract and SortCL tools provide for rapid unloading and transformation of data in Oracle databases in support of ETL processes.

Jitterbit, Oakland, California, www.jitterbit.com — Freely downloadable software with a focus on both application integration (event- and message-based) and data integration.

Kalido, Burlington, Massachusetts, and London, U.K., www.kalido.com — The Kalido Active Information Management software enables dynamic data modeling and change management for data warehouses and master data environments.

Metatomix, Dedham, Massachusetts, www.metatomix.com — Follows a semantics-based approach to creation of data services and federated views of data across multiple data sources.

Pentaho, Orlando, Florida, www.pentaho.org — A provider of open-source BI solutions, Pentaho has added data integration tools to its portfolio by leveraging the Kettle open-source project and providing services and support.

Progress Software, Bedford, Massachusetts, www.progress.com — The DataXtend and DataDirect product lines provide tools for data access, replication and synchronization.

Quest Software, Aliso Viejo, California, www.quest.com — SharePlex provides real-time replication support for Oracle DBMS environments and is targeted primarily at high-availability applications.

Red Hat/MetaMatrix, Raleigh, North Carolina, www.redhat.com — The MetaMatrix Server, Enterprise and Query products support creation of data models and model-driven federated views of data.

Relational Solutions, Westlake, Ohio, www.relationalsolutions.com — The BlueSky Integration Studio provides ETL capabilities in a simplified, low-cost toolset that runs in the Windows environment.

SchemaLogic, Kirkland, Washington, www.schemalogic.com — Creation and maintenance of data models (Workshop), business models (SchemaServer), and the ability to propagate models and data across applications (Integration Service).

Seagull Software, Atlanta, Georgia, www.seagullsoftware.com — SmartDB for data migrations to the Oracle E-Business Suite.

SOALogix, Reston, Virginia, www.soalogix.com — The Confero SOA product offers a platform for the creation and delivery of data services for SOA.

Software AG, Darmstadt, Germany, www.softwareag.com — The Enterprise Information Integrator product provides data federation capabilities and is geared toward SOA deployments. The vendor's acquisition of webMethods in 2007 added process-oriented integration capabilities.

Software Labs, Roseville, California, www.softlabsco.com — The xFusion Studio product provides ETL functionality positioned toward a range of use-cases including BI and migrations.

Sypherlink, Dublin, Ohio, www.sypherlink.com — Metadata discovery and mapping via Harvester, and access to data sources for creation of integrated views via Exploratory Warehouse.

Talend, Los Altos, California, and Suresnes, France, www.talend.com — Open Studio is an open-source tool that primarily supports ETL-oriented implementations and is provided for on-premises deployment as well as in a software-as-a-service (SaaS) delivery model.

TigerLogic (formerly Raining Data), Irvine, California, www.tigerlogic.com — TigerLogic XDMS provides XML-based data federation and persistence, as well as delivery of data services.

Vamosa, Glasgow, U.K., and Boston, Massachusetts, www.vamosa.com — Provides content integration and migration, aimed at synchronization and consolidation of document repositories, via its Content X-Change and Content Migrator products.

Vision Solutions, Irvine, California, www.visionsolutions.com — Real-time database replication functionality is provided in the Vision Replicate1 product.

WhereScape, Portland, Oregon, www.wherescape.com — WhereScape Red enables rapid creation and maintenance of data warehouses, including ETL functionality.

XAware, Colorado Springs, Colorado, www.xaware.com — Provides support for the access, integration and service enablement of data sources via its XA-Suite product.

 

Resources & Downloads

Contact sales via :

(916) 773-6272 ext - 130 or

sales@softlabsco.com

Learn more at :

http://www.softlabsco.com

 
 
 

Copyright © 2010 Software Labs Inc. All rights reserved.