Data Warehousing
  Data Warehousing
  Meta Data Management
Articles & White Papers
Multi-dimensional Analysis
Portal and Web-enabled Solutions
Data Movement
 

Designing a Data Warehouse

Download this article in .pdf format

By Michael Haisten

In my white paper Planning For A Data Warehouse, I covered the essential issues of the data warehouse planning process.¹ This time I move on to take a detailed look at the topic of warehouse design. In this discussion I focus on design issues often overlooked in implementing several of the warehouse services outlined in my previous article:

  • Acquisition - bringing data into the warehouse from the operational environment
  • Preparation - packaging the data for a variety of uses by a diverse set of information consumers
  • Navigation - assisting consumers in finding and using warehouse resources
  • Access - supporting direct interaction with data warehouse data by consumers using a variety of tools
  • Delivery - providing agent-driven collections and subscription information delivery to consumers

Creating the Data Warehouse Data Model
Moral: A good data warehouse model is a synthesis of diverse non-traditional factors.
A data warehouse model must be comprehensive, current and dynamic, and provide a complete picture of the physical reality of the warehouse as it evolves. The data warehouse database schema should be generated and maintained directly from the model. The model not just a design aid, but an ongoing communication and management tool.

Managing the dynamic nature of a data warehouse is one of the most daunting challenges. New
feeds must be supported from new and existing sources. New configurations of data must be packaged with tools to meet continually changing needs. Under-utilized datasets must be retired or consolidated with others. Staying on top of these changes is easier when the model is religiously used to maintain the database structure.

A Hybrid of Concepts, Techniques and Methods
A good data warehouse model is a hybrid representing the diversity of different data
containers¹ required to acquire, store, package, and deliver sharable data.

To be useful, a warehouse data model must contain physical representations, such as summaries and derived data. The basic entities and attributes that will become core atomic tables are just the starting point for a data warehouse. A purely logical model contains none of the substance that defines the primary warehouse deliverables. These deliverables are the many aggregates, packages or collections of data produced to satisfy consumer access needs.

The model must also include archive
data and meta data as well as core data and collections. An archive is a managed, historical extension of the on-line warehouse tables (see "Operations: Implementing a True Archive" below). You must include the definition of the archive tables in the model to record the full context of available data. Meta data is best modeled side-by-side with the data itself and stored co-resident with the warehouse tables.

There is even a more fundamental way in which the data warehouse model is a hybrid. The model is not created using a single methodology. In fact, the better models are a synthesis of differing techniques each contributing a part of the overall result.

Bottom Up Analysis
To design a data warehouse, you must learn the needs of the information consumer. You must decide how the data should be organized. But often designers miss an even more fundamental question, which should come first: What data is
available for cross-functional sharing and analysis? The question "What do you want?" is often answered by the questions "I don't know, what have you got?" This is a common circular track in the requirements process. A strict, by-the-book modeler or requirements analyst will loop back with the reply "Well, I really can't say since what we most need to know is what you know you need."

Unfortunately, this is the wrong way out of the loop when designing a data warehouse. What data is available often influences what can be done and even what questions are worth asking. What the information consumer
will need in the future may be more important than what they know they need now. A major part of our job is to help them understand what they don't know they need.

The problem is that we often have no clear picture of what data is locked up in the
data jailhouses that are our operational systems. Data models are not current, offer insufficient detail, or, more likely, do not exist at all. Trying to interpret the fragmented, denormalized layout of typical legacy systems is a daunting task if done manually.

One technique to consider is the use of a
reverse-engineering tool. Such a tool reads the legacy system data directly to produce a tentative logical model. Several deliverables are produced which we can use to develop a data warehouse model:

  • A consolidated list of unique data elements
  • An indication of the fundamental relationships which exist
  • A normalized structure which can serve as a starting point for both re-engineering the system and for the design of data warehouse core tables

A candidate list of data warehouse data elements can be produced without any fundamental requirements analysis at all. Start with the consolidated list of data elements from the operational source, then carry out a form of triage. Scan the available data element list and divide them quickly into those that are 1) operational and transitory in nature, 2) of analytic and historical value, and 3) of unknown use or value.

This is easier than it sounds. The first category includes flags and indicators reflecting states of operational processes, control totals, and derived values produced to support transaction processing. Many data elements are replicated from one file to another for performance efficiency.

When in doubt, ask the question, does this element have any lasting meaning after completion of this process? Does it have any historical significance? As an example, a field called
"awaiting shipment" is clearly one of these transitory indicators. A shipment record contains data of permanent value but "awaiting shipment" is worthless after the fact. On the other hand, a field called "shipment wait duration" has a potential use after the close of an order. We can use this element to compute a profile of shipment delay times.

A large number of fields or columns in operational system will fall into the first category. The percentage ranges from forty to seventy percent of all fields.

After eliminating the operational and transitory elements, and healthy percentage of the rest should be clearly of analytic or historic value. Elements in the second category are included as candidates for our data warehouse model without question. This includes elements that record the essence of the core business event. Some will become the facts in our data warehouse. Examples include sales order quantity, voucher payment amount, and values like shipment wait duration. Other essential elements of the core business event define the dimensions of analysis such as order date, customer ID, product code, vendor number, etc.

What remains for the third category is often a small proportion of the available data elements (usually between ten and thirty percent).

One principle of warehouse design is to include all category-two elements without insisting on documenting specific current requirements. We suggest taking this principle one step further. We suggest including category three, the indeterminate, as candidates for warehouse inclusion as well. Being a candidate means you must generate solid reasons NOT to acquire these elements for the data warehouse. This turns the traditions requirements process on its head!

Why? Because a common data warehouse failing is to not have what the consumer wants when they come shopping. This is guaranteed to happen if you limit your scope known requirements only. A warehouse must be built to anticipate future needs, and not limited to what is a known need now. And you know what? It is both easier, and less costly in the long run to take it all than to pick and choose!

Top Down Analysis
So how do you anticipate the future? We suggest facilitated brainstorming. Avoid
blank slate modeling sessions that start from scratch. Avoid a passive requirement process that focuses on revealing known needs. Instead, structure a process that introduces many possibilities and uses strategic business directions to catalyze a conversation about what is known, what may be, and what should be.

This facilitated brainstorming takes the form of several sessions with participation from different constituencies. This process works best when there is a mix of information systems professionals and business information consumers. You start by opening their eyes to the possibilities. You first overview what is known and what is available. Cataloging existing analytic and reporting usage is an effective precursor. If you have completed the bottom up process as discussed above, the candidate data element list is an input to the process. This is used to trigger ideas of new forms of analysis that offer potential benefits to the business.

You continue with a directed process that focuses on the key business and information challenges today that must be overcome. Using published corporate direction statements is often a powerful technique to drive home this message.

An expected deliverable of this process is a weighted or prioritized list of anticipated analytic uses of data. This anticipated usage helps our data warehouse design in two ways:

1. They are used to validate the selection of elements.

2. They provide insight on the data associations that have to be supported in the database.

This second factor directly supports data preparations or the development of specific data packaging to satisfy anticipated needs.

The True Role of an Enterprise Model
There are two flaws with the common perception of the role of an enterprise model in data warehouse design. The first is that an enterprise model is essential. It is not. The second is that the enterprise model is the sole or primary input to the design process. It is only one of many contributing factors. An enterprise model plays a pivotal role in establishing the context and high level structure of a data warehouse data model. But it has very limited impact on the details of the design.

An enterprise model provides a conceptual framework for data integration. It defines the major high-level business entities (which should equate to the warehouse notion of
subject areas). It depicts how the high level entities interrelate across the full spectrum of the enterprise. Depending on the level of detail, it may suggest critically important business facts that should be included in the warehouse design.

However, an enterprise model is never a sufficient starting point for detailed design. The enterprise entities have insufficient detail to design even the core atomic tables of the warehouse. And the vast bulk of the tables in a warehouse, the ones that take the most effort to design, implement, redesign, and reimplement, are the aggregate and summary collections. This derived data would never be represented in an enterprise model, but it is essential to the design of a data warehouse.

The reality is that very few enterprises have a comprehensive and current enterprise model. What exists may not even be sufficient as an integration framework. Worse yet, it may not even serve to identify all the subject areas of the business. You can, and many do, implement an effective data warehouse design without a formal complete enterprise model.

However, in its absence you need some mechanism for defining and bounding your subject areas. You must produce a high-level subject area map to clearly communicate the scope of each subject area and the relationships between subject areas. It is impossible to integrate without having a picture of the whole to which each part relates.

In summary (see Figure 1):

  • Use reverse engineering of source systems to define a normalized view of existing data. The resulting logical models are candidates for table design in the warehouse.
  • Use the unique data element list from reverse engineering to create a list of candidate elements to be included.
  • Conduct facilitated brainstorming sessions to create an ordered list of anticipated data usage. This is used to both validate candidate elements for inclusion and to help define the data associations that will become aggregate and summary collections of data.
  • Employ an enterprise model as an integration template revealing the relationship between subject areas. Use it to group tables into subject areas and define macro needs.

Data Acquisition Synchronization Methods
Moral: Selecting and mapping the source is easy. Synchronizing the target is tough.
Acquisition services involve getting data out of the enterprise legacy systems and into the base tables of the data warehouse:

  • Extract - creates a data snapshot from a source system based on identified selection criteria
  • Transport - sends a snapshot to the location where it will be loaded into the data warehouse
  • Clean - inspects the data for validity and then optionally corrects the content or marks errors
  • Transform - implements conversion logic to produce a new value from one or more source fields
  • Populate - loads warehouse base tables from one or more snapshots

The advent of automated data acquisition tools has fueled the growing interest in data warehousing. Tools such as Carlton Passport, ETI·Extract Tool Suite by Evolutionary Technologies International, and Warehouse Manager by Prism have finally validated the benefits of CASE code generation technology, providing reusable definitions and highly efficient development and maintenance.

Uninformed use of these tools, however, may hide key design decisions implicit in the default techniques used by these products. Specifying mappings and transformation in real world contexts are always more tricky than they appear. Data validation and cleanup is also often a difficult task. The area where these tools are most limited is in the types of
synchronization choices they provide.

By synchronization, I mean the combination of extract, transport, and population techniques chosen to balance results achieved, performance and resource utilization, while assuring source and target data consistency. This trade-off is affected by the volatility, volume, and value of the data, the means of transport, and the operational batch window for extract processing. The more options you have, and the more flexibility to mix and match techniques, the better off you will be when faced with the inevitable
unique circumstances.

Extract Methods
The main extract methods used are the following:

Full Copy unloads the entire file or table. A bulk export or copy command is often the most efficient means. Automated tools may use SQL.

Selective Extraction generally uses SQL either directly against the source data or via or a database gateway or transparency layer to create a subset of the source data. For some data source, a proprietary data manipulation language or export specification is more efficient.

Net Changes (selective) is a variant of selective extraction where the criteria is tuned to extract records that have changed or have been added (net changes) since the last execution.

Net Changes (log) determines net changes by reading the log file of the source DBMS or file system instead of accessing the source data directly.

Net Changes (bucket) accumulates net changes in a bucket or hold file under the control of the source or producer applications - no external extract process is used with this method.

Triggered Replication is a record or transaction level process that determines new or updated data automatically when the change event occurs.

Differencing is a technique that compares two copies from different time periods to identify new, updated or deleted records. A full copy unload is a precursor to differencing, which uses flat file comparison techniques.

Populate Methods
The main populate methods used are:

Refresh completely reloads the target warehouse table each time. This can be done by either truncating (emptying) the table or dropping and then creating the table. The loading process may either use SQL, a bulk load utility, or proprietary commands.

Logical Update appends rows to the table and additionally supports logical update via effective dates. Use of a single as-of-date is the norm, but a begin-date and end-date pair are occasionally used for reference data.

Physical Update issues an actual update command to modify values in an existing row or to do a physical update should never be used on base tables in the data warehouse. This destroys information by modifying history data.

Refresh operations are by definition set-oriented. Append and logical update operations can be either record-oriented or set-oriented. A set-oriented approach may use either a bulk load utility or proprietary commands. A record-oriented approach is generally not recommended except in conjunction with streaming transport.

Combinations of Synchronization Method
Some of the main combinations synchronization method are discussed below.

Full Copy extract and Refresh populate - this simple mode is over used and should be avoided in most cases. Generally, only small decode tables or reference data should be refreshed and then only if the source contains both active and inactive values. Business facts or event data should always be maintained incrementally. Prior attempts to deliver management information to the desktop, such as information centers, failed due to overuse of image copies of operational data. The lack of continuity and history is one of the flaws a warehouse is intended to fix.

  • Net change (any) extract and Logical Update populate - This is the most common mode for most warehouse tables. The three variants are noted below:
  • Net change (selective) is the most universal method, but offers increasing poor performance for larger tables.
  • Net change (bucket) offers the best performance, but creates an ongoing dependency on the source system. This significantly inhibits the dynamic flexibility of the warehouse to respond to changing needs. Since the extract is not specified in an automated tool, reusability and development efficiencies are not realized.
  • Net Change (log), or what is called changed data capture by some vendors, provides performance advantages over the selective approach and retains warehouse control lost via the bucket approach. The downside includes specification complexity and limitations on the number of cases where this method can be applied. For example, the file system or DBMS may not offer a log mechanism that captures the details required or the log may not be enabled. Another issue is that the log may not be maintained in a manner that is compatible with data warehouse synchronization, for example, logs may be reset during an operational cycle or after backup and recovery.

Differencing extract and Logical Update populate - Net change methods often involve a file or table scan under control of a file or DBMS manager. Differencing uses extremely fast flat file record comparator techniques and is amazingly efficient when the volatility is low-to-intermediate on moderate-to-large tables. This technique has been used effectively in custom implementations.

Net Change extract and Streaming transport - This technique is useful, and often, essential in situations where very large volumes of data exist on both source and target systems, there is high volatility of data and limited batch processing time on the source system. A continuous end-to-end process is initiated moving extracted data a row or transaction set at a time directly into the target table until all the net changes have been processed for the session.

Replication extract via a DBMS server process - The use of replication servers is generally not recommended for data warehouse data acquisition. The initial round of such products where created to maintain operational integrity and consistency between two remote copies of the data, one a master, the other a slave. They are near real-time transaction processes that perform hard updates and deletes on the target - A warehouse table is not a replicant of an operational table. New products are being introduced that support lazy (time sequenced, non-real-time) replication of data and allow the preferred logical update mode. These products are more suitable for data warehousing.

Preparation: Packaging the Data for Use
Moral: It's not just what you have, it's how it's packaged that makes the difference.
Collect a broad spectrum of core detailed data drawn from multiple sources systems to populate your target subject area(s) in the warehouse. Anticipate the future needs and take the atomic detail that is needed today and of potential use tomorrow. Doing this right is necessary, but does not guarantee success. You will have acquired the right raw materials, but what will make or break your success is how you prepare the data for use.

A Data Warehouse is fundamentally very different from prior attempts at decision support and end user access. One of the most essential differences is the reliance on massively redundant storage schemes. The data is packaged in a variety of ways to separately optimize each form of access and mode of analysis.

Core Warehouse Data, the basic atomic facts, is structured for efficient and rational storage. These tables are the on-line component of the historical archive and the raw materials for producing specialized tables for consumer use. Core data is always stored in a highly normalized fashion.

Standard Collections are aggregates, related sets of information from multiple base tables, and light summaries that support a wide variety of common uses. If core data is the raw materials in a data warehouse, standard collections are the intermediate assemblies. Manufacturers build intermediate assemblies to more efficiently produce a wide array of very specific, often very different, consumer products, from a much small number of standardized building blocks. Likewise, in a data warehouse, standard collections are produced to more efficiently construct a diversity of specialized data structures. You must produce aggregates and summaries to efficiently support high-level analysis. It is prohibitively expensive to produce all summaries directly from the basic core tables.

Standard collections may be normalized, but summarized. They may be denormalized. They may be denormalized by pre-jointing related information or by inclusion of time series arrays. However, regardless of structure, standard collections are generally maintained in relational database tables. This is because they will be used by relational tools to produce consumer collections and by SQL access tools directly fore analysis by those with detailed information needs.

Consumer Collections are the uniquely packaged selective subsets of data optimized for a defined usage. They are constructed with the content, level of detail, and format required for a given purpose. The purpose is defined by a business question posed by a given consumer who intends to use a specific type of analysis with a particular tool in mind. Often the type of analysis and the tool dictate a specialized form or packaging of the data.

The format and structure of consumer collections varies quite widely, including the following:

  • General Denorms are highly summarized tables containing basic and derived data selected for a specific requirement. These can be employed by general purpose access tools for a wide variety of uses. These can be employed by general purpose access tools for a wide variety of uses. Their disadvantage is that many of these have to be built to satisfy the diversity of uses.
  • Historical summaries are implemented as Time Series data stored as arrays. Arrays are often far more amenable to use by novice consumers and general-purpose access tools.
  • Star Joins exploit a single fact table and about three to five composite dimensional tables containing information regarding the rollup hierarchy. This structure was popularized by Metaphor and is not supported by the Red Brick database system and other specialized tools. It is more flexible than hypercubes and more efficient for access, analysis, and drill-down queries than a normalized design.
  • Snowflake structures are similar to a star join. They also contain a single central fact table, but the dimensions are specified by a set of tables representing the rollup hierarchy and characteristics in a more normalized fashion. This structure is more accessible by a general purpose query tool but does not offer the performance advantages of a star join.
  • Hypercube is a single dataset containing multiple levels of pre-calculated information. This dataset may be a relational table or a proprietary structure maintained by a particular DSS or EIS tool. Even when stored in a relational table, this structure is exclusively usable by the tool that creates, maintains, and provides access to it. You trade generalized management and accessibility for very high drilldown efficiency. You also lose flexibility since data must be reloaded rather than incrementally updated.
  • Exports is a general name for flat file datasets created to satisfy the import requirements of a target tool. The tool may be a standalone OLAP server, a client-based DSS system or DBMS, a spreadsheet tool, or virtually any other data-oriented tool.

You need to match the wide diversity of needs with an equal diversity of data packaging. Once size does not fit all. As business needs change, you need to retire obsolete data packages and build new ones. As new tools are introduced, they often come with unique requirements for data and meta data import, storage, and export.

You have three alternatives for handling the broad diversity of needs:

1. Develop an inventory management process to continually evaluate and refine the data products you offer.

2. Invest in new (but quickly evolving) software products that help you manage warehouse aggregates, for example, HP Warehouse Analyzer, Stanford Technologies Metacube.

3. Buy a VERY BIG machine and run all queries off the base tables.

Operations: Implementing a True Archive
Moral: Maintaining the atoms and building viable history requires a dynamic archive.
Earlier we stated that one factor which makes a data warehouse unique is the insistence on acquiring atomic level detail. A second factor is the dedication to maintaining enterprise history. Prior decision support and end-user efforts were, at best, only partially successful because they were based only on a current view of incomplete and primarily summarized data. The clear challenge is to handle the massive volumes that will result from maintaining both atomic and enterprise history data. The solution is to integrate an
archive as a planned component of your data warehouse infrastructure from day one.

An archive is auxiliary storage managed to maintain a consistent historical perspective by offering automated archival, on-demand retrieval, and transparent view creation. An archive supports access and analysis of information in the extended inventory without regard to what happens to be
on-line at the time.

Data Containers in an Archive
An archive is organized to manage several different types of data necessary to retain enterprise history:

Atomic History
- The row-by-row atomic details that are often archived as the data is acquired from the source systems to build the base fact core event tables.

Context History
- Retaining details is not sufficient to reconstruct an accurate picture of what the environment looked like when the data was recorded. Organizations change, rollup hierarchies are in constant flux, coding schemes come and go, and in general the reference data that defines the information context is highly volatile. Context history is the periodic recording of this volatile data to understand the full information environment (the context) that existed at the time.

With these two broad categories, we have accounted for all the raw data which is captured in our data warehouse. However, to facilitate practical day-to-day use of the archive, we need at least two other primary data containers:

Time Fragments
- Time is the dominant dimension in a data warehouse. Many forms of analysis involve either trends or inter-period comparisons. Since an archive is simply an extension of the on-line data warehouse backwards in time, it pays for us to store primary standard collections of lightly summarized data in a time-aware manner. Time fragments are chunks of an extended domain divided into physically separate datasets based on some common periodicity. For instance, we might retain at least seven years of financial data with three years on-line. The data in the archive is stored in quarterly time units that can be retrieved separately. In this way, you can reassemble almost any timeframe for analysis selectively.

Strategic Results - Some highly summarized datasets represent business results that are quite significant and are widely reported. Annual operating results, quarterly sales figures by region, and monthly product forecasts are common examples. Any attempts to maintain enterprise history must archive these end-point artifacts or final results, which were used to make strategic decisions.

Services Provided by an Archive
An archive should support at least the following types of services:

  • Atomic Archive/Retrieve - Storing rows of atomic detail in a fashion that allows selective retrieval by defined dimensions or criteria.
  • Context Management - Periodic storage of related reference data that may optionally be retrieved when either atomic data or time fragments are retrieved.
  • Fragment Management - Provides a mechanism to define the periodicity of a data domain, store data according to this periodic interval, and retrieve the data for one or more periods.
  • View/Retrieve Table Creation - Provides a home for data that is retrieved. One method is to create a union view which encompasses the on-line table and one or more identically structured tables of retrieved data. A second method is to dynamically create and populate a single table into which the data is retrieved. An option of this latter method is to load all, or selective contents of the corresponding on-line table into the newly created retrieved data table.
  • Strategic Results Archive/Retrieve - Stores and reloads strategic results tables.
  • Navigation Services - an information catalog browsing service that helps that consumer identify what data is available, in what level of detail, at what location, and for what timeframe. To support an archive, you need to identify for what interval of time the data is available on-line vs. in the archive.

Delivery: Defining a Delivery Infrastructure
Moral: Interactive access is not the whole game. In fact, alone it is not enough.
The access and analysis tools market is growing by leaps and bounds. For example, Dr. E.F. Codd has coined the term On-Line Analytic Processing (OLAP) and new vendors are attempting to differentiate themselves by introducing pet phrases such as surging, grazing or diving to supplement trending, data mining, and drill-down, which have been around a little longer. What they all have in common is an assumption that consumers want to interact with their data.

Analysis implies interactivity. However, interactive analysis does not require interactive access. Once consumers have grazed, and determined the criteria that provided useful data, they don't want to graze anymore. They prefer to have the data delivered when they need it.

For many information consumers, analysis is not even the goal. Access and presentation are the sole objectives. One of our most pervasive myths is that our primary customer is an information analyst. The vast majority of the direct consumers of a data warehouse will be information clerks and go-betweens who prepare data for other.

Information clerks don't surf or graze or dive. They select, format, and present. They often follow known paths, repeating identical or similar processes periodically. What will help them most are mechanisms that free up their machine as quickly as possible and that automate any repetitive activity.

Both an information analyst and an information clerk will benefit from a
delivery service as an alternative to interactive access. With a delivery service, consumers specify what they want, what form they want it in, where they want it delivered, and when they want delivery to take place.

Alternative Delivery Channels
There are two radically different approaches to delivery: Order fulfillment and agent collection. Order fulfillment, sometimes called subscription delivery, allows a consumer to place an order describing the content and format of the desired results and have it scheduled for delivery to a specified location at a given time. The order may be one time or a repetitive, standing order. Agent collection allows consumers to specify data requirements that are acted upon when a defined event occurs or trigger conditions are met. Orders are time-based captive processes. Agents are event-based autonomous processes.

A range of alternative delivery mechanisms exist:

  • Batch SQL: SQL is exported form any access tool and scheduled for independent execution on the DBMS server. Results are returned by file transfer to a local server or retrieved by the consumer from the remote output folder or directory. The result files are generally raw tab- or comma-delimited text files, but may be post-processed into a specific PC platform standard.
  • Client Timer: The client access tool may offer a delayed execution capability. At a specified time, the software activates, possibly in background mode, and runs a prepared script or query template. The results are made available on the local client platform in query document form, not just as raw data.
  • Object-Aware Server: The client tool may submit a query object to be executed at a specified time on a remote batch server. Generally, an administrator facility is available to control execution sequence, priority, error handling and more. The results are returned to the client platform in a query document form. Unlike the client timer option, this mechanism frees up the client machine entirely while offering more uniform control of consumer order processing.
  • Order-Based Agent: A query template specifying desired content and format is attached to an agent process. The agent is assigned to execute based on a trigger event or condition. An event example is when new sales data is received from a remote site. A condition example is when costs exceed a set amount for a specific department or account. The results are returned as either raw data or a query document.
  • Search-Based Agent: Information needs are defined in a general, categorical manner. The agent is instructed to inform you when data is found matching your general criteria. It may optionally collect the data itself, or an abstract or sample if one exists. An example is to specify a search for any mention of a competitor's product within several sources. You may further specify that any information on pricing should be retrieved automatically.

Providing both interactive a delivery-oriented access is essential to offer a complete solution. Agents will play a major role in the future of information access. Some believe agent based processing will also play growing role in data analysis.

Originally published in InfoDB

References
1. Michael Haisten, "Planning for a Data Warehouse,"
InfoDB, Volume 9, Number 1.

FACE="Verdana, Helvetica, Arial">Texas.78746 tel(512)329-6646,.fax(512)329-0767

© Copyright 1999-2003, Daman Incorporated. All rights reserved. All trademarks are the property of their respective owners.
 



home | about us | solutions | products | partners | clients | careers | contact us | search