- Denormalization
is a preferred data structure as it helps reduce joins and improve query
performance. It is also better suited for managing the various types
of time sensitivity that the business wants.
- Detail vs. Aggregation
is an evolving issue. Early warehouse designers advocated summarization
tables. Currently, it is thought that it may be better to leave the
details in the warehouse and do the aggregation in the datamarts.
- Special Purpose Containers
These are containers that exist to provide an enhanced environment
for staging data. These add metadata management, 100% indexing, data
modeling, the system monitoring, query activity tracking, and load balancing.
- Production Loading
is a very important issue. The container must be defined so that
it can be loaded in a window of time that works for the business. Experience
is teaching us that the container definition may be more constrained
by the load and query issues.
- Process and Content Metadata
It seems that as soon as a container definition stabilizes, someone
wants to change it. This is a good thing as it means that the warehouse
is being used and growing. The lesson here is that we need to plan to
manage the ongoing changes to the structures, content, and load events.
- Application Profiles
should be documented so that it is easier to manage multiple
types of downstream data uses.
Samples:
- Exploratory Warehouse
- Operational Data Store
- Appending transaction details
to holding files
- Updating attribute tables
as per the correct frozen point in time
- Ad-hoc warehouses with evolving
or dynamic structures
|
|