Relationships
must be managed to insure that you have adequate legal ownership,
security permissions, and physical access to the data sources.
Process Repeatable steps that can be performed that
get access to and control of certain data elements.
Synthesis - Decisions on what data sources to go after, what
is usable, what can be acquired at reasonable cost, and which data elements
can be acquired easily that may be useful later.
Tools Data comes in many formats and there are a variety
of ways to extract it from its native formats and move it to an open container.
(e.g. loaders, SQL, query generators, ETL tools, replication, etc.)
Containers are typically open databases used to hold
interim stages of the data. These provide optimized functions for managing
and manipulating the results of the initial extracts.
Metadata This is the initial repository of the solution.
It must contain the physical metadata and you should also consider starting
to include the logical metadata and business rules as well. Knowledge
of the data characteristics begins growing during this step. This knowledge
must be passed along with the acquired data to the next step.
Samples:
Relational OLTP databases
Structured Flat Files exported from other systems
Unstructured data exported from other systems
|