DW/BI professionals are often tasked with making evolutionary upgrades and improvements to minimize cost and upheaval in the current analytic environment. We explore four upgrades that can breathe new life into legacy data warehouses. Few readers have the luxury of working with a blank slate when it comes to the development of their data warehouse/business […]

Childhood guessing games sometimes rely on the distinction of “person, place or thing” for early mystery-solving clues. Some modelers use these same characterizations in their data models by creating abstract person, place and/or thing (typically referred to as product) tables. While generalized tables appeal to the purist in all of us and may provide flexibility and reusability advantages for […]

Meaningless integer keys, otherwise known as surrogate keys, are commonly used as primary keys for dimension tables in data warehouse designs. Our students frequently ask us – what about fact tables? Should a unique surrogate key be assigned for every row in a fact table? Although for the logical design of a fact table, the answer is no, […]

Consistent data is the Holy Grail for most data warehouse initiatives, and data stewards are the crusaders who fearlessly strive toward that goal. An active data stewardship program identifies, defines and protects data across the organization. Stewardship ensures the initial effort to populate the data warehouse is done correctly, while significantly reducing the amount of […]

Your ETL system may need to process late arriving dimension data for a variety of reasons. This design tip discusses the scenario where the entire dimension row routinely arrives late, perhaps well after impacted fact rows have been loaded. For example, a new employee may be eligible for healthcare insurance coverage beginning with their first day on the […]

An overarching false statement about dimensional models is that they’re only appropriate for summarized information. Some people maintain that data marts with dimensional models are intended for managerial, strategic analysis and therefore should be populated with summarized data, not operational details. We strongly disagree! Dimensional models should be populated with the most detailed, atomic data captured by the source […]