Data warehousing populates data from source systems into a central data warehouse (DW) through extraction, transformation, and loading (ETL). Massive transaction data are routinely recorded in a variety of applications such as retail commerce, bank systems, and Website management. Transaction data records a time and relevant reference data needed for a particular transaction record. It is non-trivial for a standard ETL to process transaction data with dependencies and velocity. This paper presents a two-tiered segmentation approach for transaction data warehousing. The approach uses the so-called two-staging ETL to process the detailed records from operational source systems, followed by dimensional data process to populate a dimension data store with star or snowflake schema. The proposed approach is an all-in-one solution capable of processing fast/slowly changing data, and early/late-arriving data. The paper empirically evaluates the proposed method, and the results have shown its effectiveness for transaction data warehousing.
|Title of host publication||Emerging Perspectives in Big Data Warehousing|
|Editors||David Taniar, Wenny Rahayu|
|Number of pages||27|
|Publication date||Jun 2019|
|Publication status||Published - Jun 2019|
|Series||Advances in Data Mining and Database Management (ADMDM) Book Series|
- technology, engineering and IT