Topic > Data WareHouse - 934

Being a market leader today requires a competitive advantage over rival organizations. By investing in data warehouses, organizations can better predict market trends and offer services best suited to their customers' needs. A Data Warehouse (DW) can be defined as a non-volatile, subject-oriented database containing records over years [1,2]. DWs support strategic decision making and help answer questions like “Who was our best customer for this item last year?”[3]. Different DW systems are made up of different components, however, some major components are shared by most DW systems. The first component is data sources. DW receives inputs from different data sources (such as Point-Of-Sales (POS) systems, Automated Teller Machines (ATMs) in banks, cash terminals, etc.). The second component is the data staging area. The data comes from data sources and is placed in the staging area, where the data is treated with different transformations and cleaned of any anomalies. After this transformation, the data is placed into the third component known as the storage area, which is usually a relational database management system (RDBMS). This process of extracting data from data sources, transforming it, and finally loading it into storage is considered Extract, Transform, and Load (ETL). Data saved from memory can be viewed by reporting units. Several online analytical processing (OLAP) tools help generate reports based on the data saved in the storage area [4,5,6,7,8]. We believe testing should be rooted in DW development. Therefore, each of the DW components should be tested. One of the main challenges in testing DW systems is the fact that DW systems are different between organizations, each organization has its own DW system that conforms to its requirements and needs, which leads to having differences between DW systems in different aspects (such as (such as database technology, tools used, size, number of users, number of data sources, how components are connected, etc.)[9]. Another major challenge facing DW testers concerns preparing test data. Using real data for testing purposes is a violation of citizen privacy laws in some countries (for example, using real data from bank accounts and other information is illegal in many countries) Successful testing of a DW requires a huge amount of test data. In a real-time environment, the system may behave differently in the presence of terabytes of data [10].