DATA WAREHOUSING
Data Warehousing
A Data warehouse is a repository where information is gathered from multiple data sources and stored electronically. Data engineers and Business analysts for business intelligence purposes in an organization can access data warehouses that store data in larger amounts. The data is accessible and easy to evaluate for business insights. The data gathered from multiple sources is in multiple formats, such as Structured, Semi-structured, or Unstructured. The collected data is processed, modified, and converted into information so that the users can access the data through multiple tools such as Spreadsheets, SQL Clients, etc. A data warehouse is a centralized location where all the enterprise data gets stored for further analysis. Business Intelligence tools help organizations to explore the data to make better business decisions.
The data is stored in the data warehouse through ETL (Extract, Transform and Load). The data is extracted from the source and transformed into high-quality data before being loaded into the warehouse (El-Sappagh, Hendawi, and Bastawissy, 2011). To keep the data up to date, businesses use this procedure regularly. Business intelligence (BI) tools become more important when an organization wants to use its data for analytics or reporting. Deriving valuable insights from their business data, the two BI tools -Data exploration and visual analytics help organizations.
Types of a Data Warehouse:
• Enterprise Datawarehouse
• Data Mart
• Operational Datastore
Enterprise Datawarehouse: It is centralized for organizing and representing data. It provides unified approaches, and various departments across the enterprise seek support for decision-making.
Operational Data store: Operational Data Store, also referred to as ODS. It is used when the data warehouse does not meet an organization’s needs (reporting). A data warehouse can be refreshed in real-time, making it ideal for keeping staff information.
Data Mart: Data mark is specifically designed for business lines such as accounts, sales, purchases, finance, and inventory (datachannel, 2022). As shown in figure:1
fig:1 three tire data warehouse architecture
Stages in Data warehouse:
• Offline Operational Database
• Offline Data Warehouse
• Real-time Data Warehouse
• Integrated Data Warehouse
Offline Operational Database: In this stage, the data loading, Processing, and reporting do not influence the operating system performance. The data is copied from the operating system to a server.
Offline Data Warehouse: In this stage, the Operational Database regularly updates the data stored in the warehouse for business insights.
Real-time Data Warehouse: In this stage, the data warehouses are updated whenever the transaction happens in the operational database. Both the operational database and data warehouse will be synchronized.
Integrated Data Warehouse: In this stage, the data warehouses are updated whenever the transaction happens in the operational database. Then, the transactions are forwarded to the operational database, which is generated by the data warehouse (datachannel, 2022).
Data warehouse Architecture:
• Single-tier architecture
• Two-tier architecture
• Three-tier architecture
Single-tier architecture: In this tier, the aim is to remove data redundancy. So, the data is minimized and stored. This is not a frequent helpful practice.
Two-tier architecture: In this tier, the levels of the data warehouse divide the data warehouse from physically accessible sources. This architecture cannot be scalable. Because of network limitations, it has connectivity problems (Vaisman, and Zimányi, 2014).
Three-tier architecture: The most used architecture of data warehouse has Top, Middle, and Bottom tiers (Chaudhary, Murala, and Srivastav, 2011).
1)Top Tier: The front-end client layer; is where you get connected and get the data from the data warehouse through tools and APIs.
2)Middle Tier: The OLAP server is implemented using the ROLAP or MOLAP model. This application tier displays an abstract view of the database to the user.
3)Bottom Tier: In this tier, the database serves as the bottom tier, typically a relational data (guru99, 2022).
Features of Data warehouse:
• Integrated
• Time-Variant
• Non-Volatile (Static)
• Subject oriented
Banks employ data warehouses for governance and to maintain regulatory compliance. Different areas of business in banking generate numerous operating systems, resulting in dispersed, inconsistent data. Mergers and acquisitions exacerbate the problem. Banks may employ data warehouses to have access to reliable data for reporting and analytics.(Al-Okaily, et al. 2022).
Secure access to standardized data that has been collected from various systems, including clinical, staff, patient, and financial activities, is required by healthcare organizations. After reviewing this reliable data, they are better equipped to coordinate care, deliver coordinated treatment, and guarantee positive health outcomes.
REFERENCES
Al-Okaily, A., Al-Okaily, M., Teoh, A.P. and Al-Debei, M.M., (2022). An empirical study on data warehouse systems effectiveness: the case of Jordanian banks in the business intelligence era. EuroMed Journal of Business, (ahead-of-print).
Chaudhary, S., Murala, D.P. and Srivastav, V.K., (2011). A critical review of data warehouse. Global Journal of Business Management and Information Technology, 1(2), pp.95-103.
datachannel (2022) introduction to data warehousing. Available at: . https://datachannel.co/blogs/introduction-to-data-warehousing/ (Accessed: 20/November/2022).
El-Sappagh, S.H.A., Hendawi, A.M.A. and El Bastawissy, A.H., (2011). A proposed model for data warehouse ETL processes. Journal of King Saud University-Computer and Information Sciences, 23(2), pp.91-104.
guru99 (2022) data warehouse architecture. Available at: https://www.guru99.com/data-warehouse-architecture.html (Accessed: 20/November/2022).
Vaisman, A. and Zimányi, E., (2014). Data warehouse systems. Data-Centric Systems and Applications.
Comments
Post a Comment