ETL testing plays a vital role in data warehouse systems. This involves testing a data warehouse system from end to end. Continue reading to learn more about the ETL testing phase.
Requirements testing: The goal of requirements tests is to verify that all business requirements have been defined and are in line with the expectations of business users. The testing team should analyze business requirements to determine their ability to test and completeness during requirements testing. During requirement testing, the following points should be taken into consideration:
Verification of the logical data model using design documents.
Verification of many-to-many attribute relationship
Verification of the type keys used
The rules of transformation must be clearly stated.
Data models or design documents must specify the target data type.
The purpose and summary of the report must be specified clearly.
Reports should be available.
You should specify all report details, such as groupings, parameters, and filters.
Reports will use technical definitions, such as data definitions or details about tables and fields.
The header, footer, and column headings must all be specified clearly.
The data sources, parameter names, and values must be specified clearly.
Documentation is required to verify the technical mapping of each report in terms of the table name, column name, report name, and description.
This test is designed to verify that the physical data model matches the logical model. During this testing, the following activities should take place:
Verification of the logical data model according to design documents
Verification of the entity relationships in the design document
The attributes and keys of the key must be clearly defined.
Make sure that the model meets all of your requirements.
Verify that the physical model and design are in sync.
Name your products according to the conventions.
Perform schema verification
As per the logical design, ensure the table structure and keys are implemented, and the relationship is in the physical model.
Validation of indexes and partitioning
Unit Testing: Unit testing aims to verify that the implemented component meets design specifications and business requirements. This includes testing business transformation rules, error conditions, mapping fields, and staging levels. Unit testing should consider the following points:
The logic of transformation should be the same from source to target.
The generation of surrogate keys has been done correctly.
Where NULL values were expected, they have been filled in.
There have been rejects where they were expected. A log of rejected items is created with enough details.
The auditing process is carried out properly.
Compare the source and target data counts to ensure all source data are expected to be loaded in the target.
The fields are filled with their full content. No data fields are truncated during the transformation.
Integration testing follows unit testing. Integration testing aims to verify that all components work as expected. Data warehouse applications must be compatible with upstream and downstream flows, and all ETL components must be executed according to the correct schedule. The following list of points should be taken into consideration during Integration Testing.
ETL packages with Initial Load
ETL packages with Incremental Load
Executing ETL packages sequentially
Handling rejected records
Exception handling verification
Logging of Errors
This testing is designed to verify that data flows through the ETL are correct and cleaned according to the business rules. Data Validation Testing should consider the following list of points:
Comparison of data between source and destination
Data flow according to business logic
Mismatch of data types
Validation of source and target row counts
Data duplication
Data correctness
Data Completeness
Security Testing: This testing ensures only authorized users can access reports according to their assigned rights. When performing security tests, it is important to consider the following:
Unauthorized user access
Access to reports based on role
Report Testing: The goal of report testing should be to verify that BI reports meeting all functional requirements defined in the Business Requirements Document. When performing functional testing, it is important to consider the following:
Report drill up, drill down and drill through
Report navigation and embedded Links
Filters
Sorting
Export functionality
Report dashboard
Dependent reports
Verify that the report is run with different parameter values and on the device the user will use to receive the report. The subscription will run the report and distribute it as you wish.
Verify the data returned is what you expected
Check that the report’s performance is within acceptable limits
Validation of report data (Correctness, completeness, and integrity).
Verify the required security measures
Automation of processes can save a lot of time
Verify the rules of business have been followed
Regression Test: Regression tests aim to maintain existing functionality each time new code for a feature implementation is created or when existing code has been changed to correct application defects. Before regression testing can be performed, an impact analysis should be conducted in conjunction with developers to identify the functional areas that will be affected. The ideal is to have 100% regression for every drop/build. If builds are frequent or there is a limited time for testing, then the regression should be scheduled based on the priority of test cases.
Performance Testing: Performance testing aims to ensure that the non-functional requirements load the data or reports on the reports. Performance testing involves different types of tests, such as stress tests, load tests, volume tests, etc. When performing performance testing, it is important to consider the following factors:
Compare the SQL query execution times on Report UI data and backend data
Multiple users can access the same reports simultaneously
Report rendering with multiple filters applied
Check the ETL Process by loading the large volume of data and if it can do the job in the time expected.
Browse the cube using multiple options to validate the performance of the OLAP system
Analyze maximum user load during peak and off-peak times that can access and process BI Reports

