@@ -8,33 +8,33 @@ Feature: Pipeline tests using the planets dataset
88 Some validation of entity attributes is performed: SQL expressions and Python filter
99 functions are used, and templatable business rules feature in the transformations.
1010
11- Scenario : Validate and filter planets (spark)
12- Given I submit the planets file planets_demo.csv for processing
13- And A spark pipeline is configured
14- And I add initial audit entries for the submission
15- Then the latest audit record for the submission is marked with processing status file_transformation
16- When I run the file transformation phase
17- Then the planets entity is stored as a parquet after the file_transformation phase
18- And the latest audit record for the submission is marked with processing status data_contract
19- When I run the data contract phase
20- Then there is 1 record rejection from the data_contract phase
21- And the planets entity is stored as a parquet after the data_contract phase
22- And the latest audit record for the submission is marked with processing status business_rules
23- When I run the business rules phase
24- Then The rules restrict "planets" to 1 qualifying record
25- And At least one row from "planets" has generated error code "HIGH_DENSITY"
26- And At least one row from "planets" has generated error code "WEAK_ESCAPE"
27- And the planets entity is stored as a parquet after the business_rules phase
28- And the latest audit record for the submission is marked with processing status error_report
29- When I run the error report phase
30- Then An error report is produced
31- And The entity "planets" does not contain an entry for "Jupiter" in column "planet"
32- And The entity "planets" contains an entry for "Neptune" in column "planet"
33- And The statistics entry for the submission shows the following information
34- | parameter | value |
35- | record_count | 9 |
36- | number_record_rejections | 18 |
37- | number_warnings | 0 |
11+ # Scenario: Validate and filter planets (spark)
12+ # Given I submit the planets file planets_demo.csv for processing
13+ # And A spark pipeline is configured
14+ # And I add initial audit entries for the submission
15+ # Then the latest audit record for the submission is marked with processing status file_transformation
16+ # When I run the file transformation phase
17+ # Then the planets entity is stored as a parquet after the file_transformation phase
18+ # And the latest audit record for the submission is marked with processing status data_contract
19+ # When I run the data contract phase
20+ # Then there is 1 record rejection from the data_contract phase
21+ # And the planets entity is stored as a parquet after the data_contract phase
22+ # And the latest audit record for the submission is marked with processing status business_rules
23+ # When I run the business rules phase
24+ # Then The rules restrict "planets" to 1 qualifying record
25+ # And At least one row from "planets" has generated error code "HIGH_DENSITY"
26+ # And At least one row from "planets" has generated error code "WEAK_ESCAPE"
27+ # And the planets entity is stored as a parquet after the business_rules phase
28+ # And the latest audit record for the submission is marked with processing status error_report
29+ # When I run the error report phase
30+ # Then An error report is produced
31+ # And The entity "planets" does not contain an entry for "Jupiter" in column "planet"
32+ # And The entity "planets" contains an entry for "Neptune" in column "planet"
33+ # And The statistics entry for the submission shows the following information
34+ # | parameter | value |
35+ # | record_count | 9 |
36+ # | number_record_rejections | 18 |
37+ # | number_warnings | 0 |
3838
3939 Scenario : Handle a file with no extension provided (spark)
4040 Given I submit the planets file planets_no_extension for processing
0 commit comments