Skip to content

Commit ebea1ee

Browse files
Merge branch 'main' of https://github.com/NHSDigital/data-validation-engine into ci/add_conda_publish_setup
2 parents 68d61f2 + db0d300 commit ebea1ee

132 files changed

Lines changed: 3886 additions & 2130 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.github/pull_request_template.md

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
## TLDR of changes
2+
3+
4+
## What kind of changes does this PR introduce?
5+
6+
__Tick all that apply__
7+
8+
- [ ] fix: A bug fix. Correlates with PATCH in SemVer
9+
- [ ] feat: A new feature. Correlates with MINOR in SemVer
10+
- [ ] docs: Documentation only changes
11+
- [ ] style: Changes that do not affect the meaning of the code (white-space, formatting, missing semi-colons, etc)
12+
- [ ] refactor: A code change that neither fixes a bug nor adds a feature
13+
- [ ] perf: A code change that improves performance
14+
- [ ] test: Adding missing or correcting existing tests
15+
- [ ] build: Changes that affect the build system or external dependencies (example scopes: pip, docker, npm)
16+
- [ ] ci: Changes to CI configuration files and scripts (example scopes: GitLabCI)
17+
18+
19+
## Please check if the PR fulfills these requirements
20+
21+
- [ ] I have read and followed the [Contributing guidance](../CONTRIBUTE.md)
22+
- [ ] Docs have been added / updated
23+
- [ ] Tests and Linting in the CI are passing
24+
- [ ] Changes have been reviewed and approved by a Project Maintainer
Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
name: Publish Documentation
2+
3+
on:
4+
push:
5+
branches: main
6+
7+
permissions:
8+
contents: read
9+
pages: write
10+
id-token: write
11+
12+
jobs:
13+
deploy:
14+
environment:
15+
name: github-pages
16+
url: ${{ steps.deployment.outputs.page_url }}
17+
runs-on: ubuntu-24.04
18+
steps:
19+
- uses: actions/configure-pages@v5
20+
21+
- uses: actions/checkout@v5
22+
23+
- name: Install extra dependencies for a python install
24+
run: |
25+
sudo apt-get update
26+
sudo apt -y install --no-install-recommends liblzma-dev libbz2-dev libreadline-dev
27+
28+
- name: Install asdf cli
29+
uses: asdf-vm/actions/setup@b7bcd026f18772e44fe1026d729e1611cc435d47
30+
31+
- name: Install software through asdf
32+
uses: asdf-vm/actions/install@b7bcd026f18772e44fe1026d729e1611cc435d47
33+
34+
- name: reshim asdf
35+
run: asdf reshim
36+
37+
- name: ensure poetry using desired python version
38+
run: poetry env use $(asdf which python)
39+
40+
- name: install docs requirements
41+
run: |
42+
poetry install --sync --no-interaction --with docs
43+
44+
- run: poetry run zensical build --clean
45+
- uses: actions/upload-pages-artifact@v4
46+
with:
47+
path: site
48+
- uses: actions/deploy-pages@v4
49+
id: deployment

.github/workflows/ci_linting.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -17,10 +17,10 @@ jobs:
1717
sudo apt -y install --no-install-recommends liblzma-dev libbz2-dev libreadline-dev
1818
1919
- name: Install asdf cli
20-
uses: asdf-vm/actions/setup@v4
20+
uses: asdf-vm/actions/setup@b7bcd026f18772e44fe1026d729e1611cc435d47
2121

2222
- name: Install software through asdf
23-
uses: asdf-vm/actions/install@v4
23+
uses: asdf-vm/actions/install@b7bcd026f18772e44fe1026d729e1611cc435d47
2424

2525
- name: reshim asdf
2626
run: asdf reshim

.github/workflows/ci_testing.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -20,10 +20,10 @@ jobs:
2020
sudo apt -y install --no-install-recommends liblzma-dev libbz2-dev libreadline-dev libxml2-utils
2121
2222
- name: Install asdf cli
23-
uses: asdf-vm/actions/setup@v4
23+
uses: asdf-vm/actions/setup@b7bcd026f18772e44fe1026d729e1611cc435d47
2424

2525
- name: Install software through asdf
26-
uses: asdf-vm/actions/install@v4
26+
uses: asdf-vm/actions/install@b7bcd026f18772e44fe1026d729e1611cc435d47
2727

2828
- name: reshim asdf
2929
run: asdf reshim

CHANGELOG.md

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,32 @@
1+
## v0.7.2 (2026-04-02)
2+
3+
### Fix
4+
5+
- amend messaging for missing CSV fields check in ddb reader (#83)
6+
- ensure publish audit only captures information about the indivdual submission (#77)
7+
8+
### Refactor
9+
10+
- add extra clarification to the units presented in the file size row of the error report (#81)
11+
12+
## v0.7.1 (2026-03-12)
13+
14+
### Fix
15+
16+
- enable passing of original_entity_override to fix instance where id cannot be sourced from current entity
17+
18+
## v0.7.0 (2026-03-12)
19+
20+
### Feat
21+
22+
- add row index (#57)
23+
- add option in csv readers to clean and null empty strings (#64)
24+
25+
### Fix
26+
27+
- add ability to strictly enforce date format in conformatteddate (#70)
28+
- add postcode type to model gen (#69)
29+
130
## v0.6.2 (2026-03-09)
231

332
### Fix

CONTRIBUTE.md

Lines changed: 37 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,30 +1,42 @@
11
# DVE Contributing guidelines
22

3-
# Developer information
3+
__If you're planning to contribute to the DVE, please follow all the guidance below. Failure to follow the guidance in this document may result in your contributions being automatically rejected.__
44

55
## Getting started
66

7-
To begin with I would recommend that users read all the documentation available within the [docs](./docs/). It gives an overview of how the DVE works and how to work with the dischema json document.
7+
I would recommend that you read all the documentation available within the [docs](https://nhsdigital.github.io/data-validation-engine/). It gives an overview of how the DVE works and how to work with the dischema json document.
88

99
## General requirements
10-
To start contributing to the DVE project you will need the following tooling available:
10+
11+
To start contributing to the DVE project you will need the tooling listed within the `.tool-versions` or `mise.toml`. The following tools are required because...
12+
1113
| Tool | Version | Reason |
1214
| ---- | ------- | ------ |
13-
| `Python` | 3.7.17 | Currently supported version of `Python` for the DVE. |
14-
| `Poetry` | 1.4.2 | Build and venv tool used for the DVE. |
15+
| `Python` | 3.11 | Latest version of Python supported by the DVE. |
16+
| `Poetry` | 2.2.1 | Build and package manager tool used for the DVE. |
1517
| `Java` | java liberica-1.8.0 | `Java` version required for `PySpark`. |
16-
| `pre-commit` | 2.21.0 | Currently installed as part of the `poetry` venv but seperate installation is fine. |
17-
| `commitizen` | 3.9.1 | Like `pre-commit`, installed as part of the `poetry` venv but seperate installation is fine. This is used to manage commits and automated semantic versioning. |
18+
| `pre-commit` | 4.3.0 | Currently installed as part of the `poetry` venv but seperate installation is fine. |
19+
| `commitizen` | 4.9.1 | Like `pre-commit`, installed as part of the `poetry` venv but seperate installation is fine. This is used to manage commits and automated semantic versioning. |
1820
| `git-secrets` | Latest | Utilised as part of the `pre-commit` to ensure that no secrets are commited to the repository. There is a helper installation script within [scripts](/scripts/git-secrets/). |
1921

20-
Additionally, we have created a [asdf support](.tool-versions) and [mise-en-toml](.mise.toml) for those utilising `asdf` or `mise-en-toml` software.
22+
You can install all the developer requirements with the following command:
23+
24+
```bash
25+
poetry install --with dev
26+
```
2127

2228
## Testing Requirements
2329

2430
Testing requirements are given in [pyproject.toml](./poetry.toml#48) under `tool.poetry.group.test.dependencies`. These are always pinned versions for consistency, but should be updated regularly if new versions are released. The following core packages are used for testing:
2531
- [pytest](https://docs.pytest.org/en/stable/): Used for Python unit tests, and some small e2e tests which check coverage.
2632
- [behave](https://github.com/behave/behave): Used for full, business-driven end-to-end tests.
27-
- [coverage](https://coverage.readthedocs.io/en/7.10.7/): Used to get coverage for `pytest` tests.
33+
- [coverage](https://coverage.readthedocs.io/en/): Used to get coverage for `pytest` tests.
34+
35+
You can install these requirements with the following command:
36+
37+
```bash
38+
poetry install --with test
39+
```
2840

2941
## Linting/Formatting/Type Checking Requirements
3042

@@ -38,6 +50,12 @@ This mostly breaks down to:
3850

3951
We use these tools to ensure that code quality is not excessively compromised, even when working at pace.
4052

53+
You can install these requirements with the following command:
54+
55+
```bash
56+
poetry install --with lint
57+
```
58+
4159
## Installation for Development
4260

4361
We are utilising Poetry for build dependency management and packaging. If you're on a system that has `Make` available, you can simply run `make install` to setup a local virtual environment with all the dependencies installed (this won't install Poetry for you).
@@ -50,6 +68,16 @@ Tests should be run after installing the package for development as outlined abo
5068
- To check the coverage run `poetry run coverage report -m`
5169
- To run the behave tests, run `poetry run behave tests/features` (these are not included in coverage calculations)
5270

71+
## Committing
72+
73+
We use [commitizen](https://github.com/commitizen-tools/commitizen) to commit new changes. This ensures...
74+
75+
1. A consistent standard for the commit messages
76+
2. Generation of changelog from the commit messages
77+
3. Allows for automatic bumping of the version based on the changes
78+
79+
Please use `poetry run cz c` or `cz c` (if already in the venv).
80+
5381
## Submitting a pull request
5482

5583
If you want to contribute to the DVE then please follow the steps below:

README.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
<h1 style="display: flex; align-items: center; gap: 10px;">
2-
<img src="overrides/.icons/nhseng.svg" alt="NHS Logo" width="5%" height="100%" align="left">
2+
<img src="https://github.com/NHSDigital/data-validation-engine/blob/616b55890306db4546177f7effac48ca241857ec/overrides/.icons/nhseng.svg" alt="" width="5%" height="100%" align="left">
33
Data Validation Engine
44
</h1>
55

@@ -25,11 +25,11 @@ If you'd like more detailed documentation around these services the please read
2525

2626
The DVE has been designed in a way that's modular and can support users who just want to utilise specific "services" from the DVE (i.e. just the file transformation + data contract). Additionally, the DVE is designed to support different backend implementations. As part of the base installation of DVE, you will find backend support for `Spark` and `DuckDB`. So, if you need a `MySQL` backend implementation, you can implement this yourself. Given our organisations requirements, it will be unlikely that we add anymore specific backend implementations into the base package beyond Spark and DuckDB. So, if you are unable to implement this yourself, I would recommend reading the guidance on [requesting new features and raising bug reports here](#requesting-new-features-and-raising-bug-reports).
2727

28-
Additionally, if you'd like to contribute a new backend implementation into the base DVE package, then please look at the [Contributing][#Contributing] section.
28+
Additionally, if you'd like to contribute a new backend implementation into the base DVE package, then please look at the [Contributing](#Contributing) section.
2929

3030
## Installation and usage
3131

32-
The DVE is a Python package and can be installed using `pip`. As of release v0.6.1 we currently support Python 3.10 & 3.11, with Spark version 3.4 and DuckDB version of 1.1. In the future we will be looking to upgrade the DVE to working on a higher versions of Python, DuckDB and Spark.
32+
The DVE is a Python package and can be installed using package managers such as [pip](https://pypi.org/project/pip/). As of the latest release we support Python 3.10 & 3.11, with Spark v3.4 and DuckDB v1.1. In the future we will be looking to upgrade the DVE to working on a higher versions of Python, DuckDB and Spark.
3333

3434
If you're planning to use the Spark backend implementation, you will also need OpenJDK 11 installed.
3535

@@ -38,12 +38,12 @@ Python dependencies are listed in `pyproject.toml`.
3838
To install the DVE package you can simply install using a package manager such as [pip](https://pypi.org/project/pip/).
3939

4040
```
41-
pip install git+https://github.com/NHSDigital/data-validation-engine.git@v0.6.1
41+
pip install data-validation-engine
4242
```
4343

44-
Once you have installed the DVE you are ready to use it. For guidance on how to create your dischema JSON document (configuration), please read the [documentation](./docs/).
44+
*Note - Only versions >=0.6.2 are available on PyPi. For older versions please install directly from the git repo or build from source.*
4545

46-
Please note - The long term aim is to make the DVE available via PyPi and Conda but we are not quite there yet. Once available this documentation will be updated to contain the new installation options.
46+
Once you have installed the DVE you are ready to use it. For guidance on how to create your dischema JSON document (configuration), please read the [documentation](./docs/).
4747

4848
Version 0.0.1 does support a working Python 3.7 installation. However, we will not be supporting any issues with that version of the DVE if you choose to use it. __Use at your own risk__.
4949

@@ -60,10 +60,10 @@ Below is a list of features that we would like to implement or have been request
6060
| ------- | --------------- | --------- |
6161
| Open source release | 0.1.0 | Yes |
6262
| Uplift to Python 3.11 | 0.2.0 | Yes |
63-
| Upgrade to Pydantic 2.0 | Not yet confirmed | No |
63+
| Upgrade to Pydantic 2.0 | Before 1.0 release | No |
6464
| Create a more user friendly interface for building and modifying dischema files | Not yet confirmed | No |
6565

66-
Beyond the Python upgrade, we cannot confirm the other features will be made available anytime soon. Therefore, if you have the interest and desire to make these features available, then please read the [Contributing](#contributing) section and get involved.
66+
Beyond the Python and Pydantic upgrade, we cannot confirm the other features will be made available anytime soon. Therefore, if you have the interest and desire to make these features available, then please read the [Contributing](#contributing) section and get involved.
6767

6868
## Contributing
6969
Please see guidance [here](./CONTRIBUTE.md).

0 commit comments

Comments
 (0)