Skip to content

Commit 31eabfc

Browse files
committed
Merge branch 'v4.1.0-dev' of github.com:muthu-ku/ACI-Pre-Upgrade-Validation-Script into muthuk-CSCwd37387
2 parents 2b46fea + 07ea2db commit 31eabfc

38 files changed

Lines changed: 1771 additions & 36 deletions

.github/workflows/pytest.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,9 +2,9 @@ name: Pytest
22

33
on:
44
push:
5-
branches: [master]
5+
branches: [master, 'v[0-9].[0-9]+.[0-9]+*']
66
pull_request:
7-
branches: [master]
7+
branches: [master, 'v[0-9].[0-9]+.[0-9]+*']
88

99
permissions:
1010
contents: read

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -50,6 +50,7 @@ coverage.xml
5050
.hypothesis/
5151
.pytest_cache/
5252
cover/
53+
preupgrade_validator*.tgz
5354

5455
# Translations
5556
*.mo

aci-preupgrade-validation-script.py

Lines changed: 261 additions & 13 deletions
Large diffs are not rendered by default.

docs/docs/validations.md

Lines changed: 90 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -82,7 +82,7 @@ Items | Faults | This Script
8282
[Fabric Port Status][f19] | F1394: ethpm-if-port-down-fabric | :white_check_mark: | :no_entry_sign:
8383
[Equipment Disk Limits][f20] | F1820: 80% -minor<br>F1821: -major<br>F1822: -critical | :white_check_mark: | :no_entry_sign:
8484
[VMM Inventory Partially Synced][f21] | F0132: comp-ctrlr-operational-issues | :white_check_mark: | :no_entry_sign:
85-
85+
[APIC Storage Inode Usage][f22] | F4388: 75% - 85% -warning<br>F4389: 85% - 90% -major<br>F4390: 90% or more -critical | :white_check_mark: | :no_entry_sign:
8686

8787
[f1]: #apic-disk-space-usage
8888
[f2]: #standby-apic-disk-space-usage
@@ -105,6 +105,7 @@ Items | Faults | This Script
105105
[f19]: #fabric-port-status
106106
[f20]: #equipment-disk-limits
107107
[f21]: #vmm-inventory-partially-synced
108+
[f22]: #apic-storage-inode-usage
108109

109110
### Configuration Checks
110111

@@ -194,6 +195,9 @@ Items | Defect | This Script
194195
[ISIS DTEPs Byte Size][d27] | CSCwp15375 | :white_check_mark: | :no_entry_sign:
195196
[Policydist configpushShardCont Crash][d28] | CSCwp95515 | :white_check_mark: | :no_entry_sign:
196197
[Auto Firmware Update on Switch Discovery][d29] | CSCwe83941 | :white_check_mark: | :no_entry_sign:
198+
[Rogue EP Exception List missing on switches][d30] | CSCwp64296 | :white_check_mark: | :no_entry_sign:
199+
[N9K-C9408 with more than 5 N9K-X9400-16W LEMs][d31] | CSCws82819 | :white_check_mark: | :no_entry_sign:
200+
[Multi-Pod Modular Spine Bootscript File][d32] | CSCwr66848 | :white_check_mark: | :no_entry_sign:
197201

198202
[d1]: #ep-announce-compatibility
199203
[d2]: #eventmgr-db-size-defect-susceptibility
@@ -224,6 +228,9 @@ Items | Defect | This Script
224228
[d27]: #isis-dteps-byte-size
225229
[d28]: #policydist-configpushshardcont-crash
226230
[d29]: #auto-firmware-update-on-switch-discovery
231+
[d30]: #rogue-ep-exception-list-missing-on-switches
232+
[d31]: #n9k-c9408-with-more-than-5-n9k-x9400-16w-lems
233+
[d32]: #multi-pod-modular-spine-bootscript-file
227234

228235
## General Check Details
229236

@@ -1551,6 +1558,56 @@ EPGs using the `pre-provision` resolution immediacy do not rely on the VMM inven
15511558

15521559
This check returns a `MANUAL` result as there are many reasons for a partial inventory sync to be reported. The goal is to ensure that the VMM inventory sync has fully completed before triggering the APIC upgrade to reduce any chance for unexpected inventory changes to occur.
15531560

1561+
1562+
### APIC Storage Inode Usage
1563+
1564+
If a Cisco APIC is running low on inode capacity for any reason, the Cisco APIC upgrade can fail. The Cisco APIC will raise three different faults depending on inode utilization. If any of these faults are raised on the system, the issue should be resolved prior to performing the upgrade.
1565+
1566+
* **F4388**: A warning level fault for Cisco APIC storage inode utilization. This is raised when utilization is greater than 75%.
1567+
1568+
* **F4389**: A major level fault for Cisco APIC storage inode utilization. This is raised when utilization is between 85% and 90%.
1569+
1570+
* **F4390**: A critical level fault for Cisco APIC storage inode utilization. This is raised when utilization is greater than 90%.
1571+
1572+
Although the storage space for the filesystem might be adequate we might still see issues with inode usage, this happens when we have more number of files or directories created with lower file sizes.
1573+
1574+
Recommended Action:
1575+
1576+
To recover from this fault, try the following action
1577+
1578+
1. Free up space from affected disk partition .
1579+
2. TAC may be required to analyze and cleanup certain directories due to filesystem permissions. Cleanup of `/` is one such example.
1580+
1581+
!!! example "Fault Example (F4390: Critical fault for APIC Inode Utilisation)"
1582+
```
1583+
moquery -c faultInst -f 'fault.Inst.code=="F4390"'
1584+
Total Objects shown: 1
1585+
1586+
# faultInst
1587+
ack : yes
1588+
alert : no
1589+
cause : equipment-full
1590+
changeSet : available (Old: 19408344, New: 19407972), inodesFree (Old: 263915, New: 263842), inodesUsed (Old: 2357525, New: 2357598),
1591+
used (Old: 19436092, New: 19436464)
1592+
code : F4390
1593+
created : 2024-08-05T05:42:31.975+02:00
1594+
delegated : no
1595+
descr : Storage unit /scratch-writes on node 3 with hostname apic3 mounted at /scratch-writes is 90% full for Inodes
1596+
dn : topology/pod-2/node-3/sys/ch/p-[/scratch-writes]-f-[/dev/mapper/atx-scratch]/fault-F4390
1597+
domain : infra
1598+
highestSeverity : critical
1599+
lastTransition : 2024-08-05T09:41:18.152+02:00
1600+
lc : raised
1601+
occur : 2
1602+
origSeverity : critical
1603+
prevSeverity : cleared
1604+
rule : eqpt-storage-inode-critical
1605+
severity : critical
1606+
subject : equipment-full
1607+
type : operational
1608+
```
1609+
1610+
15541611
## Configuration Check Details
15551612

15561613
### VPC-paired Leaf switches
@@ -2648,6 +2705,7 @@ Due to [CSCwp95515][59], upgrading to an affected version while having any `conf
26482705

26492706
If any instances of `configpushShardCont` are flagged by this script, Cisco TAC must be contacted to identify and resolve the underlying issue before performing the upgrade.
26502707

2708+
26512709
### Auto Firmware Update on Switch Discovery
26522710

26532711
[Auto Firmware Update on Switch Discovery][63] automatically upgrades a new switch to the target firmware version before registering it to the ACI fabric. This feature activates in three scenarios:
@@ -2668,6 +2726,33 @@ To avoid this risk, consider disabling Auto Firmware Update before upgrading to
26682726
This issue occurs because older switch firmware versions are not compatible with switch images 6.0(3) or newer. The APIC version is not a factor.
26692727

26702728

2729+
### Rogue EP Exception List missing on switches
2730+
2731+
The Rogue/COOP Exception List feature, introduced in 5.2(3), allows exclusion of specific MAC addresses from Rogue Endpoint Control and COOP Dampening. Initially, each MAC address had to be configured individually in each bridge domain. In 6.0(3), this feature was enhanced to support fabric-wide exception lists with wildcard options per bridge domain and the ability to exclude MAC addresses in L3Outs.
2732+
2733+
However, due to [CSCwp64296][64], when upgrading spine switches to version 6.0(3)+ from an older version with Rogue/COOP Exception Lists configured, some exception lists may not be pushed to the spine switches. As a result, the feature may stop functioning after the upgrade.
2734+
2735+
The root cause is that internal objects called `presListener` for Rogue/COOP Exception List, which publish the configuration from APICs to switches, may be missing on the APICs after an upgrade.
2736+
2737+
Recommended action: Delete the affected exception list and create it again. If needed, contact Cisco TAC to help recover missing `presListener` objects on APICs.
2738+
2739+
2740+
### N9K-C9408 with more than 5 N9K-X9400-16W LEMs
2741+
2742+
Due to defect [CSCws82819][65], N9K-C9408 switch will experience a boot loop with dt_helper process crash if upgraded to versions 16.1(2f) to 16.1(5) or 16.2(1g) with more than 5 N9K-X9400-16W LEMs installed.
2743+
2744+
To avoid this issue, please upgrade to fix version or use less than 6 N9K-X9400-16W in one chassis.
2745+
2746+
2747+
### Multi-Pod Modular Spine Bootscript File
2748+
2749+
Due to [CSCwr66848][66], in Multi-Pod environments, upgrading a modular spine to 6.1(4h) may result in inter-pod traffic to stop working if the `/bootflash/bootscript` file is missing on the spine prior to the upgrade. The traffic interruption occurs because the spine incorrectly indentifies the reason of its reload, leading to an unnecessary attempt to load the missing bootscript file.
2750+
2751+
This issue happens only when the target version is specifically 6.1(4h).
2752+
2753+
To avoid this issue, change the target version to another version. Or verify that the `bootscript` file exists in the bootflash of each modular spine switch prior to upgrading to 6.1(4h). If the file is missing, you have to do clean reboot on the impacted spine to ensure that `/bootflash/bootscript` gets created again. In case you already upgraded your spine and you are experiencing the traffic impact due to this issue, clean reboot of the spine will restore the traffic.
2754+
2755+
26712756
[0]: https://github.com/datacenter/ACI-Pre-Upgrade-Validation-Script
26722757
[1]: https://www.cisco.com/c/dam/en/us/td/docs/Website/datacenter/apicmatrix/index.html
26732758
[2]: https://www.cisco.com/c/en/us/support/switches/nexus-9000-series-switches/products-release-notes-list.html
@@ -2731,4 +2816,7 @@ To avoid this risk, consider disabling Auto Firmware Update before upgrading to
27312816
[60]: https://www.cisco.com/c/en/us/solutions/collateral/data-center-virtualization/application-centric-infrastructure/white-paper-c11-743951.html#Inter
27322817
[61]: https://www.cisco.com/c/en/us/solutions/collateral/data-center-virtualization/application-centric-infrastructure/white-paper-c11-743951.html#EnablePolicyCompression
27332818
[62]: https://bst.cloudapps.cisco.com/bugsearch/bug/CSCwe83941
2734-
[63]: https://www.cisco.com/c/en/us/td/docs/dcn/aci/apic/all/apic-installation-aci-upgrade-downgrade/Cisco-APIC-Installation-ACI-Upgrade-Downgrade-Guide/m-auto-firmware-update.html
2819+
[63]: https://www.cisco.com/c/en/us/td/docs/dcn/aci/apic/all/apic-installation-aci-upgrade-downgrade/Cisco-APIC-Installation-ACI-Upgrade-Downgrade-Guide/m-auto-firmware-update.html
2820+
[64]: https://bst.cloudapps.cisco.com/bugsearch/bug/CSCwp64296
2821+
[65]: https://bst.cloudapps.cisco.com/bugsearch/bug/CSCws82819
2822+
[66]: https://bst.cloudapps.cisco.com/bugsearch/bug/CSCwr66848
Lines changed: 80 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,80 @@
1+
[
2+
{
3+
"faultInst": {
4+
"attributes": {
5+
"ack": "no",
6+
"alert": "no",
7+
"cause": "equipment-full",
8+
"changeSet": "available (Old: 37868344, New: 37859228), inodesFree (Old: 810163, New: 479339), inodesUsed (Old: 1811277, New: 2142101), inodesUtilized (Old: 70, New: 82), used (Old: 976092, New: 985208)",
9+
"code": "F4388",
10+
"created": "2026-03-06T11:58:43.579+00:00",
11+
"delegated": "no",
12+
"descr": "Storage unit /data/admin/bin/avread on Node 1 mounted at /data/admin/bin/avread is 82% full for Inodes",
13+
"dn": "topology/pod-1/node-1/sys/ch/p-[/data/admin/bin/avread]-f-[overlayfs]/fault-F4388",
14+
"domain": "infra",
15+
"highestSeverity": "warning",
16+
"lastTransition": "2026-03-06T11:58:43.579+00:00",
17+
"lc": "raised",
18+
"occur": "1",
19+
"origSeverity": "warning",
20+
"prevSeverity": "warning",
21+
"rule": "eqpt-storage-inode-warning",
22+
"severity": "warning",
23+
"subject": "equipment-full",
24+
"type": "operational"
25+
}
26+
}
27+
},
28+
{
29+
"faultInst": {
30+
"attributes": {
31+
"ack": "no",
32+
"alert": "no",
33+
"cause": "equipment-full",
34+
"changeSet": "available (Old: 37868344, New: 37859228), inodesFree (Old: 810163, New: 479339), inodesUsed (Old: 1811277, New: 2142101), inodesUtilized (Old: 70, New: 82), used (Old: 976092, New: 985208)",
35+
"code": "F4388",
36+
"created": "2026-03-06T11:58:43.587+00:00",
37+
"delegated": "no",
38+
"descr": "Storage unit /etc/hosts on Node 1 mounted at /etc/hosts is 82% full for Inodes",
39+
"dn": "topology/pod-1/node-1/sys/ch/p-[/etc/hosts]-f-[overlayfs]/fault-F4388",
40+
"domain": "infra",
41+
"highestSeverity": "warning",
42+
"lastTransition": "2026-03-06T11:58:43.587+00:00",
43+
"lc": "soaking",
44+
"occur": "1",
45+
"origSeverity": "warning",
46+
"prevSeverity": "warning",
47+
"rule": "eqpt-storage-inode-warning",
48+
"severity": "warning",
49+
"subject": "equipment-full",
50+
"type": "operational"
51+
}
52+
}
53+
},
54+
{
55+
"faultInst": {
56+
"attributes": {
57+
"ack": "no",
58+
"alert": "no",
59+
"cause": "equipment-full",
60+
"changeSet": "available (Old: 37868344, New: 37859228), inodesFree (Old: 810163, New: 479339), inodesUsed (Old: 1811277, New: 2142101), inodesUtilized (Old: 70, New: 82), used (Old: 976092, New: 985208)",
61+
"code": "F4388",
62+
"created": "2026-03-06T11:58:43.595+00:00",
63+
"delegated": "no",
64+
"descr": "Storage unit /scratch-writes on Node 1 mounted at /scratch-writes is 82% full for Inodes",
65+
"dn": "topology/pod-1/node-1/sys/ch/p-[/scratch-writes]-f-[/dev/mapper/atx-scratch]/fault-F4388",
66+
"domain": "infra",
67+
"highestSeverity": "warning",
68+
"lastTransition": "2026-03-06T11:58:43.595+00:00",
69+
"lc": "raised-clearing",
70+
"occur": "1",
71+
"origSeverity": "warning",
72+
"prevSeverity": "warning",
73+
"rule": "eqpt-storage-inode-warning",
74+
"severity": "warning",
75+
"subject": "equipment-full",
76+
"type": "operational"
77+
}
78+
}
79+
}
80+
]
Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,54 @@
1+
[
2+
{
3+
"faultInst": {
4+
"attributes": {
5+
"ack": "no",
6+
"alert": "no",
7+
"cause": "equipment-full",
8+
"changeSet": "available (Old: 37868344, New: 37859228), inodesFree (Old: 810163, New: 479339), inodesUsed (Old: 1811277, New: 2142101), inodesUtilized (Old: 70, New: 82), used (Old: 976092, New: 985208)",
9+
"code": "F4388",
10+
"created": "2026-03-06T11:58:43.579+00:00",
11+
"delegated": "no",
12+
"descr": "Storage unit /data/admin/bin/avread on Node 1 mounted at /data/admin/bin/avread is 82% full for Inodes",
13+
"dn": "topology/pod-1/node-1/sys/ch/p-[/data/admin/bin/avread]-f-[overlayfs]/fault-F4388",
14+
"domain": "infra",
15+
"highestSeverity": "warning",
16+
"lastTransition": "2026-03-06T11:58:43.579+00:00",
17+
"lc": "cleared",
18+
"occur": "1",
19+
"origSeverity": "warning",
20+
"prevSeverity": "warning",
21+
"rule": "eqpt-storage-inode-warning",
22+
"severity": "warning",
23+
"subject": "equipment-full",
24+
"type": "operational"
25+
}
26+
}
27+
},
28+
{
29+
"faultInst": {
30+
"attributes": {
31+
"ack": "no",
32+
"alert": "no",
33+
"cause": "equipment-full",
34+
"changeSet": "available (Old: 37868344, New: 37859228), inodesFree (Old: 810163, New: 479339), inodesUsed (Old: 1811277, New: 2142101), inodesUtilized (Old: 70, New: 82), used (Old: 976092, New: 985208)",
35+
"code": "F4388",
36+
"created": "2026-03-06T11:58:43.587+00:00",
37+
"delegated": "no",
38+
"descr": "Storage unit /etc/hosts on Node 1 mounted at /etc/hosts is 82% full for Inodes",
39+
"dn": "topology/pod-1/node-1/sys/ch/p-[/etc/hosts]-f-[overlayfs]/fault-F4388",
40+
"domain": "infra",
41+
"highestSeverity": "warning",
42+
"lastTransition": "2026-03-06T11:58:43.587+00:00",
43+
"lc": "retaining",
44+
"occur": "1",
45+
"origSeverity": "warning",
46+
"prevSeverity": "warning",
47+
"rule": "eqpt-storage-inode-warning",
48+
"severity": "warning",
49+
"subject": "equipment-full",
50+
"type": "operational"
51+
}
52+
}
53+
}
54+
]

0 commit comments

Comments
 (0)