Skip to content

fix: plan-resolve silently drops packages not in pool#44

Open
marcos-mendez wants to merge 1 commit intoturnkeylinux:masterfrom
popsolutions:fix/plan-resolve-missing-packages
Open

fix: plan-resolve silently drops packages not in pool#44
marcos-mendez wants to merge 1 commit intoturnkeylinux:masterfrom
popsolutions:fix/plan-resolve-missing-packages

Conversation

@marcos-mendez
Copy link
Copy Markdown
Member

Summary

Plan.resolve() in fablib/plan.py silently drops packages from the plan that are not found in the pool, instead of passing them through as repo-installable packages. This causes critical packages like locales, curl, nginx, git, iptables, fail2ban, postfix, ssh, build-essential, isolinux, linux-image-amd64 etc. to be missing from root.spec, breaking all appliance builds.

Root cause

Three bugs in Plan.resolve():

  1. missing set never populatedpackages.missing was not accumulated across while-loop iterations. The missing variable stayed empty throughout execution.

  2. all_missing computed in wrong scope — it was calculated inside the while loop using only the last iteration's packages.missing, instead of after the loop using the fully accumulated set.

  3. brokendeps included virtual packages — packages already satisfied by bootstrap Provides (e.g., awk provided by mawk) were included in the output, causing apt-get install awk to fail.

Fix

  • plan.py: Accumulate missing |= packages.missing inside the loop. Move all_missing calculation after the loop. Filter brokendeps by all_missing.
  • resolve.py: Add iter_provided() to extract Provides from bootstrap dpkg/status. Pass bootstrap-provided virtual packages to Plan.resolve() so they are excluded from missing.

Testing

Before fix:

$ fab-plan-resolve plan/main --bootstrap=... | grep locales
# (empty - locales silently dropped)

After fix:

$ fab-plan-resolve plan/main --bootstrap=... | grep locales
locales                                                # plan/main

Full appliance build (NetBox 4.5.7) completes successfully with ISO and tar.gz generation.

Impact

This bug affects all appliance builds on TKLDev with fab 1.1rc5. Without this fix, no appliance can build correctly — critical packages are missing from the root filesystem.

Three bugs in Plan.resolve():

1. The 'missing' set was never populated — packages.missing was not
   accumulated across while-loop iterations. Fixed by adding
   'missing |= packages.missing' inside the loop.

2. The 'all_missing' computation was inside the loop but should be
   after it, using the fully accumulated 'missing' set.

3. The 'brokendeps' list iterated over 'missing' without filtering
   by 'all_missing', including virtual packages already satisfied
   by bootstrap Provides.

Also adds iter_provided() to resolve.py to extract virtual package
names (Provides) from the bootstrap dpkg status, so packages like
'awk' (provided by mawk) are correctly recognized as satisfied.

Without this fix, packages like locales, curl, nginx, git, iptables,
fail2ban, postfix, ssh, build-essential, etc. were silently dropped
from root.spec, causing build failures.
@JedMeister
Copy link
Copy Markdown
Member

When I first tried to reproduce the issue you note I could. However it didn't make any sense to me as there were a ton of other packages that were getting pulled in from the common plans - including packages that are only in the base plan. But then I realized that the current 19.x-dev branch plan/base doesn't include locales! 🤦

The couple of commits that added locales (& mawk) to plan/base are/were in turnkeylinux/common#343 (which was part of your turnkeylinux/common#339). So I cherry-picked those out into their own PR: turnkeylinux/common#350. With that branch checked out I could no longer reproduce it... See below.

Regardless, after having a look at your code changes, there's definitely improvements there that are worth including. Although now I can't reproduce the issue that you are experiencing I'd really like to spend some time properly reviewing and testing before merging. And I'm so close to a 19.0 RC I really want to focus on that... Once I have that done I really want to turn to reviewing your new apps so we can publish them too!

FWIW I have a slightly newer version of fab, but the changes are irrelevant to your issue (they are related to a fab-rewind bug and an ISO build issue)...

root@dev-box:/turnkey/public/fab# apt policy fab
fab:
  Installed: 1.1rc5+2+g5972cd4
  Candidate: 1.1rc5+2+g5972cd4
  Version table:
 *** 1.1rc5+2+g5972cd4 500
        500 http://archive.turnkeylinux.org/debian trixie-testing/main amd64 Packages
        500 http://archive.turnkeylinux.org/debian trixie-testing/main arm64 Packages
        500 http://archive.turnkeylinux.org/debian trixie-testing/main all Packages
        100 /var/lib/dpkg/status
root@dev-box:/turnkey/public/fab# git diff --name-only v1.1rc5 5972cd4 # note commit ID from package version
contrib/fab-rewind
share/load_env

Note that without settingFAB_PLAN_INCLUDE_PATH or using -I (like it does by default when building an app with make) it just throws a stracktrace - because it can't find turnkey/base. A nicer error message with a hint would be nice, but that's something of an aside...

fab-plan-resolve :

root@dev-box:/turnkey/fab/products/tkldev# grep -r locales /turnkey/fab/common/plans/
/turnkey/fab/common/plans/turnkey/base:locales
root@dev-box:/turnkey/fab/products/tkldev# FAB_PLAN_INCLUDE_PATH=/turnkey/fab/common/plans fab-plan-resolve plan/main | grep locales
locales                    # plan/main [FORCE_POOL]

And

root@dev-box:/turnkey/fab/products/tkldev# fab-plan-resolve -I/turnkey/fab/common/plans plan/main | grep locales
locales                    # plan/main [FORCE_POOL]

Including --bootstrap=... makes no difference (which is expected because locales definitely isn't in the bootstrap):

root@dev-box:/turnkey/fab/products/tkldev# FAB_PLAN_INCLUDE_PATH=/turnkey/fab/common/plans fab-plan-resolve plan/main --bootstrap=/turnkey/fab/bootstraps/trixie-amd64/ | grep locales 
locales                    # plan/main [FORCE_POOL]

And abusing --bootstrap=... to demonstrate that it is in fact installed included in an appliance (tkldev in this case):

root@dev-box:/turnkey/fab/products/tkldev# make clean && NO_PROXY=true TKL_TESTING=y make root.patched
[...]
root@dev-box:/turnkey/fab/products/tkldev# fab-plan-resolve --bootstrap=build/root.patched | grep locales
locales                    # bootstrap [FORCE_POOL]

@marcos-mendez
Copy link
Copy Markdown
Member Author

Hey Jed, thanks for taking the time to dig into this — and sorry for the confusion.

You're right. I went back and checked my local environment and found that my common repo on 19.x-dev has 14 commits ahead of origin/19.x-dev — including the locales and mawk additions to plan/base from the old #339. Those commits were sitting locally but never made it upstream when #339 was closed.

So when I was debugging the missing locales issue, I had it in my local plan/base and assumed the resolve code was silently dropping it. In reality, on a clean 19.x-dev checkout, locales simply isn't there — exactly what you found.

I think with the excess of local branches and uncommitted work I got confused and something slipped past me. That's on me — I should have double-checked against upstream before opening the PR.

That said, looking at the code itself, I do believe the fixes to missing accumulation and brokendeps filtering are still valid improvements (the missing set is initialized but never written to inside the while loop, so it's always empty). But the immediate symptom I reported was clearly caused by the missing plan entry, not a code bug. Your #350 is the right fix for that.

I also cleaned up a duplicate line I had in the last push (missing |= packages.missing was repeated twice — copy-paste mistake).

No rush on the review — totally understand the focus on the 19.0 RC. Looking forward to working on the new apps once that's out!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants