⚠️ DISCLAIMERThis project is for authorized security research and defensive auditing only. It is intended to help owners identify and protect their own exposed secrets. Do not use it illegally or against systems you do not own or are not explicitly permitted to assess.
The project authors are not responsible for any consequences resulting from misuse, unauthorized scanning, or other unlawful activity.
GitSteal is a small Python utility that searches GitHub for repositories matching one or more keywords and then runs TruffleHog against the discovered repository URLs. It can help identify exposed API keys in authorized environments.
- Search GitHub commits by keyword using the GitHub Search API
- Deduplicate repository URLs before scanning
- Run TruffleHog in parallel with configurable concurrency
- Persist scanned repositories to avoid rescanning the same URLs
- Save verified findings to JSON for later review
- Prune the scan history automatically when it grows large
- Python 3.14 or newer
- A GitHub personal access token with access to the search API
- TruffleHog installed and available on your
PATH
-
Install dependencies with
uv:uv sync
-
Create a
.envfile in the project root and add your GitHub token:GITHUB_API_KEY=your_token_here
-
Add search keywords to
key_words.txt, one per line.
Run the scanner from the project root:
uv run main.pyIf you are using the virtual environment directly, activate it first and then run:
python main.py--start-page: first GitHub search page to fetch, default1--last-page: last GitHub search page to fetch, default3--concurrency: number of parallel TruffleHog scans, default10--keywords: path to the keyword file, defaultkey_words.txt
uv run main.py --start-page 1 --last-page 5 --concurrency 6 --keywords key_words.txtThe keyword file supports one keyword per line. Empty lines are ignored, and lines starting with # are treated as comments.
Example:
OPENAI_API_KEY
GEMINI_API_KEY
The scanner writes local state to the project root:
scanned_urls.txt— repository URLs that have already been processedtrufflehog_results.json— verified findings returned by TruffleHog
These files are ignored by Git so local scan state does not get committed.
- Load keywords from
key_words.txt - Query GitHub for repositories related to each keyword
- Remove repositories that have already been scanned
- Run TruffleHog against each new repository URL
- Save verified findings and update scan history
Use this tool only on repositories and accounts you own or are explicitly authorized to assess. Do not use it to collect secrets from third-party systems or repositories without permission.
For additional legal and operational details, see DISCLAIMER.md.
main.py— scanner implementation and CLI entry pointkey_words.txt— sample keyword listDISCLAIMER.md— usage and legal noticeLICENSE— project license textpyproject.toml— project metadata and dependencies
This project is licensed under the terms of the LICENSE file.
- The code currently uses GitHub’s search API and may be affected by rate limits.
- TruffleHog must be installed separately; this project only orchestrates it.
- Results depend on the keywords you provide and the repositories returned by GitHub search.