Skip to content

itguyeric/podcasting

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Podcasting Automation Toolkit

This repository contains scripts and supporting files to automate the entire podcasting workflow, from download to final publication assets.
It is designed to replace fragmented manual steps (e.g., transcription services like AccurateScribe.ai, ad-hoc file management) with a fully local, reproducible process you can run on macOS or Linux.

Features

  • Batch download from YouTube or other supported platforms using yt-dlp, supporting URLs listed in urls.txt.
  • Audio normalization with ffmpeg.
  • Automated transcription with WhisperX including:
    • Speaker diarization
    • JSON output
    • Paragraph timestamps
  • DOCX transcript generation with optional episode metadata mapping via mapping.csv.
  • Organized asset management using type-based folder organization for audio, metadata, transcripts, and intermediate WhisperX JSON.
  • Extensible workflow: episode planning, audio cleanup (Auphonic), publishing automation.

Folder Structure

podcasting/
├── audio/          # Normalized audio files
├── downloads/      # Raw downloads
├── info/           # yt-dlp metadata
├── tx_docx/        # Final transcripts
├── whisper/        # WhisperX JSON output
├── mapping.csv     # Optional metadata mapping file
├── urls.txt        # Video URLs to process
└── transcribe_urls_whisperx.sh

The repository organizes files by type rather than by episode folder. .gitkeep files preserve folder structure, while .gitignore excludes generated media and other output to keep the repository clean.

Prerequisites

Tested on macOS and Linux. Required tools:

Example:

brew install yt-dlp ffmpeg
python3 -m venv ~/.venvs/whisperx
source ~/.venvs/whisperx/bin/activate
pip install git+https://github.com/m-bain/whisperX.git

Usage

./transcribe_urls_whisperx.sh -n ITG020
./transcribe_urls_whisperx.sh --language en

Mapping Metadata

mapping.csv example:

video_id,episode_id,title,guest,date
abc123,ITG020,Episode Title,Guest,2024-08-11

Output Example

runs/2025-08-11_123456-ITG020/
├── audio/
├── info/
├── whisper/
├── tx_docx/
└── downloads/

License

This project is licensed under the GNU General Public License v3.0 — see the LICENSE file.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages