A multi-source data analysis project that explores trends in the U.S. tech job market using datasets from LinkedIn, Glassdoor, and other job boards.
This project analyzes demand for roles, skills, and locations over time to uncover actionable insights for job seekers, hiring managers, and labor analysts.
Between 2020 and 2024, the job market for tech professionals evolved significantly due to the pandemic, the rise of remote work, and economic shifts. This project aggregates and analyzes multiple job posting datasets to identify:
- In-demand job titles and skills over time
- Geographic trends in tech hiring (remote vs on-site)
- Salary distributions and growth in different states
The project is being expanded into an interactive dashboard using Streamlit to allow dynamic exploration of these trends by students.
-
Multi-source Dataset Integration
Combines job data from LinkedIn, Glassdoor, and other job boards. -
Data Cleaning & Preprocessing
Standardizes columns, cleans text, handles missing values, and merges across schemas. -
Exploratory Data Analysis
Insightful charts and tables generated with Matplotlib, Seaborn, and Pandas. -
️ Location-based Insights
Visualizes how job demand varies across states and regions. -
Title & Skill Analysis
Frequency tracking of roles like Data Scientist, Software Engineer, ML Engineer, and in-demand skills (e.g., Python, SQL, AWS). -
Future NLP Pipeline (In Progress)
Applying keyword extraction and topic modeling to analyze trends in job descriptions.
| Tool / Library | Purpose |
|---|---|
| Python 3.11+ | Primary programming language |
| Pandas / NumPy | Data manipulation and cleaning |
| Matplotlib / Seaborn | Data visualization |
| Jupyter Notebooks | Exploratory analysis |
| Streamlit (WIP) | Dashboard app interface |
| Natural Language Toolkit | NLP and clustering for job descriptions |
| Beautiful Soup | Web scrapping |
| Selenium | Web scrapping for websites with dynamic page |
Planned enhancements to this project include:
- Deploying the dashboard via Streamlit Cloud
- Utilize Tableau or PowerBI