Skip to content

2026-05-19 Executive Team #100

@joecastiglione

Description

@joecastiglione

Agenda

  • Telecommute Frequency Model Update and Tests (RSG & Friends)

Notes

Telecommute Frequency Model Refinement — Status Update

Presenter: Joel Freedman (RSG)
Presentation

Background & Motivation

The current ActivitySim telecommute frequency model uses distance to workplace as its accessibility variable. This creates a behavioral inconsistency: improved accessibility may allow workers to travel farther for the same generalized cost, paradoxically increasing predicted telecommute frequency. The goal of this task is to replace distance with a mode choice log sum for more consistent policy response. SANDAG was selected as the estimation data source due to available processed survey data.

JC: Note that this issue was originally flagged by MWCOG because there was no sensitivity of telecommuting to increasing parking costs in the core. Thought it would result in more telecommuting, but had no effect because impedance term was distance only - wasn't generalized cost or logsum or didn't include separate monetary cost parameter. Also, an additional reason for using SANDAG data was because that's what the MWCOG model was based on.

Data Issues

The estimation dataset combines SANDAG multi-day household travel surveys from 2016 and 2022. A significant anomaly was identified: the 2016 subsample appears severely over-represented (~33,000 workers), likely because the unique-household deduplication logic is not functioning correctly for the 2016 wave. All results shared should be treated as preliminary pending resolution of this issue.

Preliminary Estimation Results

Initial runs replacing distance with the mode choice log sum produced a sign error: the coefficient is positive, implying more accessible workers are more likely to telecommute — the opposite of the intended direction. Possible causes:

  • Over-representation of 2016 households biasing estimation
  • Errors in ActivitySim's log sum calculations
  • A genuine regional data pattern (e.g., high-accessibility downtown workers who also telecommute frequently)

Discussion Highlights

  • Return-to-office dynamics (Guy Rousseau): The model handles temporal variation via year-specific constants (2016 vs. 2022). Agencies can apply a scalar (0–1) to blend pre- and post-pandemic conditions. Future survey waves would allow finer-grained year constants.
  • Weighted estimation (Amir Samimi): Joel noted weighting by survey year biases standard errors; year dummy variables are preferred.
  • Industry vs. occupation (Shaun Tabone / Paris Brunton): The 2016 data includes industry but not occupation. Paris noted that in a parallel regression, work type dominated over distance, with location dropping out as non-significant once industry/occupation was controlled.

Next Steps

  1. Fix the 2016 household deduplication issue in the SANDAG estimation bundle.
  2. Re-estimate and check whether the log sum sign corrects.
  3. If not, test individual time/cost variables (transit IVT, parking cost, transit fare) as alternatives to the composite log sum.
  4. If still inconclusive, explore estimation using another region's data (e.g., MWCOG), subject to budget.
  5. RSG to distribute the auto-calibration design document (same day).
  6. Follow-up update in ~3 weeks.

Poisson Sampling & Disaggregate Accessibilities — Bias Issue

Raised by: Jan Zill (Outer Loop)

The Poisson sampling integration for the explicit error term (EET) branch is complete and ready for review. Testing uncovered a systematic bias in disaggregate accessibilities on the current main branch under importance sampling with replacement.

The root cause: McFadden's sampling correction factor is specified in the YAML as a constant shift, which drops out of location choice utilities but carries into log sum calculations. Under importance sampling with replacement, this is misapplied, introducing a bias equal to log(sample size). For SANDAG (~100 samples) and MTC Extended (~725 samples), this equates to roughly 4–7 utility units — one-third to one-half the mean disaggregate accessibility value.

Key implications:

  • Affects SANDAG and MTC Extended in production (where disaggregate accessibilities are used); does not affect standard test models.
  • Switching to Poisson sampling will not produce equivalent results under the current spec.
  • Existing models estimated on biased accessibilities would be inconsistent if the code is corrected without re-estimation.

Path forward: Jan will prepare slides covering the bias, numerical magnitude, and resolution options (code fix vs. YAML workaround) for discussion at the Thursday/Friday engineering meeting. Jeff Newman framed the approach: establish the ideal fix without legacy constraints, then assess backward compatibility for agencies with production models built on the existing behavior.


Action Items

Item Owner Timeline
Fix 2016 household deduplication in SANDAG estimation bundle RSG ~2–3 weeks
Re-estimate telecommute frequency model with corrected data RSG Following data fix
Distribute auto-calibration design document RSG EOD 5/19
Schedule follow-up telecommute model update Alex Bettinardi ~3 weeks
Prepare slides on Poisson sampling bias & resolution options Jan Zill By Thu engineering meeting
Discuss bias resolution path Engineering group Thu/Fri engineering meeting

Metadata

Metadata

Assignees

No one assigned

    Labels

    meetingMeeting notes.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions