Agenda
- Telecommute Frequency Model Update and Tests (RSG & Friends)
Notes
Telecommute Frequency Model Refinement — Status Update
Presenter: Joel Freedman (RSG)
Presentation
Background & Motivation
The current ActivitySim telecommute frequency model uses distance to workplace as its accessibility variable. This creates a behavioral inconsistency: improved accessibility may allow workers to travel farther for the same generalized cost, paradoxically increasing predicted telecommute frequency. The goal of this task is to replace distance with a mode choice log sum for more consistent policy response. SANDAG was selected as the estimation data source due to available processed survey data.
JC: Note that this issue was originally flagged by MWCOG because there was no sensitivity of telecommuting to increasing parking costs in the core. Thought it would result in more telecommuting, but had no effect because impedance term was distance only - wasn't generalized cost or logsum or didn't include separate monetary cost parameter. Also, an additional reason for using SANDAG data was because that's what the MWCOG model was based on.
Data Issues
The estimation dataset combines SANDAG multi-day household travel surveys from 2016 and 2022. A significant anomaly was identified: the 2016 subsample appears severely over-represented (~33,000 workers), likely because the unique-household deduplication logic is not functioning correctly for the 2016 wave. All results shared should be treated as preliminary pending resolution of this issue.
Preliminary Estimation Results
Initial runs replacing distance with the mode choice log sum produced a sign error: the coefficient is positive, implying more accessible workers are more likely to telecommute — the opposite of the intended direction. Possible causes:
- Over-representation of 2016 households biasing estimation
- Errors in ActivitySim's log sum calculations
- A genuine regional data pattern (e.g., high-accessibility downtown workers who also telecommute frequently)
Discussion Highlights
- Return-to-office dynamics (Guy Rousseau): The model handles temporal variation via year-specific constants (2016 vs. 2022). Agencies can apply a scalar (0–1) to blend pre- and post-pandemic conditions. Future survey waves would allow finer-grained year constants.
- Weighted estimation (Amir Samimi): Joel noted weighting by survey year biases standard errors; year dummy variables are preferred.
- Industry vs. occupation (Shaun Tabone / Paris Brunton): The 2016 data includes industry but not occupation. Paris noted that in a parallel regression, work type dominated over distance, with location dropping out as non-significant once industry/occupation was controlled.
Next Steps
- Fix the 2016 household deduplication issue in the SANDAG estimation bundle.
- Re-estimate and check whether the log sum sign corrects.
- If not, test individual time/cost variables (transit IVT, parking cost, transit fare) as alternatives to the composite log sum.
- If still inconclusive, explore estimation using another region's data (e.g., MWCOG), subject to budget.
- RSG to distribute the auto-calibration design document (same day).
- Follow-up update in ~3 weeks.
Poisson Sampling & Disaggregate Accessibilities — Bias Issue
Raised by: Jan Zill (Outer Loop)
The Poisson sampling integration for the explicit error term (EET) branch is complete and ready for review. Testing uncovered a systematic bias in disaggregate accessibilities on the current main branch under importance sampling with replacement.
The root cause: McFadden's sampling correction factor is specified in the YAML as a constant shift, which drops out of location choice utilities but carries into log sum calculations. Under importance sampling with replacement, this is misapplied, introducing a bias equal to log(sample size). For SANDAG (~100 samples) and MTC Extended (~725 samples), this equates to roughly 4–7 utility units — one-third to one-half the mean disaggregate accessibility value.
Key implications:
- Affects SANDAG and MTC Extended in production (where disaggregate accessibilities are used); does not affect standard test models.
- Switching to Poisson sampling will not produce equivalent results under the current spec.
- Existing models estimated on biased accessibilities would be inconsistent if the code is corrected without re-estimation.
Path forward: Jan will prepare slides covering the bias, numerical magnitude, and resolution options (code fix vs. YAML workaround) for discussion at the Thursday/Friday engineering meeting. Jeff Newman framed the approach: establish the ideal fix without legacy constraints, then assess backward compatibility for agencies with production models built on the existing behavior.
Action Items
| Item |
Owner |
Timeline |
| Fix 2016 household deduplication in SANDAG estimation bundle |
RSG |
~2–3 weeks |
| Re-estimate telecommute frequency model with corrected data |
RSG |
Following data fix |
| Distribute auto-calibration design document |
RSG |
EOD 5/19 |
| Schedule follow-up telecommute model update |
Alex Bettinardi |
~3 weeks |
| Prepare slides on Poisson sampling bias & resolution options |
Jan Zill |
By Thu engineering meeting |
| Discuss bias resolution path |
Engineering group |
Thu/Fri engineering meeting |
Agenda
Notes
Telecommute Frequency Model Refinement — Status Update
Presenter: Joel Freedman (RSG)
Presentation
Background & Motivation
The current ActivitySim telecommute frequency model uses distance to workplace as its accessibility variable. This creates a behavioral inconsistency: improved accessibility may allow workers to travel farther for the same generalized cost, paradoxically increasing predicted telecommute frequency. The goal of this task is to replace distance with a mode choice log sum for more consistent policy response. SANDAG was selected as the estimation data source due to available processed survey data.
JC: Note that this issue was originally flagged by MWCOG because there was no sensitivity of telecommuting to increasing parking costs in the core. Thought it would result in more telecommuting, but had no effect because impedance term was distance only - wasn't generalized cost or logsum or didn't include separate monetary cost parameter. Also, an additional reason for using SANDAG data was because that's what the MWCOG model was based on.
Data Issues
The estimation dataset combines SANDAG multi-day household travel surveys from 2016 and 2022. A significant anomaly was identified: the 2016 subsample appears severely over-represented (~33,000 workers), likely because the unique-household deduplication logic is not functioning correctly for the 2016 wave. All results shared should be treated as preliminary pending resolution of this issue.
Preliminary Estimation Results
Initial runs replacing distance with the mode choice log sum produced a sign error: the coefficient is positive, implying more accessible workers are more likely to telecommute — the opposite of the intended direction. Possible causes:
Discussion Highlights
Next Steps
Poisson Sampling & Disaggregate Accessibilities — Bias Issue
Raised by: Jan Zill (Outer Loop)
The Poisson sampling integration for the explicit error term (EET) branch is complete and ready for review. Testing uncovered a systematic bias in disaggregate accessibilities on the current main branch under importance sampling with replacement.
The root cause: McFadden's sampling correction factor is specified in the YAML as a constant shift, which drops out of location choice utilities but carries into log sum calculations. Under importance sampling with replacement, this is misapplied, introducing a bias equal to
log(sample size). For SANDAG (~100 samples) and MTC Extended (~725 samples), this equates to roughly 4–7 utility units — one-third to one-half the mean disaggregate accessibility value.Key implications:
Path forward: Jan will prepare slides covering the bias, numerical magnitude, and resolution options (code fix vs. YAML workaround) for discussion at the Thursday/Friday engineering meeting. Jeff Newman framed the approach: establish the ideal fix without legacy constraints, then assess backward compatibility for agencies with production models built on the existing behavior.
Action Items