This repository is the official implementation of the paper: PyEPO: A PyTorch-based End-to-End Predict-then-Optimize Library for Linear and Integer Programming (Accepted to Mathematical Programming Computation (MPC))
Citation:
@article{tang2024,
title={PyEPO: a PyTorch-based end-to-end predict-then-optimize library for linear and integer programming},
author={Tang, Bo and Khalil, Elias B},
journal={Mathematical Programming Computation},
issn={1867-2957},
doi={10.1007/s12532-024-00255-x},
year={2024},
month={July},
publisher={Springer}
}
If you use the CaVE loss, please also cite:
@inproceedings{tang2024cave,
title={CaVE: A Cone-Aligned Approach for Fast Predict-then-Optimize with Binary Linear Programs},
author={Tang, Bo and Khalil, Elias B},
booktitle={Integration of Constraint Programming, Artificial Intelligence, and Operations Research},
pages={193--210},
year={2024},
publisher={Springer}
}
PyEPO (PyTorch-based End-to-End Predict-then-Optimize Tool) is a Python-based, open-source software that supports modeling and solving predict-then-optimize problems with linear objective functions. The core capability of PyEPO is to build optimization models with GurobiPy, COPT, Pyomo, Google OR-Tools, MPAX or any other solvers and algorithms, then embed the optimization model into an artificial neural network for the end-to-end training. For this purpose, PyEPO implements various methods as PyTorch autograd modules.
For end-to-end learning on binary linear programs (TSP, CVRP, knapsack, ...), PyEPO ships CaVE [13]. CaVE replaces the per-step ILP solve with a cone-alignment projection onto the binding-constraint normals at the true optimum; backed by an interior-point QP solver (Clarabel) with a low iteration cap, this delivers paper-faithful regret on TSP-scale binary LPs. Because the cone projection is far cheaper than the per-instance ILP solve, CaVE trains an order of magnitude faster than SPO+ at this scale.
PyEPO also integrates MPAX, a JAX solver that runs the first-order PDHG method on GPU. Because both the prediction network and the solver stay on the GPU, MPAX solves a whole mini-batch of instances at once and avoids the GPU-to-CPU transfer that CPU solvers like Gurobi pay at every training step.
The official PyEPO docs can be found at https://khalil-research.github.io/PyEPO.
Our recent tutorial was at the ACC 2024 conference. You can view the talk slides here.
01 Optimization Model: Build an optimization solver
02 Optimization Dataset: Generate synthetic data and use optDataset
03 Training and Testing: Train and test different approaches
04 CaVE for Binary Linear Programs: Train with the cone-aligned CaVE loss vs SPO+ on TSP
05 2D Knapsack Solution Visualization: Visualize solutions for the knapsack problem
06 Warcraft Shortest Path: Train shortest path models on the Warcraft terrains dataset
07 Real-World Energy Scheduling: Apply PyEPO to real energy data
08 kNN Robust Losses: Use optDatasetKNN for robust losses
09 Solving on MPAX with PDHG: Use MPAX for GPU-accelerated batch solving
10 JAX Frontend: Train any loss in JAX/Flax with
jax.grad(MPAX or any solver)
To reproduce the experiments in the original paper, please use the code and follow the instructions in this branch. Please note that this branch is a very early version.
- End-to-end gradient surrogates for predict-then-optimize, covering the seven families in the docs:
- Surrogate losses — convex upper bound on regret (SPO+ [1]) and finite-difference directional gradient (PG [11]).
- Perturbed methods — Monte-Carlo gradients over random cost perturbations: DPO and PFYL [5] [6], I-MLE [9], AI-MLE [10].
- Regularized methods — L2-regularized Frank-Wolfe over the convex hull of feasible solutions: RFWO and RFYL [6].
- Black-box methods — informative gradient estimates that replace the solver's zero gradient: DBB [3] (interpolation) and NID [4] (signed identity).
- Cone-aligned estimation — supervise the predicted cost by projecting onto the binding-constraint normals at the true optimum; binary linear programs only: CaVE [13] — an order of magnitude faster than SPO+ at TSP scale.
- Contrastive methods — margin against a cached pool of non-optimal solutions: NCE and CMAP [7].
- Learning to rank — rank the true optimum highest among the pool: pointwise / pairwise / listwise LTR [8].
- Multi-solver backend under a unified
optModelAPI: Gurobi, COPT, Pyomo, Google OR-Tools, and the GPU-native MPAX PDHG solver. - Symbolic modeling with
pyepo.dsl: define an LP, MIP, or QP once withVariable,Parameter, and constraints, then compile it to any backend. The compiled model is a standardoptModel, so every loss above works unchanged. - Parallel solving via a Pathos worker pool to amortize per-instance ILP solves across a mini-batch.
- Solution caching [7] reuses previously computed optima to skip redundant solver calls in contrastive and ranking training.
- kNN-smoothed targets [12] replace each label with a neighborhood aggregate for noise-robust regret.
You can download PyEPO from our GitHub repository.
git clone -b main --depth 1 https://github.com/khalil-research/PyEPO.gitAnd install it.
pip install PyEPO/pkg/.The package is now available for installation on PyPI. You can easily install PyEPO using pip by running the following command:
pip install pyepoPyEPO is also available on Anaconda Cloud. If you prefer to use conda for installation, you can install PyEPO with the following command:
conda install -c pyepo pyepoAn end-to-end predict-then-optimize example. The optimization model is defined with pyepo.dsl and compiled to Gurobi; change backend to run the same model on COPT, Pyomo, OR-Tools, or MPAX.
#!/usr/bin/env python
# coding: utf-8
import numpy as np
import pyepo
from pyepo import EPO, dsl
import torch
from torch import nn
from torch.utils.data import DataLoader
# prediction model
class LinearRegression(nn.Module):
def __init__(self):
super(LinearRegression, self).__init__()
self.linear = nn.Linear(num_feat, num_item)
def forward(self, x):
out = self.linear(x)
return out
if __name__ == "__main__":
# generate data
num_data = 1000 # number of data
num_feat = 5 # size of feature
num_item = 10 # number of items
weights, x, c = pyepo.data.knapsack.genData(num_data, num_feat, num_item,
dim=3, deg=4, noise_width=0.5, seed=135)
# optimization model: define symbolically, compile to Gurobi
items = dsl.Variable(num_item, vtype=EPO.BINARY)
cost = dsl.Parameter(num_item)
optmodel = dsl.Problem(dsl.Maximize(cost @ items),
[weights @ items <= np.array([7, 8, 9])]).compile(backend="gurobi")
# init prediction model
predmodel = LinearRegression()
# set optimizer
optimizer = torch.optim.Adam(predmodel.parameters(), lr=1e-2)
# init SPO+ loss
spop = pyepo.func.SPOPlus(optmodel, processes=1)
# build dataset
dataset = pyepo.data.dataset.optDataset(optmodel, x, c)
# get data loader
dataloader = DataLoader(dataset, batch_size=32, shuffle=True)
# training
num_epochs = 10
for epoch in range(num_epochs):
for data in dataloader:
x, c, w, z = data
# forward pass
cp = predmodel(x)
loss = spop(cp, c, w, z)
# backward pass
optimizer.zero_grad()
loss.backward()
optimizer.step()
# eval
regret = pyepo.metric.regret(predmodel, optmodel, dataloader)
print("Regret on Training Set: {:.4f}".format(regret))End-to-end training of a shortest-path predictor on a 5x5 grid with the SPO+ loss (Flax + optax):
import jax
import jax.numpy as jnp
import optax
from flax import linen as nn
import pyepo
from pyepo.data.dataset import optDataset
from pyepo.func.jax import SPOPlus
# optimization model: 5x5 grid shortest path (any PyEPO solver works)
grid = (5, 5)
optmodel = pyepo.model.shortestPathModel(grid)
# synthetic data
x, c = pyepo.data.shortestpath.genData(
num_data=1000, num_features=5, grid=grid, deg=4, noise_width=0.5, seed=135,
)
ds = optDataset(optmodel, x, c)
xj = jnp.asarray(x, jnp.float32)
cj, wj, zj = (jnp.asarray(a, jnp.float32) for a in (ds.costs, ds.sols, ds.objs))
# linear predictor and SPO+ loss
predmodel = nn.Dense(optmodel.num_cost)
params = predmodel.init(jax.random.PRNGKey(0), xj[:1])
spo = SPOPlus(optmodel, reduction="mean")
optimizer = optax.adam(1e-2)
opt_state = optimizer.init(params)
# end-to-end training
for epoch in range(10):
grads = jax.grad(lambda p: spo(predmodel.apply(p, xj), cj, wj, zj))(params)
updates, opt_state = optimizer.update(grads, opt_state)
params = optax.apply_updates(params, updates)- [1] Elmachtoub, A. N., & Grigas, P. (2021). Smart “predict, then optimize”. Management Science.
- [2] Mandi, J., Stuckey, P. J., & Guns, T. (2020). Smart predict-and-optimize for hard combinatorial optimization problems. In Proceedings of the AAAI Conference on Artificial Intelligence.
- [3] Vlastelica, M., Paulus, A., Musil, V., Martius, G., & Rolínek, M. (2019). Differentiation of blackbox combinatorial solvers. arXiv preprint arXiv:1912.02175.
- [4] Sahoo, S. S., Paulus, A., Vlastelica, M., Musil, V., Kuleshov, V., & Martius, G. (2022). Backpropagation through combinatorial algorithms: Identity with projection works. arXiv preprint arXiv:2205.15213.
- [5] Berthet, Q., Blondel, M., Teboul, O., Cuturi, M., Vert, J. P., & Bach, F. (2020). Learning with differentiable perturbed optimizers. Advances in neural information processing systems, 33, 9508-9519.
- [6] Dalle, G., Baty, L., Bouvier, L., & Parmentier, A. (2022). Learning with Combinatorial Optimization Layers: a Probabilistic Approach. arXiv:2207.13513.
- [7] Mulamba, M., Mandi, J., Diligenti, M., Lombardi, M., Bucarey, V., & Guns, T. (2021). Contrastive losses and solution caching for predict-and-optimize. Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence.
- [8] Mandi, J., Bucarey, V., Mulamba, M., & Guns, T. (2022). Decision-focused learning: through the lens of learning to rank. Proceedings of the 39th International Conference on Machine Learning.
- [9] Niepert, M., Minervini, P., & Franceschi, L. (2021). Implicit MLE: backpropagating through discrete exponential family distributions. Advances in Neural Information Processing Systems, 34, 14567-14579.
- [10] Minervini, P., Franceschi, L., & Niepert, M. (2023, June). Adaptive perturbation-based gradient estimation for discrete latent variable models. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 37, No. 8, pp. 9200-9208).
- [11] Gupta, V., & Huang, M. (2024). Decision-Focused Learning with Directional Gradients. Training, 50(100), 150.
- [12] Schutte, N., Postek, K., & Yorke-Smith, N. (2023). Robust Losses for Decision-Focused Learning. arXiv preprint arXiv:2310.04328.
- [13] Tang, B., & Khalil, E. B. (2024). CaVE: A Cone-Aligned Approach for Fast Predict-then-Optimize with Binary Linear Programs. In Integration of Constraint Programming, Artificial Intelligence, and Operations Research (pp. 193-210).

