Skip to content

[FC-0118] docs: add ADR for standardizing pagination across APIs#38300

Open
Abdul-Muqadim-Arbisoft wants to merge 1 commit intoopenedx:docs/ADRs-axim_api_improvementsfrom
edly-io:docs/ADR-standardize_pagination_usage
Open

[FC-0118] docs: add ADR for standardizing pagination across APIs#38300
Abdul-Muqadim-Arbisoft wants to merge 1 commit intoopenedx:docs/ADRs-axim_api_improvementsfrom
edly-io:docs/ADR-standardize_pagination_usage

Conversation

@Abdul-Muqadim-Arbisoft
Copy link
Copy Markdown
Contributor

@Abdul-Muqadim-Arbisoft Abdul-Muqadim-Arbisoft commented Apr 8, 2026

Currently, Open edX REST APIs implement pagination inconsistently across endpoints — some use page/page_size, others use limit/offset, and several return full unbounded result sets entirely. This forces every API consumer, whether an MFE, mobile client, or AI agent, to implement custom data-loading logic per endpoint, and risks overloading clients with large unpaginated payloads. This ADR proposes standardizing all list-type endpoints on the existing DefaultPagination class from edx-drf-extensions, enforcing a consistent response envelope across the platform and enabling consumers to implement a single reusable pagination loop for all Open edX APIs.
Issue: http://github.com/openedx/openedx-platform/issues/38266

- Proposes DefaultPagination from edx-drf-extensions as platform-wide standard
- Documents migration path for LimitOffsetPagination and unpaginated endpoints
- Includes code examples for ListAPIView, APIView, and mobile pagination
- Outlines rollout plan and alternatives considered
@Abdul-Muqadim-Arbisoft Abdul-Muqadim-Arbisoft changed the base branch from master to docs/ADRs-axim_api_improvements April 8, 2026 06:01
* **User Accounts API** (``/api/user/v1/accounts/``) — pagination behavior differs from other user-related APIs, making it difficult for consumers to use a single data-loading pattern.
* **Course Members API** (``/api/courses/v1/.../members/``) — returns all enrollments without pagination, relying on a ``COURSE_MEMBER_API_ENROLLMENT_LIMIT`` setting (default 1000) to cap results and raising ``OverEnrollmentLimitException`` instead of paginating.
* **Enrollment API** (``/api/enrollment/v1/``) — some list endpoints return full result sets without pagination support.
* **Course Blocks API** (``/api/courses/v2/blocks/``) — intentionally returns unpaginated data for the entire course structure, which can result in very large response payloads.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, pagination of tree structures is complicated, to say the least.

Does a "page" size of 10 refer to 10 top-level items, which may potentially have hundreds of children included? Or a "varying shape" response with 1 top-level item + 9 children, or 8 top-level items + 2 children, and even more complexities with grandchildren and great-grandchildren? Or do we limit to returning 1 depth level at a time to avoid this?

Claude suggests the following:

The most principled approach distinguishes between two different questions clients are asking:

"What is the shape of this tree?" — This is a structural query. The answer (IDs, types, parent-child relationships, display names) is typically small and bounded even for large courses. It should be returned in full, without pagination, at controlled depth. A course with 500 blocks has maybe 5–15KB of structure. Trying to paginate this creates more problems than it solves.

"What is the full data for these nodes?" — This is a content query. Node content (student view data, completion state, grade details) can be large per node. This is where you paginate — but over a flat list of node IDs, not the tree itself.

Specifically, for the ADR, that would mean stating something like this:

Tree-shaped endpoints must not apply standard item-count pagination to the full node set. Instead, they must choose one of:

  1. Return the complete structural representation (IDs, types, relationships) and paginate separately over node content when requested, or
  2. Return the tree to a fixed maximum depth and provide explicit child-fetch URLs for any subtrees beyond that depth.

CC @jesperhodge re taxonomy pagination.

Note: Claude also said:

The course blocks API is actually a reasonable example of getting this mostly right already — requested_fields lets you strip the response down to structural metadata, and you can fetch full block detail separately. Its main gap is that the approach isn't documented as an explicit standard, so other tree-shaped APIs have reinvented things differently. ADR 0036 should probably make this the pattern explicitly.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I guess this is actually explored in #38305 - why not just combine that ADR into this one?

Alternatives Considered
-----------------------

* **Standardize on LimitOffsetPagination instead of PageNumberPagination**: Rejected because ``edx-drf-extensions`` already ships ``DefaultPagination`` based on ``PageNumberPagination``, and a significant portion of the platform already uses it. Additionally, ``limit``/``offset`` pagination degrades in performance with large offsets because the database must scan and skip all preceding rows, making it unsuitable for large Open edX datasets such as enrollments and completions.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additionally, limit/offset pagination degrades in performance with large offsets because the database must scan and skip all preceding rows, making it unsuitable for large Open edX datasets such as enrollments and completions.

This doesn't make any sense. limit/offset pagination and page number pagination have exactly the same database performance characteristics if implemented naively. But this is just the client-facing API shape; technically, there are ways to implement either page number pagination or limit/offset pagination using a cursor internally to improve performance.

The main reasons to prefer page number pagination are that it's already widely used, and it's much easier for humans to understand than limit/offset.

-----------------------

* **Standardize on LimitOffsetPagination instead of PageNumberPagination**: Rejected because ``edx-drf-extensions`` already ships ``DefaultPagination`` based on ``PageNumberPagination``, and a significant portion of the platform already uses it. Additionally, ``limit``/``offset`` pagination degrades in performance with large offsets because the database must scan and skip all preceding rows, making it unsuitable for large Open edX datasets such as enrollments and completions.
* **Adopt CursorPagination as the platform standard**: Rejected because cursor-based pagination, while performant for large and frequently-changing datasets, does not support random page access (jumping to page N). This would break existing MFE patterns that display numbered page controls. Cursor pagination also requires a stable, unique, sequential sort key on every queryset, which not all Open edX models guarantee today.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor pagination does not require a sort keys to be sequential nor unique. It just requires that you can define a deterministic ORDER BY on every QuerySet, and that the sort key is indexed (for performance).

While "basic" cursor-based pagination works like WHERE id > :last_seen_id ORDER BY id LIMIT :page_size, you could instead use WHERE (sort_key, id) > (last_value, last_id) ORDER BY sort_key, id to make cursor-based pagination work for any comparable, indexed type — timestamps, strings, UUIDs, whatever.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants