Skip to content

Overhaul the python and cython lexers#2271

Open
jneen wants to merge 6 commits intomainfrom
maint.python-overhaul
Open

Overhaul the python and cython lexers#2271
jneen wants to merge 6 commits intomainfrom
maint.python-overhaul

Conversation

@jneen
Copy link
Copy Markdown
Member

@jneen jneen commented Apr 10, 2026

This is one of the earliest lexers written in the project, and it definitely shows its age. It was using all manner of large joined regex, and lookaheads that were frankly unnecessary.

I've introduced a special :newline state which is pushed at every newline. The Cython lexer has also been adjusted to be compatible with this.

The lookaheads that were kept are for case and match, which are not entirely possible to lex correctly without doing a full parse. This approach works for the most common cases, and a few uncommon ones, but it is possible to break it with a specially crafted string literal inside a case pattern or match statement. In the worst case, though, case and match will be highlighted as Name, and the lexer will recover.

Mojo was also edited to use the :funcname state after its fn keyword.

@jneen
Copy link
Copy Markdown
Member Author

jneen commented Apr 10, 2026

image

Added this to the visual spec for the Python lexer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant