feat(mdxish): add legacy variable tokenizer by eaglethrost · Pull Request #1339 · readmeio/markdown

eaglethrost · 2026-02-13T08:24:00Z

	Fix RM-15239

🧰 Changes

Adds a legacy variable <<>> micromark tokenizer so that MDXish can parse it to variable nodes. It follows the pattern we have for magic block parsing. This improves the engine architecture and removes the need for the legacy variable preprocessing we're doing in the frontend in the readme repo.

The tokenizer also supports parsing legacy glossary, which it converts to a glossary node (created here), which will then be converted to Glossary component.

🧬 QA & Testing

You can test quicker in the markdown playground. Turn on the mdxish flag and type <> syntax in normal and weird cases. The <> syntax should get resolved to VARIABLE. I'm not sure why it's not resolved to a value, maybe because we haven't added the value context, but if it turns to the capitalised version and strips away the <<>>, the tokenizer has worked (because it wasn't doing that before).
If you want to test in the readme app, make sure to prevent the legacy variable preprocessing to happen on MDXish RDMD here by removing the opts.mdxish part

…dimas/RM-15239-parse-legacy-vars-in-mdxish

eaglethrost · 2026-02-13T13:04:12Z

processor/transform/mdxish/evaluate-expressions.ts

Some type adjustments I needed to make after adding the Glossary node type introduced some TypeErrors elsewhere

eaglethrost · 2026-02-13T15:03:08Z

I'm not sure why there's failing test, looks like it won't run the build for some reason.. I've verified all tests pass locally though

eaglethrost · 2026-02-13T15:03:59Z

__tests__/lib/mdxish/magic-block-table-perf.test.ts

Adding the tokenizer to the table cell magic block slowed parsing down a bit, had to increase this to make it pass

eaglethrost · 2026-02-13T15:05:06Z

processor/transform/mdxish/mdxish-component-blocks.ts

 const EXCLUDED_TAGS = new Set(['HTMLBlock', 'Table', 'Glossary', 'Anchor']);

-const inlineMdProcessor = unified().use(remarkParse);
+const inlineMdProcessor = unified()


For parsing legacy variables inside MDX components like <Comp>Hello <<name>></Comp>

eaglethrost · 2026-02-13T15:05:44Z

types.d.ts

  value: string;
 }

+interface Glossary extends Node {


Creating a glossary node so we can represent it in mdast

rafegoldberg

Nice start here @eaglethrost; thanks for picking this up! Working well in the baseline scenarios, but still seeing some discrepancies with these <<legacy_vars>> when used in a code context, or when trying to escaping said var per the following screenshot/example Markdown:

- **Old**: <<email>>
- **New**: {user.email}
- ***Escaped***
  - **Old**: \<<email>>
  - **New**: \{user.email}
- ***Inline Code***
  - **Old**: `<<email>>`
  - **New**: `{user.email}`
- ***Code Block***
  - **Old**:
    ```
    <<email>>
    ```
  - **New**:
    ```
    {user.email}
    ```

(Also just noticing that it seems like the new {user.var} syntax isn't being escaped properly either, but that's an issue for a different PR…)

eaglethrost · 2026-02-13T17:01:05Z

Nice start here @eaglethrost; thanks for picking this up! Working well in the baseline scenarios, but still seeing some discrepancies with these <<legacy_vars>> when used in a code context, or when trying to escaping said var per the following screenshot/example Markdown:
- **Old**: <<email>>
- **New**: {user.email}
- ***Escaped***
  - **Old**: \<<email>>
  - **New**: \{user.email}
- ***Inline Code***
  - **Old**: `<<email>>`
  - **New**: `{user.email}`
- ***Code Block***
  - **Old**:
    ```
    <<email>>
    ```
  - **New**:
    ```
    {user.email}
    ```
(Also just noticing that it seems like the new {user.var} syntax isn't being escaped properly either, but that's an issue for a different PR…)

Oh I didn’t think we’d want to resolve variables in codes, I thought codes should be untouched. But will look into it 🙏

commit e657a21 Author: eagletrhost <dimazanugrah12@gmail.com> Date: Mon Feb 16 19:51:12 2026 +1100 refactor: clean code & comment commit c28853d Author: eagletrhost <dimazanugrah12@gmail.com> Date: Mon Feb 16 19:40:18 2026 +1100 test: fix legacy commit 7574b19 Author: eagletrhost <dimazanugrah12@gmail.com> Date: Mon Feb 16 18:36:04 2026 +1100 feat: glossary adjustments commit 6e92df0 Author: eagletrhost <dimazanugrah12@gmail.com> Date: Mon Feb 16 18:13:02 2026 +1100 feat: parse legacy vars in codes & api header block

eaglethrost · 2026-02-16T09:06:45Z

Update: Now it resolves legacy variables in code blocks, though it ended up being more complicated than expected. I still had to resort to some kind of pre-processing the legacy vars in code blocks because the tokenizer doesn't operate on code nodes. Looking to see if there's cleaner way, but it works now. Also extended to match glossary resolution behaviour in legacy

Another limitation I'm working on is correctly parsing legacy variables with spaces which doesn't work yet now.

Demo:
https://github.com/user-attachments/assets/94339567-a1b8-4485-8c87-d2d300146ad6

eaglethrost · 2026-02-17T12:20:22Z

Another update: So the current state works in tokenising legacy variables to variable nodes IF they're not in code blocks / inline code. I've reverted my preprocessing function to convert legacy vars to MDX vars because neither legacy touches vars in code blocks, and it still won't really work for variables with spaces / special chars. It's also not an option to tokenize code content because it's important to keep the code string intact for the Code component syntax-highlighter to work & not parse it.

Hence, as of now legacy vars in codes NO LONGER get resolved. After doing a bit of digging, I found that in legacy, they get resolved in the CodeMirror syntax-highlighter (see the Code component). So I think the best path forward is to extend the syntax-highlighter package to be mdxish aware, and allow it to also parse legacy variables for mdxish (currently it only parses MDX style user variables in MDXish). This way, we can keep the legacy variable syntax in code blocks and have it still get resolved in rendering, and follows how legacy handles it.

I've made a PR for that here: readmeio/syntax-highlighter#608

Important @kevinports
So to get this tokenizer full parity with legacy, that PR is blocking this. However, it's not really needed to resolve this akamai ticket which relies on this PR, because it doesn't care about code blocks. If we need to get it resolved asap then we can do a fast follow to add back the code block resolution

kevinports · 2026-02-18T18:48:22Z

Hence, as of now legacy vars in codes NO LONGER get resolved. After doing a bit of digging, I found that in legacy, they get resolved in the CodeMirror syntax-highlighter (see the Code component). So I think the best path forward is to extend the syntax-highlighter package to be mdxish aware, and allow it to also parse legacy variables for mdxish (currently it only parses MDX style user variables in MDXish). This way, we can keep the legacy variable syntax in code blocks and have it still get resolved in rendering, and follows how legacy handles it.

Ok this makes sense. But I do want to push back a little on whether the syntax highlighter is the best place to do this. If you run this example: https://non-git.readme.io/docs/variables you'll see the variable resolution doesn't happen until after the React app mounts (because I believe the syntax highlighting is client side only). So you see a flash of unresolved syntax from the SSR:

Screen.Cast.2026-02-18.at.12.48.47.PM.mp4

That UX kind of blows right? Is there any way we can do this engine side? or at the very least on SSR?

So to get this tokenizer full parity with legacy, that PR is blocking this. However, it's not really needed to resolve this akamai ticket which relies on this PR, because it doesn't care about code blocks. If we need to get it resolved asap then we can do a fast follow to add back the code block resolution

I am fine with moving the work to resolve vars within code blocks as a fast follow.

eaglethrost · 2026-02-18T21:44:03Z

That UX kind of blows right? Is there any way we can do this engine side? or at the very least on SSR?

Yeah we definitely can, we just need to pass in the variables list to the engine for resolution. It will be straightforward to do and we just need to extend the function arguments to accept the variables

So to summarize, we want to resolve legacy AND mdx variables in codes on the engine side? To do this I think we can just add transformer that visit code nodes and use regex to resolve the variables / extend the variable transformer we have now. Do note though that I think that way of resolving vars in code is different from legacy & mdx, but I guess it would be an improvement.

I am fine with moving the work to resolve vars within code blocks as a fast follow

If you're happier to move this work in a follow up PR, then this PR is basically done! Let me know if you're happy with my plan above and if it's better to create a follow up. (@kevinports)

kevinports

Lgtm.

That UX kind of blows right? Is there any way we can do this engine side? or at the very least on SSR?

Yeah we definitely can, we just need to pass in the variables list to the engine for resolution. It will be straightforward to do and we just need to extend the function arguments to accept them the variables

So to summarize, we want to resolve legacy AND mdx variables in codes on the engine side? To do this I think we can just add a transformer that visit code nodes and use regex to resolve the variables. Do note though that I think way of resolving vars in code is different from legacy & mdx, but I guess it would be an improvement.

I am fine with moving the work to resolve vars within code blocks as a fast follow

If you're happier to move this work in a follow up PR, then this PR is basically done! Let me know if you're happy with my plan above and if it's better to create a follow up.

Sounds good to me. I think the improvement is worth it while we're here.

[![PR App][icn]][demo] | Fix RM-XYZ :-------------------:|:----------: ## 🧰 Changes As a follow up of #1339, we want to resolve variables in inline code & code blocks, and the tokenizer couldn't do that. This PR adds an additional argument to the engine for the project variables, and a transformer to visit code nodes & use regexes to resolve legacy & MDX variables to their value. ## 🧬 QA & Testing The variables in this should get resolved, test with code blocks ``` `<<name>> {user.name}` // Remove the \, basically make it a code block \``` My name is <<name>> My other name is {user.name} \``` [block:code] { "codes": [ { "code": "My name is <<name>> and {user.name}", "language": "js" } ] } [/block] ``` - [Broken on production][prod]. - [Working in this PR app][demo]. [demo]: https://markdown-pr-PR_NUMBER.herokuapp.com [prod]: https://SUBDOMAIN.readme.io [icn]: https://user-images.githubusercontent.com/886627/160426047-1bee9488-305a-4145-bb2b-09d8b757d38a.svg

## Version 13.3.0 ### ✨ New & Improved * **mdxish:** add legacy variable tokenizer ([#1339](#1339)) ([8e8b11b](8e8b11b)) * add option to perserve variable syntax in plain text compiler ([#1345](#1345)) ([5ab350e](5ab350e)) * **mdxish:** resolve variables in code blocks ([#1350](#1350)) ([a6460f8](a6460f8)) * **mdxish:** use variable name for heading slug generation ([#1340](#1340)) ([61a97d3](61a97d3))

rafegoldberg · 2026-02-20T15:41:41Z

This PR was released!

🚀 Changes included in v13.3.0

eaglethrost added 9 commits February 13, 2026 16:56

feat: basic legacy variable tokenizer

a2e073f

fix: remove print

123e36b

fix: temp map glossary to variable for now

04e9050

test: basic var test

559eaa7

fix: skip glossary parsing for now

b392aa2

test: complete edge case testing

402eb77

feat: add variable tokenize to magic block & component parsers

7cac7ef

fix: escape character test

c290a2e

refactor: rename to legacy variable

6c15146

eaglethrost marked this pull request as draft February 13, 2026 08:54

eaglethrost changed the title ~~fear(mdxish): add legacy variable tokenizer~~ feat(mdxish): add legacy variable tokenizer Feb 13, 2026

eaglethrost added 5 commits February 13, 2026 23:10

feat: legacy glossary parsing support

df9b7a8

style: comments

4f7cb94

Merge branch 'dimas/mdxish-legacy-variables-tokenizer-glossary' into …

5a1bbf5

…dimas/RM-15239-parse-legacy-vars-in-mdxish

fix: updatae variable name

839a1c7

fix: position type issue

0ebc7eb

eaglethrost commented Feb 13, 2026

View reviewed changes

test: increase timeout & add markdown test

2388bcf

eaglethrost commented Feb 13, 2026

View reviewed changes

eaglethrost marked this pull request as ready for review February 13, 2026 15:14

eaglethrost requested review from kevinports, maximilianfalco and rafegoldberg and removed request for kevinports, maximilianfalco and rafegoldberg February 13, 2026 15:14

eaglethrost requested review from kevinports and maximilianfalco and removed request for maximilianfalco February 13, 2026 15:14

rafegoldberg reviewed Feb 13, 2026

View reviewed changes

eaglethrost added 3 commits February 16, 2026 15:59

Merge branch 'next' into dimas/RM-15239-parse-legacy-vars-in-mdxish

2ca93ea

chore: bump bundle

fd01200

eaglethrost added 3 commits February 17, 2026 13:34

feat: reuse mdx code to deal with invalid variable names

d7ceac0

feat: pass in var context in legacy

9bc2c8a

refactor: remove legacy var preprocessor

d2c84b3

kevinports approved these changes Feb 18, 2026

View reviewed changes

eaglethrost added 2 commits February 19, 2026 08:59

Merge branch 'next' into dimas/RM-15239-parse-legacy-vars-in-mdxish

85af10f

test: in mdx components

b9591ce

eaglethrost merged commit 8e8b11b into next Feb 19, 2026
21 of 24 checks passed

eaglethrost deleted the dimas/RM-15239-parse-legacy-vars-in-mdxish branch February 19, 2026 12:43

eaglethrost mentioned this pull request Feb 19, 2026

feat(mdxish): resolve variables in code blocks #1350

Merged

rafegoldberg added the released label Feb 20, 2026

Conversation

eaglethrost commented Feb 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🧰 Changes

🧬 QA & Testing

Uh oh!

eaglethrost Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

eaglethrost commented Feb 13, 2026

Uh oh!

eaglethrost Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

eaglethrost Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

eaglethrost Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

rafegoldberg left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eaglethrost commented Feb 13, 2026

Uh oh!

eaglethrost commented Feb 16, 2026

Uh oh!

eaglethrost commented Feb 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kevinports commented Feb 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

eaglethrost commented Feb 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kevinports left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

rafegoldberg commented Feb 20, 2026

This PR was released!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

eaglethrost commented Feb 13, 2026 •

edited

Loading

rafegoldberg left a comment •

edited

Loading

eaglethrost commented Feb 17, 2026 •

edited

Loading

kevinports commented Feb 18, 2026 •

edited

Loading

eaglethrost commented Feb 18, 2026 •

edited

Loading