ENS Name Normalization

Any thoughts on 2-in-1 characters (ꜳæꜵꜷꜹꜻꜽʤʣʥᴔꭁꭂʩǁʪɮʫʨꝷʦʧꜩɱᵯ) being confusable? eg. aa vs . aa shouldn’t be confusable because it’s double ASCII.


Combining Marks (CM) modify how a character is presented:

  • å = 61 30A (where 30A is a CM)
  • å = E5

NFC is responsible for collapsing these together, eg. they both normalize to E5. For some characters, there is no combined glyph, eg. e̊ = 65 30A has no corresponding single character form.

Multiple CM can be attached to the same character, eg. ã̰ = 61 303 330. NFC is responsible for putting the CM in a canonical order.

You can stack CM on characters, eg. ã̃̃̃̃̃̃̃̃̃ and a̰̰̰̰̰̰̰̰̰̰.

Some CM stack without any visual indication, eg. (1x) vs a̸̸̸̸̸̸̸̸̸̸ (10x).
a̸̸.eth ≠ a̸̸̸.eth ≠ a̸̸̸̸.eth = ...

There’s currently 500 registered names with CM. We could disallow some of the malicious ones (underscore-like or very small, etc.)? We could disallow stacking?

1 Like