ENS Name Normalization

Characters that should probably change:

  • Currency symbols: $, ¢, £, ¥, €, ₿, ... These are disallowed, they should be allowed.
  • Checkmarks: ❌❌︎🗙🗴🗵🗶🗷✔️✔🗸🗹 These are allowed, they should be consolidated.
  • Negative Circled (Serif vs San-serif): ❻ ➏ these are allowed, they should be equivalent.
  • Double vs Single Circled Digits: ➀ vs ⓵ these are allowed, they should be equivalent.
  • Emoji Tags: 🏴󠁧󠁢󠁥󠁮󠁧󠁿 1F3F4 E0067 E0062 E0065 E006E E0067 E007F: these are allowed, they should be ignored as they hide arbitrary data.

Reasonable but Non-RGI ZWJ Sequences:

Not sure:

  • Variations of this guy: ༼つ◕o◕༽つ, ಠ‿ಠ, └།๑益๑།┘
  • Blocks: ▁▂▃▄▅▆▇█ If these are allowed, why is _ disallowed? And what’s this? 🗕
  • Math Symbols: many seem very cool and unique, but need individually reviewed.
  • Abstract shapes

These lists are incomplete.

Edit: I’ve computed the 75 ZWJ sequences that are non-RGI but show up on emojipedia as JSON. I’ve also included them into the Emoji report.

4 Likes