ENS Name Normalization

Another very suspect character combination (that I wasn’t aware of) that’s still valid in IDNA 2008 is 0332 (COMBINING LOW LINE) which is a combining character that underlines:

X + 0332 = underlined X

  • a = 0061
  • = 0061 0332
  • a = <a>0061</a>
  • = <a>0061 0332</a>

Here are the current deviations from the spec that were already discussed:

  1. Emoji (UTS-51) are handled separately from text (UTS-46).

  2. There should only be (1) stop/label-separator character (.) 002E (FULL STOP) rather than (4).
    ENS Name Normalization - #7 by nick.eth

  3. Underscore should be allowed (_) 005F (LOW LINE).
    ENS Name Normalization - #26 by nick.eth

  4. Because UTS-51 is sloppy, all emoji must be whitelisted.
    ENS Name Normalization - #23 by raffy
    The ambiguity of “poop joiner” prevents an algorithmic solution.
    ENS Name Normalization - #39 by raffy

  5. There are some well-supported non-RGI emoji that should be allowed. So far, I’ve only experimentally whitelisted “women wrestling”. This likely requires community review.
    ENS Name Normalization - #24 by raffy

  6. Tag sequences must be whitelisted because invalid tags render invisibility. Luckily, there are only 3. Again, requires community review.
    ENS Name Normalization - #27 by raffy

This is the latest report using 553K registered labels for eth-ens-namehash vs adraffy-1.3.13 which corresponds to UTS-51+IDNA2008+CheckHyphen+CheckBidi+ContextJ+ContextO+ChangesAbove.

1 Like