Another very suspect character combination (that I wasn’t aware of) that’s still valid in IDNA 2008 is 0332 (COMBINING LOW LINE)
which is a combining character that underlines:
X + 0332 = underlined X
Here are the current deviations from the spec that were already discussed:
-
Emoji (UTS-51) are handled separately from text (UTS-46).
-
There should only be (1) stop/label-separator character
(.) 002E (FULL STOP)
rather than (4).
ENS Name Normalization - #7 by nick.eth -
Underscore should be allowed
(_) 005F (LOW LINE)
.
ENS Name Normalization - #26 by nick.eth -
Because UTS-51 is sloppy, all emoji must be whitelisted.
ENS Name Normalization - #23 by raffy
The ambiguity of “poop joiner” prevents an algorithmic solution.
ENS Name Normalization - #39 by raffy -
There are some well-supported non-RGI emoji that should be allowed. So far, I’ve only experimentally whitelisted “women wrestling”. This likely requires community review.
ENS Name Normalization - #24 by raffy -
Tag sequences must be whitelisted because invalid tags render invisibility. Luckily, there are only 3. Again, requires community review.
ENS Name Normalization - #27 by raffy
This is the latest report using 553K registered labels for eth-ens-namehash vs adraffy-1.3.13
which corresponds to UTS-51+IDNA2008+CheckHyphen+CheckBidi+ContextJ+ContextO+ChangesAbove
.