DNS Collisions of ENS Names in Browser Input

ENS-over-DNS services like eth.limo are amazing, eg. vitalik.eth.limo, however there is a collision issue with some names.

There are names that work perfectly with DNS ("vitalik.eth"), names that don’t work with DNS ("-x-.eth" or "xn--πŸ’©.eth"), and names that can be converted to Punycode ("πŸ’©.eth" β†’ "xn--ls8h.eth").

Most browsers use UTS-46 (IDNA 2003, Transitional=True) on URL input, but not all browsers work the same. Most Punycode libraries follow the same conventions.

The following (2) names are distinct in ENS:

  1. πŸ˜΅β€πŸ’«πŸ˜΅β€πŸ’«πŸ˜΅β€πŸ’« 1F635 200D 1F4AB 1F635 200D 1F4AB 1F635 200D 1F4AB
    (3 sets of 😡+ZWJ+πŸ’«)
  2. πŸ˜΅πŸ’«πŸ˜΅πŸ’«πŸ˜΅πŸ’« 1F635 1F4AB 1F635 1F4AB 1F635 1F4AB
    (6 separate emoji)

However, most browsers and Punycode implementations translate both names to xn--ns8haa78mbab, resulting in a collision.

For reference, under my potential ENS normalization spec, ZWJ may only appear in emoji sequences.


There should be a mechanism to determine if an ENS name can work with DNS (verbatim, punycoded, or invalid). If it does, it should provide the correct DNS translation.

Because there’s no way to determine the preferred name for a collision, names that have collisions should not be reachable. I see the following solutions:

  1. Do nothing. All names can be addressed via Punycode. Tell users not to trust URL input translation for names with ZWJ sequences or deviation characters. Users who want to ensure DNS functionality should register the IDNA-mangled version(s) of their name.

  2. Do not resolve names that could potentially collide:

    • Do not resolve any name that contains a ZWJ sequence (where the ZWJ are optional.) In the above example, πŸ˜΅β€πŸ’« 1F635 200D 1F4AB is valid RGI emoji sequence. Therefore, do not resolve a name containing 1F635 1F4AB or 1F635 200D 1F4AB.
    • I don’t know what to do about the deviations ss <=> ß and Ο‚ <=> Οƒ. Browsers transform these differently too (Firefox is leaves ß alone, Chrome maps to ss). It seems unwise to ban every ENS name that contains ss from DNS. If you only ban the deviations, then jeß.eth might go to jess.eth in certain browsers without warning.
  3. Maintain a database of actively registered collisions (tracking registrations, renews, and expiration). Unfortunately, this would mean your DNS/URL would randomly break if someone registers a collision.

  4. Encourage the owner to purchase all permutations of their name (if they really want DNS functionality but have a name with collision possibilities) and then check for common owner on resolution (unnecessarily complicated.)

  5. Get first-class ENS support in browsers!

ENS names registered as Punycode literals (xn--*.eth) should not be reachable from DNS.


Here are the list of current collisions:

// note: some collisions are Punycode literals
// escape notation: {HEX} -> \u{HEX}
[
    ["edelwei{DF}", "edelweiss"],
    ["weisswein", "wei{DF}wein"],
    ["xn--0ciaa", "{2728}{2728}{2728}"],
    ["xn--og8haa", "{1F308}{1F308}{1F308}"],
    ["{56E7}{56E7}{56E7}", "xn--8bsaa"],
    ["{1F635}{1F4AB}{1F635}{1F4AB}{1F635}{1F4AB}", "{1F635}{200D}{1F4AB}{1F635}{200D}{1F4AB}{1F635}{200D}{1F4AB}"],
    ["xn--ki8haa", "{1F34A}{1F34A}{1F34A}"],
    ["{1F408}{2B1B}{1F408}{2B1B}{1F408}{2B1B}", "{1F408}{200D}{2B1B}{1F408}{200D}{2B1B}{1F408}{200D}{2B1B}"],
    ["{1F3F4}{200D}{2620}{1F3F4}{200D}{2620}{1F3F4}{200D}{2620}", "{1F3F4}{2620}{1F3F4}{2620}{1F3F4}{2620}"],
    ["{1F97A}{1F449}{1F448}", "xn--tp8hb001b"],
    ["{1F636}{1F32B}{1F636}{1F32B}{1F636}{1F32B}", "{1F636}{200D}{1F32B}{1F636}{200D}{1F32B}{1F636}{200D}{1F32B}"],
    ["nussbaumer", "nu{DF}baumer"],
    ["mnussbaumer", "mnu{DF}baumer"],
    ["{2764}{1F525}{2764}{1F525}{2764}{1F525}", "{2764}{200D}{1F525}{2764}{200D}{1F525}{2764}{200D}{1F525}"],
    ["{1F636}{200D}{1F32B}{1F636}{200D}{1F32B}", "{1F636}{1F32B}{1F636}{1F32B}"],
    ["xn--ms8ha18h", "{1F4AA}{1F60E}{1F4AA}"],
    ["{1F3F3}{200D}{1F308}{1F3F3}{200D}{1F308}{1F3F3}{200D}{1F308}", "{1F3F3}{1F308}{1F3F3}{1F308}{1F3F3}{1F308}"],
    ["{1F43B}{200D}{2744}{1F43B}{200D}{2744}{1F43B}{200D}{2744}", "{1F43B}{2744}{1F43B}{2744}{1F43B}{2744}"],
    ["{1F3F3}{200D}{26A7}{1F3F3}{200D}{26A7}{1F3F3}{200D}{26A7}", "{1F3F3}{26A7}{1F3F3}{26A7}{1F3F3}{26A7}"],
    ["weissbier", "wei{DF}bier"],
    ["{2764}{200D}{1F525}{2764}{200D}{1F525}", "{2764}{1F525}{2764}{1F525}"],
    ["gro{DF}mann", "grossmann"],
    ["fu{DF}ball", "fussball"]
]

Edit: I don’t think this is anything serious, but it popped up while thinking through some DNS stuff, and I figured I should make note of it.

6 Likes

Can someone pay this man please?

1 Like

What happens if a user enters these names into their browser? If the browser shows the normalised version (Eg, πŸ˜΅β€πŸ’«πŸ˜΅β€πŸ’«πŸ˜΅β€πŸ’« transforms into πŸ˜΅πŸ’«πŸ˜΅πŸ’«πŸ˜΅πŸ’«), then it seems to me that we should just preserve the default behaviour.

2 Likes

πŸ˜΅β€πŸ’«πŸ˜΅β€πŸ’«πŸ˜΅β€πŸ’« and πŸ˜΅πŸ’«πŸ˜΅πŸ’«πŸ˜΅πŸ’« are two different names.

You pointed out this example to me:

The point here is that they translate to the same destination when typed in as <emoji>.eth.limo.

Right, understood. But if only one of them can be resolved via DNS, that’s not necessarily a problem as long as it’s clear to a user which one they’re getting.

2 Likes