ENS Name Normalization

I’ve hedged either way, so what ever way the cookie crumbles it’ll be fine for me

Please guys, none of this. It’s ad hominem because you’re not arguing anything on technical merit, instead you’re seeking to discredit someone purely based on the domains they hold.

3 Likes

Cheers, I wasn’t going to say anything, I am used to it from them, also a cheap way to pump their club at the end

I actually laugh at things like this now, shows they are are worried about their bags

Since 2003, but it doesn’t mean registrars allowed it or absolutely anything was consistent. For 10 years or so allegedly there have been IDNs, but nobody uses them. I would be hesitant to say it was bungled because when something is standardized it is better to keep it that way, and restricting them to specific country based TLDs.

Please guys, none of this. It’s ad hominem because you’re not arguing anything on technical merit, instead you’re seeking to discredit someone purely based on the domains they hold.

This is mostly a political forum, and that’s not an ad hominem as I see it. It’s perfectly fair to argue politically. In fact the main arguments for and against are not technical. There’s no technical limitation really, it’s all philosophical which comes down to politics. The only technical things involved are implementation, and compatibility issues.

We all want to stop scams, and while I disagreed with the Theth.eth’s - vs _ confusion argument, I definitely see it as valid. The effort you go to restrict domains to allow or disallow certain behaviors because of underlying empirical observations is pure politics. Motives being transparent or questioned is a key tenet of democracy. There’s also nothing wrong with having a motive of self-interest.

I’m arguing for a broader picture as many have made arguments, which in my opinion, show a bit too narrow of a worldview. Not everyone is a gamer or gets 90s culture in IRC/AOL/ICQ handles. Not everyone codes and even people who do don’t know every language’s conventions. There’s a few sticky places where inclusivity touches security concerns.

Not touched in the discussion was the role of + for example. Only last night while researching ways to index Web3 sites did I discover some things about - versus _ and + in some libraries. It would be nice if at least subdomains could have + as a compliment to -. Sure, you can access the text records of a domain, but as ENS domains will likely be subdomain heavy, it would be nice to have conventions that allow faster indexing with keyworded subdomains.

That is the same level of low-caliber ad hominem I warned him about. Don’t do that either please, come on.

Saying “you just want to save your own bags” is nothing but a personal attack. You are pretending to know the heart of the other person, and attacking them based on your own assumptions. Feel free to argue politically or technically, but keep personal attacks out of it.

The rest is fine, let’s just stay on topic and civil please

@raffy What about domains like this: ENS Resolver


I don’t know why they render vertically, but is there a possibility make some normalization for them?

Combining marks stack/grow in different directions. Some build towers, some appear at the same spot, etc. Some marks obscure the underlying character. Some marks are very tiny.

  • i [69] vs ı̇ [131 307] (dottless i + dot)

Marks of the same class don’t reorder and form distinct names:

  • 1̉̇ [31 309 307] vs 1̇̉ [31 307 309]

If you convert every valid registered name to NFD and then count sequences of adjacent combining marks, only 41K names have 1 mark, 400 names have 2, and 400 have 3+.

UAX 39 suggests disallowing duplicate marks and only allowing 4 per character. I mentioned this issue a few times in this thread but ultimately I think enforcement should happen at validation level, instead of encoding this logic into the normalization spec.

For Latin, probably 1 CM is sufficient (and many shouldn’t validate, eg. a̲ [332] COMBINING LOW LINE), unless we want some very complicated logic that rejects duplicates and non-stacking/overlapping variations and blacklists some combinations. I’m not exactly sure what marks are needed in other scripts but we should try to keep it as minimal as possible.

IDNA says labels shouldn’t start with a CM. Additionally, I think a CM following an emoji is invalid too.

1 Like

I found a small issue: there are 8 hyphen-like characters that get mapped to other hyphen-likes for which we mapped to hyphen. There are 17 names that use these characters. These need mapped to hyphen too:

  • ‐ [2010]
  • ‒ [2012]
  • ― [2015]
  • ⁻[207B]
  • ₋[208B]
  • ︱[FE31]
  • ︲[FE32]
  • ﹘[FE58]

Edit: This should be fixed in the ENSIP, reference implementation, and ens-normalize.js library.

4 Likes

This proves my point yet again

The underscore should map to the hyphen

They stopped using the underscore in Web2 to as it was a confusable for the hyphen

Here we have vertical lines being mapped to the hyphen as they are confusables

Which looks more like a hyphen?? A vertical line or an underscore ?? I know which one I think does

People think it’s about me protecting my bags, but it’s not, I’m just stating facts……

1 Like

A hyphen is a hyphen, an underscore is an underscore. Your thesis that these are confusable doesn’t holds much water.

Repeating the same thing continuously doesn’t changes the fact that there is a plethora of reasons why underscore is a good idea for ENS dynamism, reasons that far outweigh your concerns imo.

At this point if you may or may not be protecting your bags is irrelevant, If you are looking out for ENS ecosystem that’s great, but the reasons not to allow underscore (at least those that I’ve read here) are objectively not compelling enough (and this seems to be what the majority of people that have voiced their opinion here seem to believe too).

The work being done here protect against most (if not all) your concerns.

1 Like

Well there is a vote going on now……

You also need to look at all the people who have manually minted underscore names already, and pushing for it to be allowed as it benefits them greatly and if it’s not allowed then they have lost money

You can’t call me out as just protecting bags (which I’m not) without also calling them out for trying to make sure their bags are actually worth something

Something to take into consideration…….

At this point I don’t really think we should be getting caught on with who minted what presently. But what’s better/worse for the ecosystem long-term.

4 Likes

Funny that in this very thread it’s been used against me and my reasons for posting and that was ok, then turn it the other way and it’s all we shouldn’t worry about what others have minted :joy:

If you consistently ask for underscores to map to hyphens and the reasoning you provide isn’t compelling enough and often the same (almost to the point where it might even start to feel a little like spam) Its not strange if people start thinking you might have an agenda to push or at least you are not being objective. Maybe you are genuinely concerned about the ENS ecosystem, maybe you have a personal interest, maybe both. Whichever the case may be, its irrelevant when you think about it.

Ill simply return to my previous answer, imo based on what I’ve read here it doesn’t make sense to map underscores to hyphens on the base of them being confusable to each other visually, they just aren’t (or mechanically because they print using the same keycap in desktops). I do not believe they would devaluate other ENS and/or anger companies either. I don’t see why them not being allowed in Web2 in certain instances should determine Web3 identity diversity, historically underscores have been very present in usernames and even emails, my email has a underscore, (Its 20+ years old).

I’m sure people would be open to considering any more additional reasons if you (or someone else) wants to write about them but at least to me everything said so far against underscores holds little to no water.

My first comments about this subject:

Several links showing that the underscore was worked out of web2 as it was a confusable with the hyphen

More proof again:

and again:

Then you have this when ENS is a DAO and run by a separate company paid to do the service

And now we have vertical lines that are being mapped to the hyphen as they are mapped to others that are deemed confusables

Again, how is a vertical line more like a hyphen than an underscore…it makes zero sense…

Even the public agree from a Twitter vote I did, though I was very surprised how close the results were

note: there were ZERO explanations of why some voted for the vertical line over the underscore

https://twitter.com/hyphenate_eth/status/1558375030773481473

I know Nick wants the underscore introduced to ENS, fine do it, but map it to the hyphen

ENS is going to have a billion $ valuation quite easily looking at the valuation of UD currently, doing things wrong now will have implications in the future, the running of ENS needs to take this into account now to save problems in the future, mistakes are only going to snowball larger

1 Like

Mapping a hyphen to a underscore is equivalent to mapping a 0 to the letter o. Two completeley different characters.

I’ve also seen multiple BAYC holders rocking a .eth name with an underscore in it so I don’t see a problem with it.

1 Like

Don’t really see what relevance what others hold has on this thread, but if you want to worship BAYC holders feel free to

Your example of o & 0 is also completely different:

1 - Both digits and letters are already issued

2 - They are on different keys on the keyboard

2 Likes

For clarity, all of those hyphens above, including the vertical ones, are mapped to a hyphen of specific length according to IDNA 2003. It’s not an ad-hoc assignment.

We’ve already mapped minus, en-, and em- to hyphen. This minor fix ensures that the remaining hyphens map to the same hyphen.

As mentioned before, the alternative would be to disallow them (terrible UX) or make all non-standard hyphens unsafe. Considering that people use hyphen, minus, en-, and em- regularly, mapping is clearly the best solution.


You can check this yourself:

// these all equal "xn--8ug"
new URL('http://︱').host
new URL('http://—').host
new URL('http://\u{FE31}').host
new URL('http://\u{2014}').host
3 Likes

Someone voicing their opinion is the entire purpose of a DAO.

Labelling it as spam is ridiculous, and I would almost go as far as saying that it’s close to censorship.

Similarly to how Nick.eth said “end of debate.”

No…the point is to have a discussion. It’s one sided, clearly, and I’m disappointed in the DAO for taking this stance.

Underscores will inevitably cause problems for users. It is painfully obvious that there will be outrage when users lose their ETH or beloved NFTs all because there was a character that looks the same.

If we’re prioritizing safety of users and a good user experience, why would you even consider risking having a character that could cause this to happen?

Finally, having “easily confusables” like | map to the already working hyphen, but not an underscore _ is the most backwards thing I’ve ever seen on here.

Let’s not screw this up, for the sake and safety of the mass of users yet to come.

2 Likes

Stop sucking the radio test argument & pretend you guys caring about users. Doesn’t make any sense. There are 100 other confusing ways.
Within all the normalised characters you chose underscores to talk about for months and try to explain why it’s confusing. It’s not.

  • isn’t _
4 Likes