ENS Name Normalization

I don’t think it’s visually similar at all, they are two very distinct characters. And lots of characters are “similar”, that’s what we’ve all learned going on this normalization journey with @raffy. I don’t think that’s a good reason to say “really is it needed” though.

ENS already allows all kinds of characters that are not used (or maybe not even valid) in web2 domains, and that’s okay. When ENS started it was centered around “domains”, but I think now it is seen through the broader lens of “profiles”, not just website domains.

So sure, why not allow underscores. I agree with Nick that hyphen and underscore are clearly distinct, and one should not be mapped to another.

The question is why do new characters need to be introduced, if you keep doing this you will continue to lose the trust of the people buying them. will the attitude be “What is coming next…is someone going to be able to copy my name??”

@raffy did agree that it should be mapped along with any other dash / minus / looking character

" I think mapping to hyphen is reasonable as well. "

Yes you can tell the difference, but as they are on the same physical key on a keyboard mistakes will be made by people, but again why is it needed, why are the rules changing so much, what is next?? !@#$%^&*()+=:"’;<>?/~`| one of these or all of them??

If underscores were used in web2, then why when I go to GoDaddy and make up a random name with an underscore, it shows zero results, do the same with a hyphen and it comes up

Why not leave the ASCII characters as they were in web2 Letter, Number & Hyphens only

Leave the fun for the other characters & emoji’s

This will help onboard people as it’s the system they know and trust

The people here already are not the main market, the main market hasn’t even heard of ENS yet

I don’t think there’s been a single normalization change since EIP-137 so no trust has been lost from “rules changing”. IMO it’s the opposite: there are serious issues with nonstandard normalization, weird emoji, zero-width characters, and various confusables.

My ENSIP only enables (2) previously disallowed characters, "$" and "_". They’re both collateral damage of the STD3 flag and legacy DNS rules.

  • "$" is the only disabled currency symbol under IDNA 2003.
  • "_" is actively used in DNS records.

ASCII are the most valuable ENS characters. They’re universally supported, recognized, and easy to type. IMO, we should enable as many as possible, and $ and _ seem like good candidates, whereas "@" or "#" seem like a huge mistake since they have established delimiter-like uses.

Above ASCII, there are thousands of characters that never should of been enabled, but we’re stuck with them.

We can improve trust by (1) standardizing this process (re: my ENSIP), (2) deploy an on-chain normalization contract that follows the standard, and finally (3) deploy an on-chain validation contract that asserts if a name is unambiguous.

My investigations have shown that (3) is hard, but it’s almost trivial to provide confusable-free validation for almost all names (DNS + Emoji is 95%+ coverage). By separating normalization from validation, the remaining exotic names will still be able to normalize and resolve (but ideally there would be some feedback that they fail validation during user-input.)

2 Likes

I totally get why you are sorting some stuff out, it’s been needed as it was opened up far too much in the past without being fully thought through, hence all the problems with ZWJ etc etc

Things need to be tidied up

I just chose # and @ as 2 random characters, it could quite easily be ! or & or * etc

I fully feel you need to standardise the process and lock in what can and can’t be used, but this is what I am on about, when are the changes going to stop, is this the only one?

The hyphen is gaining momentum every single day in its use

In the past few weeks we have seen many hyphen emoji names minted (I hold zero), people are realising that web2 user names with the old rules are no longer needed with SIWE and can include hyphens, this is only going to amplify the use even more, but now the proposal is that these people and companies using a hyphen name will have to try and get the underscore name to save a copy cat user

If it mapped to the hyphen name then great, it would save a lot of hassle and cost to these people, though for Coca-Cola or G-StarRAW or Y-3 it wouldn’t be too much money in the grand scale of things

I’m guessing it will all be packaged in one vote by the DAO, so it will be this is what we have decided is best, do you agree, yes / no

There’s only 127 ASCII. I think this can be decided and frozen.

  • "?", "&" search string param separator
  • "/" path separator
  • "\", "%" escape characters
  • """, "'" quotes
  • "()[]{}" brackets (markdown syntax, etc.) → "(raffy.eth)"
  • "*" DNS wildcard
  • "|" is a "!"-confusable
  • ",;:" is a "."-confusable (the most important character)
  • "!" maybe? likely "."-confusable
  • "+", "~" maybe? likely"-"-confusable
  • rest are control characters

If we dislike "_", it should stay disabled.

2 Likes

One question I have about the underscore is, has anyone actually come forward and asked for it to be included ??

If not then it shows that it is not needed in my view

I’m guessing someone has come forward and asked for $ to be added

100%

1 Like

This sounds very non-sensical to me.

I don’t see any issue on introducing the “_”
just like I don’t see any issue with $ and other currency symbols. Once decided it can be “locked” and that fear of never ending ASCII would stop.

I might be being cynical but it sounds like you might own some “-” hyphen ENS and want to protect yourself from the “_” version.

You are right with one things. Just the point of the iceberg know about ENS. And I can guarantee it unlike Web 2. This wont be as brand centred, quite the opposite in fact.

2 Likes

Underscores are permitted in DNS names - they are just prohibited by registrars for registration. The normalisation function needs to support a superset of valid DNS characters, so underscores need to be included.

1 Like

In my first post I think it was in this thread, I said I own hyphenated names

No need to guess, it’s further up the discussion

I’ve been fully open

REPLY TO NICK BELOW AS CAN"T POST AGAIN

Still not seen a single TLD with a underscore, I’ve seen subdomains, but never a TLD

They were also phased out being used in subdomains

I know this is web3 and not web2, but your argument falls flat

DNS has already gone though this process and phased out the underscore as they realised it was a mistake

First off, I’m assuming you mean domain names vs subdomains (as TLDs would be .com, .net, .org, .eth and so on)

I looked into this a while ago and the convention of not using underscores in domain names is from an incredibly outdated specification hailing from 1987 that was last updated in 1997 (the last spec is from 1998 but is about URI’s):

Summary: HostNamingRules < CF < TWiki


The TLDR is that it was decided that domain names should follow ARPANET hostname rules to prevent incompatibility with legacy software like telnet and mail before operating systems other than Plan9 conformed to unicode.

ENS doesn’t have to make those same considerations because it doesn’t support legacy software regardless. I haven’t seen any up to date specification that disallows underscores for any reason beyond this, let alone a reason that matters to ENS or modern software.

These same specifications also contains many antiquated rules restricting the length of names, internationalized names, all unicode and many more things we wouldn’t want to restrict ourselves to.

From 2020:

From 2019:

https://www.entrust.com/blog/2019/01/removal-of-underscores-from-domain-names/

https://cabforum.org/2018/11/12/ballot-sc-12-sunset-of-underscores-in-dnsnames/

Those articles aren’t about which characters are valid in domain names, they are about rules for the issuance of SSL certificates by centralized Certificate Authorities.

ENS names aren’t required to point to a website, or to use SSL certificates, and self-issued certificates aren’t bound by those rules.

Totally agree ENS names are different, BUT it shows that underscores were used and revoked in the past in DNS to to the same reason why I am talking about not issuing the underscore in ENS

It’s history repeating itself, but as ENS is decentralised it can’t be revoked like on DNS, if it’s allowed it’s here to stay

No it doesn’t. Underscores not being allowed in domain names doesn’t have anything to do with those articles, it had to do with very old considerations for ARPANET hostnames. You can read about it in my comment here:

As I have been saying, used in sub-domains only, but not in the main domain name

They have also been revoked as they are a confusable, which is what I have also been saying for ENS

So why follow the same path when it’s already happened in DNS and they changed it

It obviously doesn’t work…

I am really not interested in litigating this further, and it’s a massive distraction from the normalisation function as a whole.

Underscores are permitted in DNS and will be permitted in the new normalisation function. End of debate.

5 Likes

Underscore is for file path word separation only imo.

im just gonna bump this one post up and pretend I didn’t reply before reading @nick.eth 's recent post.

1 Like

Thought the DAO decided and not one single person??

Why are you not addressing the concerns of the people you are meant to be serving instead of dictating to them?

Raffy has already said this about mapping it:

Why not introduce it to allow DNS integration, but map it to the hyphen

This gives the best of both worlds, which is what Raffy was proposing

Just now what you are proposing in my view is a rug pull on anyone who has registered an ENS name with a hyphen in it

Coca-Cola.eth included along with all the others

1 Like

Glad to see “_” is likely going to happen.

As it should.

2 Likes