ENS Name Normalization

Same physical key on a keyboard

I feel there will be many mistakes

I am biased but I feel it would be wrong to not map it to the hyphen name

I own several hyphen names and several hyphen pre-punk names

Think about companies like coca-cola, who own and use coca-cola.eth, then y-3.eth, g-starRAW.eth etc etc

All these companies will lose confidence in ENS if you then allow coca_cola.eth to be able to be registered

They would all need to fight again to try and secure their names at cost

In my view, do not annoy these companies as they may turn their back against ENS

It would also mean all those pre-punk names with hyphens would be able to be copied

Actually not just pre-punk names but all names with a hyphen in them, and there are plenty

I know Nick doesn’t like hyphen names, but don’t shot ENS in the foot, it’s already got it’s problems, don’t make another one

ENS could be huge, it is already becoming a beast and I don’t think it’s even started yet, but again with mismanagement it could also fail

I have been openly critical of Nick and his ideas of how to issue 2 & 1 character names, I still think the tax method he was suggesting is stupid, but that is just my opinion, if you don’t listen to opinions then it just becomes an echo-chamber and mistakes are made

regarding the $ sign

Does ENS really need it??

I’m not really seeing why it is needed

Remember that for mass adoption of ENS it needs to appeal to the masses. The masses don’t spend all day at a computer keyboard, simplicity will attract and keep the masses, make it too technical or have too many issues and it will scare away the masses or make them hesitant in adopting it

3 Likes

thank you raffy your work is appreciated!

6 Likes

Agree

Is this the outcome discussed and approved? I read through the most recent comments, but not exactly sure what the outcome was. Will the extended arabic (persian) digits, route to the regular arabic indic digits for the ones where there is overlap?

I believe they need to be mapped, if we want a good UX.

According to UTS-46 w/ Context O the recommendation for Arabic Numerals is never allow digit mixing. However, that still permits visually identical names for corresponding digits.

According to UAX-15 and visual inspection, 0-3,7-9 are confusable, so either you disallow those characters or pick a preferred one. The recommended solution is to convert to punycode, so the user sees a gibberish name, but now have the information necessary differentiate the confusables characters (if they know the correct punycode form.) For ENS, we don’t have an alternative input form.

  • Names should normalize or fail → How do I resolve xyz.eth?
  • Names should be valid/accepted/notconfusing or fail/warn. → Is xyz.eth a spoof?

Discussed? yes. Approved? no, that’s the purpose of this discussion!

2 Likes

Hyphen and underscore are distinct characters with distinct visual appearances. Both are already used in DNS, too, where they have different meanings. We definitely shouldn’t map them to the same character.

Thanks, everyone for the great discussion and all the work.

I noticed that addresses containing an (emoji)+(word) are not being recognized on etherscan.
When I searched for :cookie:cookie.eth the response was ( Name does not follow UTS-46 normalization.)
Don’t get me wrong. Maybe this is already being addressed but I am afraid that if we cannot make it work across different platform interfaces like etherscan and specially CEXs this could trigger huge concerns and a lack of trust in ENS.

I don’t want to sound alarmist but I don’t think the majority of people that bought emoji domains know about this.

I am here to help if you need.

1 Like

Underscores NOT allowed in DNS domain names OR sub-domains and has been that way for years

Underscores NOT allowed in DNS domain names OR sub-domains and has been that way for years

This one is a good read

Domains are not case sensitive, so why make a difference between a hyphen ‘-’ and an underscore '_'

1 Like

The existence of _dmarc and _domainkey subdomains would tend to indicate otherwise.

An underscore is not an uppercase hyphen.

I think @raffy might have been talking about the Arabic-Indic digits in this case, not hyphen/underscore.

1 Like

This argument doesn’t make any sense, since lots of characters which are available in ENS are not viable in traditional domain names. The whole point of this thread is to find a “middle ground” by allowing many new Unicode characters and make them viable for our web 3.0 environment. We are not bound to follow the same standard as traditional domains. The goal is to allow for creativitiy, but not let malicious people spoof others by registering malignant/confusable domain names. The current implementation by raffy already accomplishes and fixes most of this.

We are seeing that people currently are most interested in visual appealing domain names like emojis/digits/numerals (Dune), which traditionally would be mapped down to punycode, which thankfully is not needed for ENS.

So the only argument to not allow underscore domains would be that they are confusable with hyphens, but in my opinion atleast they are plenty visually different to eachother, as Nick has already pointed out. If you read the thread, there are much more appaling characters that are confusable, but in this case it’s very clear.

1 Like

I would like to add to the argument to not allow underscore on ENS domains: we want .eth domains to be accessible natively or through a gateway on browsers. Introducing underscore as an allowable character may violate browser standards(?); uninformed post, more a question than a suggestion.

1 Like

If I’m interpreting this correctly, RFC3986 allows for underscores as an unreserved character in section 2.3 in URI’s, which should mean that browsers support them in the address field.

The only thing I could find for actual domain names is that underscores aren’t allowed in hostnames but are allowed for arbitrary records (cname, TXT and so on) so they’re likely to be supported.

My personal opinion is also that underscores should be allowed. They’re significantly different characters from hyphens.

2 Likes

Agreed, sorry, I meant to reply to the above post.


This is already false for names with complex emoji and other characters. While punycode can encode all non-ASCII, almost all URL-inputs preprocess with some version of UTS-46 + IDNA 2003, which mangles.

2 Likes

Point proven

SUBDOMAINS

This is worth consideration for the new normalisation function, too. How does the new normalisation function normalise names that have already been UTS-46 normalised? In what circumstances is ENSNORM(UTS46(name)) != ENSNORM(name)?

For valid names, this holds for everything that isn’t punycode (xn--...).

I will peel off the testing part of ens-normalize.js into a separate repo to compare existing implementations against the reference and my compressed version, and include this as one of the reports.

1 Like

Hey Raffy,
I came across someone online, who developed a chrome extension, and it basically is able to verify the twitter user, and see if their posted .eth as their name, is owned by them. I think it goes to opensea, scrapes their twitter handle, and also checks if the wallet owns the .eth. Then the extension will add a badge to “verify” if the twitter is linked to opensea, and is also linked to their .eth.
Do you think this is something ENS would fund as a grant, do on their own, or would be supportive of in any way? It can also help maybe with normalization in some way down the road?
I can paste this elsewhere, but kind of wanted thoughts first to see if it even makes sense to start.

1 Like

I’m not the person to ask but your idea seems useful. You probably don’t want an OpenSea dependency and should instead resolve everything on-chain via browser wallet (Infura) or via direct fetch. (Personally, I avoid most browser extensions (as they usually demand far too many privileges and making auditing difficult) but I’m a huge fan of Tampermonkey scripts.)


I’ve split up my repos:

Once I’ve finished splitting up ens-normalize.js (possibly this weekend), I’ll run the reference and compressed implementation through the Test Suite and then I think my ENSIP is ready for consideration.

After that, finish the contract NFC implementation and confirm that it matches the reference implementation.

Edit:

3 Likes

Going back to the hyphen debate, I’m still unsure why you are looking to introduce the underscore ??

It was not used in web2 TLD’s, it has had limited use in web2 subdomains and that is it

It is in my view visually very similar to the hyphen, so really is it needed

Coca-Cola and all the other companies using a hyphen name would need to fight to get the name again to protect themselves, all at the time when we are trying to onboard people to ENS, this would also flow onto all the people who have registered hyphen names, including the recent rush to rego hyphen emoji names, all those people would have the rug pulled from under them.

Surely if you are going to add something else ASCII these would have more use case:

@ - @username.eth

= #123.eth

I do kinda get the $ addition, but I also don’t feel it’s needed with the emoji sign :heavy_dollar_sign: that already works, adding the $ will double up all the names adding to confusion, if the $ is added are you also going to do the Euro sign and the GBP sign and also the other currencies??

ENS is flooded with emoji names now, plenty different options, why not just leave the ASCII characters as they are and were in web2

If you keep changing the rules, nobody is going to trust ENS, how are you meant to onboard people if there is no trust ???