ENS Name Normalization

We love the underscores. I run a discord group with many BAYC and MAYC holders and many of them have refrained from minting a name because underscores are not allowed. I think this is much needed and will onbaord even more users.

Point proven, just had this sent to me:

Coca_Cola.eth was registered on the 26th

Could say the same about the last 5 or so posts of you.

The only argument against underscores that can be made are that they are confusable with hyphens, which they clearly aren’t if you look at the issue objectively. I think most reasonable people would agree here.

These are the official confusables for the hyphen: Unicode Utilities: Confusables

For underscore: Unicode Utilities: Confusables

Additionally, nick has pointed out already that underscores are in fact valid in regular domain names (but not host names), so even less of an argument there.

And can we please stop pretending like cocacola and gstarraw are in any way negatively impacted by allowing the underscore? G-StarRAW.eth is even using uppercase letters in their twitter, clearly showing they are not interesting correctly portraying an ENS address anyways.

I trust raffys and Nicks expertise in this matter, since they have been on the forefront of getting a handle on this normalization for more than half a year now. Raffys argument about allowing as many ASCII as possible that aren’t confusable sounds very reasonable, since they are the easiest accessible symbols.

It would be nice to get back on topic, since this discussion seems rather pointless as this matter is clearcut.

2 Likes

“I trust raffys and Nicks expertise in this matter”

Raffy was for mapping the underscore to the hyphen

ONLY Nick was against it…

ENS is not case sensitive, so lower case is not “the correct way”, so people will use upper case in advertising, this WILL translate into people using uppercase when typing things in

Have you looked at what is currently getting registered via the Twitter bot??

People front running the possible rules change by bypassing the official ENS Rego Page

As I’ve said, you have created a beast in ENS, now is the time to try and control and not lose control of it

https://twitter.com/ensregistry

Theres a coca_cola on instagram as well. Coca Cola didn’t reach out to buy. It’s a dead account

What about allowing domains to start with a - or _ in the next update? If they are allowed to be entered, may as well allow which character they can start or end with?

So -nick- or nick for example. I don’t see any harm in allowing these at any point in the character order?

1 Like

Already getting minted in the chance it happens

-7-.eth looks clean, only 10 possible (all gone) :wink: these are total grail names

Single-digit palindrome, the palindrome floor of the 999 is quite high, so I expect these to go for top $ as they have no numbers on the edges and are thus 10x as rare as any 999 palindrome

I also saw someone do an underscore then then keycap digit

$-$.eth has also gone

-420- has gone

_69 has gone

69_ has gone

someone has been minting the N_N names

If you are asking about it you are too late

Stylistically a leading hyphen or underscore makes a ton of sense as an identity. This is allowed in almost every online game where you can choose your handle. There are many --#1-- type usernames.

It’s not like the unicode letters are compatible with Web 2.0 as it is. The ability to finally have your own language in a domain is amazing. That’s the killer difference of ENS. It is actually universal.

2 Likes

Yes, so many gaming and poker handles start with - or _, and it would make sense they would want to just have the same option here.

The reason people put an underscore or a hyphen at the start or the end of a user-name is that they want a specific user name but can’t get it, so they copy-cat the name by putting in the hyphen or underscore before/after

In ENS if you had 7 or -7- or --7 or 7-- or 7 or 7 for example, it is like having a single digit, but the cheaper version, you could even go -7_ or -_7 etc etc, all are a single digit, but the cheaper version

It does mean there will be cheaper versions of all names due to the dilution eg 474- or 474_ or _474 or -474 etc etc

It also means that there are more names that are rarer than some groups like the 999, -NN there are only 100, NN- there are only 100, these have the same holding costs as the 999 names, same goes for the -N- names only 10 of them, same cost as the 999 per year. These names are all copy-cats of 2 or 1 digit names, that are obviously not available at this current time

Every copy-cat name is also cheaper to register per year than the name they are copying due to the additional character/s (until it gets to 5 characters)

I don’t know if this is something ENS wants to happen, but I’m sure we will find out with the normalisation that is coming and if this is the angle that is wanted or not

Edit: Just noticed that some underscores are not showing up around the 7 number as I’m guessing it makes some sort of coding

yet another problem…nothing wrong with the hyphen versions though…

Pic:

What is the website to check ENS normalization?

You can use raffy’s tool here: ENS Resolver

Edit: Just noticed that some underscores are not showing up around the 7 number as I’m guessing it makes some sort of coding

ser:

__7__

image

if you add “`” it will escape (ignore) the markdown formatting :wink:

1 Like

@3070.eth
First and foremost, we are glad to see this is your first post! We welcome you. Each individual utilizes different dApps, tools and services and it’s important that you share issues you come across. That helps the whole project succeed. Input from the community is valuable and highly encouraged. Again, thanks for sharing.

Could you please elaborate on the conditions that caused this error?

I don’t think underscores should be disallowed on the basis that they devalue existing names. Underscores are hardly confusable and, to use your example, no one refers to Coca-Cola by “Coca_Cola”, just like no one refers to google by “goog1e”.

This is true. While underscores are allowed by DNS, they aren’t allowed in hostnames. To me, the “web2-bridge” features (eth.link/eth.limo, .xyz/.luxe/.club, DNSSEC, etc) are the most compelling use-cases for ENS (if not the most compelling use-cases for blockchain in all of crypto), so I think ENS should strive to ensure that all .eth domains work within this context. If users see that “http://example-domain.eth.limo” works, they might be surprised that “http://example_domain.eth.limo” breaks. Selling domains which are known to break with web2 could be considered user-hostile.

To my knowledge, unicode letters are fully compatible with web2 via punycode. Allowing “_”, and “$” would be the first major time that ENS breaks with web2 (which I think would be unfortunate).

Can you elaborate on this? What specific technical use-cases are solved by allowing underscores? In DNS, underscores are only used for service records, which I’d consider most similar to ENS text records (where underscores are allowed). Absent that, I can’t really think of any compelling technical arguments for why ENS should break with UTS46 on this, but am interested in hearing why you think they’re necessary.

From UTS46:

IDNA2003 provides for a flag, UseSTD3ASCIIRules, that allows for implementations to choose whether or not to abide by the rules in [STD3]. These rules exclude ASCII characters outside the set consisting of A-Z, a-z, 0-9, and U+002D ( - ) HYPHEN-MINUS. For example, some browsers also allow characters such as U+005F ( _ ) LOW LINE (underbar) in domain names, and thus use UseSTD3ASCIIRules=false, plus their own validity checks for the other ASCII characters.
While UseSTD3ASCIIRules=true is strongly recommended, Section 5, IDNA Mapping Table provides data to allow implementations to support UseSTD3ASCIIRules=false for compatibility with IDNA2003 implementations where necessary.

In my opinion, there should be a pretty high bar for ignoring this recommendation.

I think the main draw for many here is the exact opposite though, namely not being held to the same strict and very old ruleset that web 2.0 domain names are held to. I’d argue that the success of ENS is due to this fact, as you can see with the plethora of communities that are forming around various Unicode subgroupings: https://www.ens.vision/ right at this moment with interest growing day by day which in turn makes the whole ENS ecosystem thrive right now.

Coming from the emoji ENS community personally, I am glad that emojis in ENS are not devalued to punycode and that they can be used as a form a expression which in turn makes them valuable and collectible. This is why the normalization thread was originally created by raffy, to make emojis work properly within the ecosystem, something that is entirely different from traditional domain naming systems, where emoji support was a mere afterthought (punycode) and is therefore not used at all basically, which is a shame. The same goes for any Unicode web 2.0 domain names really, they are not used because punycode is not a visual appealing solution, ENS is doing the right thing in my opinion by refraining from using punycode mappings.

The argument that Unicode is supported in web 2.0 doesn’t make sense in an ENS context, since punycode is visually uninteresting and not something anyone would create a community around, the reason ENS is popular is because Unicode is widely supported. Allowing more creative freedom by supporting additional Unicode symbols without punycode, while simultaneously reducing the issue of confusable symbols/ZWJ/spoofing should be the goal of this whole effort (an onchain solution being the holy grail), not to adhere to web 2.0 standards forever to not break compatibility.

I don’t want to speak for anyone, but the last months have shown that the aspect of trading and owning a visually interesting and unique web 3.0 identity alias in form of an .ETH adress is something way more people are interested in than having your eth domain be compatible with your web 2.0 domain, which is really not the main draw for many of us. :hugs:

Agreed. The growth in ENS over the past few months cannot be ignored or overlooked.

There already are names that don’t survive the wide-implemented IDNA preprocessing before punycode, there’s ambiguity about puny literals, and there’s names that transform differently between the most popular browsers.

My suggestion is that the official registration app should state the compatibility w/DNS. There are 3 possibilities:

  1. will encode reliably
  2. must be punycoded beforehand (some complex emoji)
  3. cannot be punycoded (check hyphen violations, _$, puny literals, invalid puny: xn--💩, etc.)

I don’t see anything wrong with a small percentage of names being incompatible with DNS.

1 Like

The reason people put an underscore or a hyphen at the start or the end of a user-name is that they want a specific user name but can’t get it, so they copy-cat the name by putting in the hyphen or underscore before/after

That’s one reason, but a leading underscore has language meaning in many programming languages. When you use a leading hyphen for a variable, it dictates that it is input. So _trueArtist means something different than just TrueArtist not being available.

Secondly, there are stylistic reasons in names, such as ----OCD—, Drunk_, XxUNDEADxX, --James–, or -_- (someone regged that already).

In principle, you are right, it increases supply, but I bet a vote for -8-, 008, or __8 zero wouldn’t be close if people had no idea of the 999 Club. This is because walled digits are a niche. Our group holds 999 domains, but we would be in favor of leading hyphens and underscores because it makes sense from a gamer perspective.

It’s not easy to create a ruleset that lasts decades, supports billions of users, and is agreed upon by every stake holder in the web2 ecosystem. A lot of thought has been put into those rules, and it would be foolish for ENS to dismiss them out of hand. But ENS is different from DNS, not all of the old rules still apply, and I agree that ENS should take the opportunity to consider modifications which make sense for ENS. I just don’t think allowing underscores actually improves the ENS product. On the contrary, it makes it more “edge case-y”.

Emoji is a good example where the cost/benefit for ENS was particularly favorable. As you mention, they open up a whole new class of creative and colorful domains, with the cost of breaking some legacy web2 infrastructure (but not all web2 infrastructure, thanks to punycoding). In contrast, the benefits of setting UseSTD3ASCIIRules=false are much more modest, but the cost is greater because these domains completely break web2 compatibility. Unless there are other technical benefits which I’m discounting, I just don’t see the cost/benefit favorably for this particular change.

1 is the happy path. 2 is unfortunate, but largely out of ENS control (and possibly changeable with active involvement in Unicode Consortium), but I agree that users should know about this before registration. Currently, there is no 3, so why would we self-inflict ENS and ecosystem developers with that complexity burden if we didn’t absolutely need to?