Do you believe your ENSIP is ready to be advanced for standardisation?
Do you have an updated report of names that are invalidated or change their normalisation as a result of the new standard? I can send you a new list of registered names if need be.
Transposition is terrifyingly easy, I absolutely agree 100%!
That’s why I think popularizing ENS = Ethereum address as an absolute is dangerous. I’ve sent things to the wrong address from unchecksummed OCR before (meaning it was all lowercase in the address). It’s not good enough just to have an address, you need the underlying checksummed Ethereum address always printed as a crosscheck. Both to validate that the dapp developer didn’t fat finger something themselves, and to check you didn’t do something wrong.
Whatever the result of normalization now or in the future, we need better filters and alerts to counterfeit names. In Arabic numbers it’s a real problem in marketplaces because the default alert is meaningless. For most numbers, Urdu, Kurdish, and Farsi are written exactly the same as Arabic. This is leading to scams. There should probably be established language group flags output.
Edit: And in programming, every single developer will be very suspicious if their code compiles the first time. It’s scary when that happens. Then you ask yourself what you did wrong to make it work first try. It’s much more reassuring when the compiler yells at you for a missing semicolon or extra bracket a couple times.
The changes I’m proposing (which ens-normalize.js, norm-ref-impl.js, and the resolver follow) are outlined in my ENSIP draft. They’re are actually pretty minimal from what I was originally proposing (invaliding ~2% of names).
I have an updated list but I might be missing some.
Pretty much. Give me this evening to review the language in the ENSIP.
The latest ens_tokenize has the NFC Quick Check logic that only runs NFC on the minimal parts of the string. That’s the only piece of the solidity contract was missing to be a 100% match.
They were already valid in IDNA 2003 / EIP-137. They aren’t valid in IDNA 2008 (which is why there is confusion.)
Validation will happen after normalization. The underlying idea is simple: each label gets checked to see if it’s safe for use. Potentially there is a full name check.
It’s hard to state exactly what names are safe but it’s easy to chip away at it. I think everyone agrees DNS labels are safe. So we can write a validator which returns true for names composed of DNS labels. From my calculations above, that’s 95% of names.
Each validator is simple to write because it doesn’t need to process emoji, it only deals with normalized characters, and most validators are single-script. There also is an efficient way to determine which validators could apply.
There is a question of what to do with this information.
For contracts or headless scenarios, you either just normalize (throw on invalid, allow unsafe) or normalize with the ENS recommendation (throw on invalid or unsafe)
For user input, like text fields, unsafe names should show a warning, but the name should still work if the user acknowledges. This requires a change to the ENS recommendation and various applications.
A power-user feature might be to limit the accepted charsets: maybe you want to be more strict than the ENS recommendation and warn on everything that isn’t composed of DNS labels. This could occur at the user-input level (“I want metamask to warn non-DNS names”) or at the application level (*.cb.id might just be strict DNS.)
That is not an error, that just means someone has manually registered the capital characters against the smart contracts. The name with uppercase characters will just normalize to the correct name with lowercase characters in all client wallets/sites.And that is true today as well, even before any of this updated normalization code goes into effect.
When/how was it last updated? If it’s no more than a couple of weeks old, that should give us a good idea.
Just to revive the underscore issue - what do you think of only permitting it as the first character? I know that would be a deviation from the rest of the function, which is position-independent, but it would allow service domains etc, without allowing it in arbitrary positions.
Edit: I see a lot of arabic numeral names in the diff-norm list. Is this due to there being two versions of certain characters in different alphabets? Do you know how often both normalisations are registered?
I assume you’re talking about this report? The formatting isn’t the best “eth-ens-namehash-error” means the error only occurs in “eth-ens-namehash”, which is the official implementation.
Few days ago, 1.4M names.
This seems reasonable to me. First single character? Or can there be multiple underscores?
Hey @raffy how long do clients usually take to integrate the updates after release? Say for example OpenSea (assuming it’s using the provided normalisation and not an in-house implementation) - how long does it usually take for them to implement the update? I’m assuming MetaMask will take the lead and implement it asap? Thank you.