ENS Normalization with {FE0F} inclusion - ENS Fee and more

kraft · January 19, 2023, 8:24am

Hi ENS DAO, I wanted to bring your attention to an important matter.

I’d observed it previously that emojis like have different sequencing in the official unicode releases than from what the normalization takes as official. This is because the FE0F part of the sequence is stripped off, as it contributes nothing to the name as per @raffy

Consider this for an example -

in this you can see only the first sequence is (marked) qualified by Unicode and the rest are marked as unqualified or minimally qualified. But the normalization only happens for the 4th one which is marked as unqualified in this unicode 15.0 document release.

Same is the case with the rest of the emojis above and many more. Raffy told me that the FE0F part has been removed because it does not contribute anything to the sequence and it’ll remain the same regardless. Some examples like the and as well. In this case too, doesn’t matter which you register the normalisation caters to only the older version, which is
.

In the above case, it should have been a 5 character emoji, costing 160$ instead of 640$ + the current one where the normalisation points should have been invalidated. But it’s not the case since it was already registered when ENS was following the 2003 spec of unicode. Though it was later noticed that there were many anomalies with this due to FE0F, as previously believed, FE0F was not unnecessary and was imp in some emoji versions. But ENS ignored them and only normalised the non FE0F versions.

There has been some discussion around this as well like -

and this one too, which has never been adressed by the DAO even though the problem is legit -

I had a chat with @raffy to understand what’s happening here, since in the above discussion Raffy too was advocating on including all ZWJ and FE0F sequences which are in the newer versions. But now his stance is also to let FE0F be stripped off.

This had already created some issues previously but had been ignored by the DAO in the pretext that the individual is not registering the ENS from the main frontend. My argument against this is that they don’t need to. I’m able to use AAVE or UNI without using its main frontend/app. One must be able to register to an emoji from the contract itself without depending on the frontend, just by referring to the Unicode version release and be sure that the Fully qualified version must be normalised and the unqualified one must not be. It’s an over assumption that someone who would be registering an ENS from the contract would also be reading the governance forums for something as basic as inclusion of FE0F, the version of which unicode clearly marked as VALID but ENS marks as INVALID.

You should ofc not break the previous Emoji registrations for this, but it sure can be corrected for the new ones. Consider this example -

this was released last year ('22) reading this you’d assume ENS may have opted for the first one to be the correct normalised sequence, since unicode marked that one as fully qualified (Note:marked). But it’s not so, and therefore instead of it being a 4char emoji, it’s a 3 character emoji, same thing but now costs $640 instead of $160.

One should be able to look up unicode doc, find the unicode sequence, find read that ‘this one’s fully qualified’ and register that from the contract without using the app frontend. ENS should not be imposing their own frontend, making it seem more centralised. One can still register the FE0F version of the emoji but since normalization doesn’t pick that up, it’s a waste of money and makes no sense to not follow what unicode terms as fully qualified. We’ve also already observed some anomalies that it caused in previous versions like the poop emoji and the snowflake one.

Now the next argument is why even change that now that it’s working - my main take on this is, the thing that should ideally cost me $160 or $5, costs me $640. It sure brings more revenue to the DAO but as a user I’m getting rekt. Even if I use the main frontend, it doesn’t solve the problem, the cost remains the same. Therefore what @raffy mentioned as ‘it doesn’t change anything’ is wrong, it does reduce the cost for the end user to register these.

Moreover there are more examples, like in this one, I’d just need to register the first emoji as min 2 emojis and it’d take 4 char and cost me 160$ but now it costs me 640$ and have to register min 3 emojis, which only gets normalized to the first version and ignores the 2nd version altogether.

And in this, both are 4 char emojis for what unicode ‘marked’ as fully qualified, but ENS ignores them and goes with the 3 char ones which are marked as unqualified by unicode.

We sure cannot break previous names but we can improve this for the newer emojis involving FE0F sequence. Apparently 578 more emojis are to be released and 100s of them use FE0F sequence in them in the fully qualified version of the sequences. If the current FE0F sequence stripping is not removed it would keep costing more to register normal emojis like those and furthermore keep catering the marked ‘unqualified’ version of the unicode sequence.

It’s been an inefficiency on the ENS and Unicode’s end, but it can now be corrected atleast by ENS. DAO won’t earn an extra $1mm with this FE0F stripping, moreover not doing so is less user friendly and favors the DAO more than its community.

Open to a fruitful discussion where we not only think from the DAO’s perspective but also think from users’ pov keeping the ethos of decentralization in mind. Normalization is the main bottleneck here and partly contributes to a centralized approach, but one should be able to look into unicode releases, find the ‘marked’ FULLY QUALIFIED sequence, register that from the contract w/o main frontend and be sure that the normalization won’t be anything crazy so to ignore a part of the sequence(FE0F) and elevate the registration and renewal fee.

Tagging others as well who have been actively taking part in this topic previously -
@serenae, @nick.eth and also since we’re discussing ENS fees as well here, tagging @vbuterin for his take from this blogpost in https://vitalik.ca/general/2022/09/09/ens.html

edit:spelling

serenae · January 19, 2023, 2:38pm

I think the current rules are correct, FE0F should be stripped. That is not something that is changing with the new normalization library – FE0Fs were stripped before, and they’ll continue to be stripped after the updates.

I would be against selectively allowing FE0F on “newer sequences” because then you’d introduce arbitrary inconsistencies where some sequences normalize with the FE0F and some without. It’s harder and more error-prone to keep track of which sequences “should have FE0F” and then add it in, than it is to just strip FE0F out if you see it. And since the normalization should be consistent, all-or-nothing, I think the current method is the correct one.

Raffy does have that Beautifier tool that clients can use to restore FE0F for display purposes as well.

Yeah, the minimally-qualified version of an emoji sequence is one character less, which means it may be more expensive. But at least it’s consistent for everyone. If we made inconsistent special cases for new emoji sequences while preserving current ones, then that would be unfair to the registrants of the older names who may have to pay more in renewal fees than registrants of newer emoji sequences. DAO revenue does not and should not factor into this decision at all.

kraft · January 19, 2023, 2:50pm

There’s already some inconsistency in some pairs of emojis with FE0f absent and present. We’re talking about 578 more FE0F sequence emojis that are about to be released soon, the additonal cost for 1 more character is not just 1% or 2% inc, it’s several times when you remove FE0F, many emojis in the newer version have more than one EF0F element for directional emojis, removing them makes a few of them 3 character from 5 character.

Also, I don’t think it’s unfair for previous ones in the sense that they’re already using the wrong version, DAO is already doing good to them by not stripping their version and validating the correct one for normalization. Why continue the same rule and inc the cost multiple times for users when you can correct it now.

serenae · January 19, 2023, 2:51pm

What inconsistency? Do you have an example?

kraft · January 19, 2023, 2:54pm

the snowflake emoji, poop emoji, frowning emoji and many more. You’ve even addressed one such eg here -

even if you let aside this inconsistency, the cost factor is still not solved

serenae · January 19, 2023, 2:54pm

There is no inconsistency.

All FE0Fs are stripped out in normalization for all sequences, it is consistent for everyone.

kraft · January 19, 2023, 2:57pm

but when you do, you do realize some emojis are not in their latest form right? some emojis are in their lately versions with the EF0F form. like the snowflake or genie or poop emoji. Not everyone is using raffy’s tool to beautify too.

Currently if I’ve to register an ENS emoji via contract without using the main frontend - I’d need to refer to this governance forum for what’s working an what not, or raffy’s tool.

serenae · January 19, 2023, 3:13pm

Some emoji sequences (even the ones that present as a single glyph) have more characters than others, but that’s just life. Unicode can be complex and weird, but I think the current rules do the best job at working within those confines while being fair to everyone.

Also, for all practical purposes it doesn’t matter to the average person. Anyone can just use either the fully or minimally-qualified version, and it will go to the same place (this is the point of normalization). As pointed out in that thread you linked.

Again, the FE0F stripping is not new. That’s the way it has always worked. If you go to the ENS manager and enter in ❄️❄️❄️.eth, it will automatically go to the page for ❄❄❄.eth. All other frontends should be doing the same thing. Metamask obviously does as shown above.

ENS.Vision does this properly as well: https://ens.vision/name/.eth

OpenSea does too:

kraft · January 19, 2023, 3:25pm

There are several emojis that are unqualified, not even minimally qualified, and still are used as the main normalised version. About the snowflake emoji too, my point was not which one is being normalised, ofc I know metamask, Open Sea and all others follow the one that ENS validates as the normalised version.

consider this eg, you’re not even using the minimally qualified one, you’re using the marked - unqualified version of the emoji. That does not make sense.

Raffy told me that EF0F stripping was brought into the pic because the previously registered ones already didn’t use it while registering, like this above eye-cloud one, so you guys just stripped it off.

To address your point of being fair, I don’t think it’s such, those who already own these - for eg the eye cloud emoji, is using the unicode marked invalid emoji and is up for sale for 1000eth. Heart on fire emoji had the biggest sale too, while its seq too is marked unqualified by Unicode. DAO is already being fair to them by letting it be.

What’s unfair to me is the cost factor that you’re not addressing, these additional 500+ emojis would be several times less costlier to register if it were with EF0F (the marked qualified version of emoji)

here’s an eg of some new ones -

serenae · January 19, 2023, 3:49pm

The “cost factor” is not an issue. The FE0F stripped is something that has always been done, so everyone has always played by the same rules. Some emoji sequences have fewer characters than others, that’s life.

I think stripping FE0Fs is still the correct course, and in that example, it still presents the same way in browsers anyway:

If someone went directly to the contracts and registered some other variation with FE0F, then that was their mistake.

It seems that you are hinging a lot of your argument on the idea that unqualified sequences are “unicode marked invalid”. You’re using the word “unqualified” as if it’s supposed to be “invalid”, but that is not the case. It’s just a technical term for a form of the emoji sequence. From the Unicode spec:

ED-17a. qualified emoji character — An emoji character in a string that (a) has default emoji presentation or (b) is the first character in an emoji modifier sequence or (c) is not a default emoji presentation character, but is the first character in an emoji presentation sequence.

ED-18. fully-qualified emoji — A qualified emoji character, or an emoji sequence in which each emoji character is qualified.

ED-18a. minimally-qualified emoji — An emoji sequence in which the first character is qualified but the sequence is not fully qualified.

ED-19. unqualified emoji — An emoji that is neither fully-qualified nor minimally qualified.

Later in the spec, Unicode also notes this:

minimally-qualified or unqualified emoji ZWJ sequences may be handled in the same way as their fully-qualified forms; the choice is up to the implementation.

The point is that with normalization, all different forms should normalize to the same thing. So all four forms you have listed there will normalize to 👁‍🗨.eth. In my opinion the most sensible and least error-prone way to do that is to be consistent across the board and strip FE0Fs in normalization.

Again, it doesn’t change any utility for the user. Anyone can use any of those forms with FE0F. Any frontend can display whichever form they want. It doesn’t matter, because the normalization library guarantees that all of those forms will point to the same registered ENS name.

kraft · January 19, 2023, 4:32pm

The cost factor is an issue though, what would cost $160 or $5 would not cost $640, I don’t understand though where did the stripping FE0F thing came from, why was it suggested in the first place when it was fine even before.

Your point of someone using the smart contract to register being a mistake sounds wrong too, why impose one particular frontend on the masses when you should keep it as such that people should be able to register the right normalised sequence.

I’m aware of the these ED points from the Unicode spec, but that’s also my point however it’s marked, why choose the version that is the costliest to register. I also didn’t intend to make it sound invalid that’s why I used marked, I’m aware they all work the same.

Choosing non EF0F version are not error prone either, I’ve already mentioned some examples above that show that using EF0F was the right move and would have easily avoided wrongly normalised registrations. Even in that case the utility won’t change btw.

I still don’t understand though, the cost multiples 5x when or in some cases 64x, that’s not like a 10-20% increase for so many emojis. Also I’m not iterating something new, as suggested in my original post, it had been talked about previously as well that retaining EF0F in ZWJ in the newer versions seems to be correct move.

That’s not life, it’s not like it cannot be corrected in the newer version if it’s always been this way in the current version, knowing well that if corrected the users won’t have to bear additional cost which is multiple times of it. Earlier today Raffy was giving me an eg of a=A in normalization so {no FE0F}= with FE0F, the argument doesn’t stand cost wise though.

kraft · January 19, 2023, 4:35pm

also point me how it sounds unfair to incl EF0F in newer version. You pointing out that they registered via the contract without our frontend is wrong is equivalent to me saying that them register 👁‍🗨.eth with 3 char seq was wrong. We can keep going in this loop, while I don’t see how dec cost for newer emojis harms the previous ones. They too registered an arbitrary version.

serenae · January 19, 2023, 4:35pm

You have decided that it’s “wrongly normalized”. I disagree. If this is the crux of the argument then I suppose we’ll just have to agree to disagree.

kraft · January 19, 2023, 4:40pm

I’ve already pointed out the examples of snowflake, poop emoji, genie emoji, and so on, they don’t look the same without the EF0F. Also you saying that browsers anyway show the same thing doesn’t make sense. Why must we care which medium shows what when we know removing EF0F does create some difference to a selected few.

The right move would have been to strip off the already registered non EF0F ones since not incl EF0F in some - like snowflake, makes it a wrong decision. But ofc we shouldn’t break previously registered emojis.

All we can do is correct it here on and reduce the cost.

And yes, wrongly normalised is the right word to describe that else we won’t have the snowflake issue.

some discussions around this -

kraft · January 19, 2023, 4:49pm

you saved already registered emoji domains like👁‍🗨.eth, .eth and .eth by sabotaging future registrars and the existing who all would have to pay nearly 5x or 64x just because the DAO decided to move ahead with a version that also unicode marks as unqualified. Whether qualified or unqualified they all look the same so it didn’t make sense to make this move in the first place. I’ve also demonstrated examples of issues it created in a selected few like the snowflake emoji.

kraft · January 19, 2023, 4:53pm

Lastly the DAO issued refunds previously to users who registered domains from the frontend and were wrong. Why not do that for non EF0F ones like 👁‍🗨.eth, .eth and .eth why not do that, instead of inc the price 5x or 64x for the future registrars.

Assuming ENS lives on, this cost would only compound.

serenae · January 19, 2023, 4:57pm

Your recent comments haven’t really brought anything new, I’ve already addressed all those points (and misunderstandings) in my previous comments, so I’ll leave it at that.

kraft · January 19, 2023, 5:01pm

You’ve been addressing a portion of my responses. Haven’t found a response to these inefficiencies -

cost is 5x or 64x due to this
why choose a version of emoji seq that is costier, why not refund the already registered ones and correct this issue. The issue is there and I’ve pointed that with a single snowflake eg.
why not improve that for the upcoming 578 emojis when you know that saves cost for the users.
why is the DAO defending 👁‍🗨.eth, .eth and .eth this version of the normalisation and ditching others. When with EF0F version would not have any issues in the first place
and lastly what happens if 5 years down the line EF0F becomes somewhat important like it is for a selected few to distinguish (eg - snowflake versions), we’re not talking a few months but years. If EF0F become relevant for many future emojis, would the DAO ditch all the current version of non EF0F domains.

You’re only assuming that EF0F may not be useful.

What would the DAO do if they do become relevant? what’s the next course of action if your assumption turns out wrong in the future 5 or 10years down the line

raffy · January 19, 2023, 7:42pm

You are correct that my original idea was to grandfather all the old/unqualified emoji (without FE0F) and normalize future emoji to their fully-qualified forms. There was an implementation that worked like this.

However, every unqualified emoji sequence is 1:1 with the corresponding fully-qualified form. There are no ambiguities. Since registrations so far have removed FE0F, I see no problem with continuing this trend. I don’t foresee Unicode releasing a compound emoji where parsing varies with the presence of FE0F. See: Emoji Variation Selector Notes.

Even if future emoji were normalized with FE0F, old emoji (depending on platform) would still require some form of beautification to standardize their presentation. The same applies to keycaps, which are the most popular emoji type and the most mangled by the loss of FE0F. The use of beautification makes the lack of FE0F irrelevant for all emoji.

For reference, IDNA 2003 (the only UTS-46 flavor that supports emoji) strips FE0F. Most web browsers use IDNA 2003. EIP-137/ENSIP-1/ens-normalize.js use IDNA 2003.

The only benefit of allowing FE0F is to improve the appearance of raw normalized names that contain future emoji (without beautification) and reduce registration costs for the few who registered the shortest variants (since this introduces an additional superfluous invisible character.)

kraft · January 20, 2023, 11:35am

Quoting from the same unicode source you’ve put above, they have put it as an option for implementations and do mark atleast some old ones as defective. We still could have chosen the more cost friendly path right? nothing stops us from doing that.

I don’t think this part too is correct, there are no more interpretations after normalization because you stopped it at the normalization stage itself. As shown above not having FE0F has created issues with emoji versions of snowflake and a few more. Why are we assuming that things would remain as you think they would. The scenario would be different had ENS been part of the consortium itself, but rn we rely on any info released from them.
a clear example of frowning face emoji.

are you sure this would always remain the case? we’re talking about ENS in general, a change rn would be in effect for decades if not corrected in the starting years itself. What happens if the assumptions are wrong - would the DAO strip old non EF0F ones and legitimize with EF0F ones in case something obvious comes out in support of EF0F?

Again as I pointed, this number may be small in registrations but cost wise they could become 5x to 64x which is not a mere 10-20% jump. Furthermore, who are we to punish specifically these selected users by charging hefty. The DAO didn’t (or couldn’t) do anything about future registrations of black bird emoji and many more like that. Then why only hold these selected EF0F ones accountable when unicode suggests there being nothing wrong in using EF0F and you have the option to reduce cost without creating any new issues.

To me it seems like we rely a lot upon speculative texts in the unicode doc which made us strip off EF0F from thousands of emojis - we’re not sure of whether EF0F becomes relevant or not in the future - say 5years down the line. Unicode may not be aware of the effect this creates to us, emojis are scarce and hold value in them.

also the number of combinations that could be made with some emojis like this is not small and thus removal of EF0F is not really a very small selected group of set.

In my current understanding we’re not ready for changes where EF0F becomes necessary, whether from unicode’s end or if some browser starts to use them.

more so, we’d have a few more emojis to register had it not been for EF0F stripping, very few but still.

My suggestion would be bring it into effect atleast for the newer 578 ones and also the DAO to already register them asap to auction them later after these are officially released. I don’t see it as a lot of additional work.

I also suggest the DAO to become a member of the Unicode like how other important orgs are, in ENS case emojis matter a lot and are of great importance. Instead of assuming something, you’d have rights to vote within the Unicode system.

(for some reason I seem to have been rate limited due to the number of replies therefore late response)