ENS Name Normalization

Unicode 15 officially released. I don’t know if we want to include emojis now or wait until there’s platform support (I say now that they’re official.) I’ll also check what other differences show up when I switch to the latest Unicode data files.

2 Likes

Platform support takes years, Windows still doesn’t support Unicode 14. I say just include them now so we don’t have to release a new version of our libraries and then get ethers/metamask and everyone else to update again etc.

3 Likes
  1. 2019 (’) RIGHT SINGLE QUOTATION MARK should be valid.

Allows for names and surnames to be used on Ethereum NAME Service. (O’Brien, O’Donnell, Kevin O’Leary, etc

6 Likes

Okay, should we enforce:

  • Can’t touch another 2019? eg. ’’’.eth
  • Can’t start the label? End the label?
  • Can’t touch an emoji? ❤️’.eth
    (this would be similar to the combining mark rules)

Since ' [27] has always been disallowed, we could map it to 2019 for UX.

3 Likes
  • Can’t touch another 2019
  • Can’t start or end label
  • Can touch an emoji

Mapping ' [27] to 2019 is perfect.

5 Likes

Where can we see the 250 names with 2044?

3 Likes

Bunch of random updates:

  • During the cleanup of code that derives the spec from the Unicode files and rules we’ve developing, I discovered that there were 3 Modifier_Base emoji that don’t have a Modifier_Base + Modifier RGI-equivalents. I had mistakenly assumed that every combination was RGI. I assume we still want to include them? (via the whitelist)
Emoji Whitelist Additions
// missing MOD_BASE + MODIFIER combinations
// (👪) FAMILY  
'1F46A 1F3FB',
'1F46A 1F3FC',
'1F46A 1F3FD',
'1F46A 1F3FE',
'1F46A 1F3FF',
// (👯) WOMAN WITH BUNNY EARS 
'1F46F 1F3FB',
'1F46F 1F3FC',
'1F46F 1F3FD',
'1F46F 1F3FE',
'1F46F 1F3FF',
// (🤼) WRESTLERS
'1F93C 1F3FB',
'1F93C 1F3FC',
'1F93C 1F3FD',
'1F93C 1F3FE',
'1F93C 1F3FF',
  • During some testing, I discovered that there are 166 characters that when decomposed (either valid or mapped) become 2+ adjacent combining marks. I disallowed all of them since our combining mark rule eliminates them anyway.
    They have very minimal use: JSON

  • For the punctuation discussion above:

    • I mapped 2027 (‧) HYPHENATION POINT to hyphen instead of disallowing it.
    • After looking at prior registrations, I kept 2022 (•) BULLET valid.
  • At the moment, only 2 emoji are disallowed. They are default text-presentation so they format unstyled like !! and !? (but with less kerning). For reference, both ? and ! by themselves are invalid. Nothing prevents them from being valid with how we currently handle emoji.

    • 203C (‼️) double exclamation mark
    • 2049 (⁉️) exclamation question mark
  • I updated everything to Unicode 15 and applied the latest changes.

  • Added code for deriving the spec from Unicode files and ENS-specific rules (example).

  • Added code for generating validation tests from custom examples, generated from derive rules, random names, and registered names.


Edit: Delta Report: ens_normalize (1.6.3) vs ens_normalize (1.6.4) [1484938 labels] @ 2022-09-19T05:21:53.105Z

4 Likes

Not too sure about the bullet

2 Likes

What are they?

In the spoiler, (👪) FAMILY, (👯) WOMAN WITH BUNNY EARS, (🤼) WRESTLERS

2 Likes

They seem worth including; are there wider implications of doing so?

1 Like

None. I just wanted clarify the policy. All RGI should be included, and then anything ENS feels is reasonably supported, can be whitelisted.

That is U+00D7 been found now and people starting to mint

No idea how they think it’s not going to be seen as a confusable

time sign

Middle dot as well, U+00B7

1 Like

I’m thinking we can disallow the following between 80..FF

0xA1, // (¡) INVERTED EXCLAMATION MARK
0xA6, // (¦) BROKEN BAR
0xA7, // (§) SECTION SIGN // maybe?
0xAB, // («) LEFT-POINTING DOUBLE ANGLE QUOTATION MARK
0xAC, // (¬) NOT SIGN
0xBB, // (») RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK
0xBF, // (¿) INVERTED QUESTION MARK // maybe this is needed?
0xD7, // (×) MULTIPLICATION SIGN
0xF7, // (÷) DIVISION SIGN

// old english?
0xF0, // (ð) LATIN SMALL LETTER ETH
0xFE, // (þ) LATIN SMALL LETTER THORN
Frequencies in Registered Names
{
  'A1 (¡) INVERTED EXCLAMATION MARK': 19,
  'A6 (¦) BROKEN BAR': 3,
  'A7 (§) SECTION SIGN': 18,
  'AB («) LEFT-POINTING DOUBLE ANGLE QUOTATION MARK': 5,
  'AC (¬) NOT SIGN': 13,
  'BB (») RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK': 8,
  'BF (¿) INVERTED QUESTION MARK': 9,
  'B7 (·) MIDDLE DOT': 173,
  'D7 (×) MULTIPLICATION SIGN': 109,
  'F7 (÷) DIVISION SIGN': 10,
  'F0 (ð) LATIN SMALL LETTER ETH': 28,
  'FE (þ) LATIN SMALL LETTER THORN': 13
}
Actual Names
{
	"A1 (¡) INVERTED EXCLAMATION MARK": [
		"¡69",
		"¡88",
		"¡ape",
		"¡apple",
		"¡bayc",
		"¡coinbase",
		"¡facebook",
		"¡fb",
		"¡google",
		"¡mayc",
		"¡meta",
		"¡moon",
		"¡opensea",
		"¡porn",
		"¡rise",
		"¡°×°¡",
		"¡¡¡",
		"₮₩¡₮₮€₹",
		"cap¡tan"
	],
	"A6 (¦) BROKEN BAR": [
		"⁅-¦-⁆",
		"¦0000",
		"5¦5¦5"
	],
	"A7 (§) SECTION SIGN": [
		"§§§",
		"§wallet",
		"42usc§2000d",
		"cr¥ptø₿rīck§",
		"§555§",
		"§69",
		"§777",
		"§888§",
		"§corptan",
		"§crt",
		"§ens",
		"§ingh",
		"§ss",
		"§□§□§",
		"§§§§",
		"§§§§§",
		"§69420",
		"§vault"
	],
	"AB («) LEFT-POINTING DOUBLE ANGLE QUOTATION MARK": [
		"«««",
		"«««»»»",
		"«⨀⨆⨀»",
		"»«»«»«",
		"«-0-»"
	],
	"AC (¬) NOT SIGN": [
		"■-■¬",
		"◐-◐¬",
		"◑-◑¬",
		"◑‒◑¬",
		"◧-◧¬",
		"◨‒◨¬",
		"◧-◧¬⌐◨-◨",
		"¬0000",
		"凸¬‿¬凸",
		"¬‿¬",
		"¬¬¬¬¬",
		"❪¬‿¬❫",
		"◩-◩¬"
	],
	"BB (») RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK": [
		"1⁄4»3⁄4",
		"c—lxxxx»",
		"«««»»»",
		"«⨀⨆⨀»",
		"»«»«»«",
		"»»»",
		"»»———————►",
		"«-0-»"
	],
	"BF (¿) INVERTED QUESTION MARK": [
		"¿¿¿",
		"¿¿¿¿¿",
		"8¿iiu",
		"õ¿ö",
		"°0¿0°",
		"‹°¿°›",
		"°¿°",
		"●¿●",
		"¿¿¿¿"
	],
	"B7 (·) MIDDLE DOT": [
		"elonmusk·",
		"zeit·geist",
		"·····",
		"bryan·",
		"埃隆·马斯克",
		"an·ni·hi·late",
		"markcuban·",
		"ðÿ‡°ðÿ‡·",
		"伊隆·马斯克",
		"google·wallet",
		"i·seoul·u",
		"···",
		"darry·ring",
		"伊隆·麥斯克",
		"0001·",
		"00·933",
		"0·00",
		"0·0·1",
		"0····",
		"1000·",
		"10493···",
		"11·11",
		"1·23",
		"1····",
		"4⃣2⃣0⃣·6⃣9⃣",
		"435·4",
		"4··--20",
		"55·55",
		"66·66",
		"69·69",
		"6·666",
		"7777·",
		"7·7·7",
		"8888·",
		"888·",
		"888··",
		"88·88",
		"8·8·8",
		"99·99",
		"al·la·huak·bar",
		"amer·i·ca",
		"as·suage",
		"bit·coin",
		"chi·na",
		"clark·1",
		"con·trast",
		"con·tri·bu·tion",
		"cryptophys·i·cist",
		"darkmarket·",
		"def·e·cate",
		"de·gen",
		"diamond·hands",
		"eight·een",
		"elon·musk",
		"eth2·0",
		"in·ter·mit·tent",
		"in·ter·mit·tentfasting",
		"knick·ers",
		"met·a",
		"me·nageatrois",
		"mikasa-·ackerman",
		"ob·fus·cate",
		"ob·fus·cated",
		"ob·se·qui·ous",
		"om·nis·cient",
		"o·•o·",
		"pepsi·com",
		"pokemon·",
		"pokémon·",
		"pres·sur·ize",
		"qui·es·cent",
		"sas·quatch",
		"six·ti·eth",
		"sub·stan·tial",
		"tra·der",
		"trog·lo·dyte",
		"vi·sion·ar·y",
		"wall·e",
		"⌐◨-◨·",
		"ロロノア·ゾロ",
		"克里斯汀·迪奥",
		"利昂内尔·梅西",
		"爱新觉罗·玄烨",
		"伯克希尔·哈撒韦",
		"维塔利克·布特林",
		"阿尔方斯·艾尔利克",
		"迪丽热巴·迪力木拉提",
		"二三八·三",
		"布拉德·皮特",
		"贾斯汀·比伯",
		"迈克尔·乔丹",
		"勒布朗·詹姆斯",
		"唐纳德·特朗普",
		"沙奎尔·奥尼尔",
		"贝拉克·奥巴马",
		"爱德华·艾尔利克",
		"👁👄·👁",
		"查理·芒格",
		"路易·威登",
		"ニコ·ロビン",
		"乔治·克鲁尼",
		"伊隆·馬斯克",
		"杰森·斯坦森",
		"正宗·哥吉拉",
		"沃伦·巴菲特",
		"波雅·汉库克",
		"罗伊·马斯坦",
		"肯伊·韦斯特",
		"泰勒·斯威夫特",
		"馬克·扎克伯格",
		"马克·扎克伯格",
		"金·卡戴珊",
		"·⌐◨-◨",
		"·000",
		"·0000",
		"·0001",
		"·0008",
		"·1111",
		"·1234",
		"·4444",
		"·5555",
		"·555·",
		"·6666",
		"·666·",
		"·6699",
		"·7777",
		"·8888",
		"·888·",
		"·9999",
		"·diamondhands",
		"波特夹斯·d·艾斯",
		"モンキー·d·ルフィ",
		"蒙其·d·路飞",
		"豚林·vitalik",
		"·•888",
		"·•—•·",
		"马克·艾略特·扎克伯格",
		"🐼·🐼·🐼",
		"二·一·一二",
		"··⌐◨-◨",
		"··555·",
		"·⁄·⁄·",
		"··👻··",
		"····",
		"····0",
		"····6957",
		"····•••",
		"ᗧ·····▪▪▪ᗣ▪▪▪···ᗣ··",
		"······",
		"·······",
		"ᗧ···ᗣ···ᗣ··",
		"········",
		"·········",
		"科比·布莱恩特",
		"穆罕默德·本·萨勒曼🇸🇦🇨🇳",
		"77·77",
		"00·00",
		"克里斯蒂亚诺·罗纳尔多",
		"vitalik·buterin",
		"·0888",
		"·666666",
		"·1000",
		"粤a·77777",
		"粤a·66666",
		"··0005",
		"·007",
		"罗德里戈·埃尔南德斯·卡斯坎特",
		"科怀·伦纳德",
		"乔尔·恩比德",
		"达米恩·利拉德",
		"德克·诺维茨基",
		"劳尔·冈萨雷斯·布兰科",
		"416·"
	],
	"D7 (×) MULTIPLICATION SIGN": [
		"hunter×hunter",
		"○∑❒×⬡♦",
		"eth×eth",
		"×daniξl×",
		"00×00",
		"0×🇺🇸",
		"0×0",
		"0×00",
		"0×000",
		"0×001",
		"0×002",
		"0×005",
		"0×007",
		"0×008",
		"0×01",
		"0×018",
		"0×02",
		"0×069",
		"0×08",
		"0×1",
		"0×10",
		"0×100",
		"0×101",
		"0×11",
		"0×111",
		"0×123",
		"0×1337",
		"0×2",
		"0×20",
		"0×200",
		"0×21",
		"0×22",
		"0×222",
		"0×28",
		"0×3",
		"0×30",
		"0×300",
		"0×33",
		"0×333",
		"0×40",
		"0×404",
		"0×420",
		"0×44",
		"0×4590",
		"0×50",
		"0×55",
		"0×555",
		"0×66",
		"0×666",
		"0×668",
		"0×6699",
		"0×69",
		"0×6969",
		"0×77",
		"0×777",
		"0×8",
		"0×80",
		"0×88",
		"0×888",
		"0×8888",
		"0×90",
		"0×987",
		"0×99",
		"0×999",
		"0×abc",
		"0×××0",
		"1000×",
		"10×10",
		"10×10×10",
		"1×1",
		"24×7",
		"24×7×365",
		"3×210",
		"3×3eyes",
		"420×69",
		"4×4",
		"69×420",
		"69×69",
		"69×69×69",
		"7680×4320",
		"8×8×8",
		"abc×xyz",
		"ø×ø",
		"se×",
		"space×",
		"¡°×°¡",
		"×0000",
		"×0001",
		"×000×",
		"×01",
		"×0123",
		"×0÷0×",
		"×1234",
		"×666×",
		"×69",
		"×777×",
		"×̶̶͜×̶",
		"×××",
		"××××",
		"×××××",
		"×××××××",
		"60×1",
		"×0-0×",
		"1×1×1",
		"0×1×0",
		"0×0×1",
		"0×033",
		"888×888",
		"000×000"
	],
	"F7 (÷) DIVISION SIGN": [
		"0÷000",
		"0÷0÷0",
		"4÷2",
		"50÷50",
		"abc÷xyz",
		"÷0123",
		"÷1234",
		"÷÷÷",
		"÷÷÷÷÷",
		"×0÷0×"
	],
	"F0 (ð) LATIN SMALL LETTER ETH": [
		"ððð",
		"ðoge",
		"mð°rkzuckerberg",
		"ðÿ‡°ðÿ‡·",
		"ðÿœtm",
		"alfǫðr",
		"ásgarðr",
		"bróðir",
		"ððððð",
		"ðeviantobeðient",
		"frœði",
		"geirahöð",
		"góðrmorginn",
		"guðr",
		"hermaðr",
		"hlaðguðr",
		"hlórriði",
		"miðgarðr",
		"óðinn",
		"ragnarrloðbrókkr",
		"randgríðr",
		"sanngriðr",
		"sigurður",
		"skaði",
		"sveið",
		"þrúðr",
		"viðskipti",
		"vörður"
	],
	"FE (þ) LATIN SMALL LETTER THORN": [
		"¶◾▫◾þ",
		"alþingi",
		"aþþle",
		"祡fliρþeŕマ",
		"hjalmþrimul",
		"hjörþrimul",
		"þögn",
		"þórr",
		"þrima",
		"þrúðr",
		"þþþ",
		"þþþþþ",
		"þutin"
	]
}

I restored the ContextO rule for B7, which only permits it between two L's. If that’s too esoteric, we can just disallow it.

These changes would leave the following valid or mapped between 00..FF

Summary
24 ($) DOLLAR SIGN
27 (') APOSTROPHE => 2019 (’) RIGHT SINGLE QUOTATION MARK
2D (-) HYPHEN-MINUS
30 (0) DIGIT ZERO
31 (1) DIGIT ONE
32 (2) DIGIT TWO
33 (3) DIGIT THREE
34 (4) DIGIT FOUR
35 (5) DIGIT FIVE
36 (6) DIGIT SIX
37 (7) DIGIT SEVEN
38 (8) DIGIT EIGHT
39 (9) DIGIT NINE
41 (A) LATIN CAPITAL LETTER A => 61 (a) LATIN SMALL LETTER A
42 (B) LATIN CAPITAL LETTER B => 62 (b) LATIN SMALL LETTER B
43 (C) LATIN CAPITAL LETTER C => 63 (c) LATIN SMALL LETTER C
44 (D) LATIN CAPITAL LETTER D => 64 (d) LATIN SMALL LETTER D
45 (E) LATIN CAPITAL LETTER E => 65 (e) LATIN SMALL LETTER E
46 (F) LATIN CAPITAL LETTER F => 66 (f) LATIN SMALL LETTER F
47 (G) LATIN CAPITAL LETTER G => 67 (g) LATIN SMALL LETTER G
48 (H) LATIN CAPITAL LETTER H => 68 (h) LATIN SMALL LETTER H
49 (I) LATIN CAPITAL LETTER I => 69 (i) LATIN SMALL LETTER I
4A (J) LATIN CAPITAL LETTER J => 6A (j) LATIN SMALL LETTER J
4B (K) LATIN CAPITAL LETTER K => 6B (k) LATIN SMALL LETTER K
4C (L) LATIN CAPITAL LETTER L => 6C (l) LATIN SMALL LETTER L
4D (M) LATIN CAPITAL LETTER M => 6D (m) LATIN SMALL LETTER M
4E (N) LATIN CAPITAL LETTER N => 6E (n) LATIN SMALL LETTER N
4F (O) LATIN CAPITAL LETTER O => 6F (o) LATIN SMALL LETTER O
50 (P) LATIN CAPITAL LETTER P => 70 (p) LATIN SMALL LETTER P
51 (Q) LATIN CAPITAL LETTER Q => 71 (q) LATIN SMALL LETTER Q
52 (R) LATIN CAPITAL LETTER R => 72 (r) LATIN SMALL LETTER R
53 (S) LATIN CAPITAL LETTER S => 73 (s) LATIN SMALL LETTER S
54 (T) LATIN CAPITAL LETTER T => 74 (t) LATIN SMALL LETTER T
55 (U) LATIN CAPITAL LETTER U => 75 (u) LATIN SMALL LETTER U
56 (V) LATIN CAPITAL LETTER V => 76 (v) LATIN SMALL LETTER V
57 (W) LATIN CAPITAL LETTER W => 77 (w) LATIN SMALL LETTER W
58 (X) LATIN CAPITAL LETTER X => 78 (x) LATIN SMALL LETTER X
59 (Y) LATIN CAPITAL LETTER Y => 79 (y) LATIN SMALL LETTER Y
5A (Z) LATIN CAPITAL LETTER Z => 7A (z) LATIN SMALL LETTER Z
5F (_) LOW LINE
61 (a) LATIN SMALL LETTER A
62 (b) LATIN SMALL LETTER B
63 (c) LATIN SMALL LETTER C
64 (d) LATIN SMALL LETTER D
65 (e) LATIN SMALL LETTER E
66 (f) LATIN SMALL LETTER F
67 (g) LATIN SMALL LETTER G
68 (h) LATIN SMALL LETTER H
69 (i) LATIN SMALL LETTER I
6A (j) LATIN SMALL LETTER J
6B (k) LATIN SMALL LETTER K
6C (l) LATIN SMALL LETTER L
6D (m) LATIN SMALL LETTER M
6E (n) LATIN SMALL LETTER N
6F (o) LATIN SMALL LETTER O
70 (p) LATIN SMALL LETTER P
71 (q) LATIN SMALL LETTER Q
72 (r) LATIN SMALL LETTER R
73 (s) LATIN SMALL LETTER S
74 (t) LATIN SMALL LETTER T
75 (u) LATIN SMALL LETTER U
76 (v) LATIN SMALL LETTER V
77 (w) LATIN SMALL LETTER W
78 (x) LATIN SMALL LETTER X
79 (y) LATIN SMALL LETTER Y
7A (z) LATIN SMALL LETTER Z
A2 (¢) CENT SIGN
A3 (£) POUND SIGN
A4 (¤) CURRENCY SIGN
A5 (¥) YEN SIGN
AA (ª) FEMININE ORDINAL INDICATOR => 61 (a) LATIN SMALL LETTER A
B0 (°) DEGREE SIGN
B1 (±) PLUS-MINUS SIGN
B2 (²) SUPERSCRIPT TWO => 32 (2) DIGIT TWO
B3 (³) SUPERSCRIPT THREE => 33 (3) DIGIT THREE
B5 (µ) MICRO SIGN => 3BC (μ) GREEK SMALL LETTER MU
B6 (¶) PILCROW SIGN
B7 (·) MIDDLE DOT
B9 (¹) SUPERSCRIPT ONE => 31 (1) DIGIT ONE
BA (º) MASCULINE ORDINAL INDICATOR => 6F (o) LATIN SMALL LETTER O
BC (¼) VULGAR FRACTION ONE QUARTER => [31 2044 34]
BD (½) VULGAR FRACTION ONE HALF => [31 2044 32]
BE (¾) VULGAR FRACTION THREE QUARTERS => [33 2044 34]
C0 (À) LATIN CAPITAL LETTER A WITH GRAVE => E0 (à) LATIN SMALL LETTER A WITH GRAVE
C1 (Á) LATIN CAPITAL LETTER A WITH ACUTE => E1 (á) LATIN SMALL LETTER A WITH ACUTE
C2 (Â) LATIN CAPITAL LETTER A WITH CIRCUMFLEX => E2 (â) LATIN SMALL LETTER A WITH CIRCUMFLEX
C3 (Ã) LATIN CAPITAL LETTER A WITH TILDE => E3 (ã) LATIN SMALL LETTER A WITH TILDE
C4 (Ä) LATIN CAPITAL LETTER A WITH DIAERESIS => E4 (ä) LATIN SMALL LETTER A WITH DIAERESIS
C5 (Å) LATIN CAPITAL LETTER A WITH RING ABOVE => E5 (å) LATIN SMALL LETTER A WITH RING ABOVE
C6 (Æ) LATIN CAPITAL LETTER AE => E6 (æ) LATIN SMALL LETTER AE
C7 (Ç) LATIN CAPITAL LETTER C WITH CEDILLA => E7 (ç) LATIN SMALL LETTER C WITH CEDILLA
C8 (È) LATIN CAPITAL LETTER E WITH GRAVE => E8 (è) LATIN SMALL LETTER E WITH GRAVE
C9 (É) LATIN CAPITAL LETTER E WITH ACUTE => E9 (é) LATIN SMALL LETTER E WITH ACUTE
CA (Ê) LATIN CAPITAL LETTER E WITH CIRCUMFLEX => EA (ê) LATIN SMALL LETTER E WITH CIRCUMFLEX
CB (Ë) LATIN CAPITAL LETTER E WITH DIAERESIS => EB (ë) LATIN SMALL LETTER E WITH DIAERESIS
CC (Ì) LATIN CAPITAL LETTER I WITH GRAVE => EC (ì) LATIN SMALL LETTER I WITH GRAVE
CD (Í) LATIN CAPITAL LETTER I WITH ACUTE => ED (í) LATIN SMALL LETTER I WITH ACUTE
CE (Î) LATIN CAPITAL LETTER I WITH CIRCUMFLEX => EE (î) LATIN SMALL LETTER I WITH CIRCUMFLEX
CF (Ï) LATIN CAPITAL LETTER I WITH DIAERESIS => EF (ï) LATIN SMALL LETTER I WITH DIAERESIS
D1 (Ñ) LATIN CAPITAL LETTER N WITH TILDE => F1 (ñ) LATIN SMALL LETTER N WITH TILDE
D2 (Ò) LATIN CAPITAL LETTER O WITH GRAVE => F2 (ò) LATIN SMALL LETTER O WITH GRAVE
D3 (Ó) LATIN CAPITAL LETTER O WITH ACUTE => F3 (ó) LATIN SMALL LETTER O WITH ACUTE
D4 (Ô) LATIN CAPITAL LETTER O WITH CIRCUMFLEX => F4 (ô) LATIN SMALL LETTER O WITH CIRCUMFLEX
D5 (Õ) LATIN CAPITAL LETTER O WITH TILDE => F5 (õ) LATIN SMALL LETTER O WITH TILDE
D6 (Ö) LATIN CAPITAL LETTER O WITH DIAERESIS => F6 (ö) LATIN SMALL LETTER O WITH DIAERESIS
D8 (Ø) LATIN CAPITAL LETTER O WITH STROKE => F8 (ø) LATIN SMALL LETTER O WITH STROKE
D9 (Ù) LATIN CAPITAL LETTER U WITH GRAVE => F9 (ù) LATIN SMALL LETTER U WITH GRAVE
DA (Ú) LATIN CAPITAL LETTER U WITH ACUTE => FA (ú) LATIN SMALL LETTER U WITH ACUTE
DB (Û) LATIN CAPITAL LETTER U WITH CIRCUMFLEX => FB (û) LATIN SMALL LETTER U WITH CIRCUMFLEX
DC (Ü) LATIN CAPITAL LETTER U WITH DIAERESIS => FC (ü) LATIN SMALL LETTER U WITH DIAERESIS
DD (Ý) LATIN CAPITAL LETTER Y WITH ACUTE => FD (ý) LATIN SMALL LETTER Y WITH ACUTE
DF (ß) LATIN SMALL LETTER SHARP S
E0 (à) LATIN SMALL LETTER A WITH GRAVE
E1 (á) LATIN SMALL LETTER A WITH ACUTE
E2 (â) LATIN SMALL LETTER A WITH CIRCUMFLEX
E3 (ã) LATIN SMALL LETTER A WITH TILDE
E4 (ä) LATIN SMALL LETTER A WITH DIAERESIS
E5 (å) LATIN SMALL LETTER A WITH RING ABOVE
E6 (æ) LATIN SMALL LETTER AE
E7 (ç) LATIN SMALL LETTER C WITH CEDILLA
E8 (è) LATIN SMALL LETTER E WITH GRAVE
E9 (é) LATIN SMALL LETTER E WITH ACUTE
EA (ê) LATIN SMALL LETTER E WITH CIRCUMFLEX
EB (ë) LATIN SMALL LETTER E WITH DIAERESIS
EC (ì) LATIN SMALL LETTER I WITH GRAVE
ED (í) LATIN SMALL LETTER I WITH ACUTE
EE (î) LATIN SMALL LETTER I WITH CIRCUMFLEX
EF (ï) LATIN SMALL LETTER I WITH DIAERESIS
F1 (ñ) LATIN SMALL LETTER N WITH TILDE
F2 (ò) LATIN SMALL LETTER O WITH GRAVE
F3 (ó) LATIN SMALL LETTER O WITH ACUTE
F4 (ô) LATIN SMALL LETTER O WITH CIRCUMFLEX
F5 (õ) LATIN SMALL LETTER O WITH TILDE
F6 (ö) LATIN SMALL LETTER O WITH DIAERESIS
F8 (ø) LATIN SMALL LETTER O WITH STROKE
F9 (ù) LATIN SMALL LETTER U WITH GRAVE
FA (ú) LATIN SMALL LETTER U WITH ACUTE
FB (û) LATIN SMALL LETTER U WITH CIRCUMFLEX
FC (ü) LATIN SMALL LETTER U WITH DIAERESIS
FD (ý) LATIN SMALL LETTER Y WITH ACUTE
FF (ÿ) LATIN SMALL LETTER Y WITH DIAERESIS

It kinda sucks to lose the glasses-variants but that not-sign is incredibly hyphen-like. There’s also it’s reverse: 0x2310, // (⌐) REVERSED NOT SIGN


Internally, I want to treat all “symbol” characters (non-character, non-emojis) as emojis, such that they have extra rules attached, like you can’t attach a combining mark to them, etc.

I’ve already included all of the non-emoji extended pictographic characters as emoji. If it’s important, we can give them a different name (“single character symbols” or w/e).

This just seems like a clear win to me, for example, you shouldn’t be allowed to attach (x̀) COMBINING GRAVE ACCENT to a (☗) BLACK SHOGI PIECE.

4 Likes

Not sure which direction you want to go with this, but anything round dot like could be a confusable for Braille as that is what they use

No idea if Braille is a direction you want to go down or not though, but currently they are valid

1 Like

I think it’s correct that these are disallowed. They can be recreated using singles that are emoji-presentation default.

  • 2753 (❓) BLACK QUESTION MARK ORNAMENT
  • 2754 (❔) WHITE QUESTION MARK ORNAMENT
  • 2755 (❕) WHITE EXCLAMATION MARK ORNAMENT
  • 2757 (❗) HEAVY EXCLAMATION MARK SYMBOL

I’d like to review another unanswered question from a while ago:

There’s also Squared Letters and Negative Squared Letters. I don’t want to pick favorites. The fairest solution would be map them to their unstyled equivalents (a-z0-9). For reference, Squared Letters are mapped to unstyled letters by IDNA.

Styled Letters and Digits
0x24EB, // (⓫) NEGATIVE CIRCLED NUMBER ELEVEN
0x24EC, // (⓬) NEGATIVE CIRCLED NUMBER TWELVE
0x24ED, // (⓭) NEGATIVE CIRCLED NUMBER THIRTEEN
0x24EE, // (⓮) NEGATIVE CIRCLED NUMBER FOURTEEN
0x24EF, // (⓯) NEGATIVE CIRCLED NUMBER FIFTEEN
0x24F0, // (⓰) NEGATIVE CIRCLED NUMBER SIXTEEN
0x24F1, // (⓱) NEGATIVE CIRCLED NUMBER SEVENTEEN
0x24F2, // (⓲) NEGATIVE CIRCLED NUMBER EIGHTEEN
0x24F3, // (⓳) NEGATIVE CIRCLED NUMBER NINETEEN
0x24F4, // (⓴) NEGATIVE CIRCLED NUMBER TWENTY

0x24FF, // (⓿) NEGATIVE CIRCLED DIGIT ZERO

0x24F5, // (⓵) DOUBLE CIRCLED DIGIT ONE
0x24F6, // (⓶) DOUBLE CIRCLED DIGIT TWO
0x24F7, // (⓷) DOUBLE CIRCLED DIGIT THREE
0x24F8, // (⓸) DOUBLE CIRCLED DIGIT FOUR
0x24F9, // (⓹) DOUBLE CIRCLED DIGIT FIVE
0x24FA, // (⓺) DOUBLE CIRCLED DIGIT SIX
0x24FB, // (⓻) DOUBLE CIRCLED DIGIT SEVEN
0x24FC, // (⓼) DOUBLE CIRCLED DIGIT EIGHT
0x24FD, // (⓽) DOUBLE CIRCLED DIGIT NINE
0x24FE, // (⓾) DOUBLE CIRCLED NUMBER TEN

0x2776, // (❶) DINGBAT NEGATIVE CIRCLED DIGIT ONE
0x2777, // (❷) DINGBAT NEGATIVE CIRCLED DIGIT TWO
0x2778, // (❸) DINGBAT NEGATIVE CIRCLED DIGIT THREE
0x2779, // (❹) DINGBAT NEGATIVE CIRCLED DIGIT FOUR
0x277A, // (❺) DINGBAT NEGATIVE CIRCLED DIGIT FIVE
0x277B, // (❻) DINGBAT NEGATIVE CIRCLED DIGIT SIX
0x277C, // (❼) DINGBAT NEGATIVE CIRCLED DIGIT SEVEN
0x277D, // (❽) DINGBAT NEGATIVE CIRCLED DIGIT EIGHT
0x277E, // (❾) DINGBAT NEGATIVE CIRCLED DIGIT NINE
0x277F, // (❿) DINGBAT NEGATIVE CIRCLED NUMBER TEN

0x2780, // (➀) DINGBAT CIRCLED SANS-SERIF DIGIT ONE
0x2781, // (➁) DINGBAT CIRCLED SANS-SERIF DIGIT TWO
0x2782, // (➂) DINGBAT CIRCLED SANS-SERIF DIGIT THREE
0x2783, // (➃) DINGBAT CIRCLED SANS-SERIF DIGIT FOUR
0x2784, // (➄) DINGBAT CIRCLED SANS-SERIF DIGIT FIVE
0x2785, // (➅) DINGBAT CIRCLED SANS-SERIF DIGIT SIX
0x2786, // (➆) DINGBAT CIRCLED SANS-SERIF DIGIT SEVEN
0x2787, // (➇) DINGBAT CIRCLED SANS-SERIF DIGIT EIGHT
0x2788, // (➈) DINGBAT CIRCLED SANS-SERIF DIGIT NINE
0x2789, // (➉) DINGBAT CIRCLED SANS-SERIF NUMBER TEN

0x278A, // (➊) DINGBAT NEGATIVE CIRCLED SANS-SERIF DIGIT ONE
0x278B, // (➋) DINGBAT NEGATIVE CIRCLED SANS-SERIF DIGIT TWO
0x278C, // (➌) DINGBAT NEGATIVE CIRCLED SANS-SERIF DIGIT THREE
0x278D, // (➍) DINGBAT NEGATIVE CIRCLED SANS-SERIF DIGIT FOUR
0x278E, // (➎) DINGBAT NEGATIVE CIRCLED SANS-SERIF DIGIT FIVE
0x278F, // (➏) DINGBAT NEGATIVE CIRCLED SANS-SERIF DIGIT SIX
0x2790, // (➐) DINGBAT NEGATIVE CIRCLED SANS-SERIF DIGIT SEVEN
0x2791, // (➑) DINGBAT NEGATIVE CIRCLED SANS-SERIF DIGIT EIGHT
0x2792, // (➒) DINGBAT NEGATIVE CIRCLED SANS-SERIF DIGIT NINE
0x2793, // (➓) DINGBAT NEGATIVE CIRCLED SANS-SERIF NUMBER TEN

0x1F150, // (🅐) NEGATIVE CIRCLED LATIN CAPITAL LETTER A
0x1F151, // (🅑) NEGATIVE CIRCLED LATIN CAPITAL LETTER B
0x1F152, // (🅒) NEGATIVE CIRCLED LATIN CAPITAL LETTER C
0x1F153, // (🅓) NEGATIVE CIRCLED LATIN CAPITAL LETTER D
0x1F154, // (🅔) NEGATIVE CIRCLED LATIN CAPITAL LETTER E
0x1F155, // (🅕) NEGATIVE CIRCLED LATIN CAPITAL LETTER F
0x1F156, // (🅖) NEGATIVE CIRCLED LATIN CAPITAL LETTER G
0x1F157, // (🅗) NEGATIVE CIRCLED LATIN CAPITAL LETTER H
0x1F158, // (🅘) NEGATIVE CIRCLED LATIN CAPITAL LETTER I
0x1F159, // (🅙) NEGATIVE CIRCLED LATIN CAPITAL LETTER J
0x1F15A, // (🅚) NEGATIVE CIRCLED LATIN CAPITAL LETTER K
0x1F15B, // (🅛) NEGATIVE CIRCLED LATIN CAPITAL LETTER L
0x1F15C, // (🅜) NEGATIVE CIRCLED LATIN CAPITAL LETTER M
0x1F15D, // (🅝) NEGATIVE CIRCLED LATIN CAPITAL LETTER N
0x1F15E, // (🅞) NEGATIVE CIRCLED LATIN CAPITAL LETTER O
0x1F15F, // (🅟) NEGATIVE CIRCLED LATIN CAPITAL LETTER P
0x1F160, // (🅠) NEGATIVE CIRCLED LATIN CAPITAL LETTER Q
0x1F161, // (🅡) NEGATIVE CIRCLED LATIN CAPITAL LETTER R
0x1F162, // (🅢) NEGATIVE CIRCLED LATIN CAPITAL LETTER S
0x1F163, // (🅣) NEGATIVE CIRCLED LATIN CAPITAL LETTER T
0x1F164, // (🅤) NEGATIVE CIRCLED LATIN CAPITAL LETTER U
0x1F165, // (🅥) NEGATIVE CIRCLED LATIN CAPITAL LETTER V
0x1F166, // (🅦) NEGATIVE CIRCLED LATIN CAPITAL LETTER W
0x1F167, // (🅧) NEGATIVE CIRCLED LATIN CAPITAL LETTER X
0x1F168, // (🅨) NEGATIVE CIRCLED LATIN CAPITAL LETTER Y
0x1F169, // (🅩) NEGATIVE CIRCLED LATIN CAPITAL LETTER Z

0x1F170, // (🅰) NEGATIVE SQUARED LATIN CAPITAL LETTER A
0x1F171, // (🅱) NEGATIVE SQUARED LATIN CAPITAL LETTER B
0x1F172, // (🅲) NEGATIVE SQUARED LATIN CAPITAL LETTER C
0x1F173, // (🅳) NEGATIVE SQUARED LATIN CAPITAL LETTER D
0x1F174, // (🅴) NEGATIVE SQUARED LATIN CAPITAL LETTER E
0x1F175, // (🅵) NEGATIVE SQUARED LATIN CAPITAL LETTER F
0x1F176, // (🅶) NEGATIVE SQUARED LATIN CAPITAL LETTER G
0x1F177, // (🅷) NEGATIVE SQUARED LATIN CAPITAL LETTER H
0x1F178, // (🅸) NEGATIVE SQUARED LATIN CAPITAL LETTER I
0x1F179, // (🅹) NEGATIVE SQUARED LATIN CAPITAL LETTER J
0x1F17A, // (🅺) NEGATIVE SQUARED LATIN CAPITAL LETTER K
0x1F17B, // (🅻) NEGATIVE SQUARED LATIN CAPITAL LETTER L
0x1F17C, // (🅼) NEGATIVE SQUARED LATIN CAPITAL LETTER M
0x1F17D, // (🅽) NEGATIVE SQUARED LATIN CAPITAL LETTER N
0x1F17E, // (🅾) NEGATIVE SQUARED LATIN CAPITAL LETTER O
0x1F17F, // (🅿) NEGATIVE SQUARED LATIN CAPITAL LETTER P
0x1F180, // (🆀) NEGATIVE SQUARED LATIN CAPITAL LETTER Q
0x1F181, // (🆁) NEGATIVE SQUARED LATIN CAPITAL LETTER R
0x1F182, // (🆂) NEGATIVE SQUARED LATIN CAPITAL LETTER S
0x1F183, // (🆃) NEGATIVE SQUARED LATIN CAPITAL LETTER T
0x1F184, // (🆄) NEGATIVE SQUARED LATIN CAPITAL LETTER U
0x1F185, // (🆅) NEGATIVE SQUARED LATIN CAPITAL LETTER V
0x1F186, // (🆆) NEGATIVE SQUARED LATIN CAPITAL LETTER W
0x1F187, // (🆇) NEGATIVE SQUARED LATIN CAPITAL LETTER X
0x1F188, // (🆈) NEGATIVE SQUARED LATIN CAPITAL LETTER Y
0x1F189, // (🆉) NEGATIVE SQUARED LATIN CAPITAL LETTER Z
Braille Patterns
0x2800, // (⠀) BRAILLE PATTERN BLANK
0x2801, // (⠁) BRAILLE PATTERN DOTS-1
0x2802, // (⠂) BRAILLE PATTERN DOTS-2
0x2803, // (⠃) BRAILLE PATTERN DOTS-12
0x2804, // (⠄) BRAILLE PATTERN DOTS-3
0x2805, // (⠅) BRAILLE PATTERN DOTS-13
0x2806, // (⠆) BRAILLE PATTERN DOTS-23
0x2807, // (⠇) BRAILLE PATTERN DOTS-123
0x2808, // (⠈) BRAILLE PATTERN DOTS-4
0x2809, // (⠉) BRAILLE PATTERN DOTS-14
0x280A, // (⠊) BRAILLE PATTERN DOTS-24
0x280B, // (⠋) BRAILLE PATTERN DOTS-124
0x280C, // (⠌) BRAILLE PATTERN DOTS-34
0x280D, // (⠍) BRAILLE PATTERN DOTS-134
0x280E, // (⠎) BRAILLE PATTERN DOTS-234
0x280F, // (⠏) BRAILLE PATTERN DOTS-1234
0x2810, // (⠐) BRAILLE PATTERN DOTS-5
0x2811, // (⠑) BRAILLE PATTERN DOTS-15
0x2812, // (⠒) BRAILLE PATTERN DOTS-25
0x2813, // (⠓) BRAILLE PATTERN DOTS-125
0x2814, // (⠔) BRAILLE PATTERN DOTS-35
0x2815, // (⠕) BRAILLE PATTERN DOTS-135
0x2816, // (⠖) BRAILLE PATTERN DOTS-235
0x2817, // (⠗) BRAILLE PATTERN DOTS-1235
0x2818, // (⠘) BRAILLE PATTERN DOTS-45
0x2819, // (⠙) BRAILLE PATTERN DOTS-145
0x281A, // (⠚) BRAILLE PATTERN DOTS-245
0x281B, // (⠛) BRAILLE PATTERN DOTS-1245
0x281C, // (⠜) BRAILLE PATTERN DOTS-345
0x281D, // (⠝) BRAILLE PATTERN DOTS-1345
0x281E, // (⠞) BRAILLE PATTERN DOTS-2345
0x281F, // (⠟) BRAILLE PATTERN DOTS-12345
0x2820, // (⠠) BRAILLE PATTERN DOTS-6
0x2821, // (⠡) BRAILLE PATTERN DOTS-16
0x2822, // (⠢) BRAILLE PATTERN DOTS-26
0x2823, // (⠣) BRAILLE PATTERN DOTS-126
0x2824, // (⠤) BRAILLE PATTERN DOTS-36
0x2825, // (⠥) BRAILLE PATTERN DOTS-136
0x2826, // (⠦) BRAILLE PATTERN DOTS-236
0x2827, // (⠧) BRAILLE PATTERN DOTS-1236
0x2828, // (⠨) BRAILLE PATTERN DOTS-46
0x2829, // (⠩) BRAILLE PATTERN DOTS-146
0x282A, // (⠪) BRAILLE PATTERN DOTS-246
0x282B, // (⠫) BRAILLE PATTERN DOTS-1246
0x282C, // (⠬) BRAILLE PATTERN DOTS-346
0x282D, // (⠭) BRAILLE PATTERN DOTS-1346
0x282E, // (⠮) BRAILLE PATTERN DOTS-2346
0x282F, // (⠯) BRAILLE PATTERN DOTS-12346
0x2830, // (⠰) BRAILLE PATTERN DOTS-56
0x2831, // (⠱) BRAILLE PATTERN DOTS-156
0x2832, // (⠲) BRAILLE PATTERN DOTS-256
0x2833, // (⠳) BRAILLE PATTERN DOTS-1256
0x2834, // (⠴) BRAILLE PATTERN DOTS-356
0x2835, // (⠵) BRAILLE PATTERN DOTS-1356
0x2836, // (⠶) BRAILLE PATTERN DOTS-2356
0x2837, // (⠷) BRAILLE PATTERN DOTS-12356
0x2838, // (⠸) BRAILLE PATTERN DOTS-456
0x2839, // (⠹) BRAILLE PATTERN DOTS-1456
0x283A, // (⠺) BRAILLE PATTERN DOTS-2456
0x283B, // (⠻) BRAILLE PATTERN DOTS-12456
0x283C, // (⠼) BRAILLE PATTERN DOTS-3456
0x283D, // (⠽) BRAILLE PATTERN DOTS-13456
0x283E, // (⠾) BRAILLE PATTERN DOTS-23456
0x283F, // (⠿) BRAILLE PATTERN DOTS-123456
0x2840, // (⡀) BRAILLE PATTERN DOTS-7
0x2841, // (⡁) BRAILLE PATTERN DOTS-17
0x2842, // (⡂) BRAILLE PATTERN DOTS-27
0x2843, // (⡃) BRAILLE PATTERN DOTS-127
0x2844, // (⡄) BRAILLE PATTERN DOTS-37
0x2845, // (⡅) BRAILLE PATTERN DOTS-137
0x2846, // (⡆) BRAILLE PATTERN DOTS-237
0x2847, // (⡇) BRAILLE PATTERN DOTS-1237
0x2848, // (⡈) BRAILLE PATTERN DOTS-47
0x2849, // (⡉) BRAILLE PATTERN DOTS-147
0x284A, // (⡊) BRAILLE PATTERN DOTS-247
0x284B, // (⡋) BRAILLE PATTERN DOTS-1247
0x284C, // (⡌) BRAILLE PATTERN DOTS-347
0x284D, // (⡍) BRAILLE PATTERN DOTS-1347
0x284E, // (⡎) BRAILLE PATTERN DOTS-2347
0x284F, // (⡏) BRAILLE PATTERN DOTS-12347
0x2850, // (⡐) BRAILLE PATTERN DOTS-57
0x2851, // (⡑) BRAILLE PATTERN DOTS-157
0x2852, // (⡒) BRAILLE PATTERN DOTS-257
0x2853, // (⡓) BRAILLE PATTERN DOTS-1257
0x2854, // (⡔) BRAILLE PATTERN DOTS-357
0x2855, // (⡕) BRAILLE PATTERN DOTS-1357
0x2856, // (⡖) BRAILLE PATTERN DOTS-2357
0x2857, // (⡗) BRAILLE PATTERN DOTS-12357
0x2858, // (⡘) BRAILLE PATTERN DOTS-457
0x2859, // (⡙) BRAILLE PATTERN DOTS-1457
0x285A, // (⡚) BRAILLE PATTERN DOTS-2457
0x285B, // (⡛) BRAILLE PATTERN DOTS-12457
0x285C, // (⡜) BRAILLE PATTERN DOTS-3457
0x285D, // (⡝) BRAILLE PATTERN DOTS-13457
0x285E, // (⡞) BRAILLE PATTERN DOTS-23457
0x285F, // (⡟) BRAILLE PATTERN DOTS-123457
0x2860, // (⡠) BRAILLE PATTERN DOTS-67
0x2861, // (⡡) BRAILLE PATTERN DOTS-167
0x2862, // (⡢) BRAILLE PATTERN DOTS-267
0x2863, // (⡣) BRAILLE PATTERN DOTS-1267
0x2864, // (⡤) BRAILLE PATTERN DOTS-367
0x2865, // (⡥) BRAILLE PATTERN DOTS-1367
0x2866, // (⡦) BRAILLE PATTERN DOTS-2367
0x2867, // (⡧) BRAILLE PATTERN DOTS-12367
0x2868, // (⡨) BRAILLE PATTERN DOTS-467
0x2869, // (⡩) BRAILLE PATTERN DOTS-1467
0x286A, // (⡪) BRAILLE PATTERN DOTS-2467
0x286B, // (⡫) BRAILLE PATTERN DOTS-12467
0x286C, // (⡬) BRAILLE PATTERN DOTS-3467
0x286D, // (⡭) BRAILLE PATTERN DOTS-13467
0x286E, // (⡮) BRAILLE PATTERN DOTS-23467
0x286F, // (⡯) BRAILLE PATTERN DOTS-123467
0x2870, // (⡰) BRAILLE PATTERN DOTS-567
0x2871, // (⡱) BRAILLE PATTERN DOTS-1567
0x2872, // (⡲) BRAILLE PATTERN DOTS-2567
0x2873, // (⡳) BRAILLE PATTERN DOTS-12567
0x2874, // (⡴) BRAILLE PATTERN DOTS-3567
0x2875, // (⡵) BRAILLE PATTERN DOTS-13567
0x2876, // (⡶) BRAILLE PATTERN DOTS-23567
0x2877, // (⡷) BRAILLE PATTERN DOTS-123567
0x2878, // (⡸) BRAILLE PATTERN DOTS-4567
0x2879, // (⡹) BRAILLE PATTERN DOTS-14567
0x287A, // (⡺) BRAILLE PATTERN DOTS-24567
0x287B, // (⡻) BRAILLE PATTERN DOTS-124567
0x287C, // (⡼) BRAILLE PATTERN DOTS-34567
0x287D, // (⡽) BRAILLE PATTERN DOTS-134567
0x287E, // (⡾) BRAILLE PATTERN DOTS-234567
0x287F, // (⡿) BRAILLE PATTERN DOTS-1234567
0x2880, // (⢀) BRAILLE PATTERN DOTS-8
0x2881, // (⢁) BRAILLE PATTERN DOTS-18
0x2882, // (⢂) BRAILLE PATTERN DOTS-28
0x2883, // (⢃) BRAILLE PATTERN DOTS-128
0x2884, // (⢄) BRAILLE PATTERN DOTS-38
0x2885, // (⢅) BRAILLE PATTERN DOTS-138
0x2886, // (⢆) BRAILLE PATTERN DOTS-238
0x2887, // (⢇) BRAILLE PATTERN DOTS-1238
0x2888, // (⢈) BRAILLE PATTERN DOTS-48
0x2889, // (⢉) BRAILLE PATTERN DOTS-148
0x288A, // (⢊) BRAILLE PATTERN DOTS-248
0x288B, // (⢋) BRAILLE PATTERN DOTS-1248
0x288C, // (⢌) BRAILLE PATTERN DOTS-348
0x288D, // (⢍) BRAILLE PATTERN DOTS-1348
0x288E, // (⢎) BRAILLE PATTERN DOTS-2348
0x288F, // (⢏) BRAILLE PATTERN DOTS-12348
0x2890, // (⢐) BRAILLE PATTERN DOTS-58
0x2891, // (⢑) BRAILLE PATTERN DOTS-158
0x2892, // (⢒) BRAILLE PATTERN DOTS-258
0x2893, // (⢓) BRAILLE PATTERN DOTS-1258
0x2894, // (⢔) BRAILLE PATTERN DOTS-358
0x2895, // (⢕) BRAILLE PATTERN DOTS-1358
0x2896, // (⢖) BRAILLE PATTERN DOTS-2358
0x2897, // (⢗) BRAILLE PATTERN DOTS-12358
0x2898, // (⢘) BRAILLE PATTERN DOTS-458
0x2899, // (⢙) BRAILLE PATTERN DOTS-1458
0x289A, // (⢚) BRAILLE PATTERN DOTS-2458
0x289B, // (⢛) BRAILLE PATTERN DOTS-12458
0x289C, // (⢜) BRAILLE PATTERN DOTS-3458
0x289D, // (⢝) BRAILLE PATTERN DOTS-13458
0x289E, // (⢞) BRAILLE PATTERN DOTS-23458
0x289F, // (⢟) BRAILLE PATTERN DOTS-123458
0x28A0, // (⢠) BRAILLE PATTERN DOTS-68
0x28A1, // (⢡) BRAILLE PATTERN DOTS-168
0x28A2, // (⢢) BRAILLE PATTERN DOTS-268
0x28A3, // (⢣) BRAILLE PATTERN DOTS-1268
0x28A4, // (⢤) BRAILLE PATTERN DOTS-368
0x28A5, // (⢥) BRAILLE PATTERN DOTS-1368
0x28A6, // (⢦) BRAILLE PATTERN DOTS-2368
0x28A7, // (⢧) BRAILLE PATTERN DOTS-12368
0x28A8, // (⢨) BRAILLE PATTERN DOTS-468
0x28A9, // (⢩) BRAILLE PATTERN DOTS-1468
0x28AA, // (⢪) BRAILLE PATTERN DOTS-2468
0x28AB, // (⢫) BRAILLE PATTERN DOTS-12468
0x28AC, // (⢬) BRAILLE PATTERN DOTS-3468
0x28AD, // (⢭) BRAILLE PATTERN DOTS-13468
0x28AE, // (⢮) BRAILLE PATTERN DOTS-23468
0x28AF, // (⢯) BRAILLE PATTERN DOTS-123468
0x28B0, // (⢰) BRAILLE PATTERN DOTS-568
0x28B1, // (⢱) BRAILLE PATTERN DOTS-1568
0x28B2, // (⢲) BRAILLE PATTERN DOTS-2568
0x28B3, // (⢳) BRAILLE PATTERN DOTS-12568
0x28B4, // (⢴) BRAILLE PATTERN DOTS-3568
0x28B5, // (⢵) BRAILLE PATTERN DOTS-13568
0x28B6, // (⢶) BRAILLE PATTERN DOTS-23568
0x28B7, // (⢷) BRAILLE PATTERN DOTS-123568
0x28B8, // (⢸) BRAILLE PATTERN DOTS-4568
0x28B9, // (⢹) BRAILLE PATTERN DOTS-14568
0x28BA, // (⢺) BRAILLE PATTERN DOTS-24568
0x28BB, // (⢻) BRAILLE PATTERN DOTS-124568
0x28BC, // (⢼) BRAILLE PATTERN DOTS-34568
0x28BD, // (⢽) BRAILLE PATTERN DOTS-134568
0x28BE, // (⢾) BRAILLE PATTERN DOTS-234568
0x28BF, // (⢿) BRAILLE PATTERN DOTS-1234568
0x28C0, // (⣀) BRAILLE PATTERN DOTS-78
0x28C1, // (⣁) BRAILLE PATTERN DOTS-178
0x28C2, // (⣂) BRAILLE PATTERN DOTS-278
0x28C3, // (⣃) BRAILLE PATTERN DOTS-1278
0x28C4, // (⣄) BRAILLE PATTERN DOTS-378
0x28C5, // (⣅) BRAILLE PATTERN DOTS-1378
0x28C6, // (⣆) BRAILLE PATTERN DOTS-2378
0x28C7, // (⣇) BRAILLE PATTERN DOTS-12378
0x28C8, // (⣈) BRAILLE PATTERN DOTS-478
0x28C9, // (⣉) BRAILLE PATTERN DOTS-1478
0x28CA, // (⣊) BRAILLE PATTERN DOTS-2478
0x28CB, // (⣋) BRAILLE PATTERN DOTS-12478
0x28CC, // (⣌) BRAILLE PATTERN DOTS-3478
0x28CD, // (⣍) BRAILLE PATTERN DOTS-13478
0x28CE, // (⣎) BRAILLE PATTERN DOTS-23478
0x28CF, // (⣏) BRAILLE PATTERN DOTS-123478
0x28D0, // (⣐) BRAILLE PATTERN DOTS-578
0x28D1, // (⣑) BRAILLE PATTERN DOTS-1578
0x28D2, // (⣒) BRAILLE PATTERN DOTS-2578
0x28D3, // (⣓) BRAILLE PATTERN DOTS-12578
0x28D4, // (⣔) BRAILLE PATTERN DOTS-3578
0x28D5, // (⣕) BRAILLE PATTERN DOTS-13578
0x28D6, // (⣖) BRAILLE PATTERN DOTS-23578
0x28D7, // (⣗) BRAILLE PATTERN DOTS-123578
0x28D8, // (⣘) BRAILLE PATTERN DOTS-4578
0x28D9, // (⣙) BRAILLE PATTERN DOTS-14578
0x28DA, // (⣚) BRAILLE PATTERN DOTS-24578
0x28DB, // (⣛) BRAILLE PATTERN DOTS-124578
0x28DC, // (⣜) BRAILLE PATTERN DOTS-34578
0x28DD, // (⣝) BRAILLE PATTERN DOTS-134578
0x28DE, // (⣞) BRAILLE PATTERN DOTS-234578
0x28DF, // (⣟) BRAILLE PATTERN DOTS-1234578
0x28E0, // (⣠) BRAILLE PATTERN DOTS-678
0x28E1, // (⣡) BRAILLE PATTERN DOTS-1678
0x28E2, // (⣢) BRAILLE PATTERN DOTS-2678
0x28E3, // (⣣) BRAILLE PATTERN DOTS-12678
0x28E4, // (⣤) BRAILLE PATTERN DOTS-3678
0x28E5, // (⣥) BRAILLE PATTERN DOTS-13678
0x28E6, // (⣦) BRAILLE PATTERN DOTS-23678
0x28E7, // (⣧) BRAILLE PATTERN DOTS-123678
0x28E8, // (⣨) BRAILLE PATTERN DOTS-4678
0x28E9, // (⣩) BRAILLE PATTERN DOTS-14678
0x28EA, // (⣪) BRAILLE PATTERN DOTS-24678
0x28EB, // (⣫) BRAILLE PATTERN DOTS-124678
0x28EC, // (⣬) BRAILLE PATTERN DOTS-34678
0x28ED, // (⣭) BRAILLE PATTERN DOTS-134678
0x28EE, // (⣮) BRAILLE PATTERN DOTS-234678
0x28EF, // (⣯) BRAILLE PATTERN DOTS-1234678
0x28F0, // (⣰) BRAILLE PATTERN DOTS-5678
0x28F1, // (⣱) BRAILLE PATTERN DOTS-15678
0x28F2, // (⣲) BRAILLE PATTERN DOTS-25678
0x28F3, // (⣳) BRAILLE PATTERN DOTS-125678
0x28F4, // (⣴) BRAILLE PATTERN DOTS-35678
0x28F5, // (⣵) BRAILLE PATTERN DOTS-135678
0x28F6, // (⣶) BRAILLE PATTERN DOTS-235678
0x28F7, // (⣷) BRAILLE PATTERN DOTS-1235678
0x28F8, // (⣸) BRAILLE PATTERN DOTS-45678
0x28F9, // (⣹) BRAILLE PATTERN DOTS-145678
0x28FA, // (⣺) BRAILLE PATTERN DOTS-245678
0x28FB, // (⣻) BRAILLE PATTERN DOTS-1245678
0x28FC, // (⣼) BRAILLE PATTERN DOTS-345678
0x28FD, // (⣽) BRAILLE PATTERN DOTS-1345678
0x28FE, // (⣾) BRAILLE PATTERN DOTS-2345678
0x28FF, // (⣿) BRAILLE PATTERN DOTS-12345678

Do Braille labels need the spacer? We could allow pure-Braille labels (and map that spacer to hyphen?) I don’t see many other options.

Another alternative would be choosing a Braille open/close character, and only allowing Braille between those characters, like [⣿⣺⣯].eth. I guess that’s getting really weird.

1 Like

I’ve got no idea, but I’m guessing they do, I’ve only dipped my toe into braille a little, there are many different languages in it as well

I just know to make up letter/numbers/word that they use a single or a combination of round dots

I made some braille names as visually they look nice and was going to use them as various wallet addresses as something a bit different

2 Likes

This seems reasonable to me.

I want to push back a bit on really getting into the weeds here, though. I thought we were close to a new normalisation standard, and I don’t want us to get bogged down reviewing all of Unicode.

2 Likes