ยฉ 2024 Blockchain Commons
Authors: Wolf McNally, Christopher Allen
Date: December 7, 2024
Bytemoji is a curated set of 256 emojis that are chosen to be easily recognized and distinguished from each other, especially when used in combination. Bytemoji are intended to be used as a simple and quick way to visually identify objects in digital systems, for example, by converting a 32-bit hash (e.g., CRC-32 or truncated SHA-256) to its four corresponding Bytemojis.
Other ways to visually identify digital object include ByteWords, and LifeHash. Bytemoji combine the value of cryptographic hash visualization with easy display and handling as text.
Unlike ByteWords, Bytemoji are not intended to "round-trip" data between text and binary format, although this is technically possible.
Each line below represents a combination of four bytes, represented as Bytemojis.
๐ ๐ฉ ๐ฅ ๐ซ
๐งต ๐ ๐ ๐
๐ซ ๐ค ๐ ๐
๐ช ๐ ๐ ๐ป
๐งธ ๐ฅ ๐ง ๐
๐ ๐ ๐ฌ ๐ง
๐งฆ ๐ฝ ๐ ๐ฆ
๐ ๐ญ ๐ฅบ ๐
๐ฅ ๐ฆ ๐น ๐ข
๐ฝ ๐ ๐บ ๐
Although Bytemoji are chosen partly for their visual distinctness, they are not intended to be individually identifiable. Bytemoji should never be displayed in isolation: they should always displayed in clusters of four or more to represent cryptographic hashes. In addition, they should be clustered with other indicators of the digital object's unique identity, such as hex codes, ByteWords, or a LifeHash.
This clustering ensures sufficient visual distinction and reduces the risk of ambiguity, even if individual emojis may share some similar features. In this example, the Bytemojis, the ByteWords, and the raw hex representation are shown together, under the user's chosen name of the object:
**My First Cryptographic Seed**
๐ ๐น ๐ฝ ๐
JUGS DELI GIFT WHEN
71 27 4d f1
This mix of modalities further increases the likelihood of accurate recognition and decreases the risk of confusion. It is also useful for accessibility, as it provides multiple ways to present the information via assistive technologies.
See our previous work on the Object Identity Block (OIB) for more information on identifying digital objects.
The byte sequences that encode emojis can become quite long and complex:
- Some emojis that render as a single glyph use several combining forms. For example, โI am a witnessโ takes 17 UTF-8 bytes!
๐๏ธโ๐จ๏ธ
- Some emojis are rendered as sequences of multiple glyphs, for example "family: man, woman, girl, boy with various skin tones" takes 28 UTF-8 bytes. Note that this is a single emoji!
๐จ๐ฟโ๐ฉ๐พโ๐ง๐ฝโ๐ฆ๐ผ
So to keep things simple while still providing a wide range of visual objects, we selected a set of 256 emojis that:
- All render as single glyphs.
- All have code points that serialize as 3 or 4 UTF-8 bytes.
In addition, we used these other selection criteria:
- All emojis are visually distinct, with maximally unique shapes and designs.
- All emojis must render on a wide range of platforms.
- Avoid emojis that are highly similar or could be easily confused.
- Avoid emojis that depend solely on color differences to be distinguished.
- Prefer emojis that read well at smaller sizes.
- Ensure that contrast is good when displayed on light or dark backgrounds.
- Exclude combining forms, skin tone modifiers, and gender modifiers.
- Ensure the set covers a wide range of themes and concepts.
- Prefer emojis with positive or neutral connotations.
- Avoid national, ideological, and controversial symbols.
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | ๐ | ๐ | ๐ | ๐ | ๐ | ๐ | ๐ | ๐ | ๐ | ๐ญ | ๐ซ | ๐ฅฑ | ๐คฉ | ๐ถ | ๐คจ | ๐ซฅ |
| 1 | ๐ฅต | ๐ฅถ | ๐ณ | ๐คช | ๐ต | ๐ก | ๐คข | ๐ | ๐ค | ๐คก | ๐ฅณ | ๐ฅบ | ๐ฌ | ๐ค | ๐ | ๐คฏ |
| 2 | ๐ | ๐น | ๐บ | ๐ | ๐ป | ๐ฝ | ๐บ | ๐น | ๐ป | ๐ฝ | ๐ | ๐ฟ | ๐ซถ | ๐คฒ | ๐ | ๐ค |
| 3 | ๐ | ๐ | ๐ | ๐ | ๐ช | ๐ | ๐ฆท | ๐ | ๐ | ๐ง | ๐ | ๐ค | ๐ฆถ | ๐ | ๐ | ๐ |
| 4 | ๐ | ๐ | ๐ | ๐ | ๐ซ | ๐ | ๐ | ๐ | ๐ฅ | ๐ | ๐ฅ | ๐ฅฆ | ๐ | ๐ฝ | ๐ฅ | ๐ซ |
| 5 | ๐ง | ๐ฅ | ๐ฅฏ | ๐ | ๐ง | ๐ฅ | ๐ | ๐ญ | ๐ | ๐ | ๐ | ๐ฎ | ๐ฅ | ๐ฑ | ๐ | ๐ค |
| 6 | ๐ | ๐ฅ | ๐จ | ๐ฆ | ๐ | ๐ชด | ๐ต | ๐ฑ | ๐ | ๐ | ๐ | ๐น | ๐บ | ๐ผ | ๐ป | ๐ธ |
| 7 | ๐จ | ๐ | ๐ง | ๐ฆ | ๐ | ๐ | ๐ | ๐ | ๐ | ๐ | ๐ | ๐ | ๐ซ | โญ | ๐ช | ๐ |
| 8 | ๐ | ๐ | ๐ | ๐ | ๐ | ๐ | ๐ฉ | ๐ฌ | ๐ฏ | ๐ซ | ๐ด | ๐ท | ๐ฉ | ๐ | ๐บ | ๐ |
| 9 | ๐ | ๐ | ๐ | ๐ต | ๐จ | ๐ | ๐ | ๐ | ๐ฆ | ๐ฐ | ๐ก | ๐ข | ๐ | ๐ | ๐ | ๐ |
| A | ๐ช | ๐ช | ๐ | ๐ | ๐ฆ | ๐ซ | ๐ | ๐ | ๐ | ๐งฎ | ๐ | ๐ | ๐ท | โฐ | โณ | ๐ก |
| B | ๐ก | ๐ฐ | ๐งฒ | ๐งธ | ๐ | ๐ | ๐ | ๐ชญ | ๐ | ๐ซ | ๐ญ | ๐ | ๐ | ๐ฅ | ๐ท | ๐บ |
| C | ๐ | ๐ | ๐พ | ๐ | โจ | ๐ฅ | ๐ฅ | ๐ | ๐ | ๐ | ๐ฉณ | ๐ | ๐ | ๐งข | ๐ | ๐งถ |
| D | ๐งต | ๐ | ๐ | ๐ | ๐งฆ | ๐งค | ๐ | ๐ | ๐ฑ | ๐ถ | ๐ญ | ๐น | ๐ฐ | ๐ฆ | ๐ป | ๐ผ |
| E | ๐จ | ๐ฏ | ๐ฆ | ๐ฎ | ๐ท | ๐ธ | ๐ต | ๐ | ๐ฅ | ๐ฆ | ๐ฆ | ๐ด | ๐ฆ | ๐ | ๐ | ๐ฆ |
| F | ๐ | ๐ | ๐ข | ๐บ | ๐ | ๐ชฝ | ๐ | ๐ฆ | ๐ชผ | ๐ฆ | ๐ฆ | ๐ | ๐ฆญ | ๐ | ๐ฌ | ๐ณ |
The current reference implementation is in the bc-ur Rust crate. The bc-ur crate is available on crates.io and GitHub.
The following line contains all 256 Bytemojis in order, which may be used for testing and in further implementations:
๐๐๐๐๐๐๐๐๐๐ญ๐ซ ๐ฅฑ๐คฉ๐ถ๐คจ๐ซฅ๐ฅต๐ฅถ๐ณ๐คช๐ต๐ก๐คข๐๐ค ๐คก๐ฅณ๐ฅบ๐ฌ๐ค๐๐คฏ๐๐น๐บ๐๐ป๐ฝ๐บ๐น๐ป๐ฝ๐๐ฟ๐ซถ๐คฒ๐๐ค๐๐๐๐๐ช๐๐ฆท๐๐๐ง ๐๐ค๐ฆถ๐๐๐๐๐๐๐๐ซ๐๐๐๐ฅ๐๐ฅ๐ฅฆ๐
๐ฝ๐ฅ๐ซ๐ง๐ฅ๐ฅฏ๐๐ง๐ฅ๐๐ญ๐๐๐๐ฎ๐ฅ๐ฑ๐๐ค๐๐ฅ ๐จ๐ฆ๐๐ชด๐ต๐ฑ๐๐๐๐น๐บ๐ผ๐ป๐ธ๐จ๐๐ง๐ฆ๐๐๐๐๐๐๐๐๐ซโญ๐ช๐๐๐๐๐๐๐๐ฉ๐ฌ๐ฏ๐ซ๐ด๐ท๐ฉ๐๐บ๐๐๐๐๐ต๐จ๐๐๐๐ฆ๐ฐ๐ก๐ข๐ ๐ ๐๐๐ช๐ช๐๐๐ฆ๐ซ๐๐๐๐งฎ๐๐๐ทโฐโณ๐ก๐ก๐ฐ๐งฒ๐งธ๐๐๐๐ชญ๐๐ซ๐ญ๐๐๐ฅ๐ท๐บ๐๐๐พ๐โจ๐ฅ๐ฅ๐๐๐๐ฉณ๐๐๐งข๐๐งถ๐งต๐๐ ๐๐งฆ๐งค๐๐๐ฑ๐ถ๐ญ๐น๐ฐ๐ฆ๐ป๐ผ๐จ๐ฏ๐ฆ๐ฎ๐ท๐ธ๐ต๐๐ฅ๐ฆ๐ฆ๐ด๐ฆ๐๐๐ฆ๐๐๐ข๐บ๐๐ชฝ๐๐ฆ๐ชผ๐ฆ๐ฆ๐๐ฆญ๐๐ฌ๐ณ
The rendering of emojis can vary significantly between platforms. The Bytemoji set was chosen to be as platform-independent as possible and the entire set should be universally supported where emojis are supported, but some variation in appearance is to be expected.