We think about yotabytes, but can't handle just one byte

Ask to yourself (or nearby computer guy) question:

How many bytes needed to write value of one byte?

If answer is "one" (because byte is byte), then it is common error. With one hexadecimal digit one can write value of only 4 bits. Therefore in hexadecimal two bytes are necessary to write value of one byte with value range starting from 00 till to FF.

Programmers develop new and more expressive programming languages, using which you can say more, writing less. Though almost nobody is interested in such "primitive" thing as more effective digits.

In this article new notation for hexadecimal digits is proposed, which also allows writing value of one byte in just one character in a systematic and meaningful way.

Content

Shape of digit can also be positional

After positional notation for numbers was widely accepted, most of people think in this area perfectness is achieved and nothing better can be developed. But today widely used arabic numerals are not final development, because shape of digits is dictated by historically changed mess, not system. There were more ordered digits, for example, mayan and babylonian digits, which not only used positional order of digits in number, but also shape of digits was ordered positionally. This approach is forgotten for arabic digits now. 

fig01.png
Mayan digits

fig02.png
Babylonian digits

Scientists rethought number system intensively in 70.ties at the beginning of computer era. Problem was (and still is), that for computers it is easy to group bits in groups of four, but for conventional decimal system 0 till 9 is not enough even to name values for four bits. Therefore in computer science so called hexadecimal system is used (which by Donald Knuth should be named more correctly as "senidenary" or "sedenary" system). At the beginning new shapes of digits where proposed to distinguish them from decimal digits. Bruce Martin proposed to write hexadecimal digits in positional way with meaning bits as ordered horizontal lines, which where joined with vertical line.

fig03.png
Hexadecimal digits proposed by Bruce Martin

Although proposed idea is not bad in terms of order, it was bad idea in terms of easiness, because such digits have to bee written with three, four strokes, while decimal digits can be written with (more or less sophisticated) two strokes. Today as hexadecimal digits are accepted decimal digits from from 0 till 9 which are supplemented with Latin characters A till F. This seems customary now, but is not always comfortable approach. (As the same digits are used for different bases, in programming languages additional suffix is necessary to show base-16 in e.g. as 0x.)

Finding out the simplest form for 4-bit digit, is quite simple. One need to show presence or lack of each bit in the digit. So this means, that, to show 4-bit number, you need to show four bits, e.g. as four "boolean sticks" in shape of letter o.

fig04.png
Hexadecimal digits with minimal redundancy but complicated writing and reading

As picture shows, such un-redundant digits are very hard to write and read correctly. (Actually extreme could be zero as empty space, and only three bits for non-zero digits, but then these digits would not be readable at all.) With red color are underlined digits, with hard to read vertical lines, and with magenta are underlined digits, with hard to read horizontal lines.

Though, by small changes in digit shape proposed by Bruce Martin, it is possible to develop hexadecimal digits which are both — easy to write and easy to read because of (necessary) redundancy in digit's shape. I propose to write hexadecimal digits in 8-shaped matrix in a following way. For more compatibility with binary/decimal digits, I propose to order meaning bits in digit starting from top to bottom. So, that top horizontal line or +1 bit is at the top of digit, then middle horizontal line represents +2 bit and +4 bit is located at the bottom. 0.th and 4.th bit (+8) work like mutual triggers for non-zero numbers — they have either 0 bit (right side vertical line) or +8 bit (left side vertical line). In zero both these bits/lines are presented.

fig13.png
Rules for bit positioning in proposed digits

All digits in proposed system look following.

fig05.png
Look of proposed hexadecimal digits

In this system zero can be drawn either as O or o. If "bit sticks" are shown very formally, then 1 should be always drawn as fig15.png, but from readability point of view it can be drawn also as simple vertical line I.

Because such digits are drawn in systematic way, "sticks arithmetic" can be used to count value of numbers. This could be especially enjoyable for small children.

fig07.png
Adding numbers show their positional shape

Because for reading only position of "bit sticks" is important in the digit (topology of digit), such digits can be heavily modified according to one's style, graphical design and other requirements, but still be easy to distinguish their shape and read digit's value. For example, I found, that, while writing these digits by hand, I can draw 3 and A as differently rotated (and shaped) 1. Of course, somebody can develop more stylish/easy to read/write digits, which also represents bit position very well.

fig06.png
Proposed digits written by hand

Similarly to written digits, typed digits also can have more complicated form than just "LCD boxed" digits, to decrease monotonicity of the digits and stress their differences for easier reading. (For example, due to "boxiness" of 00 and FF in current hexadecimal form, it is quite hard to see FFs in long list of 00s.)

fig14.png
Digits can be more different than just equally shaped boxes

But all that was just prelude. Now about most important part.

Using proposed digits it is possible to write value of the byte using single character.

Using ordered foundation you can create more complicated things

If shape of digits is created in ordered manner, they have wonderful feature. You can use them as "bricks" to create more complicated structure, for example to create unique shape of 8-bit digit. For example, you can join these digits by two in so called ligatures.

fig08.png
Joining digits in ligature to address full value of byte

Full table of 8-bit digits from 00 to FF then looks following.

fig09.png
Table of 8-bit numbers

Of course, with such digits one can write as twice as much information in the same string, using just one character.

fig10.png
Comparison of classic hexadecimal numbers and newly proposed numbers

Using such digits there is no need to use complicated one-byte-to-two-characters representation used for binary editors. Compare look of classical representation with two characters per byte and new representation of binary editor.

fig11.png
Comparison of binary editor with classical and newly proposed hexadecimal digits

As width is the same one character, actually binary editor can use just one view of data with just switching from numeric representation to symbolic representation (e.g. 41 or A) of byte. It is also possible to join two views by showing Latin characters for readable bytes, and numeric values for other system/non visible values of the byte.

Resources for practical exercises

Everything above described was prepared with new fonts, developed using FontForge. As  FontForge is quite weak as graphical editor, actual shapes of fonts were developed by Inkscape (shapes for Inkscape are saved in Ligaturas.svg file) and then copied into FontForge.

  1. xDigitsClock.ttf shows decimal digits 0 — 9 and letters A — F as newly proposed digits as "Boolean sticks" in (old fashioned LCD) digital watch.
  2. xDigitsSans.ttf shows decimal digits 0 — 9 and letters A — F in Sans style. If there is no space between "digits", they are joined in ligatures by two resulting full 2^8 combinations of digits. To show strings of stand-alone digits, separate them with space, which in this font has zero width. If "normal" space is needed, use non-breaking space or tabulation.
  3. xDigitsSystem.ttf shows bytes from 00 to FF in form of ligatures. Can be used to show value of binary files in editor.
  4. If you don't use Linux, DejaVuSansMono.ttf could help, as they were used as template (and thus have the same metric, notably width and height) as above mentioned fonts.

Food for thoughts and future work

What to do when ligatures need to be rewritten by hand? There could be two options. Safer but more effort is to rewrite ligatures as separate digits. Because creation of ligatures is easy to understand, their separation in particular digits is also easy to do. Another approach would be writing ligatures by hand also. This requires more investigation to decide if such more complicated writing style is reasonable comparing to separate numbers.

Currently there are no names for 255 digits and there is no standard, how to spell them. One solution is to join digit names similarly as their shapes are joined. So, names of decimal digits and Latin letters are used, and for joined digits there are two roots in the name with e.g. stop consonant (for better audibility) between roots. For example, 1A is spelled as on-ka, BB — bek-be, FF — fek-fe etc.

Considering Chinese, who can learn thousands of hieroglyphs, in theory these digits could be extended also to 16-bit digits, similarly (systematically!) extending digit names.

(Everything in this section is just as food for thoughts and I encourage you to develop even better solutions.)

In science fiction movies producers try to astonish us with artificially developed numbers, alphabets and languages  (e.g. Klingon language and Matrix console). If pragmatically oriented computer people are not ready yet to embrace these digits, I encourage at least writers/film makers to embrace these digits for science fiction, to show that digits can be used in more organized way as we currently use them.

fig12.png

P.S.
Of course, with some time, these digits can be used in decimal system also emoticon_wink

Tags Teksts Grafika English
Created by Valdis Vītoliņš on 2015-05-03 07:17
Last modified by Valdis Vītoliņš on 2021-04-13 14:27
 
Xwiki Powered
Creative Commons Attribution 3.0 Unported License