> Back in the 1980s, most of the major CPUs in the world were big-endian, while ...

kens · on Oct 28, 2022

The history of why the 8008 was little-endian is interesting and predates the 8008. In 1970, the mostly forgotten company CTC released the Datapoint 2200, a desktop computer built from TTL chips (not a microprocessor) and sold as a programmable terminal. It had a serial processor using shift-register memory chips. It was an 8-bit processor but since it operated on one bit at a time, it had to start with the lowest bit to make addition work. As a result, it was little-endian.

CTC talked to Intel and Texas Instruments to see if the processor could be put onto VLSI chips to replace the board of TTL chips. Texas Instruments produced the TMX 1795 processor, shortly followed by the Intel 8008, both processors cloning the Datapoint 2200's instruction set and architecture including little-endian. CTC rejected both processors and stuck with TTL. TI couldn't find another customer for the TMX 1795 and it vanished from history. Intel successfully marketed the 8008 as a general-purpose microprocessor. Its architecture was copied for the 8080 and then modified for the 16-bit 8086, leading to the x86 architecture that rules the desktop and server market. As a result, x86 has the little-endian architecture and other features of the Datapoint 2200. I consider the Datapoint 2200 to be one of the most influential processors ever, even though it's almost completely forgotten.

kragen · on Oct 28, 2022

Hmm, a minor quibble: "VLSI" would be normally more than 10k gates or 100k transistors on a single chip, wouldn't it? But the 8008 had only about 3500 transistors (I don't see an exact count in http://www.righto.com/2016/12/die-photos-and-analysis-of_24....) and so probably about 1000–1500 gates, so I think it should be called "LSI" rather than "VLSI".

A funny thing about the 8008 is that Intel's manual for its instruction set is unnecessarily shitty — even if you didn't know the history with Datapoint, the Intel manual is obviously not by the people who designed the instruction set because it's in hexadecimal, a tradition sadly followed by the 8080 and 8086 manuals. The Datapoint manuals, by contrast, are all in octal, making the machine code enormously easier to understand. (The H8 I grew up with used an Intel chip, but the front panel monitor program used octal.)

kens · on Oct 28, 2022

Yes, it's kind of amazing how the 8080/Z80/8086 instructions sets make much more sense in octal, but are always displayed in hexadecimal. In hex, you can kind of see some patterns, but everything is obvious in octal. The 6502 is also based on bit triples, but they grouped the bits from the top, so octal doesn't make things any better.

The Datapoint 2200, by the way, used decimal decoder chips to decode the octal parts of the instruction set and simply ignored the 8 and 9 outputs.

pencilguin · on Oct 29, 2022

Veering radically off topic, Ken, I recently became aware of the US Navy's transistorized fleet-wide fire-control computers of the early 1960s. Is there anybody restoring those and presenting museum exhibits of their capabilities? They were fielded as replicated networked real-time load-sharing multi-processors in 1965! Why didn't the Apollo program draw on that experience?

Also, speaking of Apollo, Hal Laning made the first-ever actual compiler, in the mid-late-'50s, and a real-time multi-tasking load-shedding OS for Apollo, and was snubbed for the Turing Award every year after right up to his death. I never even encountered his name until I read Sunburst and Luminary.

Why do we ignore the real pioneers?

segfaultbuserr · on Oct 29, 2022

> They were fielded as replicated networked real-time load-sharing multi-processors in 1965! Why didn't the Apollo program draw on that experience? [...] Why do we ignore the real pioneers?

I think early computer history in general worked this way. Communication was slow before the Internet. Computer engineering largely existed as an art to build isolated one-off systems. Basic concepts were painstaking reinvented and re-engineered all the time under different circumstances. Lack of standard architectures made transfer of concepts difficult. And in the end, many computer designs die in isolation and obscurity, whose architectures left no influence, and no heritage can be found in later computers. The game only changed after minicomputers, which turned computers into mass-market products. Microprocessors did it to a greater degree.

NSA's SIGSALY system, built in late World War 2 that already pioneered digital-to-analog conversion, digital signal processing, digital PCM audio, speech compression codec, and cryptography. But I doubt later DSP engineers ever heard of this project.

Or consider Multics, a relatively late and well-known system. Today it would be called a "cloud-computing" operation system, designed for a world where computers are public utilities to provide time-sharing service to the public, in a way similar to telephone or electricity. Its operating system designed for strong reliability and security. But its architecture was all but forgotten (the hierarchical file system survived in Unix, but it was a minor feature compared to Multics real achievements - and I suspect it's not the firs time the hierarchic file system was invented). Why didn't later computers reuse its design and concepts? For starter, Multics was designed to on a modified GE-600 mainframe with customized hardware. Nobody else in the world has the same machine.

It's why it's crucial to perverse any remaining historical materials. Otherwise we won't even know even their existence.

kragen · on Oct 30, 2022

Since Unix was written as an effort to recapture the good parts of Multics after Bell Labs pulled out, and the people working on Unix for the first several years had also been working on Multics before that, I don't think Multics is a good example of "Basic concepts were painstaking reinvented and re-engineered all the time under different circumstances. … die in isolation and obscurity, whose architectures left no influence." The people who worked on Unix for its first decade were intimately familiar with Multics, and Multics was well-documented publicly at the time, unlike certain other systems from the 01960s. The research literature in subsequent years frequently compared and contrasted systems designs with Multics's design.

Even some of the Multics features that weren't in Unix in its first decade, like ACLs, memory-mapped files, process accounting, shared libraries, and SMP support, got added to Unix later.

All this is to say that, to the extent that later systems rejected Multics's design decisions, they did it consciously, not out of ignorance. It's easy to look back at the things Multics attempted, like strong security, and believe that it achieved them, and consequently that more recent systems designs represent backsliding. In many cases, though (like that one!) it did not, and later systems designs solve unanticipated problems that arose from the Multics design choices.

A lot of DSP, speech codec, and even TCP/IP work in later decades was guided by NSA people who were familiar with the SIGSALY history, even if they didn't tell the uncleared people they were working with. See https://ee.stanford.edu/~gray/lpcip/ for a detailed history of speech protocols. In other cases, like the DES, the NSA people deliberately sabotaged the resulting work.

I do agree that preserving historical materials is important.

kens · on Oct 29, 2022

Are you talking about the Navy Tactical Data System (NTDS)? This important system (that Cray worked on) is mostly forgotten, but the book "When Computers Went to Sea" provides good coverage. There's one on exhibit at the Computer History Museum. Apollo did make heavy use of them: the Univac 642B computers from NTDS were used at the ground stations around the world, relaying data to Mission Control.

pencilguin · on Oct 30, 2022

Wow, thanks! Had no idea of their role in Apollo.

kragen · on Oct 29, 2022

Shit, that's brilliant! I guess the 74138 didn't exist yet? JiaLiChuang PCB has the 74HC138 in their "basic" parts list for free PCB pick-and-place but no decimal decoders. The CD405[123] did exist but were presumably far too slow for Datapoint.

segfaultbuserr · on Oct 28, 2022

VAX's predecessor, the PDP-11, is also a little-endian architecture in its basic form (and the PDP-11 was also a major source of influence to many microprocessor designers, just like VAX's influence on Unix workstations).

The "PDP-endian" is only a quirk due to its Floating Point Unit's long integer and double-precision floating point formats. The FPU was an extra module attached to the processor, and the original PDP-11 did not even have an FPU. It only appeared on later models: on low-end machines a simplified FPU version was available for separate purchase with limited functionalities, and only high-end models had the full FPU. On a system without FPU installed, you basically don't need to worry about "PDP-endian", it's a pure little-endian machine. But for convenience, the Unix C compiler always stored long integers in PDP-endian to avoid swapping endians. Because the same Unix and C software ran on all machines with or without FPU, all Unix programmers needed to worry about it, thus the PDP-endian folklore.

But why did the PDP-11 FPU use this strange format? @aka_pugs from Twitter did some digging, and found the PDP-endian was already in used as a softfloat format by DEC's PDP-11 Fortran compiler. So the FPU was made compatible with that...

pencilguin · on Oct 29, 2022

PDP-11 long ints were mixed- endian: high 16 bits, then low 16 bits. But within each half, the low 8 bits, then upper 8 bits. I was sometimes called the "NUXI" format, for how it scrambled the bytes in "UNIX".

segfaultbuserr · on Oct 29, 2022

What I was saying is that the basic PDP-11 as original designed, is a pure 16-bit machine with no 32-bit capabilities. The Unix C and other compilers used the middle-endian format for 32-bit integers largely as an artificial choice to be compatible with its FPU, as middle-endian was FPU's native long integer format. But the FPU was only a later hardware extension, and was not an inherent part of the basic system, and it's not enforced by the basic PDP-11 insturction set. It's entirely possible to modify the UNIX C compiler to store long integers in little endian.

pencilguin · on Oct 30, 2022

ufo · on Oct 28, 2022

I think it's neat that Arabic, where we got the numbers from, is an RTL language. From their point of view the numbers are little endian.

masklinn · on Oct 28, 2022

OTOH Arabic got its numerals from indic systems, and indic scripts are generally LTR, which points to big median being “more natural”. It also matches the spelling of positional numerals in most languages, and does make sense from a convenience point of view: when talking it’s easier to round off by just stopping as you go than to figure out what rounding you should apply beforehand.

If you spell out numbers in little-endian, once you start you’re committed to spelling it out in full, whereas big endian lets you stop at basically any point you feel like.

ithkuil · on Oct 29, 2022

Zwei und dreissig ? Wahid wa ishrun?

Yes most modern languages have lost little endiannes and those that kept it use only for the first 100 numbers.

That's because for bigger and bigger numbers you're correctly pointing out that big endian is more useful when saying the numbers aloud since you can often ignore the less significant digits.

My original point was different though:

You can easily render little endian hexdumps equally readable as big endian hexdumps by just writing them in the order that is meant for numbers, namely right to left.

We even align numbers to the right in spreadsheets. That's the same thing.

Look at an old DEC manual (digital Unix, or VMS) and you'll see hexdumps where the numeric part is aligned from center towards the left and the ASCII part is aligned from the center to the right.

With this layout you can easily read multibyte numbers naturally.

renox · on Oct 28, 2022

> Little endian is not a hack. It's a natural way to represent numbers

To represent integers, for real numbers it's quite weird.