Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> It's time for big endian to be consigned to the dustbin of history.

And, especially what most people call big-endian, which is a bastardized mixed-endian mess of most significant byte is zero, while least significant bit is likewise zero.



While I have a strong personal preference for little endian, one thing I've always appreciated about IBM System/360 and its successors is that it at least has consistent notational conventions: most significant byte first, most significant bit zero[1][2].

[1] https://bitsavers.trailing-edge.com/pdf/ibm/360/princOps/A22...

[2] https://www.ibm.com/docs/en/SSQ2R2_15.0.0/com.ibm.tpf.toolki...


> IBM System/360 and its successors ... at least has consistent notational conventions

Yes, if I hadn't known about that, I probably wouldn't have written "most."

> While I have a strong personal preference for little endian

Despite the porportedly even-handed treatment given in the seminal paper:

https://www.rfc-editor.org/ien/ien137.txt

That paper was obviously a product of motivated reasoning. And motivated reasoning in the hands of an intelligent and articulate person is always dangerous.

(Today, in the public sphere, we are seeing successful motivated reasoning by people who are much less intelligent and articulate, but that is a completely separate issue.)

The primary benefit (from observation of past arguments) that big-endian has is when you are dumping data and looking at a sequence of bytes, and don't want to mentally swap them around.

But that itself begs the question. If you are so keen on big-end first, then why does your dump start at the small end of memory?


Not just that. If you store a lower precision type e.g. 2 byte integer inside a large precision type e.g. 8 byte integer, you can read the 2 byte integer by just reading two bytes. Extending or shrinking data types leads to a very natural way of implementing arbitrary precision arithmetic. To get the same capability with big endian your pointer has to point at the end of the number. If you have byte arrays, like a string, you would have to swap the order in which you allocate data, starting from the end of the array and always decrement your index from the array pointer. This would then also apply this to struts. You point at the end of the struct and subtract the field offsets.

Overall this seems like a pretty weird choice on a planet where the vast majority of text is written from left to right and only numbers are written right to left. Especially since endianness only affects byte order but not bit order, as you said.


The primary benefit touted for big-endian is "When I do a memory dump, the data looks right."

But if you really believe the left side is bigger, why do you put the smaller memory address on the left side of your dump?


> And, especially what most people call big-endian, which is a bastardized mixed-endian mess of most significant byte is zero, while least significant bit is likewise zero.

In the 1980s at AT&T Bell Labs, I had to program 3B20 computers to process the phone network's data. 3B20s used the weird byte order 1324 (maybe it was 2413) and I had to tweak the network protocols to start packets with a BOM (byte order mark) (as the various switches that sent data didn't define endianess), then swap bytes accordingly.

Lesson learned was Never Ignore Endian issues.


While I have no personal experience with the 3B2 series, its documentation[1] clearly illustrates the GP's complaint: starting from the most significant binary digit, bit numbers decrease while byte addresses increase.

As for networking, Ethernet is particularly fun: least significant bit first, most significant byte first for multi-byte fields, with a 32-bit CRC calculated for a frame of length k by treating bit n of the frame as the coefficient of the (k - 1 - n)th order term of a (k - 1)th order polynomial, and sending the coefficients of the resulting 31st order polynomial highest-order coefficient first.

[1] https://vtda.org/docs/computing/AT&T/3B2/3b2_Assembly_Lang_P...


I know this particular pain intimately.

I was in charge of the firmware for a modem. I had written the V.42 error correction, and we contracted out the addition of the MNP correction protocol. They used the same CRC.

The Indian (only important because of their cultural emphasis on book learning) subcontractor found my CRC function, decided it didn't quite look like the academic version they were expecting, and added code to swap it around and use it for MNP, thus making it wrong.

When I pointed out it was wrong, they claimed they had tested it. By having one of our modems talk to another one of our modems. Sheesh.


> Lesson learned was Never Ignore Endian issues.

This is an excellent lesson for data transport protocols and file formats.

> I had to tweak the network protocols to start packets with a BOM (byte order mark) (as the various switches that sent data didn't define endianess), then swap bytes accordingly.

(A similar thing happened to me with the Python switch from 2 to 3. Strings all became unicode-encoded, and it's too difficult to add the b sigil in front of every string in a large codebase, so I simply ensured that at the very few places that data was transported to or from files, all the strings were properly converted to what the internal process expected.)

But, as many other commenters have rightly noted, big-endian CPUs are going the way of CPUs with 18 bit bytes that use ones-complement arithmetic, so unless you have a real need to run your program on a dinosaur, you can safely forget about CPU endianness issues.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: