Consistent? What do you mean, "consistent"? Sometimes it's comma separated, some...

paulddraper · on Oct 10, 2017

> Sometimes it's comma separated, sometimes it's semicolon separated (depending on the user's locale), sometimes it's separated by tabs

CSV is comma separated. [1]

Valid YAML

    foo: bar baz

Invalid YAML

    foo: "bar" baz

Valid YAML

    foo: "bar baz"

Invalid YAML

    foo: "bar baz

Valid YAML

    foo: bar baz"

[1] https://tools.ietf.org/html/rfc4180

efreak · on Oct 10, 2017

You would think so, but people are dumb. I've seen tab-delimited files that are .CSV instead of .tsv, and I've also seen the semicolon delimiter a few times though I can't recall where. I think Excel actually pops up a prompt when importing to confirm the delimiter in some cases?

From your link, it's quite clear that you should not assume any particular CSV file to follow any particular rules.

> Interoperability considerations: > Due to lack of a single specification, there are considerable differences among implementations. Implementors should "be conservative in what you do, be liberal in what you accept from others" (RFC 793 [8]) when processing CSV files. An attempt at a common definition can be found in Section 2....

> Published specification: > While numerous private specifications exist for various programs and systems, there is no single "master" specification for this format. An attempt at a common definition can be found in Section 2.

Section 2 states:

> This section documents the format that seems to be followed by most implementations:

Piskvorrr · on Oct 11, 2017

"All theory, dear friend, is gray, but the golden tree of life springs ever green." -Goethe

If CSV were indeed always comma-separated, my hair would be at least 5% less gray. Alas, most programs emit semicolon-separated "CSV" in some locales (MS Office, LibreOffice, you-name-it-they-got-it).

Of course, I understand that your academic position "if it chokes the RFC-compliant parser, it's not a True CSV and should be sent to /dev/null" tautologically exists - but for some reason, users tend to object to such treatment (especially when they have no useful tools that would emit your One True Format for them).

TL;DR: there is no single standard fitting all the things that call themselves "CSV".

nur0n · on Oct 10, 2017

You seem like the perfect person to ask: what is a format that is close to the (apparent)simplicity of CSV, but is actually consistent?

Piskvorrr · on Oct 11, 2017

I am so sorry.

In other words, as soon as you start exchanging data, you'll get something that is complex, broken, or (most common case) both. Existence of a simple, consistent general format has not been conclusively proven impossible, but I have yet to see one in practice.

(Of course, everybody and their dog have cooked up simple data schemes, yes, but those are a) domain-specific, and b) not in widespread use.)