The question is simple, and it's about math and statistics. How do you count lin...

ssmoot · on March 28, 2014

So backwards first I guess. "per person". Ok. But given the range of programmers I guess that's not an incredible surprise. Yes the person is more important than the language. I'd buy that.

I guess "expression" seems semi-obvious to me since it's a standard rule in SBT. Variable assignments, return values and function bodies might get close.

  val a = 1 + 1

That would be an expression. Instantiating a DTO with a dozen fields, using keyword arguments and newlines between for clarity would be a single expression to me.

An if/else with a simple switch for the return value would be an expression for example. A more complex else case might have nested expressions though.

It takes some charity I suppose; one of those "I know it when I see it" things. I don't do a lot of Math based programming though. It's all business rules, DTOs, serialization, etc. So maybe not something that could be formalized too easily.

I guess where I'd intuitively disagree (and would be interested in further reading) is that LOC as a measure just doesn't feel like it works for me.

Considering only LOC to implement a task it's likely: Java, Ruby and Scala in that order (from most to fewest). But in my personal experience bugs are probably more like: Ruby, Java, Scala from most to fewest.

Hopefully that helps clarify and not just muddy what I'm trying to express further.

What confuses me is that you appear to be claiming that fewer LOC should correlate strongly with fewer bugs, but then go on to say that terser is not automatically better (in this context (sic?)). Maybe I'm reading more into it than you intend, but I'm left a bit confused.

beagle3 · on March 28, 2014

> one of those "I know it when I see it" things.

Which is a confusing use of the term "expression", since it is very well defined when talking about languages - in fact, most formal grammars have a nonterminal called "expr" or "expression" when describing the language.

Your description, though, more closely correlates with what most languages consider a statement.

Regardless, it's just pure statistics - if you calculate it, you'll notice that you have e.g. 1.3 expressions per line, with a standard deviation of 1 expressions per line - which means that over 1000 lines, you'll have, with 95% confidence, 1200-1400 expressions -- it wouldn't matter if you measure LOC or "expressions".

> What confuses me is that you appear to be claiming that fewer LOC should correlate strongly with fewer bugs, but then go on to say that terser is not automatically better (in this context (sic?)). Maybe I'm reading more into it than you intend, but I'm left a bit confused.

What I'm claiming is that, when people actually measured this, they found out that a given programmer tends to have a nearly constant number of bugs per line, regardless of language - that is, person X tends to have (on average) one bug per 100 lines, whether those lines are C, Fortran or Basic - the variance per programmer is way larger than the variance of that programmer per language.

Now, PeopleWare which references those studies (where I read about that) was written 20 years ago or so - so the Java or C++ considered wasn't today's Java/C++, things like Scala and Ruby were not considered. However, I'd be surprised if they significantly change the results - because those studies DID include Lisp, which -- even 20 years ago -- had everything to offer that you can get from Scala today.

So, in a sense - yes, you should write terse programs, regardless of which language you do that in. If you wrote assembly code using Scala syntax, and compiled with a Scala compiler - Scala is not helping you one bit.