The question is simple, and it's about math and statistics.
How do you count lines? On unix, "wc -l"; if you insist, sloccount, but "wc -l" is a good approximation.
How do you count expressions? The fact it will take you a few paragraphs to answer (you haven't, btw) indicates that it's a poor thing to measure and try to reason about.
I've done some IO code in C# (mostly WCF, bot not just), and I still think you are playing with semantics as far as statistics is concerned.
Figure out an objective, automatable way to count your "expressions" or "compositions" or "code points" or "functional points" or whatever you want to call it. Run it on a code base, and compute the Pearson r coefficient of correlation. It's likely to be >95%, which means one is an excellent approximation of the other.
And I have no idea what you were trying to say about Scala. I wasn't saying "terser is automatically better". I was saying, (and I'm quoting myself here: "number of bugs per line" tends to be a low variance statistic per person, with the programming language playing a minor role"). Note "per person"?
So backwards first I guess. "per person". Ok. But given the range of programmers I guess that's not an incredible surprise. Yes the person is more important than the language. I'd buy that.
I guess "expression" seems semi-obvious to me since it's a standard rule in SBT. Variable assignments, return values and function bodies might get close.
val a = 1 + 1
That would be an expression. Instantiating a DTO with a dozen fields, using keyword arguments and newlines between for clarity would be a single expression to me.
An if/else with a simple switch for the return value would be an expression for example. A more complex else case might have nested expressions though.
It takes some charity I suppose; one of those "I know it when I see it" things. I don't do a lot of Math based programming though. It's all business rules, DTOs, serialization, etc. So maybe not something that could be formalized too easily.
I guess where I'd intuitively disagree (and would be interested in further reading) is that LOC as a measure just doesn't feel like it works for me.
Considering only LOC to implement a task it's likely: Java, Ruby and Scala in that order (from most to fewest). But in my personal experience bugs are probably more like: Ruby, Java, Scala from most to fewest.
Hopefully that helps clarify and not just muddy what I'm trying to express further.
What confuses me is that you appear to be claiming that fewer LOC should correlate strongly with fewer bugs, but then go on to say that terser is not automatically better (in this context (sic?)). Maybe I'm reading more into it than you intend, but I'm left a bit confused.
Which is a confusing use of the term "expression", since it is very well defined when talking about languages - in fact, most formal grammars have a nonterminal called "expr" or "expression" when describing the language.
Your description, though, more closely correlates with what most languages consider a statement.
Regardless, it's just pure statistics - if you calculate it, you'll notice that you have e.g. 1.3 expressions per line, with a standard deviation of 1 expressions per line - which means that over 1000 lines, you'll have, with 95% confidence, 1200-1400 expressions -- it wouldn't matter if you measure LOC or "expressions".
> What confuses me is that you appear to be claiming that fewer LOC should correlate strongly with fewer bugs, but then go on to say that terser is not automatically better (in this context (sic?)). Maybe I'm reading more into it than you intend, but I'm left a bit confused.
What I'm claiming is that, when people actually measured this, they found out that a given programmer tends to have a nearly constant number of bugs per line, regardless of language - that is, person X tends to have (on average) one bug per 100 lines, whether those lines are C, Fortran or Basic - the variance per programmer is way larger than the variance of that programmer per language.
Now, PeopleWare which references those studies (where I read about that) was written 20 years ago or so - so the Java or C++ considered wasn't today's Java/C++, things like Scala and Ruby were not considered. However, I'd be surprised if they significantly change the results - because those studies DID include Lisp, which -- even 20 years ago -- had everything to offer that you can get from Scala today.
So, in a sense - yes, you should write terse programs, regardless of which language you do that in. If you wrote assembly code using Scala syntax, and compiled with a Scala compiler - Scala is not helping you one bit.
How do you count lines? On unix, "wc -l"; if you insist, sloccount, but "wc -l" is a good approximation.
How do you count expressions? The fact it will take you a few paragraphs to answer (you haven't, btw) indicates that it's a poor thing to measure and try to reason about.
I've done some IO code in C# (mostly WCF, bot not just), and I still think you are playing with semantics as far as statistics is concerned.
Figure out an objective, automatable way to count your "expressions" or "compositions" or "code points" or "functional points" or whatever you want to call it. Run it on a code base, and compute the Pearson r coefficient of correlation. It's likely to be >95%, which means one is an excellent approximation of the other.
And I have no idea what you were trying to say about Scala. I wasn't saying "terser is automatically better". I was saying, (and I'm quoting myself here: "number of bugs per line" tends to be a low variance statistic per person, with the programming language playing a minor role"). Note "per person"?