[tlug] Code Readability (was: perl?)

Date: Sun, 21 Aug 2016 05:15:32 +0900
From: "Stephen J. Turnbull" <turnbull.stephen.fw@example.com>
Subject: [tlug] Code Readability (was: perl?)
References: <22451.6786.829094.726@turnbull.sk.tsukuba.ac.jp> <20160819132402.GC30780@quadratic.cynic.net>
Curt Sampson writes:
 > On 2016-08-15 17:14 +0900 (Mon), Stephen J. Turnbull wrote:

 > > Or all of them!  (Hi, Curt!)  But that hurts readability, too, not all
 > > developers can be that multilingual, and sometimes you do need to read
 > > code that others maintain.
 > 
 > Oh, hi!
 > 
 > A lot of people think about readability as a property of a language,
 > but that's not right.

Anaphoric argumentum ad hominem?  (== I may believe that Python is
relatively readable, but that ain't what you quoted!)

My point was entirely in line with yours:  If your audience is not
multilingual, using many languages in your code base (or even across
code bases in your organization) is going to harm readability.

On to discussion of *your* points.

 > In code that might have to be maintained by random people of varying
 > talents and skill levels who don't program much, and certainly rarely
 > touch your project, you might well be advised to write:
 > 
 >     def double_each_number_in_array(array_of_numbers)
 > 	array_of_doubles = []
 > 	for number in array_of_numbers
 > 	    array_of_doubles.append(number * 2)
 > 	end
 > 	return array_of_doubles
 >     end

Which in Python you would write

    def double_each_number_in_array(array_of_numbers):
         array_of_doubles = []
         for number in array_of_numbers:
             array_of_doubles.append(number * 2)
         return array_of_doubles

saving two lines at the expense of two characters, and equally
(un)readable.  If you're allowed to mutate (as suggested by the
name!), you could save more lines:

    def double_each_number_in_array(array_of_numbers):
         for position in range(len(array_of_numbers):
             array_of_numbers[position] *= 2

but most likely you'd write (inline, no function definition at all):

    [2 * number for number in array_of_numbers]

which produces a copy as your model did.  Anybody who knows that [] is
a list constructor in Python can probably guess what that means.
Furthermore, once you've learned that, essentially the same syntax
serves for dictionaries:

    {number : 2 * number for number in array_of_numbers}

I'll grant that there's a gotcha:

    (2 * number for number in array_of_numbers)

does *not* produce a tuple, but rather an iterator (a lazy list whose
elements are accessed in order using iteration syntax such as "while"
and "for").  Producing a tuple is easy enough, just apply the tuple
function to the iterator:

    tuple(2 * number for number in array_of_numbers)

This is *not* an inconsistency in Python's design, BTW.  The *comma*,
not the parentheses, is the tuple constructor.  However, because the
comma "looks like grit on Tim's screen", and the parenthesis is always
part of the printed representation of a tuple (there is exactly one
exception for the comma!), this *is* a readability issue until you've
*learned* to read that as a "generator expression" whose value is an
iterator, not a "tuple comprehension" (which doesn't exist in Python).
That took a few seconds for me ("lazy" lists are important to me), but
I can't claim that "typical" students would catch on that fast (even
though I'm willing to claim that "list comprehensions" as above *are*
generally intuitive).

I'm sure you can do the same thing in Ruby.  Guessing from your
"mathematical conventions" example, I tried this:

    array_of_numbers.map { |x| x*2 }

which did the trick.  But I know Lisp and use map{,c,car,cdr,list} all
the time in that language.  I doubt you would claim that the required
level of Ruby knowledge to guess what that does is as minimal as
Python's.  More controversially, I would maintain that it will be less
easy to remember for the Ruby novice than the Python equivalent is for
the Python novice.  Granted, .map is more powerful.  But how often do
novice programmers use that power?

Note that in Python, it's (almost) obvious how to generalize a list
comprehension to a dictionary comprehension.  (I first found out it
worked just by trying it, in fact.)  And I think once you understand
list comprehension, *reading* a dictionary comprehension would be
trivial.

There are a number of other places where I prefer Python idiom to
Ruby.  For example, where Ruby objects implement "standard methods"
such as .size, Python has standard builtin functions, where len()
corresponds to Ruby's .size.  That may be purely a personal preference
(shared with a genius named Guido, of course, so there *may* be
something to it<wink/>).

So I would argue that yes, one great language can be more readable
than another for a very broad audience.

I don't know Ruby (or Perl<wink/>) well enough to come up with
examples where they would be "more readable" than Python, but in the
case of Ruby I would guess it would be easy for an experienced,
tasteful developer to find examples using blocks.

 > But only if you're not touching it a lot. If you're working with
 > "regular" Ruby programmers, you'll certainly find this much better:
 > 
 >     def double_array(numbers)
 > 	numbers.map do |n|
 > 	    n * 2
 > 	end
 >     end

A bit OT: The bars always give me a double-take.  Is there any
precedent for that notation?  (I see Rust uses them too, but I suppose
Ruby was earlier.)

 > Personally, I find even that very tedious, with then (admittedly rather
 > mathematical) conventions within which I work, and would write:
 > 
 >     def double(xs); xs.map { |x| x*2 }; end

Ugh.  I think it would take me quite some time to get used to the fact
that you can elide one but not both of the "end" statements.  BTW,
which one is left?  I'm guessing it terminates the def, because of the
keyword and the semicolon before the function body.  The semicolon
also gives me a double-take, coming from way too much C, and the fact
that it serves no semantic purpose AFAICS (the def statement should be
over at the close parenthesis).  This is also true of the colon in
Python def and class statements, but the colon is more readable to a
native English speaker since it's used to indicate "I've got more to
say about that!"  (N.B. This may genuinely depend on "native English".
It doesn't seem to be true of Japanese and Chinese college and grad
students, and at least some Indian GSoC interns, who frequently use
semicolons where colons would be appropriate.)

 > I read this third example easily two or three times as fast as the
 > second example,

I'm sure you do; I do, too, even though I can barely keep up with that
much Ruby syntax.<wink/>  But to me it's a poor example since it I'm
sure it can be inlined (both for immediate execution and as a lambda),
and most likely it would be.  I think a longer or more complex example
(eg, Josh's examples using partial as well as map, but I can't
actually parse them!) would be more persuasive.

 > 1. If you have just a generic number in a small context, call it `x`.
 > 2. A list of things is the name of the thing with an `s` on the end.
 > 3. Put short stuff on a single line.
 > 4. Don't use extra characters and extra lines for do/end when you
 >    can use the shorter braces {} that also let you do paren matching
 > 
 > Points 1 and 2 are conventions you learn, get used to, and they become
 > nice concise ways of saying something that everybody understands.

Borrowed from Haskell conventions?  I use those too, but I don't
believe that everybody will understand your example without shared
convention.  For example, try typing this into your friendly
neighborhood Python interpreter:

>>> def double(xs): return [ 2*x for x in xs ]
... # just type Enter here
>>> double(["a", "b", "c"])

Had too much beer, guys?<wink/>

For point 3, in the case of higher-order programming (including use of
map and partial), I agree, defining functions with mnemonic names that
short is often more readable than inlining, and to the extent they're
really just a single (higher-order) expression a single line is more
readable.  On the other hand, for me personally I typically find that
if the name is mnemonic enough, the definition can be far away from
the point of use, saving one screen line (more likely two: I'd
probably leave an empty line after even a one-line function
definition) compared to your practice.

Especially with modern IDEs, which will typically post the docstring
(or even the code) for an identifier in a tooltip.

 > Point 3 lets you see more code on the screen, and live with mere
 > 80 and 90 line terminals.

I don't program for a living, but even writing mathematical proofs
(which in TeX typically take several lines for something I'd write in
half a line on foolscap) I find mere 80-line terminals a luxury.
Luxurious, of course! but hardly something I can't live without.  I
think 45 lines is about the point where I start to think about buying
a larger screen with more pixels.

 > Point 4 is usually where I get the strongest pushback amongst Ruby
 > programmers (especially Rails ones), and I have no idea why.

I wonder if it doesn't have something (indirect) to do with my feeling
about the bars on the block argument.  People have strong feelings
about which parts of syntax should be punctuation and which should be
keywords, and which parts should be whitespace.<wink/>

 > Another example of something you never want to retreat from once
 > you get used to it is dropping syntactic "if" for more concise
 > boolean expressions.

Speak for yourself.  In Python it would never occur to me to write:

    author == "Guido" or print("You are so full of bad advice!")

instead of

    if author != "Guido":
        print("You are so full of bad advice!")

although I do use

    test "$os" = "darwin" && echo "Mac OS X ($version)"

and similar in preference to "if" in shell and autoconf scripts.

 > First, write for your audience. Programmers spend most of their time
 > reading code, and so you need to optimize your style for the people
 > who read it most. This style will change over time as you work
 > together.

Sure.  The Python[2] developers recognize this even though they pride
themselves on creating a readable language.  That's why (a) PEP 8
exists and (b) PEP 8 is explicitly advocated *only* for core Python
code (the standard library).  Of course you're welcome to adopt it for
your own project, and it's very common practice to do so -- but it's
almost equally common for project founders who adopt PEP 8 to insist
on additions, subtractions, and variations.  Furthermore, (c) the
encouragement of PEP 8 conformance is tempered by a recognition that
readability of old code and especially changes to code are enhanced by
*not* changing it just to conform to some style guide.  (This is a
*separate* concern from backward compatibility, such as renaming
exported variables and classes to conform with a style guide, which
constitutes an API break, of course.)

But it's also possible to design a language for better readability.  I
can't speak to many of the languages that have been mentioned, such as
Rust and even Perl, but I would advocate the position that there's a
large class of users, including the novices of both languages, for
whom Python is "more readable" than Ruby, and that this is due to
attention to design for readability.  OK, so I went and skimmed large
portions of the Rust book <https://doc.rust-lang.org/book/>.  I'm not
impressed with its overall readability (too much type crap, too many
primitive types, too low-level), but compared to C++ (which is what it
seems to want to kill, and more power to it!) it's way readable.[3]

Footnotes: 
[1]  This isn't useful, but you could use a similar construct to
memoize a fibonacci function implemented recursively.

[2]  I mention Python here because I'm familiar with it and because it
is an example that fits your principles.

[3]  I am totally ignoring the primary benefit of using Rust, of
course, in favor of focusing on readability of the language.  That's
not fair to the language, which may be as readable as it can be given
the applications it targets.
References:
- Re: [tlug] perl? (was: Employment for "oldies")
  - From: Stephen J. Turnbull
- [tlug] Code Readability (was: perl?)
  - From: Curt Sampson
Prev by Date: Re: [tlug] Code Readability (was: perl?)
Next by Date: Re: [tlug] Code Readability (was: perl?)
Previous by thread: Re: [tlug] Code Readability (was: perl?)
Next by thread: Re: [tlug] strong correlation between lines of code and defects (was mojibake? emoji? (was: perl?))
Index(es):
- Date
- Thread
Home | Main Index | Thread Index