Perl: Readability, Expressiveness, and Concision

Writing readable code means expressing yourself as clearly and correctly as you can, not targeting the lowest common denominator of reader.

There are many people out there who are preaching the gospel of readable code. I myself am one of them. I am a strong proponent of expressive variable and subroutine names, for example. Unless you are a fire-and-forget contract programmer, you are going to be reading your code a lot more often than you write it. But honestly, if code readability is your primary concern, use Python, not Perl.

Reading good Python code is like reading a novel by Dan Brown or Michael Crichton. The language is clean and simple, allowing you to move forward easily, and quickly grasp the point of the story. The vocabulary is easily attainable and won't stretch your brain too far. I like reading Python code. It's an easy read — a beach read.

Reading good Perl code is like reading Shakespeare. Every line is deeply expressive, both in its content and its rhythm. You probably need a dictionary to look up some words that, while not perhaps in everyday usage, express exactly what the author intended. It's still English, but you have to work a little harder at reading it, because it's beautiful, powerful English.

When I write code, I am concerned about readability. But I am also concerned about expressiveness and concision. I have something I am trying to say. I don't always want two paragraphs of easy prose that approximate my point. Most of the time I want le mot juste, just the right word, a precise expression of exactly what I mean that requires no further elaboration.

Now, I don't intend to imply that my Perl is of a class with Shakespeare's verse (nor Flaubert's prose). What I am saying is that some stories cannot be told in terms of Dick and Jane and Spot. Complex problems often require complex language to describe their solutions. I love writing such solutions in Perl, because Perl, like English, offers me a broad and rich vocabulary of idiom I can use to construct complex works easily.

Those who speak pidgin-Perl can usually puzzle out what the code is doing, but they are going to get hung up at some points, and they are going to miss a lot of detail. That is not a flaw in the language, nor an error on the part of the writer. It is simply the nature of the form. But those who understand the depth of the language can truly appreciate the subtlety of a nice, tight hack.

The key to writing code that is both readable and expressive is to build a bridge with your comments that help the unfamiliar reader find his way through the complex and obscure portions. If you find yourself using a rare or arcane idiom because its the right thing to do, don't dumb it down into fifty lines of loops for novice programmers. Instead, write a full paragraph of explanation above your one-line hack. Include references to man pages, books, and web sites. Scholarly works always have footnotes. The reader will not only gain a better understanding of your code, but perhaps learn a new technique that improves his own.

Writing readable code does not mean writing for the lowest common denominator of reader. It means expressing your point as clearly and correctly as you can. Try to make your code readable, but don't sacrifice concision for readability. You should expect more from your reader, rather than expecting less from your code.

P.S. For another view on the subject, you can read about the Unobfuscated Perl Code Contest.

P.P.S. For more deep(?) thoughts about Perl programming from Webquills.net, read the feed or get new posts via email.

The DBI Imp strikes!

Somewhere in the process of doing something or other in the last few weeks, I managed to break my installation of MySQL on my MacBook Pro. I can't remember now how I managed to break it, but I had shoved a note onto my "to do" list to fix it.

After downloading and installing the latest stable version of MySQL (5.0.51b), I discovered that Perl could no longer access the database. I figured I just needed to recompile DBD::mysql against the new MySQL libraries. But when I tried, I got complaints similar to this one:

Can't use dbi_imp_data of wrong size (127 not 124) at /System/Library/Perl/Extras/5.8.8/darwin-thread-multi-2level/DBI.pm line 1190.

It seems I did two things wrong. First, I downloaded the "Mac OS X 10.5 (x86_64)" version of the MySQL server. Apparently, although Leopard (OSX 10.5) is supposed to be 64-bit, the included Perl 5.8.8 is compiled as 32-bit. So for compatibility, I scrapped MySQL and re-downloaded the 32-bit version. Of course, I may be mistaken about this, because...

I still got the errors! Some more searching suggested that some bug fixes in DBI for FreeBSD might be related. I was using the pre-installed DBI 1.52, and that was error number two. So I upgraded to the latest DBI (1.605), make realclean, and tried again.

I got a warning when running Makefile.PL:

Multiple copies of Driver.xst found in: /Library/Perl/5.8.8/darwin-thread-multi-2level/auto/DBI/ /System/Library/Perl/Extras/5.8.8/darwin-thread-multi-2level/auto/DBI/ /Library/Perl/5.8.6/darwin-thread-multi-2level/auto/DBI/ at Makefile.PL line 759

It seems to be detecting both the original Apple version of this file, and the new one I just installed from CPAN. However, I was still able to build the software, so apparently this is a pickled herring or whatever you call those things that look scary but don't actually matter.

Whew! Now we're back up and running. I'm tempted to try the 64-bit version again to see if it will work now that I have updated DBI, but that's too much like work, and since I only need it for testing, it isn't very important.

The moral to the story? Erm ... Don't break stuff? I'm not sure, but anyway, if my experience helps you get your MySQL working faster, I'll be happy.

ORM - PITA?

Recently I read a note called DBIx::Class Gotchas over at Perl Alchemy. I've been struggling myself for several months with a bundle of old Class::DBI code, and I can relate. Both packages fall into the category of Object-Relational Mapping (ORM), software that tries to map programming objects around entities stored in a relational database.

For a long time I was a huge proponent of ORM software, and I still am for certain types of projects. But the honeymoon is over for me, as I've started to hit some of those limitations they always warn you about. ORM buys you a big gain in productivity, because the software reduces database access patterns from a dozen lines of code to just one or two. If you need to get a small project up and running fast, ORM is a tool that can help a great deal.

The trouble with ORM is that your objects don't always map directly to the queries you want to make. To simplify the thorny problem of SQL generation, often the underlying implementation is doing things in Perl code that would be more efficiently done in the database, or executing multiple SQL queries when a single join would have done the trick. When your project gets very large, very complex, or very dependent on database performance, the ORM layer suddenly seems to hinder as much as it helps. I now spend as much of my time coding around the ORM layer as using it.

Still, without ORM I would spend a lot more time writing "select * from table" queries and the like, and I almost always have the need to optimize for development time rather than execution time. So I have this love/hate relationship.

What's your experience with ORM software? Love it? Hate it? Tolerate it?

Perl 5: Hash slices can replace loops

How many times have you written a for loop to do something simple with a hash and thought, there must be a better way to do this? Using hash slices instead of simple loops can save you lines of code and execution time.

A hash slice is a syntax for accessing the values of multiple keys of a hash in a single statement. It is a succinct and efficient technique, but it is also one of those collections of punctuation that give Perl a reputation as a write-only language. Once you have learned it, however, you will feel much more clever! Here are a few examples of how I use hash slices to make my code shorter and faster. (Note that you can also slice arrays, but today we are just talking about hashes.)

Basic hash slice syntax

You perform a hash slice by using a list as a hash index, rather than a scalar value, and preceding with the @ sigil rather than the $ sigil you would use to get a scalar value.

my %number_for = (one => 1, two => 2, three => 3);
# Regular access to scalar key
print $number_for{one}; # 1
# Hash slice accesses multiple keys. Note the '@'
print @number_for{qw(one two three)}; # 123
# This also works
print @number_for{'one','two','three'}; # 123

A cautionary note: notice how the scalar index uses a bare word as the key. Perl gives you the quoting for free in this case. With a slice, Perl doesn't help, so you have to do the quoting yourself.

Merging two hashes

Since hash slices can be lvalues, they can be used to merge one hash into another. A common example is when you get configuration information from more than one source, but you want to consolidate it to look up in just one place.

my %your_numbers = (two => 2, four => 4, six => 6);
# I get all your numbers! 
# (And your number will override mine if they differ)
@number_for{keys %your_numbers} = values %your_numbers;
print sort values %number_for; # 12346

Accessing keys in a particular order

Here is a common thing you run into in web development. You have received input from a web form and validated it. (You have validated it, right?) The data lives in a hash, and you want to store it in a database. You have your SQL statement all prepared, but it requires that the values be bound in exact column order. Unfortunately, the values function cannot be relied upon to return the values in the order you want. (And besides, you don't want to store the value of the submit button!)

# get valid data from your validation code
my %validated = %number_for;
# Columns of your table, in order needed by your SQL
my @columns = qw(six one three);
# Get the bind values with a slice
my @bind = @validated{@columns}; # 6,1,3

Accessing values sorted by keys

Say you want to sort a hash by its keys, and then use the values in that sorted order. Using the above data, perhaps we want to print numbers in alphabetical order.

print @number_for{sort keys %number_for}; # 41632

Slicing a hash reference

Eventually you will find yourself with a reference to a hash, and you will discover that the above syntax does not work. You may try three or four different combinations of curlies and arrows that just generate errors. Don't give up! You can slice a hashref! First, let's review using a hashref to get at scalar values.

my $num_for = \%number_for;
# Common syntax for dereferencing and getting a scalar index
print $num_for->{one}; # 1
# Alternate syntax, the lazy way:
print $$num_for{two}; # 2
# Alternate syntax, the explicit way
print ${$num_for}{six}; # 6

The key to slicing a reference to a hash is to use the alternate syntax shown above, replacing the initial $ sigil with @.

# The lazy way:
print @$num_for{@columns}; # 613
# The explicit way:
print @{$num_for}{@columns}; # 613

Note the distinct absence of the "arrow" syntax. The arrow implies a scalar, and we want a list.

Powerful syntax

The hash slice is an advanced syntax demonstrating Perl's concision and expressiveness. Now you should be able to recognize it when you see it, and hopefully apply it to your own projects to save time and space. (But remember, use your Perl superpowers only for Good, never for Evil!)

For a semi-regular diet of great Perl programming tips, subscribe to the Webquills.net feed or get Webquills.net via email.

Best Error Message Ever

Oh, the irony is so heavy, it's breaking me!

Best Error Ever.png

Creative Commons License
This blog is licensed under a Creative Commons License.