Code This, Not That: Python string interpolation/concatentation

That title is a bit of a mouthful, ain’t it?

I often see VB code in FDM jobs that is creating SQL statements – concatenating things together, adding in parameters from variables, lots of quotes, single quotes, plus signs, underscores, and more all over the place. It’s hideous.

For example, check some of this garbage out:

strSQL = "INSERT INTO " & TEMP & " (ENTITY,ACCOUNT,COSTCENTER,AMOUNT) " & _
" SELECT Entity.label, ACCOUNT.label, Entity.label+'_CC'", & Period & _
" from " & DATABASE & " as FACT"

For the record, I’m not saying if you see this or write this that you’re a bad developer. What I will say, however, is that you and/or your predecessors were hobbled by a clunky programming language and now you can atone for your syntactic sins with Python (Jython).

Python offers some nice facilities for handling strings. If you are looking for a straight translation of the normal VBScript string concatenation, you could do this:

account = "170100"
select = "SELECT Amount FROM Transactions WHERE Account = " + account

Python uses the plus operator (+) instead of the ampersand (&) to concatenate strings. This is straightforward and works well for simple situations. Of course, things get a little harder to read when we need to surround our value with single quotes (since it’s a SQL query after all):

select = "SELECT Amount FROM Transactions WHERE Account = '" + account + "'"

This isn’t terribly hard to read but we’re on a slippery slope. One thing we can try is simple “printf” style string interpolation:

select = "SELECT Amount FROM Transactions WHERE Account = '%s'" % account

In this case we have a %s placeholder meaning that we will place a string in it. Then we are using the modulo (%) operator to place the value of account into the string. This probably looks funny if you’re not familiar. What’s going on with this syntax?

Remember how you used the ampersand in VB to concatenate strings but in Python you use a normal plus? Well, the fact that we can use the plus in Python is due to operator overloading. This is programming term that essentially means we can actually program in our own behavior for math operations. You’re really used to this with integers (think 2 + 3 = 5), but maybe not so used to it for strings. It’s a clever trick and a nice facility that let’s us work with strings (or indeed, other types of objects) without having to resort to a lot of boilerplate code like newStr = StringUtilities.concatenate(query1, query2) or something similarly awkward.

Along those lines, whereas the modulo operator in math means “give me the remainder when dividing these two numbers” (such as 5 % 3 = 2), in Python this operator has been overloaded to provide convenient operations with strings (after all, how on earth would you ever perform modulo arithmetic on strings?). So when we write string1 % string2 (which is what we had above with account), Python calls the appropriate handler for the string object, which in this case is a fancy string replacement functionality.

This string replacement facility, as it turns out, is pretty handy – and this leads me to the final example of this post.

Python supports objects known as dictionaries. For those from the Java world, this is essentially the same as a Map (or perhaps more like a Map). VBScript has these objects too but they aren’t part of the language itself – they have a clunky syntax. In Python we can make a dictionary simply like this:

params = {
    "ACCOUNT" : "0170200",
    "PERIOD"  : 2
}

Next, Python also lets us create multi-line strings (and we don’t even need the line continuation operator from VB where the lines end with & _). We tell Python we have a multi-line string simply by using three quotes:

select = """
    SELECT 
        Amount
    FROM
        Transactions
    WHERE
        Account = '%(ACCOUNT)s'
        AND Period = %(PERIOD)d
""" 

Note that this string has been parameterized with ACCOUNT and PERIOD placeholders. The syntax is %(KEYNAME)s for strings and %(KEYNAME)d for integers (there are other codes as well but these are the most common).

Now we have a multi-line string with placeholders that we can replace with values from a dictionary. We could print this out easily:

print select % params

And we get this code:

SELECT 
    Amount
FROM
    Transactions
WHERE
    Account = '0170200'
    AND Period = 2   

This works exceptionally well for crafting SQL queries (also, yes, I know about placeholders with ? that are specific to SQL query parameters, let’s save that for another post). The wins here are that you get to format the query how you want, you don’t have quotes and ampersands and plus symbols all over the place, it’s easy to read, easy to maintain, and even faster to write.

Remember, code for readability. Also keep in mind that if you are converting VB to Python/Jython, you are not supposed to slavishly translate line by line – you are translating the solution as whole.

One thought on “Code This, Not That: Python string interpolation/concatentation

  1. The other good thing to mention about the replacement functionality in Jython/Python is that all the casting is done implicitly for you

Leave a Reply to SH Cancel reply

Your email address will not be published.