Quadruple Backslashes

No, Quadruple Backslashes is not a college rock band (that I know of). But maybe it should be one. Anyway, ever see a quadruple backslashes on a path in a batch file? Like this:

SET FILE=\\\\some-server\\path-to-app-folder\\app\\data.txt

What’s going on here? This is a UNC path, and the normal form is like this:

SET FILE=\\some-server\path-to-app-folder\app\data.txt

Let’s say we are doing a data import in a MaxL script:

import database Sample.Basic data
    from data_file "$FILE"
    using server rules_file "LdAll"
    on error write to "errors.txt";

In many scripting languages, backslashes are special. They are used to “escape ” things inside of strings. What if you need a newline inside of a string? That’s \n. What if you want a tab? That’s \t. What if you want to print a backslash itself? You can’t just put a single backslash in because then the interpreter thinks that the next character is something special. So to write out just a backslash you escape a backslash itself (so it’s \\).

To further complicate things, many scripting languages have a notion of interpolated and non-interpolated strings. That means that strings contained inside of double quotes (“This is a double quoted string!”) have their contents parsed/scanned for special characters. Strings in single quotes don’t (‘This is a single quoted string!’). So in the single-quoted string, we can sometimes (not always) get away with doing whatever we want with backslashes. But if we use the single quoted string we can’t stick variables in it.

So now we need a double quoted string so we can put variables in it, but this makes putting in backslashes for our UNC path a little complicated. No worries – just escape the backslashes. Returning to our quadruple backslashed MaxL example that is surrounded with double quotes, what’s happening is that MaxL parses the string contents (the stuff between the double quotes) and the four backslashes become two backslashes in the final string. The double backslashes in the other parts of the UNC path get translated into a single backslash.

The final resulting string has the two backslashes to start the UNC path and any other path delimiters are just single backslashes.

Normal Paths

If you are just loading a file from a normal, non-networked path (or it’s a mapped drive), and you’re on Windows, then I highly recommend just using forward slashes to delimit the path. For example, consider an input file at D:\Hyperion\data.txt. Your double quoted string would be D:\\Hyperion\\data.txt, which gets translated down to just single quotes. For the last decade or more, Windows has had awesome support for the more traditional forward slash Unix path separator, meaning you can just use D:/Hyperion/data.txt. The forward slash isn’t special at all so you don’t need to escape it. You can’t use double forward slashes to denote a UNC path, though, so //server-name/path/to/file.txt will not work, thus requiring some backslash kung foo to get working.

Vess: a Virtual Essbase JDBC driver

Normally I finish some programming and then throw it over the wall to you guinea pigs enthusiastic blog readers before I say anything about it (no vaporware here, nosiree!), but I thought I’d share something I’ve been playing with as a proof of concept. It’s sort of inspired by some of the ODI/Essbase integration I’ve been working with lately.

The Essbase/Hyperion knowledge modules in ODI are sort of different from typical knowledge modules because in a lot of ways they cram a round peg into a square hole. So you kind of get this relational facade over an Essbase cube in terms of loading and integrating data/metadata. Instead of relying on a normal JDBC driver, the Hyperion KMs more or less delegate out to some custom Java code that wraps the Essbase Java API. This isn’t a bad thing, it’s just how it works.

Unrelated to ODI, there are many situations where we are jumping through hoops to get data of some kind – not just cube data and metadata – out of Essbase. For example, you run a MaxL script to dump stats from a cube to a text file. Then you run some regular expression voodoo on it to get to the data you actually want.

Side note: parsing MaxL text output is surely amusing to the boffins at Oracle that designed it. I don’t think many people know this, but MaxL communicates with the Essbase server using a hacked up JDBC driver (it’s ingenious really). So when we parse the plusses and minuses and other ASCII crap off of some MaxL output, effectively what is happening is that a real dataset is coming back from the Essbase server, the MaxL interpreter is applying the extra work of prettying it up with some text art (the only thing missing is lime green letters from the 80’s), it gets dumped to a text file, and then what happens in so many places is that the text is reparsed into data.

To some extent the MaxL Perl and Essbasepy modules can be used to get MaxL data in a cleaner way. The Java API can definitely be used. Of course, I have no problem myself jumping in and coding up a Essbase Java API routine to pull some data down. But for people that aren’t as comfortable, stripping some info out of MaxL output offers a clear path with less resistance, so I can’t say I blame them.

So we have all of these instances where we can get data out of Essbase (not just actual cube data, but metrics from the server) using EAS, using MaxL, dumping MaxL to a text file, the C API, Java API, and so on. But it seems like a lot of these approaches have undesirable combinations of effort/quality/reward.

Let’s talk about JDBC for a moment. JDBC is the Java Database Connectivity model. Java is one of the most popular programming languages on the planet (perhaps not the sexiest language, but it’s getting better), and JDBC is the tried and true model for connecting to databases. Every popular database in the world has a JDBC driver – you can grab a generic SQL tool (I’ve been using the simple RazorSQL as of late), pop in the appropriate JDBC driver (such as for MySQL, Microsoft SQL Server, Oracle, DB2, whatever), connect, and start issuing SELECT statements to your heart’s content.

So, this all brings me to something interesting I’ve been playing with for awhile: Vess – a virtual Essbase JDBC driver. Vess is a JDBC-compliant driver that can be dropped into any generic SQL tool that supports JDBC drivers (RazorSQL) as well as used out of the box with ODI. Vess offers a JDBC-compliant URL for connecting to Essbase, such as the following:

jdbc:essbase://epm11123.corp.saxifrages.com:9000/Sample.Basic;mode=embedded

Vess then presents several tables, views, and stored procedures. For example, consider substitution variables. These are modeled in a table where the column types are String, String, String, String (the VARCHAR database type). So you can do this:

SELECT * FROM SYS.SUBSTITUION_VARIABLES

And you get back a dataset kind of like this:

Sample, Basic, CurrMonth, Jan

(It’s not actually text with commas, it’s a normal dataset). There are similar tables with other system/cube information in them, such as database stats, connected users (the Sessions tab in EAS), and more. There are views for dimensions/members/aliases/UDAs. There are stored procedures for calling calc scripts (e.g. sp_exec_calc(‘CalcAll’)). MDX queries can also be crammed into a normal JDBC ResultSet with a little bit of imagination.

So, why bother?

Using a JDBC driver would let us easily (and cleanly) get to Essbase data and metadata using any generic database tool, as well as being able to drop the driver right in to ODI and use it as part of a normal interface. Whereas ODI uses Jython to call custom Java classes, this could potentially simplify the layers a little bit and offer an interesting alternative. This could also potentially be used in places where MaxL output is being parsed out via black magic, obviate the need for writing custom Essbase JAPI programs, and some use cases with the outline extractor.

Status

As of right now, the framework of the driver is created already and you can drop it into a SQL tool such as the aforementioned RazorSQL and run a command like SELECT name FROM SYS.SUBSTITUTION_VARIABLES WHERE APP = 'Sample' and get the proper results back. Over time (as time permits) I will be adding a few more gizmos. If there are any intrepid Java developers out there that would like to play with this, let me know and I can make this an open source project on GitHub that could be refined over time.

Essbase Outline Export Parser released

I had a use-case today where I needed to parse an XML file created by the relatively new MaxL command “export outline”. This command generates an XML file for a given cube for either all dimensions or just all dimensions you specify. I just needed to scrape the file for the hierarchy of a given dimension, and that’s exactly what this tool does: pass in an XML file that was generated by export outline, then pass in the name of a dimension, and the output to the console will be a space-indented list of members in the dimension. More information on usage at the Essbase Outline Export Parser GitHub page including sample input, sample output, and command-line usage.

Also note that the venerable Harry Gates has also created something similar that includes a GUI in addition to working on the command line. While both written in Java, we’re using different methods to parse the XML. Since I’m more familiar/comfortable with JAXB for reading XML I went with that, which in my experience gives a nice clean and extensible way to model the XML file and read it without too much trouble. The code for this project could be easily extended to provide other output formats.

Performance nuances with MaxL data imports with local and server

Some time ago, I reviewed and revamped the MaxL automation for a client. One of the major performance gains I got was actually pretty simple to implement but resulted in a huge performance improvement.

Did you know that the MaxL import data command can can be told whether the file to load is a local data file or a server data file? Check out the MaxL reference here for a quick refresher. See that bold “local” after from? That’s the default, meaning if we omit the keyword altogether, then the MaxL interpreter just assumes it’s a local file.

Imagine that you have an Essbase server, and then a separate machine with the MaxL interpreter. This could be your local workstation or a dedicated automation server. Let’s say that there is a text file on your workstation at C:/Essbase/data.txt. You would craft a MaxL import command to import the local data file named C:/Essbase/data.txt. That’s because the file is local to the MaxL interpreter.

Now imagine that the file we want to load is actually on the server itself and we have a drive mapped (such as the Y: drive) from our workstation to the server. We can still import the data file as a local file, but this time it’s Y:/data.txt (Assume that the drive is mapped directly to the folder containing the file).

In this scenario, MaxL reads the file over the network from the server to the client, then uploads that data back to the server. This data flow is represented in the figure in the left of this diagram:

MaxL data loads: server vs. local

You might be thinking, “But wait, the file is on the server, shouldn’t it be faster?” Well, no. But there’s hope. Now consider server file loading. In this case we use the server keyword on the import statement and we specify the name of the file to load. Note that the file location is based on the database being loaded to. If you’re loading to Sample Basic, then Essbase will look in the ../app/Sample/Basic folder for the file. If you don’t want to put files in the database folder, you can actually cheat a little bit and specify a path such as ..\..\data.txt and load the file from a relative path. In this case by specifying the ..\..\, Essbase will go up two folders (to the \app folder) and look for the file there. You can fudge the paths a little, but the key is this: Essbase will load the file from itself, without the MaxL client incurring the performance penalty of two full trips of the data. This is depicted in the right figure in the diagram: the MaxL client issues a data load command to the server, which then loads the file directly, and we don’t incur the time needed to load the file.

In my case the automation the written to load a file that was already on the server (in the \app folder), so I just changed the import to be a server style import, and immediately cut the data import time dramatically.

I wouldn’t be surprised if this “anti-pattern” is being used in other places – so take a look at your automation. Let me know if you find this in your environment and are able to get a performance boost!

 

The MaxL “Repeated calls to startMaxl.bat stops working” on Windows issue

I’ve run into this before and a few colleagues have recently run into this. Sometimes you’ll write some MaxL automation that loops a lot, for example, looping over a bunch of input files (perhaps you have a single text file per day of data or something).

So in your batch file you have a for loop or just a bunch of repeated calls to startMaxl.bat (note that startMaxl.bat is a wrapper of sorts for essmsh, the MaxL interpreter). The first X number of runs work just fine, then all of a sudden new calls to the script stop working. Closing the command prompt window and reopening it “fixes” the issue.

In the past when I bumped into this it was very tricky to troubleshoot. Let’s look at the code for startMaxl.bat:

@ECHO OFF
@REM This file is created to startMaxl
rem Set ESSBASEPATH
call "%~dp0\setEssbaseEnv.bat"
set PATH=%ESSBASEPATH%\bin;%PATH%
%ESSBASEPATH%\bin\essmsh.exe %*

Nothing too fancy, but notice that second to last line – where we are setting the PATH to be the ESSBASEPATH variable PLUS the existing PATH. This is a fairly common pattern. The problem, though, is that Windows has a limit to how long the value of a variable can be. The repeated calls to startMaxl are increasing the size of the PATH variable in that environment (i.e., the changes aren’t permanent) and once the length limit is hit, the job fails.

The workaround? There’s a couple. If you can call essmsh directly then use that. Sometimes you get a “MaxL initialization error” or similar  – this is due to environment variables not being set (check the setEssbaseEnv.bat file for the variables that need to be set). You can add these to your system variables and call essmsh directly. Alternatively we could change startMaxl or create a new file next to it that will only append the PATH once (say, by tracking this in another temporary variable).

Just wanted to pass this on since it ca be a little tricky to troubleshoot. Good luck!

 

A quick trick to avoid hard-coding folder names in batch files

I’m still surprised (although I guess I shouldn’t be) how often I come across batch automation files that have a first line of setting the current folder – using a hard-coded folder name. For example, if the automation is located in D:\Essbase\Automation, then the first line of the script looks like this:

cd /d D:\Essbase\Automation

99% of the time, this line is the same as the folder containing the batch file. Instead of hard-coding this, you can actually just use a handy shortcut on Windows:

cd /d %~dp0

The %~dp0 token/variable gets replaced at runtime with the current folder containing the executing file. The /d parameter simply tells the cd command to change drives, if necessary (so that the change directory command works if it’s going from the C drive to the D drive, for example).

Using the above technique, you can write batch files that are more standard looking, portable, and more flexible. It’s a good thing (said in the voice of Martha Stewart).

essbasepy Python MaxL module for Hyperion moved!

As some of you may know, I am now the active maintainer for the essbasepy Python module for MaxL. This project is an analog to the MaxL Perl module that was originally created by David Welden. I have put a fair bit of time into getting up and running with it, updating it, and testing it against EPM 11.1.2.3. I have now moved the code from its previous home on Google Code to an open repo on my GitHub account.

Other than moving to GitHub, I have included a few updates for the newest version of Hyperion, updated the documentation, and consolidated the distribution down to one master set of files. The future plans for this project are to keep validating it against Hyperion updates, polish it a bit, and enhance the documentation even more. At this point I don’t have any plans to significantly change the functionality.

I know there’s a handful of you out there that are hardcore users of this so if you have any issues or questions, please don’t hesitate to contact me.

Hyperion Essbase rejected record automation one-liner

I’m just cleaning up some old files here and came across a post from quite some time ago that never got published. Whoops. This was before I officially released the Rejected Record Summary Java routine for analyzing/summarizing a Hyperion data load rejected record file.

So imagine row after row of something like the following:

\\ Member Ac.0170001 Not Found In Database
09 0170001 900 11 .00

\\ Member Ac.0170001 Not Found In Database
09 0170001 904 11 .00

\\ Member Ac.0170001 Not Found In Database
09 0170001 905 11 .00

\\ Member Ac.0170001 Not Found In Database
09 0170001 906 11 .00

You could run the following one-liner on it:

grep \\\\ sample1.txt | sed -e 's/\\\\ Member //' -e 's/Not Found In Database//' | uniq -c | sort -nr

And get something like the following:

24 Ac.0453902
24 Ac.0397511
24 Ac.0171026
24 Ac.0170926
24 Ac.0170126
23 Ac.0909100
23 Ac.0901100
23 Ac.0201220
23 Ac.0170326

Now, hopefully you aren’t getting any rejected records (and certainly not this many, but then again, it’s just test data), but no matter what, your Hyperion automation practices should include regularly inspecting your rejected records, if any, and this might help if you happen to work it in to some automation.

This example of course presumes your Hyperion automation server is running Unix command tools, so alternatively you could install Cygwin or something similar on Windows if you script there. And for complete platform independence, check out my Java library!

Hyperion Essbase wish list: Import a compressed file

I thought this up while attending Dan Pressman’s Kscope presentation How ASO Works and How to Design for Performance, a presentation that definitely appealed to my inner Hyperion geek. Dan did a crazy deep dive on performance tuning with particular respect to loading ASO. He had some pretty bangin hardware to play with too.

Long story short, and many of us have known this for awhile, but there are ways to format your Essbase load files so that they load faster. Basically what you are trying to do is make things easier on Essbase: stream in less data, don’t repeat things you don’t need to repeat, don’t thrash blocks in and out of memory, and so on. That’s all well and good.

The advent and proliferation of SSDs in the enterprise has done wonderful things for Hyperion performance by  eliminating a lot of the performance quirks with rotational media and penalties from fragmentation. But at the end of the day we are still looking for ways to pump ever-increasing amounts of information into our cubes even faster than we were the day before.

For instances where we are loading a file that resides on the same machine as the Hyperion apps/cubes or even across the network, I wonder what, if any, performance benefits are to be had if we had the ability to import a zip file?

Zip files can get awesome compression on text files. They can also have their uncompressed contents streamed. In other words, it’s not necessary to extract the contents of a zip file before you can read the contents (starting at the beginning). In theory, if one achieved moderate to decent compression on their zip file and handed that to Essbase (say with a specialized import data MaxL command), it would be saving time on the disk-read aspect of the data load, at the expense of some additional CPU usage. Many Essbase load operations are disk I/O bound anyway so this seems like a reasonable tradeoff to make.

As an additional benefit or elaboration on the concept, perhaps multiple text files could be placed into the same zip file, perhaps with a “load manifest” or options on the load command, and Essbase would attempt to parallelize the data load to the extent it can. This would likely be an add-on feature once the basic support is in place. In all you would need to augment the data load process with a zip file reader routine (this would be an off-the-shelf library that is quite common), a couple new MaxL import data variants, and an augmentation to the Java API. I suppose you could leave the MaxL command alone and just program the interpreter to look for a .zip extension and treat it accordingly, but it seems like it’d be the better choice to specifically indicate the data load is from a compressed file.

Of course, if you’re loading just from SQL this whole thing wouldn’t apply to you. Loading data files may seem low-tech but it’s incredibly common and often times I prefer it as I have an exact text file to tie back to, if need be, versus a possibly changing SQL data store (but that’s a conversation for a different blog post). This feature would cater to the performance nuts out there – and if Kscope is any indication, there are plenty. I’d be curious to hear anyone’s thoughts on this.

Viva La Database Notes

I originally titled this post “The best little Hyperion Essbase feature you’re not using” but thought better of it. So, database notes. Have you ever wondered what that little Notes button in the Excel add-in does? Or rather, what it did (since even yours truly has finally made the official jump to Smart View)?

Essbase Excel add-in Database Note dialog

Essbase Excel add-in Database Note dialog

It is basically a facility for letting the Hyperion administrator set a “message of the day” on a per database basis. This message can be set manually from EAS, set via automation from MaxL, and of course, updated via the Java API. Since 99% of the cubes I am used to maintaining are wrapped with automation, the notes are a a great place to put some information about the automation process, even if I’m the only one that is going to refer to it. I honestly don’t know of any environments where even a significant portion of users is aware of or uses this feature, but I am sure there are some.

Just off the top of my head, here’s some ideas for information that can be put in the database notes:

  • Last cube refresh time
  • Rejected record stats (such as those generated by the Rejected Record Summary tool)
  • Automation stats (duration, errors, etc)
  • Next cube refresh time
  • Notes about business process calendar/schedule
  • Link to cube status information website
  • Timestamp/version information from file(s) used to load the cube
  • Indicate date of archival for an “archive” cube

As an example, I created an automation routine once that depended on files from various divisions. Each division was on its own schedule, which was denoted by an integer, typically between one and 40. It was useful to me to know which was the most recent file loaded in. I programmed the notes to contain this information, as well as some rejected record stats. So, on a day to day basis I didn’t need to refer to the notes very often, but when things went a little haywire, it was a great first place to be able to check for information before having to dig in further.

Continue Reading…