Hyperion Essbase wish list: Import a compressed file

I thought this up while attending Dan Pressman’s Kscope presentation How ASO Works and How to Design for Performance, a presentation that definitely appealed to my inner Hyperion geek. Dan did a crazy deep dive on performance tuning with particular respect to loading ASO. He had some pretty bangin hardware to play with too.

Long story short, and many of us have known this for awhile, but there are ways to format your Essbase load files so that they load faster. Basically what you are trying to do is make things easier on Essbase: stream in less data, don’t repeat things you don’t need to repeat, don’t thrash blocks in and out of memory, and so on. That’s all well and good.

The advent and proliferation of SSDs in the enterprise has done wonderful things for Hyperion performance by  eliminating a lot of the performance quirks with rotational media and penalties from fragmentation. But at the end of the day we are still looking for ways to pump ever-increasing amounts of information into our cubes even faster than we were the day before.

For instances where we are loading a file that resides on the same machine as the Hyperion apps/cubes or even across the network, I wonder what, if any, performance benefits are to be had if we had the ability to import a zip file?

Zip files can get awesome compression on text files. They can also have their uncompressed contents streamed. In other words, it’s not necessary to extract the contents of a zip file before you can read the contents (starting at the beginning). In theory, if one achieved moderate to decent compression on their zip file and handed that to Essbase (say with a specialized import data MaxL command), it would be saving time on the disk-read aspect of the data load, at the expense of some additional CPU usage. Many Essbase load operations are disk I/O bound anyway so this seems like a reasonable tradeoff to make.

As an additional benefit or elaboration on the concept, perhaps multiple text files could be placed into the same zip file, perhaps with a “load manifest” or options on the load command, and Essbase would attempt to parallelize the data load to the extent it can. This would likely be an add-on feature once the basic support is in place. In all you would need to augment the data load process with a zip file reader routine (this would be an off-the-shelf library that is quite common), a couple new MaxL import data variants, and an augmentation to the Java API. I suppose you could leave the MaxL command alone and just program the interpreter to look for a .zip extension and treat it accordingly, but it seems like it’d be the better choice to specifically indicate the data load is from a compressed file.

Of course, if you’re loading just from SQL this whole thing wouldn’t apply to you. Loading data files may seem low-tech but it’s incredibly common and often times I prefer it as I have an exact text file to tie back to, if need be, versus a possibly changing SQL data store (but that’s a conversation for a different blog post). This feature would cater to the performance nuts out there – and if Kscope is any indication, there are plenty. I’d be curious to hear anyone’s thoughts on this.

Essbase/EAS feature request: Metadata and description fields on Essbase objects

As a programmer, I find it imperative to document code as part of the coding process itself. I have learned to treat the construction of good code as a process that includes the documentation as part of the code. One of the things I love about ODI is the ability to add a description to many objects such as interfaces. Even within a single interface you can document individual elements. This is incredibly useful to provide context around why something is designed the way it is designed, to remind yourself of something, or for the next person to use/edit the code (which could be many years in the future as well as after you have moved on).

Essbase is no different. One of the most important things to document and that quite frequently get good documentation are calc scripts. We have the ability to write documentation in them, as much or as little as we want (hopefully we write as much is as needed and no more). We can add documentation to report scripts (of course, while still useful these seem to have fallen quite out of favor, given all of the tools that can move Essbase data around). We can add comments to individual members in an outline. We can add comments to our batch files and MaxL scripts. We can add notes to databases (did you know that? You can set a note on a database… I have used these quite successfully but they seem to be a quite seldom used feature despite how useful they can be, or were at least).

I wish we had more though. I’d like to be able to at least have a Description field that I could fill in for applications, databases, load rules, outlines, calc scripts (beyond the inline documentation), and any other object. I could see a lot of uses for this. Making notes on temporary databases or other temporary files, explaining the purpose of a particular load rule, quirks in an outline (notes to a future admin), and so on. Of course, it’s possible to document this in a separate document, but as we all know, these get stale and go out of date. Furthermore, they frequently neglect to document everything. So new features or files pop up and they are not included in the documentation.

Come to think of it, more metadata than even just Description would be useful: last person to edit, create time, edit time, basically all the usual stuff. As a bonus feature, the ability to associate some arbitrary files with a server or application would be nice so that we could upload a Word doc or PDF or something to its associated cube/server and have it available as a quick reference.

Anyone else think enhanced metadata and associated functionality would be useful?