EIS, IES, and things that make you go hmm…

This will be another one of those “fun” posts that I get to occasionally do. Every now and then I check out the web traffic stats for this blog and other online endeavors and take a look at who is coming and how they are getting here.

I wrote an article years ago about my experiences with EIS (Essbase Integration Services). It has turned out, hit-wise, to be pretty popular. There’s just not a lot of information about this wonderful little tool out there.

So, naturally one of the search terms for those looking up information on EIS is either Essbase Integration Services or just EIS. Curiously, though, one of the major traffic contributors to the EIS page is actually from people typing in “Boobeis” (presumably a misspelling of “Boobies”). When searching on the former, Google seems to helpfully break up the search term into constituent parts Boob + eis and suggest my site as a result.

So, yeah. According to Google, my EIS posts get two types of visitors: those looking for information on how to build a cube from a database schema and those wanting to see naked females. Hopefully those two demographics don’t overlap… at least during work hours.

An Introduction Essbase Integration Services (EIS)

So, just what is this EIS thing?  You’ve heard about it, you know it has something to do with outlines, but you haven’t used it and you just don’t know where to start.  I understand.  I’ve been there.  I couldn’t really see an immediate payoff to using it, which kind of made getting all setup a little more daunting.  Please note that this article is written referring mostly to version 7 of EIS — I’m not sure what changes may have occurred since then.

Interestingly, my motivation for using EIS was a bit odd.  I was building some new cubes for the enterprise and they were in ASO for the first time.  The cubes were the evolution of a set of cubes that had been around in the company for a number of years, and the goal was to take what worked (and avoid what didn’t), and apply it for the entire organization.  Which is huge.  Hence the reason for ASO.  Along with learning all of the quirks that ASO brings to the table versus BSO, I also ran into another issue.  The original cubes, as part of their calculation method, would FIX on all of the accounts related to Gross Profit, flip the sign, and then roll up the whole cube.  The reason for this is that the GP accounts come in their “natural” accounting sign, so something like a sales account would be negative in the source database.  But since there are no calc scripts in ASO, I had to come up with another solution.  Well, it turns out that there’s this neat little feature on load rules that lets you flip the sign on a particular UDA.

Given that the Accounts dimension changes from period to period (and has  a bajillion members in it), it was necessary to come up with a mechanism to tag all of the appropriate accounts with the UDA I needed.  I got this working just fine with a load rule, ran it, and voila, I had a nice shiny UDA called “GP” on the members I needed.  I was feeling pretty pleased with myself… until the next time I ran the automation.  I ran it again and again for testing purposes.  And when I cracked open the outline again in EAS, I noticed that my members had about 20 copies of the “GP” UDA on them.

As it turns out, I believe this was a bonafide bug in Essbase 7.x that was fixed at some point — at the time I was writing the automation I believe my servers were 7.1.2.  But at the time we had no plans to put in the point upgrade, and not being the kind of guy that likes to wait around anyway, I decided to take the somewhat unconventional approach and just implement the entire thing in EIS.  So, you can honestly say that my entire reason for picking up EIS is because my UDAs were multiplying like tribbles on me, but that’s life, funny things like that happen.

So, getting back to the main idea here, what exactly is EIS?  In short, it’s a tool for creating outlines based on relational data.  It essentially takes the place of editing your outlines in EAS, using dimension build Load rules.  You can also load data through EIS (although I typically decline to do so and instead automate it with MaxL elsewhere).  And, unlike EAS with it’s oh-why-don’t-you-just-stab-me-in-the-eyes-already interface, EIS has a pretty slick interface.  In fact, the interface is so nice, it makes me just a little bit tingly on the inside… but then again that could just be related to my unnatural obsession with all things multi-dimensional.

Why use EIS?  Once you get good at it, and have some good upstream data to work with, you can crank out some outlines pretty fast.  You can also keep them updated very easily, by virtue of updating the source relational data that EIS reads.  I’m also completely and utterly sick of whipping up new load rules to update dimensions.  In the case of some of my attribute dimensions, I can’t even imagine the pain I would have to go through to implement the same functionality with a load rule.  EIS is also the tool you’ll need to use to link up to some drill-through data.

To get started with EIS, you need to make sure that EIS is installed on a server somewhere.  I just have mine running on the Essbase server itself and I find I am happy with this approach.  EIS will store all of it’s data in something referred to as the “metadata catalog,” which is just a SQL database that you’ll need to setup.  Oddly enough, when I started using EIS, I didn’t actually know I had a SQL server, but I figured there was one somewhere, so I started pinging things, and behold, I found a SQL box laying around (I guess the proper way of doing this would be to fill out a capital appropriations request for a server or something, but this makes a way better story).  It’s a SQL Server box with modest specs, but it gets the job done just fine.  It’s also the same server that I use to hold all of the relational data that I use for building outlines.

In a nutshell, you need to create a new database, then use one of the scripts that comes with EIS in order to populate the new database with some initial data.  This data is simply something that EIS will use under the hood and you won’t have to (and shouldn’t) edit the data by hand once you get it all loaded up.  After you have the shell of the metadata catalog setup, you will need to define some data sources from the EIS server to your relational data.  In my case, with Windows servers, this meant just setting up an ODBC connection on the Essbase server and pointing it to another new SQL database on the SQL server.

At this point, I would highly recommend following the EIS instructions and setting up the data they include for TBC (The Beverage Company).  Setting up this example means that you will be setup with the EIS Model for TBC, as well as a metaoutline for TBC.

In EIS terminology, you create a model first, then a metaoutline.  A model is something you create in order to link together your SQL tables and tell EIS how they relate to each other.  At the center of this model you will have a Fact Table.  You can sort of think of the Fact Table as being similar in nature to the type of data that you would load to your Essbase cube with a normal load rule.  For example, if the rows in your data file had fields for the scenario, year, time period, location, department, measure, then a dollar amount or other figure, you can think of this as your fact table.  In this case, think of the different time periods for a moment.  In a typical periods/quarters/year setup, your text file would just have an 01 for period 01, an 02 for period 02, and so on.  Generally the source data wouldn’t make any sort of mention of the quarters, for example,  but EIS needs to know about this.  On the model, you would link the Period on the fact table to another table in your relational database that shows how to build the Time dimension based on that data.

The metaoutline has to be created after the model, because it is highly dependant on the way the model is setup.  In fact, when you create the metaoutline, you tell EIS which model to base it off of.  As the seasoned EIS veterans know about models and metaoutlines, once you commit to something in the model, you are basically married to it.  You can’t go around gutting the model too much unless you first make changes to your metaoutline(s).

If you’ve setup the model in a sane manner, you can then use the dimensions/tables you defined sort of as Lego blocks in your metaoutline, you can mix and match different dimensions and come up with a working outline.  You can even create arbitrary members in dimensions or completely new dimensions, if that’s what you need for that particular outline.  Try your best to keep it in the model though — it’ll usually pay off in the long run.

With the metaoutline properly defined, you can then load the members to a cube.  This is a straightforward process (assuming everything is setup correctly) where EIS will take your metaoutline, which is in turn based off your model, which is in turn based off your relational (SQL) data, and build a completely fresh outline for you.  Of course, this process can be automated (as with just about everything else).  If you’ve built everything correctly you can also load data through EIS too, although in practice I tend to find myself leaving that to an automation system, but it’s always good to know that your models are built properly.

The barriers to entry for setting up EIS can be daunting if you aren’t already using it.  You have to install it on a server, setup a SQL database for the metadata catalog, have some source data in SQL that you want to use to build/load outlines (which may require you to brush up on your SQL skills if you’re a bit rusty), and figure out how the tool works.  As always, learning from examples and experimenting on your own are very good ways to go — so if you can get the EIS demo app installed you should be able to see how things are put together.  The payoff though, for using EIS, can be immense.  You don’t have to mess around with all those dimension build load rules, you can spin up new outlines in a jiffy (without necessarily reinventing the wheel everytime), and girls think it’s pretty cool (and for all you female cube geeks out there, I’m sure the guys will be impressed too).

If you’re still curious about EIS (and who wouldn’t be?), feel free to drop me a line.  Also, if you’d like to see more articles about EIS, let me know and I get dig into some of the more complex stuff.  The screenshots below show some EIS screens — a model, a metaoutline, and building an outline from a metaoutline.   See in the model how the fact table is in the middle, with some other dimensions attached to it.  I blurred out some server information, but it doesn’t affect the purpose of the screenshot.

MaxL tricks and strategies on upgrading a legacy automation system from ESSCMD

The Old

In many companies, there is a lot of code laying around that is, for lack of better word, “old.” In the case of Essbase-related functionality, this may mean that there are some automation systems with several ESSCMD scripts laying around.  You could rewrite them in MaxL, but where’s the payoff?  There is nothing inherently bad with old code, in fact, you can often argue a strong case to keep it: it tends to reflect many years of tweaks and refinements, is well understood, and generally “just works” — and even when it doesn’t you have a pretty good idea where it tends to break.

Rewrite it?

That being said, there are some compelling reasons to do an upgrade.  The MaxL interpreter brings a lot to the table that I find incredibly useful.  The existing ESSCMD automation system in question (essentially a collection of batch files, all the ESSCMD scripts with the .aut extension, and some text files) is all hard-coded to particular paths.  Due to using absolute paths with UNC names, and for some other historical reasons, there only exists a production copy of the code (there was perhaps a test version at some point, but due to all of the hard-coded things, the deployment method consisted of doing a massive search and replace operation in a text editor).  Because the system is very mature, stable, and well-understood, it has essentially been “grandfathered” in as a production system (it’s kind of like a “black box” that just works).

The Existing System

The current system performs several different functions across its discreet job files.  There are jobs to update outlines, process new period data, perform a historical rebuild of all cubes (this is currently a six hour job and in the future I will show you how to get it down to a small fraction of its original time), and some glue jobs that scurry data between some different cubes and systems.  The databases in this system are setup such that there are  about a dozen very similar cubes.  They are modeled on a series of financial pages, but due to differences in the way some of the pages work, it was decided years ago that the best way to model cubes on the pages was to split them up in to different sets of cubes, rather than one giant cube.  This design decision had paid off in many ways.  One, it keeps the cubes cleaner and more intuitive; interdimensional irrelevance is also kept to a minimum.  Strategic dense/sparse settings and other outline tricks like dynamic calcs in the Time dimension rollups also keep things pretty tight.

Additionally, since the databases are used during the closing period, not just after (for reporting purposes), new processes can go through pretty quickly and update the cubes to essentially keep them real-time with how the accounting allocations are being worked out.  Keeping the cubes small allows for a lot less down-time (although realistically speaking, even in the middle of a calc, read-access is still pretty reliable).

So, first things first.  Since there currently are no test copies of these “legacy” cubes, we need to get these setup on the test server.  This presents a somewhat ironic development step: using EAS to copy the apps from the production server, to the development server.  These cubes are not spun up from EIS metaoutlines, and there is very little compelling business reason to convert them to EIS just for the sake of converting them, so this seems to be the most sensible approach.

Although the outlines are in sync right now between production and development because I just copied them, the purpose of one of the main ESSCMD jobs is to update the outlines on a period basis, so this seems like a good place to start.  The purpose of the outline update process is basically to sync the Measures dimension to the latest version of the company’s internal cross-reference.  The other dimensions are essentially static, and only need to be updated rarely (e.g., to add a new year member).  The cross-reference is like a master list of which accounts are on which pages and how they aggregate.

On a side note, the cross-reference is part of a larger internal accounting system.  What it lacks in flexibility, it probably more than makes up for with reliability and a solid ROI.  One of the most recognized benefits that Essbase brings to the table in this context is a completely new and useful way of analyzing existing data (not to mention write-back functionality for budgeting and forecasting) that didn’t exist.  Although Business Objects exists within the company too, it is not recognized as being nearly as useful to the internal financial customers as Essbase is.  I think part of this stems from the fact that BO seems to be pitched more to the IT crowd within the organization, and as such, serves mostly as a tool to let them distribute data in some fashion, and call it a day.  Essbase really shines, particularly because it is aligned with the Finance team, and it is customized (by finance team members) to function as a finance tool, versus just shuttling gobs of data from the mainframe to the user.

The cross-reference is parsed out in an Access database in order to massage the data into various text files that will serve as the basis of dimension build load rules for all the cubes.  I know, I know, I’m not a huge Access fan either, but again, the system has been around forever, It Just Works, and I see no compelling reason to upgrade this process, to say, SQL Server.  Because of how many cubes there are, different aliases, different rollups, and all sorts of fun stuff, there are dozens of text files that are used to sync up the outlines.  This has resulted in some pretty gnarly looking ESSCMD scripts.  They also use the BEGININCBUILD and ENDINCBUILD ESSCMD statements, which basically means that the cmd2mxl.exe converter is useless to us.  But no worries — we want to make some more improvements besides just doing a straight code conversion.

In a nutshell, the current automation script logs in (with nice hard-coded server path, user name, and password, outputs to a fixed location, logs in to each database in sequence, and has a bunch of INCBUILDDIM statements.  ESSCMD, she’s an old girl, faithful, useful, but just not elegant.  You need a cheatsheet to figure out what the invocation parameters all mean.  I’ll spare you the agony of seeing what the old code looks like.

Goals

Here are my goals for the conversion:

  • Convert to MaxL. As I mentioned, MaxL brings a lot of nice things to the table that ESSCMD doesn’t provide, which will enable some of the other goals here.
  • Get databases up and running completely in test — remember: the code isn’t bad because it’s old or “legacy code,” it’s just “bad” because we can’t test it.
  • Be able to use same scripts in test as in production.  The ability to update the code in one place, test it, then reliably deploy it via a file-copy operation (as opposed to hand-editing the “production” version) is very useful (also made easier because of MaxL).
  • Strategically use variables to simplify the code and make it directory-agnostic.  This will allow us to easily adapt the code to new systems in the future, for example, if we want to consolidate to a different server in the future, even one on a different operating system).
  • And as a tertiary goal: Start using a version control system to manage the automation system.  This topic warrants an article all on itself, which I fully intend to write in the future.  In the meantime, if you don’t currently use some type of VCS, all you need to know about the implications of this are that we will have a central repository of the automation code, which can be checked-in and checked-out.  In the future we’ll be able to look at the revision history of the code.  We can also use the repository to deploy code to the production server.  This  means that I will be “checking-out” the code to my workstation to do development, and I’m also going to be running the code from my workstation with a local copy of the MaxL interpreter.  This development methodology is made possible in part because in this case, my workstation is Windows, and so are the Essbase servers.

For mostly historical reasons the old code has been operated and developed on the analytic server itself, and there are various aspects about the way the code has been developed that mean you can’t run it from a remote server.  As such, there are various semantic client/server inconsistencies in the current code (e.g. in some cases we are referring to a load rule by it’s App/DB context, and in some cases we are using an absolute file path).  Ensuring that the automation works from a remote workstation will mean that these inconsistencies are cleaned up, and if we choose to move the automation to a separate server in the future, it will be much easier.

First Steps

So, with all that out of the way, let’s dig in to this conversion!  For the time being we’ll just assume that the versioning system back-end is taken care of, and we’ll be putting all of our automation files in one folder.  The top of our new MaxL file (RefreshOutlines.msh) looks like this:

msh "conf.msh";
msh "$SERVERSETTINGS";

What is going on here?  We’re using some of MaxL features right away.  Since there will be overlap in many of these automation jobs, we’re going to put a bunch of common variables in one file.  These can be things like folder paths, app/database names, and other things.  One of those variables is the $SERVERSETTINGS variable.  This will allow us to configure a variable within conf.msh that points to where the server-specific MaxL configuration file.  This is one method that allows us to centralize certain passwords and folder paths (like where to put error files, where to put spool files, where to put dataload error files, and so on).  Configuring things this way gives us a lot of flexibility, and further, we only really need to change conf.msh in order to move things around — everything else builds on top of the core settings.

Next we’ll set a process-specific configuration variable which is a folder path.  This allows us to define the source folder for all of the input files for the dimension build datafiles.

SET SRCPATH = "../../Transfer/Data";

Next, we’ll log in:

login $ESSUSER identified by $ESSPW on $ESSSERVER;

These variables are found in the $SERVERSETTINGS file.  Again, this file has the admin user and password in it.  If we needed more granularity (i.e., instead of running all automation as the god-user and instead having just a special ID for the databases in question), we could put that in our conf.msh file.  As it is, there aren’t any issues on this server with using a master ID for the automation.

spool stdout on to "$LOGPATH/spool.stdout.RefreshOutlines.txt";
spool stderr on to "$LOGPATH/spool.stderr.RefreshOutlines.txt";

Now we use the spooling feature of MaxL to divert standard output and error output to two different places.  This is useful to split out because if the error output file has a size greater than zero, it’s a good indicator that we should take a look and see if something isn’t going as we intended.  Notice how we are using a configurable LOGPATH directory.  This is the “global” logpath, but if we wanted it somewhere else we could have just configured it that way in the “local” configuration file.

Now we are ready for the actual “work” in this file.  With dimension builds, this is one of the areas where ESSCMD and MaxL do things a bit differently.  Rather than block everything out with begin/end build sections, we can jam all the dimension builds into one statement.  This particular example has been modified from the original in order to hide the real names and to simplify it a little, but the concept is the same.  The nice thing about just converting the automation system (and not trying to fix other things that aren’t broken — like moving to an RDBMS and EIS) is that we get to keep all the same source files and the same build rules.

import database Foo.Bar dimensions

    from server text data_file "$SRCPATH/tblAcctDeptsNon00.txt"
    using server rules_file 'DeptBld' suppress verification,

    from server text data_file "$SRCPATH/tblDept00Accts.txt"
    using server rules_file 'DeptBld'

    preserve all data
    on error write to "$ERRORPATH/JWJ.dim.Foo.Bar.txt";

In the actual implementation, the import database blocks go on for about another dozen databases.  Finally, we finish up the MaxL file with some pretty boilerplate stuff:

spool off;
logout;
exit;

Note that we are referring to the source text data file in the server context.  Although you are supposed to be able to use App/database naming for this, it seems that on 7.1.x, even if you start the filename with a file separator, it still just looks in the folder of the current database.  I have all of the data files in one place, so I was able to work around this by just changing the SRCPATH variable to go up two folders from the current database, then back down into the Transfer\Data folder.  The Transfer\Data folder is under the Essbase app\ folder.  It’s sort of a nexus folder where external processes can dump files because they have read/write access to the folder, but it’s also the name of a dummy Essbase application (Transfer) and database (Data) so we can refer to it and load data from it, from an Essbase-naming perspective.  It’s a pretty common Essbase trick.  We are also referring to the rules files from a server context.  The output files are to a local location.  This all means that we can run the automation from some remote computer (for testing/development purposes), and we can run it on the server itself.  It’s nice to sort of “program ahead” for options we may want to explore in the future.

For the sake of completeness, when we go to turn this into a job on the server, we’ll just use a simple batch file that will look like this:

cd /d %~dp0
essmsh RefreshOutlines.msh

The particular job scheduling software on this server does not run the job in the current folder of the job, therefore we use cd /d %~dp0 as a Windows batch trick to change folders (and drives if necessary) to the folder of the current file (that’s what %~dp0 expands out to).  Then we run the job (the folder containing essmsh is in our PATH so we can run this like any other command).

All Done

This was one of the trickier files to convert (although I have just shown a small section of the overall script).  Converting the other jobs is a little more straightforward (since this is the only one with dimension build stuff in it), but we’ll employ many of the same concepts with regard to the batch file and the general setup of the MaxL file.

How did we do with our goals?  Well, we converted the file to MaxL, so that was a good start.  We copied the databases over to the test server, which was pretty simple in this case.  Can we use the same scripts in test/dev and production?  Yes.  Since the server specific configuration files will allow us to handle any folder/username/password issues that are different between the servers, but the automation doesn’t care (it just loads the settings from whatever file we tell it), I’d say we addressed this just fine.  We used MaxL variables to clean things up and simplify — this was a pretty nice cleanup over the original ESSCMD scripts.  And lastly, although I didn’t really delve into it here, this was all developed on a workstation (my laptop) and checked in to a Subversion repository, further, the automation all runs just fine from a remote client.  If we ever need to move some folders around, change servers, or make some other sort of change, we can probably adapt and test pretty quickly.

All in all, I’d say it was a pretty good effort today.  Happy holidays ya’ll.

Normalize a fact table for a better EIS model

Generally I don’t load data through EIS because quite honestly, I fudged some of the EIS models when I first made them and now I’m pretty much married to them.  They are perfect for loading the outlines, but not data.  Earlier this week I was playing around with one of my newer models, trying to keep it ‘clean’ as it were, and after playing around with it, I got it set just right so the default SQL (quite a gnarly little query, really), much to my amazement, loaded exactly the data it needs from the fact table.  The one major thing I have had to tweak is the format of the fact table.  Upstream stores 3 facts in the table: for each period/location/account we store the value for the current year, prior year, and the budget value.  What I do is kind of normalize the data with a SQL script that takes all that data and puts it into a different table that has a year (YR) field and a scenario type.  This way in the EIS model, I can link this modified fact table cleanly to TIME, YEARS, SCENARIO, LOCATION, and possibly DEPARTMENTS.  It works quite nice!