The %COMPUTERNAME%/MaxL trick for syncing test to production

There’s an automation trick I’ve been using for awhile that I like.  Having talked about automation and such with other people at ODTUG this year, it seems that several people are using this technique or a variant of it on their own systems.

Basically, the idea is that you want to be able to sync your production automation scripts from your test server as easily as possible.  The number one cause of scripts not working after copying them from test to production is because they have some sort of hard-coded path, filename, server name, user id, password, or other value that simply doesn’t work on the server you are dropping your scripts onto.

Therefore, you want to try and write your automation scripts as generically as possible, and use variables to handle anything that is different between test and production.  As an added bonus for making the sync from test to prod just a little bit easier, why not dynamically choose the proper configuration file?

Assuming you are running on Windows (the same concept will work on other platforms with some tweaks for your local scripting environment), one way to handle it is like this: your main script is main.bat (or whatever).  One of the environment variables on a windows server is the COMPUTERNAME variable.  Let’s say that your test server is essbase30 and your production server is essbase10.  On the production server, COMPUTERNAME is essbase10.

Knowing that we can set environment variables in a batch file that will be available in MaxL, we could setup a file called essbase30.bat that has all of our settings for when the script runs on that server.  For example, the contents of essbase30.bat might be this:

SET ESSUSR=admin
SET ESSPW=password
SET ESSSERVER=essbase30

From main.bat, we could then do this:

cd /d %~dp0
call %COMPUTERNAME%.bat
essmsh cleardb.msh

Assuming that the two batch files and the cleardb.msh are in the same folder, cleardb.msh could contain the following MaxL:

login $ESSUSR identified by $ESSPW on $ESSSERVER;
alter database Sample.Basic reset data;
logout;
exit;

Now for a little explanation.  Note that in essbase30.bat I am explicitly setting the name of the server.  We could assume that this is localhost or the COMPUTERNAME but why not set it here so that if we want to run the script against a remote server, we could do that as well (note that if we did run it remotely, we’d have to change the name of our batch file to match the name of the server running the script).  In general, more flexibility is a good thing (don’t go overboard though).  The first line of main.bat (the cd command) is simply a command to change the current directory to the directory containing the script.  This is handy if our script is launched from some other location — and using this technique we don’t have to hard-code a particular path.

Then we use a call command in the batch file to run the batch file named %COMPUTERNAME%.bat, where %COMPUTERNAME% will be replaced with the name of the computer running the automation, which is in this case essbase30.  The batch file will run, all of the SET commands inside it will associate values to those environment variables, and control flow will return to the calling batch file, which then calls essmsh to run cleardb.msh (note that essmsh should be in your current PATH for this to work).  The file cleardb.msh then runs and can “see” the environment variables that we set in essbase30.bat.

If we want to, we can set variables for folder names, login names, SQL connection names/passwords, application names, and database names.  Using this technique can make your MaxL scripts fairly portable and more easily reusable.

In order to get this to work on the production server, we could just create another batch file called essbase10.bat that has the same contents as essbase30.bat but with different user names and passwords or values that are necessary for that server.

For all you advanced batch file scripters out there, it may be necessary to use a setlocal command so the variables in the batch file don’t stomp on something you need that is already an environment variable.  As you can see, I’m a big fan of the %COMPUTERNAME% technique, however, there are a few things to watch out for:

  • You might be putting passwords in clear text in a batch file.  You can alleviate some of this by using MaxL encryption, although I haven’t actually done this myself.  The folder with these files on my servers already have filesystem level security that prevent access, and for the time being, this has been deemed good enough.
  • It’s difficult with this technique to run automation for test and prod from the same server (say, some central scheduling server).  It’s not too difficult to address this, though.
  • If you run the automation from a remote server instead of the Essbase server itself, you may end up with lots of different config files — in which case, you might want to adjust this technique a bit.

As for deploying your test files to your production server, if you’ve set everything up in a generic way, then simply changing the variables in the configuration file should allow you to run the exact same script on each server.  Therefore, your deploy method could literally be as simple as copying the folder from the test server to the production server.  I’m actually using Subversion to version control the automation systems now, but a simple file copy also works just as well

Remote server automation with MaxL

Did you know that you don’t have to run your MaxL automation on the Essbase server itself?  Of course, there is nothing wrong with running your Essbase automation on the server: network delays are less of a concern, it’s one less server to worry about, and in many ways, it’s just simpler.  But perhaps you have a bunch of functionality you want to leave on a Windows server and have it run against your shiny new AIX server, or you just want all of the automation on one machine.  In either case, it’s not too difficult to setup, you just have to know what to look out for.

If you’re used to writing MaxL automation that runs on the server, there are a few things you need to look out for in order to make your automation more location-agnostic.  It is possible to specify the locations of rules, reports, and data files all using either a server-context or a client-context.  For example, your original automation may have referred to absolute file paths that are only valid if you are on the server.  If the automation is running on a different machine then it’s likely that those paths are no longer valid.  You can generally adjust the syntax to explicitly refer to files that are local versus files that are remote.

The following example is similar in content to an earlier example I showed dealing with converting an ESSCMD automation system to MaxL.  This particular piece of automation will also run just as happily on a client or workstation or remote server (that has the MaxL interpreter, essmsh installed of course).  Keeping in mind that if we do run this script on our workstation, however, the entries highlighted in red refer to paths/files on the server, and the text highlighted in green refer to things that are relevant to the client executing the script.  So, here is the script:

/* conf includes SET commands for the user, password, server
   logpath, and errorpath */

msh "conf.msh";

/* Transfer.Data is a "dummy" application on the server that is useful
   to be able to address text files within a App dot Database context 

   Note that I have included the ../../ prefix because with version 7.1.x of
   Essbase even though prefixing the file name with a directory separator is
   supposed to indicate that the path is an app/database path, I can't get it
   to work, but using ../../ seems to work (even on a Windows server)

 */

set DATAFOLDER = "../../Transfer/Data";

login $ESSUSER identified by $ESSPW on $ESSSERVER;

/* different files for the spool and errors */

spool stdout on to "$LOGPATH/spool.stdout.PL.RefreshOutline.txt";
spool stderr on to "$LOGPATH/spool.stderr.PL.RefreshOutline.txt";

/* update P&L database 

   Note that we are using 3 different files to update the dimensions all at once
   and that suppress verification is on the first two. This is roughly analogous
   to the old BEGININCBUILD-style commands from EssCmd

*/

import database PL.PL dimensions

    from server text data_file "$DATAFOLDER/DeptAccounts.txt"
    using server rules_file 'DeptAcct' suppress verification,

    from server text data_file "$DATAFOLDER/DeptAccountAliases.txt"
    using server rules_file 'DeptActA' suppress verification,

    from server text data_file "$DATAFOLDER/DeptAccountsShared.txt"
    using server rules_file 'DeptShar'

    preserve all data
    on error write to "$ERRORPATH/dim.PL.txt";

/* clean up */

spool off;

logout;
exit;

This is a script that updates dimensions on a fictitious “PL” app/cube.  We are using simple dimension build load rules to update the dimensions.  Following line by line, you can see the first thing we do is run the “conf.msh” file.  This is merely a file with common configuration settings in it that are declared similarly to the following “set” line.  Next, we set our own helper variable called DATAFOLDER.  While not strictly necessary, I find that it makes the script more flexible and cleans things up visually.  Note that although it appears we are using a file path (“../../Transfer/Data”) this actually refers to a location on the server, specifically, it is the app/Transfer/Data path in our Hyperion folder (where Transfer is the name of an application and Data is the name of a database in that application).  This is a common trick we use in order to have both a file location as well as a way to refer to files in an Essbase app/db way.

Next, we login to the Essbase server.  Again, this just refers to locations that are defined in the conf.msh file.  We set our output locations for the spool command.  Here is our first real difference when it comes to running the automation on the server versus running somewhere else.  These locations are relevant to the system executing the automation — not the Essbase server.

Now on to the import command.  Note that although we are using three different rules files and three different input files for those rules files, we can do all the work in one import command.  Also note that the spacing and spanning of the command over multiple lines makes it easier for us humans to read — and the MaxL interpreter doesn’t really care one way or another.  The first file we are loading in is DeptAccounts.txt, using the rules file DeptAcct.

In other words, here is the English translation of the command: “Having already logged in to Essbase server $ESSSERVER with the given credentials, update the dimensions in the database called PL (in the Application PL), using the rules file named DeptAcct (which is also located in the database PL), and use it to parse the data in DeptAccounts.txt file (which is located in the Transfer/Data folder.  Also, suppress verification of the outline for the moment.”

The next two sections of the command do basically the same thing, however we omit the “suppress verification” on the last one so that now the server will validate all the changes for the outline.  Lastly, we want to preserve all of the data currently in the cube, and send all rejected data (records that could not be used to update the dimensions) to the dim.PL.txt file (which is located on the machine executing this script, in the $ERRORPATH folder).

So, as you can see, it’s actually pretty simple to run automation on one system and have it take action on another.  Also, some careful usage of MaxL variables, spacing, and comments can make a world of difference in keeping things readable.  One of the things I really like about MaxL over ESSCMD is that you don’t need a magic decoder ring to understand what the script is trying to do — so help yourself and your colleagues out by putting that extra readability to good use.

MaxL Essbase automation patterns: moving data from one cube to another

A very common task for Essbase automation is to move data from one cube to another.  There are a number of reasons you may want or need to do this.  One, you may have a cube that has detailed data and another cube with higher level data, and you want to move the sums or other calculations from one to the other.  You may accept budget inputs in one cube but need to push them over to another cube.  You may need to move data from a “current year” cube to a “prior year” cube (a data export or cube copy may be more appropriate, but that’s another topic).  In any case, there are many reasons.

For the purposes of our discussion, the Source cube is the cube with the data already in it, and the Target cube is the cube that is to be loaded with data from the source cube.  There is a simple automation strategy at the heart of all these tasks:

  1. Calculate the source cube (if needed)
  2. Run a Report script on the source cube, outputting to a file
  3. Load the output from the report script to the target cube with a load rule
  4. Calculate the target cube

This can be done by hand, of course (through EAS), or you can do what the rest of us lazy cube monkeys do, and automate it.  First of all, let’s take a look at a hypothetical setup:

We will have an application/database called Source.Foo which represents our source cube.  It will have dimensions and members as follows:

  • Location: North, East, South, West
  • Time: January, February, …, November, December
  • Measures: Sales, LaborHours, LaborWages

As you can see, this is a very simple outline.  For the sake of simplicity I have not included any rollups, like having “Q1/1st Quarter” for January, February, and March.  For our purposes, the target cube, Target.Bar, has an outline as follows:

  • Scenario: Actual, Budget, Forecast
  • Time: February, …, November, December
  • Measures: Sales, LaborHours, LaborWages

These outlines are similar but different.  This cube has a Scenario dimension with Actual, Budget, and Forecast (whereas in the source cube, since it is for budgeting only, everything is assumed to be Budget).  Also note that Target.Bar does not have a Location dimension, instead, this cube only concerns itself with totals for all regions.  Looking back at our original thoughts on automation, in order for us to move the data from Source.Foo to Target.Bar, we need to calculate it (to roll-up all of the data for the Locations), run a report script that will output the data how we need it for Target.Bar, use a load rule on Target.Bar to load the data, and then calculate Target.Bar.  Of course, business needs will affect the exact implementation of this operation, such as the timing, the calculation to use, and other complexities that may arise.  You may actually have two cubes that don’t have a lot in common (dimensionally speaking), in which case, your load rule might need to really jump through some hoops.

We’ll keep this example really simple though.  We’ll also assume that the automation is being run from a Windows server, so we have a batch file to kick things off:

cd /d %~dp0
essmsh ExportAndLoadBudgetData.msh

I use the cd /d %~dp0 on some of my systems as a shortcut to switch the to current directory, since the particular automation tool installed does not set the home directory of the file to the current working directory.  Then we invoke the MaxL shell (essmsh, which is in the PATH) and run ExportAndLoadBudgetData.msh.  I enjoy giving my automation files unnecessarily long filenames.  It makes me feel smarter.

As you may have seen from an earlier post, I like to modularize my MaxL scripts to hide/centralize configuration settings, but again, for the sake of simplicity, this example will forgo that.  Here is what ExportAndLoadBudgetData.msh could look like:

/* Copies data from the Budget cube (Source.Foo) to the Budget Scenario
   of Target.Bar */
/* your very standard login sequence here */
login AdminUser identified by AdminPw on EssbaseServer;
/* at this point you may want to turn spooling on (omitted here) */

/* disable connections to the application -- this is optional */
alter application Source disable connects;

/* PrepExp is a Calc script that lives in Source.Foo and for the purposes
   of this example, all it does is makes sure that the aggregations that are
   to be exported in the following report script are ready. This may not be
   necessary and it may be as simple as a CALC ALL; */

execute calculation Source.Foo.PrepExp;

/* Budget is the name of the report script that runs on Source.Foo and outputs a
   text file that is to be read by Target.Bar's LoadBud rules file */

export database Source.Foo
    using report_file 'Budget'
    to data_file 'foo.txt';

/* enable connections, if they were disabled above */
alter application Source enable connects;
/* again, technically this is optional but you'll probably want it */
alter application Target disable connects;

/* this may not be necessary but the purpose of the script is to clear out
   the budget data, under the assumption that we are completely reloading the
   data that is contained in the report script output */

execute calculation Target.Bar.ClearBud;

/* now we import the data from the foo.txt file created earlier. Errors
   (rejected records) will be sent to errors.txt */

import database Target.Bar data
    from data_file 'foo.txt'
    using rules_file 'LoadBud'
    on error write to 'errors.txt';

/* calculate the new data (may not be necessary depending on what the input
   format is, but in this example it's necessary */

execute calculation Target.Bar.CalcAll;

/* enable connections if disabled earlier */
alter application Target enable connects;
/* boilerplate cleanup. Turn off spooling if turned on earlier */

logoff;
exit;

At this point , if we don’t have them already, we would need to go design the aggregation calc script for Source.Foo (PrepExp.csc), the report script for Source.Foo (Budget.rep), the clearing calc script on Target.Bar (ClearBud.csc), the load rule on Target.Bar (LoadBud.rul), and the final rollup calc script (CalcAll.csc).  Some of these may be omitted if they are not necessary for the particular process (you may opt to use the default calc script, may not need some of the aggregations, etc).

For our purposes we will just say that the PrepExp and CalcAll calc scripts are just a CALC ALL or the default calc.  You may want a “tighter” calc script, that is, you may want to design the calc script to run faster by way of helping Essbase understand what you need to calculate and in what order.

What does the report script look like?  We just need something to take the data in the cube and dump it to a raw text file.

<ROW ("Time", "Measures")

{ROWREPEAT}
{SUPHEADING}
{SUPMISSINGROWS}
{SUPZEROROWS}
{SUPCOMMAS}
{NOINDENTGEN}
{SUPFEED}
{DECIMAL 2}

<DIMBOTTOM "Time"
<DIMBOTTOM "Measures"
"Location"
!

Most of the commands here should be pretty self explanatory.  If the syntax looks a little different than you’re used to, it’s probably because you can also jam all of the tokens in one line if you want like {ROWREPEAT SUPHEADING} but historically I’ve had them one to a line.  If there were more dimensions that we needed to represent, we’d put thetm on the <ROW line.  As per the DBAG, we know that the various tokens in between {}’s format the data somehow — we don’t need headings, missing rows, rows that are zero (although there are certainly cases where you might want to carry zeros over), no indentation, and numbers will have two decimal places (instead of some long scientific notation). Also, I have opted to repeat row headings (just like you can repeat row heading in Excel) for the sake of simplicity, however, as another optimization tip, this isn’t necessary either — it just makes our lives easier in terms of viewing the text file and loading it to a SQL database or such.

As I mentioned earlier, we didn’t have rollups such as different quarters in our Time dimension.  That’s why we’re able to get away with using <DIMBOTTOM, but if we wanted just the Level 0 members (the months, in this case), we could use the appropriate report script.  Lastly, from the Location dimension we are taking use the Location member (whereas <DIMBOTTOM “Time” tells Essbase to give us all the members to the bottom of the Time dimension, simply specifying a member or members from the dimension will give us those members), the parent to the different regions.  “Location” will not actually be written in the output of the report script because we don’t need it — the outline of Target.Bar does not have a location dimension since it’s implied that it represents all locations.

The output of the report script will look similar to the following:

January Sales 234.53
January LaborHours 35.23
February Sales 532.35

From here it is a simple matter of designing the load rule to parse the text file.  In this case, the rule file is part of Target.Bar and is called LoadBud.  If we’ve designed the report script ahead of time and run it to get some output, we can then go design the load rule.  When the load rule is done, we should be able to run the script (and schedule it in our job scheduling software) to carry out the task in a consistent and automated manner.

As an advanced topic, there are several performance considerations that can come into play here.  I already alluded to the fact that we may want to tighten up the calc scripts in order to make things faster.  In small cubes this may not be worth the effort (and often isn’t), but as we have more and more data, designing the calc properly (and basing it off of good dense/sparse choices) is critical.  Similarly, the performance of the report script is also subject to the dense/sparse settings, the order of the output, and other configuration settings in the app and database.  In general, what you are always trying to do (performance wise) is to help the Essbase engine do it’s job better — you do this by making the tasks you want to perform more conducive to the way that Essbase processes data.  In other words, the more closely you can align your data processing to the under-the-hood mechanisms of how Essbase stores and manipulates your data, the better off you’ll be.  Lastly, the load rule on the Target database, and the dense/sparse configurations of the Target database, will impact the data load performance.  You may not and probably will not be able to always optimize everything all at once — it’s a balancing act — since a good setting for a report script may result in suboptimal calculation process.  But don’t let this scare you — try to just get it to work first and then go in and understand where the bottlenecks may be.

As always, check the DBAG for more information, it has lots of good stuff in it.  And of course, try experimenting on your own, it’s fun, and the harder you have to work for knowledge, the more likely you are to retain it.  Good luck out there!

A quick and dirty substitution variable updater

There are a lot of different ways to update your substitution variables.  You can tweak them with EAS by hand, or use one of several different methods to automate it.  Here is one method that I have been using that seems to hit a relative sweet spot in terms of flexibility, reuse-ability, and effectiveness.

First of all, why substitution variables?  They come in handy because you can leave your Calc and Report scripts alone, and just change the substitution variable to the current day/week/month/year and fire off the job.  You can also use them in load rules.  You would do this if you only want to load in data for a particular year or period, or records that are newer than a certain date, or something similar.

The majority of my substitution variables seem to revolve around different time periods.  Sometimes the level of granularity is just one period or quarter (and the year of the current period, if in a separate Years dimension), and sometimes it’s deeper (daily, hourly, and so on).

Sure, we could change the variables ourselves, manually, but where’s the fun in that?  People that know me know that I have a tendency to automate anything I can, although I still try to have respect for what we have come to know as “keeping an appropriate level of human intervention” in the system.  That being said, I find that automating updates to timing variables is almost always a win.

Many organizations have a fiscal calendar that is quite different than a typical (“Gregorian”) calendar with the months January through December.  Not only can the fiscal calendar be quite different, it can have some weird quirks too.  For example, periods may have only four weeks one year but have five weeks in other years, and on top of that, there is some arcane logic used to calculate which is which (well, it’s not really arcane, it just seems that way).  The point is, though, that we don’t necessarily have the functionality on-hand that converts a calendar date into a fiscal calendar date.

One approach to this problem would be to simply create a data file (or table in a relational database, or even an Excel sheet) that maps a specific calendar date to its equivalent fiscal date counterparts.  This is kind of the “brute-force” approach, but it works, and it’s simple.  You just have to make sure that someone remembers to update the file from year to year.

For example, for the purposes of the date “December 22, 2008” in a cube with separate years, time, and weekday dimensions, I need to know three things: the fiscal year (probably 2008), the fiscal period (we’ll say Period 12 for the sake of simplicity, and the day of the week: day “2”).  Of course, this can be very different across different companies and organizations.  Monday might be the first day of the week or something.  If days are included in the Time dimension, we don’t really need a separate variable here.  So, the concepts are the same but the implementation will look different (as with everything in Essbase, right?).

I want something a bit “cleaner,” though.  And by cleaner, I mean that I want something algorithmic to convert one date to another, not just a look-up table.  Check with the Java folks in your company, if you’re lucky then they may already have a fiscal calendar class that does this for you.  Or it might be Visual Basic, or C++, or something else.  But, if someone else did the hard work already, then by all means, don’t reinvent the wheel.

Here is where the approaches to updating variables start to differ.  You could do the whole thing in Java, updating variables with the Java API.  You could have a fancy XML configuration file that is interpreted and tells the system what variables to create, where to put them, and so on.  In keeping with the KISS philosophy, though, I’m going to leave the business logic separate from the variable update mechanism.  Meaning this: in this case I will use just enough program code to generate the variables, then output them to a space-delimited file.  I will then have a separate process that reads the file and updates the Essbase server.  One of the other common approaches here would be to simply output MaxL or ESSCMD script itself, then run the file.  This works great too, however, I like having “vanilla” files that I can load in to other programs if needed (or, say, use in a SQL Server DTS/SSIS job).

At the end of the day, I’ve generated a text file with conents like this:

App1 Db1 CurrentYear 2008
App1 Db1 CurrentPeriod P10
App1 Db1 CurrentWeek Week4
App2 Db1 CurrentFoo Q1

Pretty simple, right?  Note that this simplified approach is only good for setting variables with a specific App/database.  It needs to be modified a little to set global substitution variables (but I’m sure you are enterprising enough to figure this out — check the tech ref for the appropriate MaxL command).

At this point we could setup a MaxL script that takes variables on the command line and uses them in its commands to update the corresponding substitution variable, but there is also another way to do this: We can stuff the MaxL statement into our invocation of the MaxL shell itself.  In a Windows batch file, this whole process looks like this:

SET SERVER=essbaseserver
SET USER=essbaseuser
SET PW=essbasepw

REM generates subvar.conf file
REM this is your call to the Java/VB/C/whatever program that
REM updates the variable file
subvarprogram.exe

REM this isn't strictly needed but it makes me feel better
sleep 2

REM This is batch code to read subvar.conf's 4 fields and pipe
REM them into a MaxL session
REM NOTE: this is ONE line of code but may show as multiple in
REM your browser!

FOR /f "eol=; tokens=1,2,3,4 delims=, " %%i in (subvar.conf) do echo
alter database %%j.%%k set variable %%i %%l; | essmsh -s %SERVER% -l
%USER% %PW% -i 

REM You would use the below statement for the first time you need
REM to initialize the variables, but you will use the above statement
REM for updates to them (you can also just create the variables in
REM EAS)

REM FOR /f "eol=; tokens=1,2,3,4 delims=, " %%i in (subvar.conf) do
echo alter database %%j.%%k add variable %%i; | essmsh -s %SERVER% -l
%USER% %PW% -i

Always remember — there’s more than one way to do it. And always be mindful of keeping things simple — but not too simple.  Happy holidays, ya’ll.

MaxL tricks and strategies on upgrading a legacy automation system from ESSCMD

The Old

In many companies, there is a lot of code laying around that is, for lack of better word, “old.” In the case of Essbase-related functionality, this may mean that there are some automation systems with several ESSCMD scripts laying around.  You could rewrite them in MaxL, but where’s the payoff?  There is nothing inherently bad with old code, in fact, you can often argue a strong case to keep it: it tends to reflect many years of tweaks and refinements, is well understood, and generally “just works” — and even when it doesn’t you have a pretty good idea where it tends to break.

Rewrite it?

That being said, there are some compelling reasons to do an upgrade.  The MaxL interpreter brings a lot to the table that I find incredibly useful.  The existing ESSCMD automation system in question (essentially a collection of batch files, all the ESSCMD scripts with the .aut extension, and some text files) is all hard-coded to particular paths.  Due to using absolute paths with UNC names, and for some other historical reasons, there only exists a production copy of the code (there was perhaps a test version at some point, but due to all of the hard-coded things, the deployment method consisted of doing a massive search and replace operation in a text editor).  Because the system is very mature, stable, and well-understood, it has essentially been “grandfathered” in as a production system (it’s kind of like a “black box” that just works).

The Existing System

The current system performs several different functions across its discreet job files.  There are jobs to update outlines, process new period data, perform a historical rebuild of all cubes (this is currently a six hour job and in the future I will show you how to get it down to a small fraction of its original time), and some glue jobs that scurry data between some different cubes and systems.  The databases in this system are setup such that there are  about a dozen very similar cubes.  They are modeled on a series of financial pages, but due to differences in the way some of the pages work, it was decided years ago that the best way to model cubes on the pages was to split them up in to different sets of cubes, rather than one giant cube.  This design decision had paid off in many ways.  One, it keeps the cubes cleaner and more intuitive; interdimensional irrelevance is also kept to a minimum.  Strategic dense/sparse settings and other outline tricks like dynamic calcs in the Time dimension rollups also keep things pretty tight.

Additionally, since the databases are used during the closing period, not just after (for reporting purposes), new processes can go through pretty quickly and update the cubes to essentially keep them real-time with how the accounting allocations are being worked out.  Keeping the cubes small allows for a lot less down-time (although realistically speaking, even in the middle of a calc, read-access is still pretty reliable).

So, first things first.  Since there currently are no test copies of these “legacy” cubes, we need to get these setup on the test server.  This presents a somewhat ironic development step: using EAS to copy the apps from the production server, to the development server.  These cubes are not spun up from EIS metaoutlines, and there is very little compelling business reason to convert them to EIS just for the sake of converting them, so this seems to be the most sensible approach.

Although the outlines are in sync right now between production and development because I just copied them, the purpose of one of the main ESSCMD jobs is to update the outlines on a period basis, so this seems like a good place to start.  The purpose of the outline update process is basically to sync the Measures dimension to the latest version of the company’s internal cross-reference.  The other dimensions are essentially static, and only need to be updated rarely (e.g., to add a new year member).  The cross-reference is like a master list of which accounts are on which pages and how they aggregate.

On a side note, the cross-reference is part of a larger internal accounting system.  What it lacks in flexibility, it probably more than makes up for with reliability and a solid ROI.  One of the most recognized benefits that Essbase brings to the table in this context is a completely new and useful way of analyzing existing data (not to mention write-back functionality for budgeting and forecasting) that didn’t exist.  Although Business Objects exists within the company too, it is not recognized as being nearly as useful to the internal financial customers as Essbase is.  I think part of this stems from the fact that BO seems to be pitched more to the IT crowd within the organization, and as such, serves mostly as a tool to let them distribute data in some fashion, and call it a day.  Essbase really shines, particularly because it is aligned with the Finance team, and it is customized (by finance team members) to function as a finance tool, versus just shuttling gobs of data from the mainframe to the user.

The cross-reference is parsed out in an Access database in order to massage the data into various text files that will serve as the basis of dimension build load rules for all the cubes.  I know, I know, I’m not a huge Access fan either, but again, the system has been around forever, It Just Works, and I see no compelling reason to upgrade this process, to say, SQL Server.  Because of how many cubes there are, different aliases, different rollups, and all sorts of fun stuff, there are dozens of text files that are used to sync up the outlines.  This has resulted in some pretty gnarly looking ESSCMD scripts.  They also use the BEGININCBUILD and ENDINCBUILD ESSCMD statements, which basically means that the cmd2mxl.exe converter is useless to us.  But no worries — we want to make some more improvements besides just doing a straight code conversion.

In a nutshell, the current automation script logs in (with nice hard-coded server path, user name, and password, outputs to a fixed location, logs in to each database in sequence, and has a bunch of INCBUILDDIM statements.  ESSCMD, she’s an old girl, faithful, useful, but just not elegant.  You need a cheatsheet to figure out what the invocation parameters all mean.  I’ll spare you the agony of seeing what the old code looks like.

Goals

Here are my goals for the conversion:

  • Convert to MaxL. As I mentioned, MaxL brings a lot of nice things to the table that ESSCMD doesn’t provide, which will enable some of the other goals here.
  • Get databases up and running completely in test — remember: the code isn’t bad because it’s old or “legacy code,” it’s just “bad” because we can’t test it.
  • Be able to use same scripts in test as in production.  The ability to update the code in one place, test it, then reliably deploy it via a file-copy operation (as opposed to hand-editing the “production” version) is very useful (also made easier because of MaxL).
  • Strategically use variables to simplify the code and make it directory-agnostic.  This will allow us to easily adapt the code to new systems in the future, for example, if we want to consolidate to a different server in the future, even one on a different operating system).
  • And as a tertiary goal: Start using a version control system to manage the automation system.  This topic warrants an article all on itself, which I fully intend to write in the future.  In the meantime, if you don’t currently use some type of VCS, all you need to know about the implications of this are that we will have a central repository of the automation code, which can be checked-in and checked-out.  In the future we’ll be able to look at the revision history of the code.  We can also use the repository to deploy code to the production server.  This  means that I will be “checking-out” the code to my workstation to do development, and I’m also going to be running the code from my workstation with a local copy of the MaxL interpreter.  This development methodology is made possible in part because in this case, my workstation is Windows, and so are the Essbase servers.

For mostly historical reasons the old code has been operated and developed on the analytic server itself, and there are various aspects about the way the code has been developed that mean you can’t run it from a remote server.  As such, there are various semantic client/server inconsistencies in the current code (e.g. in some cases we are referring to a load rule by it’s App/DB context, and in some cases we are using an absolute file path).  Ensuring that the automation works from a remote workstation will mean that these inconsistencies are cleaned up, and if we choose to move the automation to a separate server in the future, it will be much easier.

First Steps

So, with all that out of the way, let’s dig in to this conversion!  For the time being we’ll just assume that the versioning system back-end is taken care of, and we’ll be putting all of our automation files in one folder.  The top of our new MaxL file (RefreshOutlines.msh) looks like this:

msh "conf.msh";
msh "$SERVERSETTINGS";

What is going on here?  We’re using some of MaxL features right away.  Since there will be overlap in many of these automation jobs, we’re going to put a bunch of common variables in one file.  These can be things like folder paths, app/database names, and other things.  One of those variables is the $SERVERSETTINGS variable.  This will allow us to configure a variable within conf.msh that points to where the server-specific MaxL configuration file.  This is one method that allows us to centralize certain passwords and folder paths (like where to put error files, where to put spool files, where to put dataload error files, and so on).  Configuring things this way gives us a lot of flexibility, and further, we only really need to change conf.msh in order to move things around — everything else builds on top of the core settings.

Next we’ll set a process-specific configuration variable which is a folder path.  This allows us to define the source folder for all of the input files for the dimension build datafiles.

SET SRCPATH = "../../Transfer/Data";

Next, we’ll log in:

login $ESSUSER identified by $ESSPW on $ESSSERVER;

These variables are found in the $SERVERSETTINGS file.  Again, this file has the admin user and password in it.  If we needed more granularity (i.e., instead of running all automation as the god-user and instead having just a special ID for the databases in question), we could put that in our conf.msh file.  As it is, there aren’t any issues on this server with using a master ID for the automation.

spool stdout on to "$LOGPATH/spool.stdout.RefreshOutlines.txt";
spool stderr on to "$LOGPATH/spool.stderr.RefreshOutlines.txt";

Now we use the spooling feature of MaxL to divert standard output and error output to two different places.  This is useful to split out because if the error output file has a size greater than zero, it’s a good indicator that we should take a look and see if something isn’t going as we intended.  Notice how we are using a configurable LOGPATH directory.  This is the “global” logpath, but if we wanted it somewhere else we could have just configured it that way in the “local” configuration file.

Now we are ready for the actual “work” in this file.  With dimension builds, this is one of the areas where ESSCMD and MaxL do things a bit differently.  Rather than block everything out with begin/end build sections, we can jam all the dimension builds into one statement.  This particular example has been modified from the original in order to hide the real names and to simplify it a little, but the concept is the same.  The nice thing about just converting the automation system (and not trying to fix other things that aren’t broken — like moving to an RDBMS and EIS) is that we get to keep all the same source files and the same build rules.

import database Foo.Bar dimensions

    from server text data_file "$SRCPATH/tblAcctDeptsNon00.txt"
    using server rules_file 'DeptBld' suppress verification,

    from server text data_file "$SRCPATH/tblDept00Accts.txt"
    using server rules_file 'DeptBld'

    preserve all data
    on error write to "$ERRORPATH/JWJ.dim.Foo.Bar.txt";

In the actual implementation, the import database blocks go on for about another dozen databases.  Finally, we finish up the MaxL file with some pretty boilerplate stuff:

spool off;
logout;
exit;

Note that we are referring to the source text data file in the server context.  Although you are supposed to be able to use App/database naming for this, it seems that on 7.1.x, even if you start the filename with a file separator, it still just looks in the folder of the current database.  I have all of the data files in one place, so I was able to work around this by just changing the SRCPATH variable to go up two folders from the current database, then back down into the Transfer\Data folder.  The Transfer\Data folder is under the Essbase app\ folder.  It’s sort of a nexus folder where external processes can dump files because they have read/write access to the folder, but it’s also the name of a dummy Essbase application (Transfer) and database (Data) so we can refer to it and load data from it, from an Essbase-naming perspective.  It’s a pretty common Essbase trick.  We are also referring to the rules files from a server context.  The output files are to a local location.  This all means that we can run the automation from some remote computer (for testing/development purposes), and we can run it on the server itself.  It’s nice to sort of “program ahead” for options we may want to explore in the future.

For the sake of completeness, when we go to turn this into a job on the server, we’ll just use a simple batch file that will look like this:

cd /d %~dp0
essmsh RefreshOutlines.msh

The particular job scheduling software on this server does not run the job in the current folder of the job, therefore we use cd /d %~dp0 as a Windows batch trick to change folders (and drives if necessary) to the folder of the current file (that’s what %~dp0 expands out to).  Then we run the job (the folder containing essmsh is in our PATH so we can run this like any other command).

All Done

This was one of the trickier files to convert (although I have just shown a small section of the overall script).  Converting the other jobs is a little more straightforward (since this is the only one with dimension build stuff in it), but we’ll employ many of the same concepts with regard to the batch file and the general setup of the MaxL file.

How did we do with our goals?  Well, we converted the file to MaxL, so that was a good start.  We copied the databases over to the test server, which was pretty simple in this case.  Can we use the same scripts in test/dev and production?  Yes.  Since the server specific configuration files will allow us to handle any folder/username/password issues that are different between the servers, but the automation doesn’t care (it just loads the settings from whatever file we tell it), I’d say we addressed this just fine.  We used MaxL variables to clean things up and simplify — this was a pretty nice cleanup over the original ESSCMD scripts.  And lastly, although I didn’t really delve into it here, this was all developed on a workstation (my laptop) and checked in to a Subversion repository, further, the automation all runs just fine from a remote client.  If we ever need to move some folders around, change servers, or make some other sort of change, we can probably adapt and test pretty quickly.

All in all, I’d say it was a pretty good effort today.  Happy holidays ya’ll.

Automate that old cube archive process!

Your server may have dozens or even hundreds of cubes on it.  A common strategy with a large and slowly changing Measures dimension (or some other dimension like Product) is to spin off a copy of the cube after a certain time period, typically the fiscal year end.  There are a number of different reasons that you might do this.  First, the cube may simply focus on Current Year and Prior Year, or a fixed number of years and scenarios such that the cube becomes too unwieldy when you start adding more.  Second, if you need to be able to go back and pull a report so that it looks exactly how it did in a certain fiscal year, then you may need to spin off the cube.  Depending on how many cubes you end up spinning off for each fiscal year, it may be necessary to go and clean them up at some point, but you might still want to keep them around, just in case.  You can do this by hand by stopping the app, zipping up the app folder and all its contents, and deleting the app from within EAS.

Here is an example of a batch file you could use on Windows.  This relies on the free 7-Zip package being installed somewhere.  The nice thing about this approach is that while it uses MaxL, it doesn’t actually have any MaxL files — it just injects the MaxL command via the command-line.  Edit the variables for your setup, and you’re on your way.  It’s not pretty but it’s nice if you have to go cleanup a bunch of apps!  Happy cubing — Jason. [download zip of the following batch file]

@echo off

SET USER=adminuser
SET PW=adminuserpw
SET SERVER=essbaseserver
SET APPPATH=D:\Essbase\App
SET ZIP="7zp\App\7-Zip\7z.exe"

@echo.
@echo -------------------------------------------
@echo This is the cube archiver utility...
@echo.
@echo Looking for App %1 ...

IF NOT EXIST %APPPATH%\%1 GOTO NoApp

@echo.
@echo I found it at %APPPATH%\%1 ...
@echo.
@echo Attempting to stop the app...

REM essmsh -l %USER% -p %PW% -s %SERVER% StopCube.msh %1

echo alter system unload application %1; | essmsh -s %SERVER% -l %USER% %PW% -i

@echo Archiving the app ...

%ZIP% a -tzip EssApp_%1.zip %APPPATH%\%1

echo.

choice /M "Okay to delete app %1"

IF ERRORLEVEL 2 GOTO Done

echo alter application %1 enable startup; | essmsh -s %SERVER% -l %USER% %PW% -i
echo drop application %1 cascade force; | essmsh -s %SERVER% -l %USER% %PW% -i

GOTO Done

:NoApp

@echo I could not find that app at %APPPATH%\%1 !!!

:Done

Optimizing Essbase Automation Jobs (for fun and profit!) – Part 1

I freely submit that I am a complete geek.  But it’s okay, because I have come to accept and embrace that inner geek.  Anyway… Many organizations run jobs to perform a task as certain intervals.  Your job as a good Essbase/Hyperion administrator is to be lazy — that is, make the computers do the boring stuff, while you get to do the fun stuff, like thinking.  What we end up doing is setting up batch and script files that do all of the things in a particular sequence that we might do by hand.  So instead of logging in, copying a data file, clearing a cube, running a calc, loading some data, running another calc, and all that other stuff, we do this automatically.

After a while, you get a feel for how long a certain job takes.  The job that loads last week’s sales might take about an hour to run.  The job to completely restate the database for the entire fiscal year might take six hours to run.  This probably also becomes painfully obvious when it turns out that the job didn’t run correctly, and now you need to run it again (and keep those users happy).  I am all about jobs running as fast as they can.  This is useful on a few different levels.  First and probably foremost, it finishes faster, which is nice, especially for those times when the sooner you have data the better.  It’s also nice when you need to re-run things.  It can also open up some new possibilities to use the tool in ways you didn’t think or know you could — such as during the business week, getting live updates of data.

Typically there are a few big time-killers in an automated process.  These are the data loads, calculations, and reports that are run.  There are also some others, such as calling out to a SQL server to do some stuff, restructuring outlines, pulling tons of data off the WAN, and all that.  You can always go through your database logs to see how long a particular calculation took, but this is a bit tedious.  Besides, if you’re like me, then you want to look under EVERY single nook and cranny for places to save a little time.  To borrow an a saying from the race car world, instead of finding one place to shave off 100 pounds, we can find 100 places to shave off one pound.  In practice, of course, some of these places you can’t cut anything from, but some of them you can cut down quite a bit.

The first, and probably most important step to take is to simply understand where you are spending your processing time.  This will give you a better window into where you spend time, rather than just a few things in the normal Essbase logs.  There are numerous ways you can do this.  But, if you know me at all by now, you know that I like to do things on the cheap, and make them dead-simple.  That being said, the following posts will be based around Windows batch file automation, but the concepts are easily portable to any other platform.

First of all, take a look at the below graph:

Total Seconds to Run Essbase Automation Job

Total Seconds to Run Essbase Automation Job

What you’re looking at is a graph that has been created in Excel, based on data in a text file that was generated as a result of a profiling process.  The steps are as granular as we want them to be.  In other words, if we simply want to clump together a bunch of jobs that all run really fast, then we can do that.  Similarly, if we want to break down a single job that seems to take a relatively long amount of time, we can do that too.  Each segment in the starcked bar chart is based on the number of seconds that the particular step took.  We can see that in the original (pre-optimization) scenario, the orangey-brown step takes up a pretty considerable amount of the overall processing time.  After some tweaking, we were able to dramatically slash the amount of time that it took to run the whole job.  And, in the third bar, after yet more tweaking, we were able to slash the overall processing time quite a bit again.

The important thing here is that we are now starting to build an understanding of exactly where we are spending our time — and we can therefore prioritize where we want to attempt to reduce that time.  My next post will detail a sample mechanism for getting this kind of raw data into a file, and the next few posts after those will start to dig in to where and how time was shaved off due to better outline, system, and other settings and changes.  Happy profiling!

A MaxL quickie to unload those databases that are eating your precious memory

Due to various business requirements, some organizations end up archiving many of their cubes each year.  For example, if you have a huge Measures dimension that is constantly changing (even in subtle ways), but you need to be able to go back at some point in the future and see what some numbers looked like at some particular point in time, you might find yourself spinning off a copy of the cube that sits on the server.  Most of the time it just sits there, dormant, not being used.  But every now and then someone comes along and spins it up so they can refresh some obscure report.  It’s got a decent sized outline to it, so the overhead just on having this cube running is probably in excess of 50 megs of memory, just to sit there!  If you aren’t rebooting your servers that frequently, that app and database are just going to sit there until someone comes along and stops it.  This might not be a problem for your current situation, but for this particular server, there are over two-hundred apps available at any given time — and RAM is a finite resource.

Now, we could just setup a job to unload everything — and indeed, that’s part of the solution.  But I like to keep the core apps hot so they don’t have to spin up when people (or myself) login.  So what to do?  Create a whitelist of core apps and databases and write a little MaxL to unload everything, then start just the things I want.

For the purposes of brevity, I will just assume that connect.msh has a valid login statement in it.  Then the code to unload the apps (unloadall.msh) is pretty straightforward (spool to some output file as needed…):

msh “connect.msh”;
alter system unload application all;
logout;
exit;

Then we have a script file that starts up a specific app/db passed on the command line (startappdb.msh):

msh “connect.msh”;
alter application $1 load database $2;
logout;
exit;

So, then we have a simple text file with the list of each app/database combination to fire up (whitelist.txt):

App1 Db1
App2 Db2

Then, in a simple Windows batch file we could do the following (which I imagine you could port pretty easily to whatever platform/scripting combination you have):

essmsh unloadall.msh
FOR /f “eol=; tokens=1,2 delims= ” %%i in (whitelist.txt) do essmsh startappdb.msh %%i %%j

That’s it!  Dead simple, but effective.  The FOR in batch is basically broken out as follows: eol is the end-of-line or comment character (which we aren’t using in the data file in this instance), the first and second fields are broken down into %i and %j, and the delimiter between them is a space.  Then we call the script that will start it up (named startappdb.msh, passing along the App and Db).  There are many ways you could do this (passing commands on the command line itself) but this method to me is clean and simple.

The Essbase Spreadsheet Automation Trick

This one is a bit of a blast from the past.  And I mean that — the timestamp on this file is over five years ago.  You may have heard people refer to this method.  If you’ve ever found yourself getting confused over what-job-runs-when-and-on-what, then this might be a technique that works for you.  Obviously, this one was written when ESSCMD was all the rage, but you could no doubt adapt it to whatever you want.

In the attached spreadsheet, there are rows and rows of different jobs — just ESSCMD scripts — that all serve a particular purpose (and in this case, there are two sets of databases: a “Weekly” database and a “Daily” database.  This is for historical performance reasons — everything is in one cube now).  The days of the week are in columns, with an X to denote if the job runs on that day.  Note that the jobs are numbered so you always run them smallest to largest.  There is a simple Excel formula that populates the corresponding columns to the right if that script has an X for that day.  The idea is that you can then copy that and paste it into your batch file and then call all those scripts in that day’s file.  It’s remarkably simple, and it works well.  It’s also self-documenting: any time you update the spreadsheet, and the subsequent batch file, you just print up the schedule again and you have up-to-date documentation.

This particular example shows its age a little (ESSCMD scripts, hard coded paths, and all that), but it shows the concept quite nicely.

ninja_schedule

Automated EIS outline builds

Here’s a little quickie that I thought of and put in the other day. If you use the “export script” feature in EIS so that you can use that in conjunction with EIS in order to script your outline builds, you get a script that is specific to a particular server. I like to keep my test machine automation totally synced up with my production machine automation, but if I needed different scripts for different servers, then I would have to have a different automation file. Well, here’s what I did:

Name the outline build script with thte name of the server in it. I have my server name already set in an environment variable (it’s used all over the place for various automation and logging things). So if the .cbs was normally foo.cbs, and the name of the machine (prod) is bar01, and the test machine is bar02, and I reference the outline build command as foo_%SERVERNAME%.cbs, then whichever machine the script runs on, it’ll grab the right script. Kind of a small thing but I like it!