ODI journal table performance tweak with an index

The other day I provided a crash course on simple journalization in ODI with an Oracle database. I wanted to set the stage for this performance tweak that I put in awhile ago on a few tables.

I mentioned that ODI stores journalized rows in a special table it creates that tracks the subscriber, processing status, and date of each row. The possible performance issue has to do with what ODI does during the processing of rows out of the table. If you have two different systems (subscribers) for a given table, when an interface runs, it is going to process data for just one subscriber at a time (usually). During processing ODI needs to mark these rows that it’s working with by changing the JRN_CONSUMED column to 1 for each matching subscriber.

In other words, if our interface processing data out of the CUSTOMERS table (by way of the journal table J$CUSTOMERS) is for subscriber ‘MARKETING’, then at some point in the ODI interface there will be a step like this:

UPDATE J$CUSTOMERS SET JRN_CONSUMED = '1' WHERE JRN_SUBSCRIBER = 'MARKETING'

If there are a lot of rows in the table, and particularly if there are a lot of rows in this table AND there are multiple subscribers, then the WHERE clause on this UPDATE statement can be a bottleneck. This is because the JRN_SUBSCRIBER field is not indexed, meaning that Oracle has to do a full table scan to find every row with a matching value. On a few interfaces I had at a client, this step was taking an inordinate amount of time. So I added an index on the JRN_SUBSCRIBER column (and eventually added one for the combination of JRN_SUBSCRIBER and JRN_CONSUMED), and performance increased dramatically, as Oracle was now able to avoid a full table scan.

The first time I made this change was over a year ago and I haven’t seen any adverse affects or otherwise had any problems, and it dropped processing time for this one step from over 30 minutes to well under a minute.

If you have large, multi-subscriber journal tables being worked on by performance-critical interfaces, you might want to open up Operator and see how long they are taking and if a carefully placed index could benefit them.

 

ODI journalization crash course

I do seem to love me some ODI journalization. It has been a really great feature that I have implemented countless times to great effect. For those not familiar, journalization in ODI is where you tell ODI that you want to record what rows get updated and inserted in a particular table, such that when you load data out of that table, you can choose to work with just those updated rows.

This is useful for many obvious reasons: only processing data that’s new or changed is usually faster than processing a whole table. When you are dealing with millions and billions of rows, it’s a necessity. Journalization is also nice for instances where it replaces processing data based on time.

For example, I have reworked many systems where the data to be processed was sysdate - 1 or some other time period, and the automation runs daily. This can work. But what if something goes wrong and you don’t catch it for a day? Well, you go in and tweak things and run a one-off job to fix it. Or you have other coping mechanisms. What if data from a month ago is updated? Many times just processing anything that changed is the effective strategy.

It’s how journalization works under the hood, though, that is the focus of this post (for performance reasons). When you journalize a table (at least for the simple Oracle implementation), ODI installs a trigger on that table so that updates and inserts are recorded somewhere.

In this case, that somewhere is your work schema. If you were to journalize a table called CUSTOMERS and the primary key of customers was CUSTOMER_ID, then you would end up with a journal table called J$CUSTOMERS that has the following columns:

  • JRN_SUBSCRIBER
  • JRN_CONSUMED
  • JRN_FLAG
  • JRN_DATE
  • CUSTOMER_ID

The structure will always be the four JRN_ columns and the primary key (which can be multiple columns but is often just one) of the table being recorded. This simple table records everything you need to know about the new and updated rows.

Here’s a quick crash course in these columns:

JRN_SUBSCRIBER: You supply a text name indicating what the consumer of the data will be. For example, you may have two systems that independently want to process new and updated customers, so you might have two subscribers: MARKETING and CRM, for example. Rather than creating two separate journal tables, we just have a column in here to differentiate the columns.

JRN_CONSUMED: This defaults to 0 and will be 0 until processing data out of the journal occurs. This column provides a really elegant solution to making sure you don’t lose data when processing journalized data. When an ODI interface runs to process data out of a journal, it sets the JRN_CONSUMED column for rows to process to 1. This does a couple of things. One, any rows that are added to the journal after a job starts do not get lost or otherwise suffer from some timing issue. They will just get picked up on the next processing job. Two, if the ODI interface fails for some reason, before it completes successfully, and you need to re-run the job, you won’t have lost the journalized rows: they are only dropped when the job successfully completes (it’s one of the last cleanup steps in the job).

JRN_FLAG: Indicates whether the row was inserted or updated (I or U), although in practice you will notice that the value is always ‘I’, indicating an inserted row (even if it was actually just an update). This is because later on in the job ODI will mark rows that are actually updates with a U and then handle the processing accordingly

JRN_DATE: The date (time) that the row was recorded into the journal.

With all of this infrastructure in place (plus a couple of views that ODI puts in for you), journalization is ready to go. You design your interface as normal, and if you want to process just the new/updated rows, all you have to do is check a single checkbox on the Mapping tab (Use Journalized Data). ODI handles the rest for you.

This is a really killer feature of ODI and can cleanup countless home-brew solutions that are fragile, complex, and code-intensive.

That being said, the original purpose of this post was to actually talk about a performance issue with ODI journals, but it was necessary to provide some context with an overview of journalization first. With that out of the way, check back tomorrow for a tweak you can make to your journal tables that might speed things up if you are processing large amounts of data!

ODI Oracle Merge Incremental Update burns sequence values

I use Oracle Data Integrator for quite a bit of ETL work. Most of it tends to be with Oracle databases, but also a fair bit with MySQL, SQL Server, and others. A common integration is to update data in a table based on a primary key, and insert new data if existing data doesn’t exist.

It’s also common to insert a sequence value into a row that you populate in a table somewhere. Sequences in Oracle are this mechanism where you basically ask them “Hey, give me the number in something we’re counting!” and then plug that value in. They are commonly used to generate unique values for rows that get inserted somewhere.

Sequences aren’t random numbers – they start off at 1 (or whatever you configure them to) and then count up, and up, and up. Like up to the number 999999999999999999999999999 (that’s huge!). In fact, there’s so many possible sequence values that if you happen to use a few by mistake, it’s not that big of deal. Just use a new sequence value! It’s not uncommon to see gaps in a table or rows that have a sequence value – this just happens due to the way data is processed and how the Oracle server will return blocks of sequences to be used in a given operation, then discard any that don’t get used (this might seem wasteful, but it’s not).

In any case, I was alerted to something interesting in a client’s ETL job awhile ago: sequence values in a table being updated by ODI were starting to seem abnormally large. In other words, this table generally has about 100,000 rows in it, and they are used to the sequence values being less than a hundred million or so. All of a sudden the sequence values were trending much higher.

It turns out that the job in question uses the MERGE Oracle IKM to update rows. This is a strategy where existing rows (identified by a particular primary key) are updated based on the primary key, and new rows are inserted. MERGE is kind of an elegant way of writing this without having to have separate UPDATE and INSERT step. In fact, MERGE is thought of as an “upsert” due to being a combination of the two.

Under the hood, it seems that Oracle rewrites MERGE queries into something resembling two separate steps. And the way that Oracle rewrites the query is such that it burns or otherwise needs to use a sequence value for every row in the table, on every run. So if there are 100,000 rows in the table, about 100,000 sequences are getting used on each run, even if only one row is being updated. Couple this with the fact that the job runs hourly and almost always processes data, and you have a recipe for sequence values to go up pretty fast.

Of course, we could literally burn one trillion sequence values on every run of the job, and schedule the job to run a trillion times a day, and still not have to worry about running out of sequence values. But it does seem kind of wasteful.

For this reason, you might want to use one of the other Oracle Incremental Update IKMs, such as the traditional INSERT/UPDATE.

Late note: I wrote all this up and it seems that the IKM was modified to strip out a clause from the MERGE statement that would prevent rows that already existed in the target from even being touched, thus reducing the amount of rows in the MERGE data set, and thus lowering the rate at which sequence values would get burned. So I think the stock MERGE IKM is still fine to use with sequences in the target. It’s still true that MERGE will use sequence values on non-updated rows, it’s just that ODI tries to be smart and reduce the update recordset to not include those in the first place.

Deleted journalized rows in ODI populating target with NULLs

This issue has been on my radar for awhile and I finally had a chance to dig into it the other day. I noticed an occasional issue for an ODI job that journalized a large source table and updated corresponding rows in a target on another system. The source and target are both Oracle databases so I’m using the DBLINK LKM and in this case using the MERGE IKM.

If you’re not familiar with journalized tables in ODI, it’s a really great feature that allows you to track rows that are inserted/updated on a table and then process only those rows when you run the interface. This capability is leveraged simply by way of a checkbox to tell the interface to only process the new rows (of course, setting up the journalization itself is a couple of extra steps but it’s also pretty straightforward). Journalization is incredibly useful in a variety of situations. The main one is that if the source is huge, you can’t or don’t want to process the entire table just to sync up with a target.

In addition to being able to track inserted and updated rows, with a simple option you can also synchronize deletions. This means that if a row gets deleted in the source, you have the option to delete the corresponding row (based on the primary key) in the target. I’d say about half of the interfaces I make with journalization have this option turned on and half have it turned off.

That all being said, I have a few interfaces at a client that have synchronize deletions turned on, but the sources very rarely get a row deleted. In fact, they almost never get a row deleted. But these interfaces have been failing now and then and the root cause turns out to be that some rows in the source were being manually deleted. The actual error as seen in ODI, however, is that an insert to the target table is failing on account of the ODI job trying to insert null values into columns that require non-null values. Curiously, these rows were not null in the source. As it turns out, ODI was getting null values because it was doing a left outer join with deleted records (generating records with just a primary key and all null fields).

But wait, shouldn’t these be deleted? Yes! I dug through the steps in the Operator, however, and there was no step that was deleting these rows from the flow table. So let’s take a look at the steps in the IKM:

ODI Oracle MERGE IKM steps

 

I see two critical steps with Order 140 and 150. Step 140 “Synchronize deletions from journal” is running and does exactly what we think it should do: delete records from the target that were deleted in the source. So far so good.

VMware Fusion

But based on analysis, this left some rows in the flow table (the temporary ODI staging table) that were just getting reinserted once again. But there’s a step for handling that!

VMware Fusion

We have the “remove deleted rows from flow table” that removes the deleted rows from the flow table, so that they therefore won’t be part of the join that creates the rows to go to the target. But why isn’t that step running? Well, it’s set to only execute if the FLOW_CONTROL option is set, as shown below:

VMware FusionSo, perhaps counterintuitively (for me, right now, at least…) the synchronize deletions is dependent on FLOW_CONTROL being turned on, and the issue was that Flow Control was turned off in this job. I don’t think this is a bug, but if caught me by surprised. I turned the option back on (the default is actually on but it was off for whatever reason), regenerated the scenario, and this issue is solved for now. At the moment I am of the thinking that this step shouldn’t be tied to Flow Control but I’m not going to question it for now. Hopefully this helps someone out down the road (but more likely I’ll Google it in a couple of years and help myself out, which curiously was the original motivation for this blog…)

Thoughts on using an ODI SKM to expose tables as a web service

It seems that I am trucking right along in my quest to every nook and cranny of what Oracle Data Integrator has to offer (in the words of imitable Cameron Lackpour, “Wait, you use ODI for real ETL stuff?”). People always talk about ODI’s knowledge modules (KMs) and are typically referring to the workhorse RKM, LKM, and IKMs. A more exotic and less talked about KM is the SKM – Service Knowledge Module.

The idea behind an SKM is fairly straightforward. One of the fundamental building blocks in ODI is the data store – this is typically, but not always, a table in a a relational database. We end up spending a good bit of our lives figuring out how to make data go from many source data stores to a target data store.

What’s this SKM business all about? It builds a web service with typical “CRUD” operations that you can deploy. This means we end up with a web service that clients could use to add, get, and search data in data stores we have.

There are several usage scenarios for this that I can think of. One, it might be a lot more tenable to deploy web services to expose data than to expose the database itself (think network security and such). That’s probably a slam dunk use case on its own right there. Second, you might want to access data from a language that doesn’t have support for your underlying data source technology. Third, it could just be a good architectural abstraction – making your clients oblivious to the exact underlying data source, giving you the flexibility to swap it out without affecting clients.

For the sake of argument, let’s say we have a dozen tables in a MySQL database and we’d like to expose a few of them as a web service. ODI Studio let’s us pick which data stores, and the SKM. Then we just define a JAX-WS server and deploy them.

In my case, I decided to deploy to a Wildfly server (aka JBoss AS). This is a JavaEE application server. It’s like WebLogic – but not like Tomcat – Tomcat is technically just a servlet container and it doesn’t contain all of the functionality that the ODI web services need (the EE part of JavaEE). In fact, I originally tried to deploy to Axis2 running under Tomcat 7 only to discover that Axis2 support has been deprecated/removed (for those not familiar, Axis2 is a project that runs in a servlet container that provides a base for exposing web services). So first note, do not use Axis2, especially if you are playing around in ODI 11, because I can’t even get it to work in ODI 12.

Other than that, ODI dutifully generates a bread and butter WAR file that can be deployed to JBoss (again, the typical deploy server is probably WebLogic but I’m a bit of a Red Hat guy so…). The ODI generated web service expects a JNDI data source to be configured. Again, for those of you not familiar, all JNDI does is makes it so that the WAR (servlet/application/web service) doesn’t need to include it’s own particular database connection details. So rather than a particular web service being hardcoded to a certain database server and using a particular username/password, it allows the application to say “Hey there mister application container, can you give me the ‘Finance’ database connection? Thanks!” and goes on its merry business. I ran into a small snag with the JNDI configuration that required me to modify the web.xml code to add in a <lookup-name> tag for my JNDI data source but other than that, the service deployed without a hitch.

So now on to the service itself: you get a traditional SOAP-based web service complete with WSDL file for consumption by clients. For easy testing you can point an application like SoapUI to the WSDL file and generate methods that you can easily test with. You can methods to add and list and filter. You can specify as much or as little data to filter on for a given entity (data store) so long as you include the primary key. Any data store in the web service exposed by the SKM must have a primary key, that’s the main catch – not that it should be to burdensome, since if you aren’t designing tables with primary keys, you might have bigger problems when it comes to using ODI…

Thoughts so far:

  1. Don’t use Axis2 to deploy SKMs (doesn’t work/not supported anymore)
  2. Wildfly (JBoss AS) does work, but seems to need a web.xml tweak for JNDI
  3. You might not have a JavaEE (Wildfly, WebLogic) container setup already, this could be a hurdle
  4. Generating the web service is pretty slick
  5. You get an “old school” WSDL-based web service

I actually really like how slick this can be. If you already have an application server and a set of data stores you want to expose, then boom, you can be up and running with a web service to provide access to those pretty easily. I guess the bigger question is this: is this what you want to do? Many web services are much more semantic in nature – i.e., your clients or potential clients might want more cohesive data rather than having to reach out to this table, that table, and some other thing, then combine it together into something. You wouldn’t use this, among other reasons, to expose Twitter data, for example.

As an additional thought, these SKMs have been around for quite some time – ages in the internet world. As a fairly experienced Spring developer, there have been amazing advances in things such as the Spring Data, Spring Data REST, and other technologies that allow one to build web-based create/read/update/delete operations on simple domain objects, and do it with a more modern technology stack such as to provide JSON payloads and use HTTP verbs such as POST/PUT/DELETE and so on. So personally I’d be more inclined to go in that direction to expose data than the SKM route.

But at the end of the day, if you have a set of tables you just need to expose over the web for whatever reason, this is a nice, low-investment way to accomplish that.

 

 

Flipping an ODI model to a different technology and kicking the interfaces

One of my recent ODI projects is a relatively complex transformation job. I am effectively building up a master/detail set of records from a single table. The single table isn’t really a single table in the source, it’s multiple tables. Within ODI I make several passes on it, dialing in the fields with interfaces and procedures. I opted to use the in-memory engine (MEMORY_ENGINE) because I thought the architecture would be a little cleaner, and the amounts of data being pushed through are not huge.

Everything was fine, until I hit a legitimate ODI bug. I actually found a relevant case in Oracle support for it: ODI-1228 “statement is not in batch mode”. There was even a patch! Unfortunately, the patch required a version of ODI higher than what I had available. So on a tight deadline my choices were to push through an ODI upgrade or to find some workaround.

I decided to see if I had a low-cost option of switching from using the memory engine to just using an Oracle schema as a stage (note that the package and interfaces themselves are all just moving data between various Oracle servers, nothing Hyperion related even). So I went into my model for the staging table that I was using, and just switched it from the In-memory Engine to Oracle (using the drop down). No complaints from ODI there.

Next I went into one of the interfaces that was previously setup to have source/staging/target in terms of its whole process. I went straight to the Flow tab but had some issues and fun little NullPointerException errors which is always a fun time. The thing is, I changed a technology on a model being used in various interfaces but it’s not like any part of ODI went into those interfaces to say “Hey, this changed…” – in fact, when you change the technology of a model, ODI helpfully and plainly just says “Hey, this is likely to break stuff. Proceed at your own risk… THAR BE DRAGONS.” Or something like that.

Anyway, I found that I could sort of ‘kick’ the interface when I opened it, by checking the “Staging Area Other Than Target” option, then turning it off again (it was off in most of my interfaces). This forced the interface to sort of recalculate and reset the flow, which took into account the updated technology of the model. There might be a better way to do this rather than this “fuzzy” method, but it worked and I didn’t have to redo the plethora of interfaces in this package.

Hopefully this helps someone else out someday!

Pointing an ODI Procedure to different schemas on the same server to facilitate testing

One of the truly awesome and powerful features of Oracle Data Integrator is to develop interfaces against a logical schema, then let ODI choose the physical schema at run time. In other words, we have the same logical job that should work in development, quality assurance, and production environments – and this works without having to create and maintain multiple copies of the same job.

Some environments, however, are less than ideal and don’t have a servers for testing (!). ODI really, really benefits from extensive testing in development, though. So what to do? Create a separate schema on the production server. We can still map out the test context for ODI to this separate schema and more or less approximate having a true testing environment.

What if we need to write an ODI Procedure, though? Obviously our first choice is to create an Interface,  but we need to drop down to procedures here and there for various reasons. Normally a procedure would work just fine in different physical servers because the code being executed, including any schema references, would be the same. This is not true if we have a faux test schema on our production server. We might have CUSTOMERS and our newly created CUSTOMERS_DEV or something similar. If an ODI procedure wants to do an update on a table, for instance, it might look like this:

UPDATE CUSTOMERS.INFO SET fav_color = 'Red'

Our procedure is tied to the particular schema. Again, if we had the same schema name but on different physical servers, this wouldn’t be an issue. But we don’t in this scenario. Let’s update the procedure to pull out the proper schema name from the ODI context so that the procedure works in our quasi-dev and production enviroments:

UPDATE <%=odiRef.getSchemaName()%>.INFO SET fav_color = 'Red'

Tada! In the development context, the schema CUSTOMERS_DEV will be used, and in production, CUSTOMERS will be used. Of course, the ideal solution is to have a full test and production environment, but in a pinch this will allow us to do testing on the same server and ensure that our ODI implementation is well-tested, correct, and robust.

Fixing an esoteric Oracle Incremental Merge Update Knowledge module bug with journalizing and an aggregate function

Editing knowledge modules is not for the faint of heart. KMs translate your interfaces and other fun business things into executable code. In short, they convert ODIs mostly declarative what definition of jobs into executable units by taking care of the how. There are numerous options available to interfaces and these largely affect the code that is generated for your job.

I ran into a problem the other week where my integration into an Oracle schema by way of the Oracle Incremental Update (MERGE) IKM from a journalized data source using an aggregate function was not working.

Thankfully, someone else had run into this issue and Google turned up a quite helpful OTN post that solved 99% of the problem for me. But the post is from a little while ago and didn’t fully work for me.

Here’s the relevant fix code from the brave and helpful OTN poster:

<%=odiRef.getGrpBy()%>
<%if ((odiRef.getGrpBy().length() > 0) && (odiRef.getPop("HAS_JRN").equals("1"))) {%>
,JRN_FLAG,JRN_SUBSCRIBER,JRN_DATE
<%}%>
asdffd

This doesn’t quite work, owing to a relatively new ODI feature: multiple data sets in the source of an interface. Multiple data set are a really, really cool feature that were added a bit ago. They basically let you take the UNION of different data sources and integrate them to the same target. This is useful in contrast to needing to interfaces for the same target. You can also do the intersection and minus out rows too but I haven’t quite had occasion to use that just yet.

As it pertains to us, though, there’s a parameter on much of the ODI generated code now that is used to specify which of the data sets to perform an action on. Here’s the updated updated (yes, that’s two updated’s in a row).

<%=odiRef.getGrpBy(i)%>
<% if ((odiRef.getGrpBy(i).length() > 0) && (odiRef.getPop("HAS_JRN").equals("1"))) {%>
,JRN_FLAG,JRN_SUBSCRIBER,JRN_DATE
<%}%>

Note the addition of the (i) in two places which is how the dataset is specified. Many thanks to the original poster for helping out in the first place and helping me fix my issue.

In case you missed it: Oracle Data Integrator success story webinar

A little while ago I presented a success story using Oracle Data Integrator. The webinar goes over ODI functionality at a high-level, giving a quick tour of the benefits of major functional areas and how they were used to greatly improve and build out the ETL and automation at a high profile client. If you are interested in using ODI in your organization or aren’t sure (but want to know more!) then I strongly encourage you to check it out. ODI is one of my specialization areas now where I have been doing a tremendous amount of work, so I look forward to becoming an expert in this technology and sharing this expertise with others.

Note that this webinar, while technical, is not a deep drive into implementation details. I am planning a followup webinar to very comprehensively look at using ODI as an ETL/automation solution and implementing it with best practices. This will be the webinar to look for if you are more technical and really want to see the ins and outs of how things work and can be architected for robustness, maintainability, and efficiency.

Using ODI error tables (and tables with dollar signs in their names) as models for transformations

I setup a solution awhile ago where ODI facilitates data movement from an ODI error table into another data store. As you may know, ODI has a naming convention where its temporary and work tables are prefixed with some combination of a letter (or two), a dollar sign, and perhaps an underscore. So if my original table in a relational data store is CUSTOMERS, then the error table for this table would be E$_CUSTOMERS.

So of course I fire up my RKM to build out the model metadata for these new tables, just like I normally would. Everything goes fine but when I go to build the interface and run it, things don’t work so well. A dollar sign ($) is a very common character for scripts and programming languages to indicate that variable interpolation should occur (in other words, to fill in a placeholder). For example, if the variable $color = “orange” and you interpolate a string such as echo “My favorite color is $color!” then the result is “My favorite color is orange!” Nothing too fancy here.

That being said, sometimes a dollar sign can wreak havoc when used where some code doesn’t expect it. I’m not sure if it ever got fixed, but I developed at a place once that would dole out Active Directory IDs based on letters from the first and last name – three letters from each, to be precise. But people with very short last names would have dollar signs appended to them. There was a bug in Hyperion where you couldn’t login with one of these user names, likely because it just never got tested (oops). I’ll assume that’s fixed now.

In any case, back to ODI. In a given ODI datastore, you have the name, resource name, and alias name. Name is typically how you see it in the ODI design context. Resource name is the actual resource in the schema/database. Alias is used by ODI in interfaces as, well, an alias. I didn’t record the exact error message but given that my interface was failing and the output in Operator was a bit odd, I immediately suspected the dollar sign throwing things off either for ODI’s generated code, on the target system (and Oracle database server), or both.

My quick solution? Just replace the dollar sign with an underscore in the alias in the datastore. You can delete the dollar sign entirely but I decided to just go with an underscore. By making the change in the datastore, subsequent interfaces leveraging this datastore will also benefit from the fix, as opposed to trying to adjust just the one interface. Of course, leave the resource name alone (and the name) since they don’t need to be changed (and in the case of the resource name, shouldn’t be changed).

Another approach to all of this would be to change the physical topologie’s naming convention so that error tables don’t have a dollar sign (or other entities that ODI names) but I quite like ODI’s preferred naming strategy (for the most part).