OTN Appreciation Day: Error Hospital / Resiliency in SOA 12c

otn appreciation day

A few week ago Oracle ACE Director Tim Hall did a shoutout to all community members with the idea to appreciate the Oracle Technology Network (OTN) and make it a special day where we thank OTN by talking about our favorite Oracle feature. Today is that day, the OTN Appreciation Day, tuesday 11th of October 2016. Before I go into my favorite Oracle feature, I first want to thank OTN.

Why is OTN important to me?

When I started working with Oracle, way before cloud was becoming mainstream, OTN was the main place to download most of my software. Besides great forums to find answers on issues I encounter during my work it also provides loads outstanding articles where I also had the opportunity to publish an article on. Outside the website OTN hosts/records video’s where community members are invited to talk about cool stuff.

I became Oracle ACE in 2016. I couldn’t have done this without the help of the OTN community

But it doesn’t end there! In the past years, really since I started working for AMIS, I became more active in the community, I started blogging and speaking at (international) conferences, and through the Oracle ACE Program I became a ACE Associate in 2015 and promoted to ACE in 2016 and have the goal to become ACE Director in the future. I couldn’t have done this without the help of the OTN community. The last thing that makes my trips to foreign conferences special are the events / receptions that OTN sponsors where you can meet other community members.

One of my favorite Oracle features: Fault Hospital / resiliency in SOA 12c

Error Hospital / Resiliency in SOA 12c

One of my favorite features is the error hospital in SOA Suite 12c. Errors are not my favorite, we should try to prevent errors within SOA applications, but sometime we cannot control external factors. We can minimize the impact of these factors by understanding the Error Hospital and available resiliency features in SOA 12c. Instead of explaining everything that the error hospital consist of I want to focus on two things; fault policies and the circuit breaker.

Fault Policies

In 11g Fault Policies were initially added so that you could easy intervene when a (SOAP/BPEL) fault was thrown. But you could only create them in source mode, there was no graphical editor. The SOA Suite 12c version of JDeveloper includes a visual editor for creating these Fault Policies. With the Fault Policy Editor you can now Design and Edit Fault Policies. Besides the already existed functionality a lot of new features are added.

When designing a new policy the editor will open with a empty policy document. A policy document can have more then one policy, so faults can be grouped i.g. a policy for system faults and a policy for service faults. For every type of fault you can create a fault handler and for each handler you can select one or more actions. When adding more the one actions you can use a XPath expression to select a filter. With the editor you can also create alerts, property sets and new (custom) actions.

fault policy window

For every type of fault you can create a fault handler. You can select a default action (default is termination) for the list of available actions. If you have created your own action you can also select that one. Some actions may not be shown if there are not applicant to the fault type.

fault policy actions

You can add multiple actions to a Fault Handler and use contextual if/then/default to select the action to execute. You can also assign an alert to an action. With this you can send an alert through email, JMS or write alert to a log file.

fault policy multiple actions

With the editor you can also create and edit alerts, property sets and new (custom) actions per Policy. Alerts can be used to send a message to a email address, a JMS queue or write the message to log. With property sets you can create constants that can be used to configure alerts and actions. Actions control what to do when a Fault occurs.

Actions control what to do when a Fault occurs

When creating a new Fault Policies document a set of default actions is created. These default actions show the types that are no support and can be altered to your own liking. You can also create your own actions and delete not used actions.

fault policy actions

There is so much more to talk about, but then I will go in to much detail and that was not really the point of this blog. You can find a lot of information in this blog post.

Circuit Breaker

A problem we all have experienced is when services more downstream get unavailable cause instances to fail and fill up the error hospital. Manual recovery is sometimes difficult and time consuming. These failing instances consume unnecessary resources.

Because of these failing instances the operational costs to recover instances in the error hospital are greater. There is also a potential instability of the system due to the errors on business critical instances.

In the 12.2.1 release Oracle introduced Circuit Breaker a feature that saved me a few times. It automatically suspend upstream inbound services and the messages are added in queues on disk for later processing. The inbound services automatically resume when the downstream service endpoint is up.

The circuit breaker monitors downstream system failures and after x number of failures over y minutes any upstream service/adapter where the failed messages originated from will be suspended. With adapter (i.e. file, jms, aq) the messages will not be lost, they will not be processed until the downstream system comes back up. With Web Services the requests will be rejected and it is up to the client program to handle these failures.

circuit breaker

You can find this option under soa-infra->soa administration->resiliency configuration. You can enable this option, which is part of the Oracle Integration Continuous Availability add-on, by enabling the checkbox. You can configure the failure rate and the retry interval. After the retry interval the downstream service is automatically checked if available by sending.  It is also possible to send out notification when Circuit Breaker is activated.

This failure rate can also be configured on specific services if it is different than on global level. This excellent blog described all the details, so check that out.

So thanks again to the folks, like Bob Rhubart, at OTN for the great services! I hope much more people will use this OTN Appreciation Day to talk about there favorite oracle feature.

Previous articleStarting Fresh & Fruity
Next articleAuthors journey writing a technical book
Robert is an respected author, speaker at (international) conferences and is a frequent blogger on the AMIS Technology blog, the Oracle Technology Network, and participates in OTN ArchBeat Podcasts. Robert is an member of the board of the Dutch Oracle User Group (nlOUG) and also organizes SIG meetups. He also works closely with the SOA Oracle Product Management team by participating in several Beta programs. In 2017, Robert was named Oracle Developer Champion, but also hold the Oracle ACE title for SOA and Middleware, because of his contributions to the community. He is co-author of the first Oracle PaaS book published, which was published in January 2017. His fascination for using the latest technology had led to the research of Blockchain to replace the currently used B2B patterns and tooling.


Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.