Thursday, July 14, 2011

Large number of instances in the SOA 11g dehydration store causes EM console performance issues


Applies To:

SOA Suite 11g R1(11.1.1.3,11.1.1.4,11.1.1.5)

Scenario and Symptoms:

The BPEL Engine Audit level is set to Development.

The BPEL engine is load tested with some huge number of transactions causing to generate large number of instances in the dehydration store(SOAINFRA Schema)

The number of instances is say 10-50 Lacs.The developers complain about EM Console being very slow. Drilling into composites and instances take minutes.Below activities in EM console take time:

1.       EM Login Page Load time
2.       Time taken for logging in to the EM
3.       Time taken to render homepage for SOAINFRA
4.       Expanding the SOAINFRA and each partition within
5.       Time taken to render home page for each  deployed composite
6.       Time taken to render details for each instance of any deployed composite

Cause:

Large number of instances in the dehydration store with audit level set to Development. When you login to EM console it tries to load the large amounts of instance and fault data from database leading to slowing up the EM console response time.


Solution:

Improving the Loading of Pages in Oracle Enterprise Manager Fusion Middleware Control Console

You can improve the loading of pages that display large amounts of instance and fault data in Oracle Enterprise Manager Fusion Middleware Control Console by setting two properties in the Display Data Counts section of the SOA Infrastructure Common Properties page.

These two properties enable you to perform the following:
  • Disable the fetching of instance and fault count data to improve loading times for the following pages:
    • Dashboard pages of the SOA Infrastructure, SOA composite applications, service engines, and service components
    • Delete with Options: Instances dialog

    These settings disable the loading of all metrics information upon page load. For example, on the Dashboard page for the SOA Infrastructure, the values that typically appear in the Running and Total fields in the Recent Composite Instances section and the Instances column of the Deployed Composites section are replaced with links. When these values are large, it can take time to load this page and other pages with similar information.

  • Specify a default time period that is used as part of the search criteria for retrieving recent instances and faults for display on the following pages:

    • Dashboard pages and Instances pages of the SOA Infrastructure, SOA composite applications, service engines, and service components
    • Dashboard pages of services and references
    • Faults and Rejected Messages pages of the SOA Infrastructure, SOA composite applications, services, and references
    • Faults pages of service engines and service components

Other Suggestions/Best Practices:

1. Purge the instance-The moment we purge the instances, we see good performance.The flipside is you cannot be purging data regularly in production (to meet SLAs). My take is if you are in development server you can afford frequent purging of the instances. In Stage/Prod the Audit level would be set to production,hence performance issues due to large number of instances would not be seen. After you purge the dehydration store make sure you shrink the SOAINFRA tables along with indexes (or rebuild indexes)

2. Set Audit Level to Production/Off-The flip side is developers won't be able to troubleshoot issues with their composites. Go ahead with these settings in Production.

3. Another strategy is Switching the audit configuration to 'Deferred' which allows the auditing operations to be performed in an ansynchronous manner resulting in performance comparable to setting the audit level to disabled.Please refer Tuning BPEL audit performance [ID 1328382.1].This is recommended in production.Also can be applied to Dev/Stage environments.

4. As the number of record grows in the dehydration the EM console takes longer time to return information about instances. I believe this is a result of bad performing querries and full table scans. Generate an AWR report and see if you can tune some querries and build some indexes on tables. Get this gone by the DBAs.

5. Using a fast single threaded server(eg M5000/9000) for database instead of using slow CMT servers like the Sun T5140/5240. Please refer Migration from fast single threaded CPU machine to CMT UltraSPARC T1 and T2 results in increased CPU reporting and diminished performance [ID 781763.1]

6. Try using M series boxes for the application tier as well. But if you have to use CMT servers like T5240 make sure you follow note 860459.1 and apply the steps in the solution part to adapt all the components with CMT machine.

Let me know if the above article helps.

Soumya...

16 comments:

  1. Thanks for your insightful comments, Siddharth.
    Quick Questions - Are the recommendations on performance based on Linux, AIX or just Solaris servers? What are the guidelines on partitioning the process instance data in Oracle 11g database (R1/R2). Long running process data including attachments (e.g. - documents) may need to be maintained in the SOA server / process engine till they expire or reach end state because of an SLA. Basically, can the SOA suite retrieve / hydrate process data from a clustered and partitioned Oracle database. What are the other architectural options to optimize?

    ReplyDelete
  2. Hi Siddharth,

    Thanks for the wonderful article. I am facing one issue. My SOA server audit level is set to Production. I can not see any error-ed out instances in the EM console.

    Is this because of Production Audit trail? If not what could be the issue...

    Any help highly appreciated.

    Regards,
    Sudheer

    ReplyDelete
  3. A nice post! It works for me! I also notice a large number of JVM's threads running, this amount of threads, I think, impact on performance. What can I do about it ?

    ReplyDelete
  4. When we sent Audit level to OFF in production, it is stopping some of our composites. Seeing "null" for mediators. Learned it hard way !

    ReplyDelete