Unissant

 
image_pdfimage_print

Big Data Benchmark Testing Using Oracle Engineered Systems

Outcome Highlights
66 %

Average Query Run Performance Improvement

Exalytics

76 %

Exalytics Dashboard Query Improvement

Exadata Flash Cache

94 %

Exalytics Dashboard Query Improvement

Exadata Infiband

Case Study Details

The Challenge

Unissant has a very strong heritage in Business Intelligence (BI).  Recently,  we decided to push the edges of the analytical envelope with one of the most innovative BI solutions in the marketplace, Oracle Exalytics. Our Unissant engineers leveraged the Oracle Solution Center (OSC) to conduct benchmark testing of the new Exalytics Engineered System.   Specifically, our engineering team gauged “real life” business reporting scenarios, one using commodity servers, and the other using the high performance Oracle engineered systems. The tests included commodity Oracle Business Intelligence Enterprise Edition (OBIEE) servers vs. Exalytics, commodity Oracle 11g database vs. Oracle Exadata, TimesTen, and BI Cache tests. Our test data set was based off mortgage industry data. To push the boundaries of a “real life” scenario we ran performance tests against a data set with 48 billion records and “73” attributes, totaling “3.6” terabytes in size. We conducted two performance tests to include processing pre-defined dashboards and “real life” ad hoc reporting scenarios. The data was parsed into multiple sets, one with 2.5 billion data records and a second set containing 48 billion records.  We were very interested to see the results and anxious to compare performance between the different environments.

 

Exalytics Overview

 

Oracle Exalytics is an engineered system that is comprised of multiple high performance components that work in harmony together providing maximum analytical performance and significant scalability. The impressive list of components include 40 processing cores, 1 terabyte of RAM, and Infiniband networking delivering 40 gigabytes per second networking throughput.  The software packages include Oracle Business Intelligence Enterprise Edition (OBIEE), R for statistical computing, Oracle Essbase, and the Times Ten in-memory database. In addition, Oracle has created special features for Exalytics such as the “Summary Advisor” and the “Times Ten Aggregate Store” used in delivering lightning fast query speed and unparalleled flexibility for business analytics.

The key to providing faster analytics in any BI environment is in aggregate tables and cache mechanisms, but you cannot always justify the time it takes to build and maintain aggregate tables or the cost of caching data as they don’t always provide a performance ROI. To aid in performance ROI, Exalytics provides the “Summary Advisor” which analyzes all object combinations of frequently queried data and makes recommendations for aggregates based on hierarchy combinations and data compression ratios. The Summary Advisor also generates a script to be run in the TimesTen in-memory database to create the aggregates, as well as, defines the metadata in the OBIEE repository file to utilize these aggregates. This automated process can significantly reduce the aggregate build and maintain cycle, as well as, eliminate the need for DBA assistance. The Exalytics BI Administrator can run the scripts recommended from the Summary Advisor in the TimesTen in-memory database residing on the Exalytics box. Unissant recommends that organizations automate a process for running the Summary Advisor on a frequent basis because the advisor leverages BI Usage Tracking information and will capture all the query usage patterns, thus recommending performance enhancement. To completely automate the process aggregates can be created after the advisor provides the recommendations, building an automated self-tuning environment.

Unissant recognizes that our customers’ demand for more analytics is growing, and the need to data mine large, aggregate corporate datasets is inevitable.  At the same time, a system administrator’s ability to rapidly performance tune and support growing user communities is challenged with limited available computing and human resources. Through this exercise, Unissant demonstrated that the Oracle Exalytics platform can deliver significant performance gains over traditional BI solutions by scaling to accommodate the largest of analytical datasets. The tremendous performance gains are attributed to Oracle’s OBIEE software optimization including leveraging up to 40 server cores simultaneously, leveraging 1 terabyte of RAM for server caching, and parallel threads for processing.  All of this can be done without increasing System Administrator staff size and by leveraging one integrated, self-contained platform.


Environment

The Unissant test environment included Exalytics, Exadata, Commodity Oracle 11g, and Commodity OBIEE 11g systems for performance comparison. Our test cases were a series of dashboards containing a variety of formats from Time Series Graphs, Pie Charts, Pivot Tables, and Performance Scorecards. Our ad hoc test cases were queries containing multiple facts and dimension objects with a total of 6 joins to the fact table converted into a graphical display. This provided us a complete sample set of tests simulating a “real-life” business scenario. Below is the environment and test case (dashboard and ad hoc) detail.

BD1

Property Inspection Analysis Dashboard:

BD02

Mortgage Risk Analysis Dashboard:

BD03

Mortgage Sub-Prime Analysis Dashboard:

BD04

The Ad hoc reports covered 4 different scenarios for performance testing:

  1. Ad hoc query #1 included one join to the fact table from a dimension table
  2. Ad hoc query #2 included two joins to the fact table from two different dimensions
  3. Ad hoc query #3 included three joins to the fact table from three different dimensions
  4. Ad hoc query #4 included four joins to the fact table from four different dimensions

 

Each of the ad hoc reports were converted in a graphical format after the initial database query returned the results. The purpose of this was to see if there were any lag times with drawing the graph when converted.


Environment Setup

Unissant utilized our pre-existing dashboards and data model built in our “Analytix360” managed service offering as the baseline application for testing.  This platform uses an Oracle OBIEE 11g database deployed in Unissant’s cloud-based Innovation Center. Our first step was to FTP the OBIEE metadata (catalog and .rpd) to the Exalytics platform and deploy the files in their respective folders on the engineered system. We then FTP our build scripts onto the Oracle 11g Commodity and Exadata database servers to deploy our mortgage data model. The next step was to load the sample data into the target dataset that included 8 dimension tables and 1 fact table. The fact table included an initial sample dataset of 200k rows.  A data extrapolation script was written to incrementally increase the data volumes in both the commodity DB and the Exadata DB. This script read the records from the fact table and looped through the records in creating the large volume datasets (2.5 billion records and 48 billion records).

The load script is shown below:
–Create data rows
INSERT /*+APPEND*/ INTO EXASTACK.LOAN_FACT

(BORROWER_KEY, LENDER_KEY, PROPERTY_KEY, PURPOSE_PROD_KEY, OCCUPY_TYPE_KEY,

LOAN_PRODUCT_KEY, BORROWER_INCOME, OLTV, LOAN_NUMBER, LTV, FICO, DAYS_DELINQUENT,

ORIG_UPB, OUTSTANDING_UPB, INTEREST_RATE, INSPECTION_SCORE, ORIG_DATE_KEY,

INSPECTION_DATE_KEY)

SELECT

BORROWER_KEY, LENDER_KEY, PROPERTY_KEY, PURPOSE_PROD_KEY, OCCUPY_TYPE_KEY,

LOAN_PRODUCT_KEY, BORROWER_INCOME, OLTV, LOAN_NUMBER, LTV, FICO, DAYS_DELINQUENT,

ORIG_UPB, OUTSTANDING_UPB, INTEREST_RATE, INSPECTION_SCORE, ORIG_DATE_KEY,

INSPECTION_DATE_KEY

FROM   EXASTACK.LOAN_FACT l,

(SELECT rownum repeat FROM dual

CONNECT BY LEVEL<= (SELECT MAX(LOAN_PRODUCT_KEY) from EXASTACK.OCCUPY_TYPE_DIM))r WHERE l.OCCUPY_TYPE_KEY>=r.repeat;

COMMIT WORK;

 

We ran into a few challenges with the commodity database running out of space; therefore, we had to tweak the script by lowering the amount of loops to accommodate smaller loads. The amount of loops was controlled by the max key values from the primary keys in the dimension tables, in this case was the LOAN_PRODUCT_KEY whose max value was 13. Therefore, the script looped 13 times against the dataset using a select statement and inserted the rows into the fact table generating the billions of rows of data.  While the commodity environment required special coding, the Exadata Engineered System performed seamlessly without the need for extra looping code.  Below are the load statistics and load times for each environment:

BD05

These load statistics reflect that Exadata loaded 506% faster than a commodity box up to 2.5 billion rows of data.

Our next step was to explode the dataset to 48 billion rows on the Exadata server for testing commodity OBIEE vs Exalytics. The Exadata box was able to process the load to the 48 billion records in 79.34 minutes.

BD06

 

Once the datasets were expanded to 2.5 billion and 48 billion rows, it was time to prepare the reporting environment. First, we ran the dashboards and ad hoc queries against both datasets to get the baseline performance statistics. Afterwards, we tweaked the OBIEE .RPD file for performance enhancement specific to the Exalytics environment. Next, we ran the aggregate persistence wizard to identify the levels of aggregation that would assist with query performance against very large data sets. Once we received the recommendations from the OBIEE engine, we ran the recommended scripts in the Exalytics TimesTen Aggregate Store (database). The TimesTen database is an in-memory database that has 1TB of space allocated to it on the Exalytics engineered system providing the ability for lightning fast query response times.


Exalytics – Database Test Summary (2.5 billion records):

These sets of tests were run using Analytix360 on Exalytics against a commodity back end database for performance comparison without utilizing the Exalytics TimesTen aggregator store for query performance enhancement. Then the same tests were run after performance enhancing Exalytics using the TimesTen aggregator store to improve the query response time. The dashboards used for testing contain a variety of formats from Time Series Graphs, Pie Charts, Pivot Tables, and Performance Scorecards. Ad hoc queries contained multiple facts and dimension objects with six joins to the fact table, and then converted into a graphical display.


Test 1: This test was a fresh run against the three dashboards using Exalytics. There was no performance tuning of Exalytics against 2.5 billion records.
The results of this test reflect that the Exadata Engineered System processed the dashboard queries on an average of 66% faster than the commodity database. This would be attributed to the compatibility between the Exalytics and Exadata Engineered Systems using InfiniBand and the Exadata Flash Cache compared to a vanilla high horsepower commodity database server of similar characteristics.

BD07

Test 2: This test covered four (4) ad hoc queries run using Exalytics. There was no performance tuning of Exalytics against 2.5 billion records.
The results of this test reflect that the Exadata Engineered System processed the ad hoc queries on an average of 66.75% faster than the commodity database. The performance gains would be attributed to the same features as described above for the dashboards around the compatibility between the Exalytics and Exadata Engineered Systems compared to a vanilla high horsepower commodity database server of similar characteristics.

BD08


Test 3: This test was a fresh run against the three dashboards using Exalytics with performance tuning involving aggregates in the TimesTen database against 2.5 billion records.
After we optimized and tuned the Exalytics platform there was very little variance in performance querying the commodity database compared to the Exadata Engineered System. The
improvement by using Exadata in this scenario was only an average of .03%.

BD09

Test 4: This test covered the four (4) ad hoc queries ran against 2.5 billion records using Exalytics, performance tuned with aggregates in the TimesTen database.

 

BD10

After we optimized and tuned the Exalytics platform there was very little variance in performance querying the commodity database compared to the Exadata Engineered System. The improvement by using Exadata in this scenario was only an average of .04% for the ad hoc queries.


Exalytics vs Commodity OBIEE using Exadata (48 billion records):

This set of tests was run using Analytix360 on both Exalytics and an OBIEE commodity server of similar characteristics. Due to the commodity Oracle 11g database server capacity constraints we could not get a benchmark test with the commodity OBIEE and the commodity Oracle 11g database combination. Although, this may have not proved very useful for our test as we are looking for statistics with the Oracle Engineered Systems and would have not provided query results in a reasonable amount of time against such a large data set. These tests were processed against a 48 billion record fact table using Oracle Exadata as the back end database.


Test 5: This test was a fresh run against the three dashboards using Exalytics vs. Commodity OBIEE. There was no performance tuning of Exalytics for 1 run, then we tuned Exalytics against 48 billion records.

The results of this test reflect that the Exalytics Engineered System processed the dashboard queries on an average of 94% faster than the OBIEE Commodity server. This would be attributed to the compatibility of the Exalytics and Exadata Engineered Systems using InfiniBand, and the Exadata Flash Cache. The Exadata server had been tuned for queries against 48 billion records and was available to both the Exalytics platform and Commodity OBIEE platform. After we tuned the Exalytics box our query response time was split second against the 48 billion record dataset.

BD11

 

Test 6: This test covered the four (4) queries ran using Exalytics. There was no performance tuning of Exalytics for 1 run, then we tuned Exalytics against 48 billion records.
The results of this test reflect that the Exalytics Engineered System processed the dashboard queries on an average of 76% faster than the OBIEE Commodity server. This would be attributed to the compatibility of the Exalytics and Exadata Engineered Systems using InfiniBand, and the Exadata Flash Cache. The Exadata server had been tuned for queries against 48 billion records and was available to both the Exalytics platform and Commodity OBIEE platform. After we tuned the Exalytics box, and again experienced split second response time against the 48 billion record dataset.

BD12

Conclusion

As seen in the performance testing results with the Oracle Engineered Systems, there is a true benefit to organizations in the market for a big data analytical solution. The time saved, in some cases from hours to sub-seconds, translates into increased staff productivity and lower labor cost, while also increasing user satisfaction.  Simply put, your organization can conduct existing analysis in a fraction of the time and can conduct new analysis where previously the time and the patience required was prohibitive. Adopters of this technology can mine vast enterprise datasets to monitor, predict, and develop new strategies to support their business needs. The major highlight of this effort was using Exalytics to query 48 billion records of data and return the results in a split second timeframe. This was achieved by Unissant Engineers tuning the system appropriately utilizing the aggregates, TimesTen database, and specific Exalytics environment parameters that allowed queries to quickly sift through terabytes of data instantaneously.

Unissant’s investment into the Oracle Engineered Systems for our customers will help temper their worries about how they handle big data analytics today and in years to come. We bring with us to each customer lessons learned and best practices only obtained by hands on experience in using these engineered systems. With the addition of Exadata database on the back end our customers will experience extreme scalability, performance, reliability, and availability that will accommodate future growth.  At Unissant, we understand that all companies don’t have a need for Exalytics or Exadata, but the ones that do, and make the investment in these platforms, will realize many benefits such as future cost year avoidance, scalability into the out years, and reduced out year risk.  Perhaps most importantly is the ability to analyze large sets of information that previously took too much time and effort.  This type of analysis may provide your organization with a competitive advantage or a cost-effective, significant regulatory capability that changes your business for years to come.

Contact Unissant

Unissant Logo - DSTI tag
11800 Sunrise Valley Dr., Ste 1000
Reston VA 20191
Email: information (at) unissant.com
Phone: 703.889.8500
Fax: 703.889.8501
www.unissant.com

Unissant is an innovative software development and consulting company that manages complex initiatives, solves data challenges, and transforms business.  Unissant brings technical excellence and program/project execution best practices that exceed the expectations of our clients in the Banking and Finance, Health and Life Sciences, National Security, and Federal/Civilian sectors.

Copyright © 2006-2015 Unissant, Inc. All rights reserved. No part of this publication may be reproduced, distributed, or transmitted in any form or by any means, including photocopying, recording, or other electronic or mechanical methods, without the prior written permission of Unissant, Inc., except in the case of brief quotations embodied in critical reviews and certain other noncommercial uses permitted by copyright law.

End of Document
image_pdfimage_print