Crystal Enterprise 8.5 Baseline Benchmark: IBM AIX

icon

29

pages

icon

English

icon

Documents

Écrit par

Publié par

Le téléchargement nécessite un accès à la bibliothèque YouScribe Tout savoir sur nos offres

icon

29

pages

icon

English

icon

Documents

Le téléchargement nécessite un accès à la bibliothèque YouScribe Tout savoir sur nos offres




Crystal Enterprise 8.5 Baseline
Benchmark: IBM AIX
Executive Summary
Broad scale business intelligence is rapidly becoming a necessity in
competitive organizations. As the size of deployment and degree of
dependence on the system increases (often dictated by extranet
deployments and mission-critical information delivery requirements),
predictability becomes fundamental criteria in evaluating the strength
of a business intelligence system. Like other key systems in your data
center, you should expect a business intelligence system to demonstrate
predictable scalability and a compelling cost/performance equation.
Crystal Enterprise 8.5 was designed from the ground up to operate as a
mission-critical component of your IT infrastructure – demonstrating
the scalability, performance and reliability required to support
company-wide business intelligence initiatives.
Given a complex, multi-dimensional user load, Crystal Enterprise
demonstrates outstanding scalability and performance.
Highlights:
• Supports 75 concurrent users/processor with less than 5 second
average response time for all operations.
• Scales in nearly perfect linear fashion (95% efficiency gains)
within a cluster of machines or on a single machine.
• Handles over 47 million requests/day, resulting in over 3.9 million
pages of content.
Benchmarks in the configured test system and the summary results
highlighted above are based on the IBM AIX benchmark tests. Your
performance ...
Voir icon arrow

Publié par

Nombre de lectures

72

Langue

English

   Crystal Enterprise 8.5 Baseline Benchmark: IBM AIX Executive Summary Broad scale business intelligence is rapidly becoming a necessity in competitive organizations. As the size of deployment and degree of dependence on the system increases (often dictated by extranet deployments and mission-critical information delivery requirements), predictability becomes fundamental criteria in evaluating the strength of a business intelligence system. Like other key systems in your data center, you should expect a business intelligence system to demonstrate predictable scalability and a compelling cost/performance equation. Crystal Enterprise 8.5 was designed from the ground up to operate as a mission-critical component of your IT infrastructure – demonstrating the scalability, performance and reliability required to support company-wide business intelligence initiatives. Given a complex, multi-dimensional user load, Crystal Enterprise demonstrates outstanding scalability and performance. Highlights:  Supports 75 concurrent users/processor with less than 5 second average response time for all operations.  Scales in nearly perfect linear fashion (95% efficiency gains) within a cluster of machines or on a single machine.  Handles over 47 million requests/day, resulting in over 3.9 million pages of content. Benchmarks in the configured test system and the summary results highlighted above are based on the IBM AIX benchmark tests. Your performance/results may vary depending on the hardware/software data structures and usage pattern. Crystal Decisions makes no guarantees about your performance.
Crystal Enterprise 8.5  Baseline Benchmark Our approach to scalability Seven years of experience building and deploying clustered, scalable enterprise reporting systems has given us extensive insight into the challenge of modeling real-world customer scenarios and measuring the predictability of a BI system. In designing our system and testing its behavior, we employ four guiding principles: Principle #1: Use realistic customer-based models We have designed a “blended” test suite that emulates large customer usage patterns by mixing different content, users and activities into a set of test scripts that are then run in parallel. Rather than testing the behavior of individual components, or isolating a single activity, we simulate a broad range of usage patterns within a single test. This mirrors customer usage more closely and gives customers a better baseline for understanding how a real-world system would behave. It also prevents us from tuning the system for each isolated activity, which does not provide useful information for predicting system behavior. Principle #2: Measure predictability We believe that a benchmark should give the customer fair insight into predicting how different configurations will handle load. This allows customers to make intelligent cost/benefit trade-offs in architecting their system and making hardware purchase decisions. Based on our documented, “blended” test suite, we can demonstrate exactly how the addition of users and/or hardware impacts system performance. Principle #3: Find the “performance zone” A predictable system should demonstrate a “performance zone”, a bounded area within which the customer can make informed cost/performance trade-offs. Additionally, vendors should be able to identify a median point that delivers the optimal cost/performance relationship, while still providing enough capacity to handle spikes in load. Principle #4: Ensure tests are open and repeatable Within the software world, there are a number of standards that help customers make informed software purchasing decisions based on hard scalability and performance data (e.g. TPC benchmarks for databases). No such standard exists for BI systems, but we believe that fully documented, repeatable tests provide the first step towards the rigorous,    November 2002  Copyright  2002 Crystal Decisions, Inc. All Rights Reserved. Page 2 of 29 
Crystal Enterprise 8.5 Key Findings November 2002  Baseline Benchmark standards-based tests that customers need to evaluate systems. Our tests use as many standard components as possible (including TPC standard databases and off-the-shelf testing software) and we document every aspect of our testing methodology. Crystal Enterprise 8.5 demonstrates near perfect linear scalability across multiple processors and physical machines Crystal Enterprise can scale either horizontally (by adding additional physical machines to a cluster) or vertically (by adding additional processors to any given machine in the deployment). Our tests indicate that doubling the number of processors in a cluster using either of the above methods results in a 95% increase in overall system throughput. Crystal Enterprise can support over 47 million requests per day while simultaneously generating over 3.9 million pages in a 24 hour period Business intelligence systems are rarely dedicated to a single operation; they are called upon to handle a broad variety of user requests, while efficiently generating content in the background or on-demand. Crystal Enterprise can handle large request volumes, while simultaneously generating millions of pages of highly formatted, interactive information. Crystal Enterprise can deliver sub five-second response times if configured in the “performance zone” Users expectations for a web-based business intelligence system are based on their experience with other websites and applications, where response times over 8 seconds are seen as a sign that the system is not responding properly. Crystal Enterprise is able to readily deliver sub 5-second response times within a defined performance zone. No theoretical or real performance limits were discovered during testing Despite intensive testing on single machines and clusters of machines with up to 32 processors, our tests showed no significant performance degradation or compromises in response time as overall deployment sizes and user bases were increased. This indicates that Crystal Enterprise could still support significantly more users and generate more content on larger clusters. Copyright  2002 Crystal Decisions, Inc. All Rights Reserved. Page 3 of 29 
Crystal Enterprise 8.5 Baseline Benchmark Comparative guidelines Unfortunately, no two benchmark tests are the same. Product behavior and testing methodologies vary from vendor to vendor, and benchmarks are often used less to prove scalability, and more to make marketing claims around performance. We have attempted, in our benchmark design and execution to provide evidence of linear scalability under real-world, mixed load conditions. We encourage customers to find points of commonality between our tests and those of other vendors, but to do so with the following points in mind. Our tests use smaller average “think times” than many other vendors’ performance tests Think times are variables that are set within the test scripts to regulate how long a user is to wait before making the next request. For the purposes of the IBM AIX benchmark tests, Think times were designated to range from 5 seconds to 35 seconds. Other vendors will use fixed wait times upwards of 30 seconds, which results in more regular and predictable load patterns, rather than the wide-ranging patterns often seen in real-world deployments. Our tests tax the entire system at once Our test suite is designed to measure a broad and complex mix of user activities that result in every component of the system being stressed simultaneously. We believe that this provides a more accurate picture of system behavior under real-world use and removes the ability to “tweak” the system configuration to suit specific operations. Many vendors will test each system component in isolation, and are unable to give customers a clear picture as to how they can combine these one-dimensional tests to gain a complete picture of overall system performance. Our tests assume reasonable response times (maximum of 32 seconds) Studies indicate that most users working a web environment expect responses to common operations in fewer than eight seconds. Our recommendations around the performance zone for Crystal Enterprise assumes average wait times of fewer than ten seconds with maximum wait time of 32 seconds. Some vendors tests assume up to 90 second maximum wait times in order to show higher user volumes on a given hardware configuration. Our tests use commercially available testing software We strive to eliminate testing bias and provide greater transparency by using off-the-shelf tools to conduct our tests and measure our system November 2002  Copyright  2002 Crystal Decisions, Inc. All Rights Reserved. Page 4 of 29 
Crystal Enterprise 8.5 November 2002  Baseline Benchmark performance. Our pure-web-based design also means that it is easy to hook these tools up to our system without incurring excess overhead. Copyright  2002 Crystal Decisions, Inc. All Rights Reserved. Page 5 of 29 
Crystal Enterprise 8.5 Baseline Benchmark The information contained in this document represents the best current view of Crystal Decisions on the issues discussed as of the date of publication, but should not be interpreted to be a commitment on the part of Crystal Decisions or a guarantee as to the accuracy of any information presented.  This document is for informational purposes only. CRYSTAL DECISIONS MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR IMPLIED, IN THIS DOCUMENT. CRYSTAL DECISIONS SHALL HAVE NO LIABILITY OR OBLIGATION ARISING OUT OF THIS DOCUMENT.  © Copyright 2002 Crystal Decisions, Inc. All rights reserved. Crystal Reports, Crystal Enterprise, and Crystal Decisions are the trademarks or registered trademarks of Crystal Decisions, Inc. All other trademarks referenced are the property of their respective owners.  Specifications and product offerings subject to change without notice. Contents EXECUTIVE SUMMARY..........................................................................................................................1 CONTENTS..................................................................................................................................................6 INTRODUCTION........................................................................................................................................7 THE VALUE OF BENCHMARKING.......................................................................................................7 BENCHMARK TESTS................................................................................................................................7 SCALABILITY DEFINED..........................................................................................................................8 BENCHMARK TEST RESULTS.............................................................................................................10 BENCHMARK CONCLUSION................................................................................................................19 APPENDIX 1: TEST DEFINITIONS.......................................................................................................20 APPENDIX 2: SCALABILITY TEST ENVIRONMENT......................................................................22 APPENDIX 3: PERFORMANCE TEST ENVIRONMENT (LOGON AND SCHEDULE)...............27 APPENDIX 4: LOAD TESTING SOFTWARE.........................................................................................1  November 2002  Copyright  2002 Crystal Decisions, Inc. All Rights Reserved. Page 6 of 29 
Crystal Enterprise 8.5 Introduction Baseline Benchmark This document is designed to help Crystal Decisions customers understand how Crystal Enterprise scales, in order to better plan their deployments and anticipate system requirements. Benchmarks and the resulting information in the configured test system are based on the IBM AIX benchmark tests. Your performance/results may vary depending on the hardware/software data structures and usage pattern. Crystal Decisions makes no guarantees about your performance. Crystal Enterprise is a multi-tier system. Although the components are responsible for different tasks, they can be logically grouped based on the type of work they perform. In Crystal Enterprise, there are four tiers: the client tier, the intelligence tier, the processing tier, and the data tier. To provide flexibility, the components that make up each of these tiers can be installed on one machine, or spread across many. Along with a flexible architecture that allows for growth horizontally and vertically while providing high availability and fault tolerance (see Crystal Enterprise 8.5 DataCenter Certification), it is important for the system to not only scale but to also scale linearly. The value of benchmarking Many performance benchmarks try to come up with a single high-impact number to grab headlines. Using one-dimensional test suites that stress only a single system component or process at a time is a proven mechanism for generating big numbers for marketing purposes. We believe that this approach does not yield useful deployment planning information, as it runs exactly counter to how a business intelligence system is stressed in real-world environments. In reality, virtually every large-scale business intelligence deployment handles dozens of very different simultaneous requests every second. Not only must the system cope with a complex mix of transactions, but it must also cope with a constantly changing mix of transactions. It is exactly because of these two real-world requirements that we have chosen to build a composite test suite that not only mixes multiple transaction types, but also varies the mix of these transactions in a random fashion. We believe that this provides a more useful indicator of how our system will behave under real-world load. It does not generate a single large number, but should give the prudent customer confidence that Crystal Enterprise can handle the specific and varied demands of their business. Benchmark Tests Tests outlined in this document consist of scalability tests conducted at the IBM Performance Test labs in San Mateo, California as well as November 2002 Copyright  2002 Crystal Decisions, Inc. All Rights Reserved. Page 7 of 29  
Crystal Enterprise 8.5 Baseline Benchmark Component-Based tests conducted at Crystal Decisions Performance Labs. All test measurements were obtained utilizing Rational Software’s, Rational Test Manager.  IBM Performance Lab Tests These results are based four test series run across 4, 8, 16 and 32 CPU configurations respectively. Tests were conducted on single machines (to emphasize vertical scalability) and across multiple machines (to highlight horizontal scalability). Measurements for all tests are based on user response time and transaction rate. These tests employed our Baseline benchmark test suite (see appendix for a full description of this suite), a diverse and complex mixture of user activities, load patterns and content types. Component-based tests Additionally, a range of tests was done on a 12 processor, 3-node cluster to measure how many commands could be performed simultaneously (per second). These are provided only to illustrate peak load handling for isolated operations and are not as illustrative of overall system performance as the IBM Performance Lab Tests. Scalability defined  November 2002  Before any comparison of scalability in competing Business Intelligence systems can be made there must be a clear definition of scalability. This issue can be confusing, as scalability is often talked about in combination with high-availability design and reliability. The truth is that although reliability and a highly available architecture are both factors of scalability, their existence alone within the design of the system does not guarantee scalability. Scalability: the capacity to address additional users or transactions by adding resources without fundamentally altering the implementation architecture or implementation design. For a Business Intelligence system to be scalable, as the load increases customers should be able to maintain steady performance simply by adding additional resources such as servers, processors or memory. Throughput: a measure of a system’s transaction rate on a given hardware configuration. Response Time: Response time is a measurement in seconds of the time taken between a user request and a response from the server. In other words, response times is an end user experience metric. Copyright  2002 Crystal Decisions, Inc. All Rights Reserved. Page 8 of 29 
Crystal Enterprise 8.5 November 2002  Baseline Benchmark Performance: Performance can be measured in either throughput or response time. The main objective is to achieve a linear relationship between the amount of resources added and the resulting increase in performance, while maintaining the speed of the transactions constant. Linear scalability: a straight-line relationship between resources and load handling capabilities. The slope of this line demonstrates the efficiency of a system at using additional resources. Perfect linear scalability suggests a 1:1 ratio, or line with a slope of 1. In a perfectly linear scalable Business Intelligence system, it should be possible to service twice as many users in the same amount of time by adding twice as much hardware. Conversely if the hardware was fixed and the users increased, or more specifically the number of transactions generated increased, it would take twice as long to service those requests (i.e. performance would decay by a predictable factor). System reliability becomes a critical factor in ensuring above behavior. Vertical scalability (scaling up): This refers to a system’s ability to handle extra load as the number of processors installed on a single machine is increased. Horizontal scalability (scaling out): This refers to a system’s ability to handle extra load as the number of machines in a cluster is increased.  Ideally, a system should demonstrate roughly equivalent performance and load handling characteristics, regardless of which approach is employed to scale the system. Reliability: a system’s ability to gracefully handle increased load, component failure or overload. A Business Intelligence system should not stop working as user load increases, it should simply slow down gracefully. For the purposes of this document, a highly available architecture and reliability of the Business Intelligence system are assumed. In real life, (perfect) linear scalability is impossible. The extra overhead for resource management and communication between services reduces the ability of an application to scale linearly, relative to the constraining resources. In the graph below we can see the difference between a non-linear and a linearly scalable system. A linearly scalable system will continue to benefit from the application of additional hardware, a non linearly scalable system will, at some point, fail to benefit from additional hardware and also, it can be expected that additional hardware will actually degrade the system’s performance further down the line. Most consumers would prefer the performance line showing perfect linear scalability, but since this is virtually impossible, the astute Business Intelligence consumer should strongly consider only Business Copyright  2002 Crystal Decisions, Inc. All Rights Reserved. Page 9 of 29 
Crystal Enterprise 8.5 Baseline Benchmark Intelligence systems that can demonstrate, as close as possible, a 1:1 relationship in performance to additional hardware.  Figure 1: Different degrees of scalability. This chart compares "perfect" linear scalability with "good" linear scalability and "bad" non-linear scalability.  Benchmark Test Results  IBM Performance Lab Tests    The following results were obtained from tests that were run at the IBM Test Labs in San Mateo California. The results are based on a specific environment and scenario that was designed to emulate the environment and behavior of the most typical Crystal Enterprise customer. Therefore the purpose of these tests were to demonstrate that linear scalability can be achieved in a practical environment which is inclusive of all concurrent user functions and system tiers, rather than only showing scalability in separate component-based tests where individual components and tiers are isolated. Component-based tests are valuable for isolating and testing performance of specific tiers or components, however they cannot test or indicate performance of the complete system. Our tests indicate exceptional levels of performance that can be maintained as load and hardware are increased. Within the described test environment, reported benchmark results are based on 75 Concurrent Active users per processor. For this environment, this was found to be the point before any degradation in performance occurred (well before the “peak” in the degradation curve). In other words, 75 Active-Concurrent users per CPU offered an ideal cost/performance trade-off. November 2002 Copyright  2002 Crystal Decisions, Inc. All Rights Reserved. Page 10 of 29  
Crystal Enterprise 8.5   Finding the “Performance Zone” Baseline Benchmark  Figure 2: Performance Region. This chart shows how response time varies as user load is increased on fixed hardware (4 CPUs).   Speed spot – The speed spot in the graph is the point where the system begins to perform under any degree of stress. Prior to the speed spot, the system is being under utilized. i.e. Excessive hardware ratio to load or 10 – 15% CPU utilization. Sweet spot – The sweet spot represents the ideal performance/hardware relationship. The system is under stress, properly utilizing available resources, maintaining an acceptable level of performance while still leaving room for any graceful degradation if extra load is required. 50% - 75% CPU utilization Peak – The peak is the area where system resources are being stressed to the point where there is no further room for degradation (100% CPU utilization). This peak is an area that should be avoided in everyday system usage. Before this point is reached, additional hardware resources should be considered. Benchmark results demonstrate through upward and outward linear scalability that adding new resources will allow for predictable performance gains. The performance zone runs between the speed spot and the sweet spot. Depending on response time requirements, usage patterns and resource constraints, customers should architect their system to fall within this zone. Customers seeking optimal response time should architect towards the low end of the performance zone, and all customers should be aware that larger reports, busy database servers and certain operations like exporting and searching can significantly affect overall system performance. The Crystal Enterprise Sizing Guide is still your best resource for architecting a system that will fit your specific requirements. All data included in this benchmark is based on the recorded performance zone determined using the test’s specificS calability Test Environment. With this specific test environment the sweet spot was determined to be 75 Concurrent Active users per CPU and the speed spot was found to be 50/CPU. Subsequently, the results apply to 300 Concurrent Active users on 4CPUs, 600 November 2002 Copyright  2002 Crystal Decisions, Inc. All Rights Reserved. Page 11 of 29  
Voir icon more
Alternate Text