The Keys to Performance Tuning and Testing

According to experts and executives from across the industry, performance and scalability are the key benefits to improving testing and tuning.

Tom Smith

CORE ·

Dec. 30, 17 · Interview

Likes (4)

Comment

Save

12.0K Views

To gather insights on the current and future state of Performance Testing and Tuning, we talked to 14 executives involved in performance testing and tuning. We asked them,"What are the keys to using performance testing and tuning to improve speed and quality?" Here's what they told us:

Performance

Alleviate concerns and reduce the risk of poor quality given the acceleration of releases. Performance load testing with multi-users. Backend infrastructure integrates with load generator vendors like JMeter. The environment of the Agile development cycle is not sustainable with heavy load testing. Insert more and more practices and capabilities into the cycle. What can be done so as not to overload the developer or the tester? Open eyes to the value of receiving/inserting performance testing into smoke and regression tests. Reduce the cost of redoing the code later in the SDLC. Provide the ability to make objective decisions without the time pressure of a launch.
1) Performance as response time for application users. 2) Efficiency in the cloud – managing the cost by optimizing design schema, query, and infrastructure can provide savings that are orders of magnitude greater. 3) Database optimization looking at schema, queries and general application architecture. 4) Scalability -- if the app grows can I still get the performance I want to get? Find the balance between single and multi-node databases. Be real in understanding what you are looking for.
It is important to monitor the performance of the systems as well as the experience customers have when interacting with those systems. Testing and tuning functions need to be aligned around a shared understanding of customers, and a shared set of objectives/metrics for customer experience. A combination of synthetic and real user monitoring should be used where available for all relevant applications and protocols. Test locations should mirror where users are located and be representative of the ISPs being used by customers.
Performance is key for any provider of an online service or app. Start with the product, what experience you want the user to have, loading time, features. Ensure consistent performance across all geographies.
While there are many different types of monitoring, advanced end-user testing is essential because it measures actual, real-time performance where it really counts. Not only does this enable IT administrators to be proactive and prevent issues before they derail the organization, but in our case, we use this data to help create new and informative data points for our Intelligent AutoScaling feature, which enables dynamic and automated scaling of computer resources based on real usage trends. This means we can not only use data from performance testing to make further strategic recommendations for improving our customers’ IT environments, but also drive new insights and refine our understanding of industry standards.
The first step is to identify what performance metric is being tuned for and have a definition of "good" vs "bad" performance for a given workload as tuning for a metric often is a series of trade-offs. For example, tuning for low latency in the CPU scheduler may sacrifice overall throughput. Similarly, when tuning storage, it's important to know whether tuning for higher IO operations/second or throughput is preferred because it influences what storage parameters are modified and can influence the choice of IO scheduler. Similar concerns apply for other aspects of tuning so knowing exactly what metrics take priority is critical. Once that is established, it is desirable to have a reproducible, reliable, and representative test scenario to determine if each tuning step is improving or regressing the workload. This is not always possible in the field but it is desirable. It is generally possible during development as artificial workloads can be designed that exercise the system in different ways. Once the test scenario is tuned for, it's important to validate that the tuning helps the production workload. A key observation here is that a workload must be understood because tuning that improves workload A may not improve and may even regress workload B. One should not apply tuning parameters blindly. Next, it's important to know where the bottlenecks are and tune as close to the problem as possible. Even if the analysis indicates that a workload has a bottleneck in system software, libraries or the kernel, it should be considered whether this is manifested by poor tuning within the application itself. If so, the application should be tuned first before tuning the system as it often is a more effective solution. Finally, it's important to realize that tuning is not the only tool in the box. There are cases where modifying the software, be it the system software or the application, is the most appropriate solution. This is a less common outcome for a customer problem but it does happen. I did have an experience recently where a problem could be mitigated by tuning but altering the application would be a much-preferred outcome. Unfortunately, application modification was impractical but it was possible to optimize the system call in question to cope with the application behavior and the performance was superior to tuning. Fundamentally, the key with testing and tuning is to have good data, understand how to analyze it, be able to explain why a particular tuning decision improves the performance of a workload, and document each tuning that is applied with an explanation of why it was used for future reference.
Speed is a common interest or goal and is typically easier to quantify and measure. For example, customers will often focus on single metrics like response time performance to describe performance and infer quality. It's a well-touted metric and often quoted (along with concurrency) as the primary objective or goal. However, metrics such as 'speed' or 'concurrency' are simply data. Looked at in isolation they lack context or meaning. Combined with observations, they can start to explain behavior. Understanding system behavior lets us make more accurate predictions which can be further tested. It's not until we are able to ask these questions and test them that we get closer to more qualitative objectives such as quality. Performance testing is simply the means to achieve an objective. It can be described as putting a demand on a system and measuring it for performance. Quality is not just a single objective. It might include things like reliability, availability, scalability, and so on. Great performance testing delves into questions about performance, gathering evidence to build confidence or reduce the risk for a range of tasks and objectives. Examples might be describing performance in a fail-over scenario. Or perhaps it will address issues of scalability and cost-effectiveness for a given architecture. Maybe it will help forecast capacity and predict demand. There are many questions about performance which can be addressed through testing.

Scalability

As the application evolves, and new features are added, it is important to keep a performance testing environment up to date with the latest features. I do not mean testing every single newly added button or API method, but rather keep a defined state of major use cases, that might involve any new features developed. From there, using various performance testing techniques, try and measure the system’s performance on a high-scale level, to see if the newly developed features scale well, and don’t introduce any bottlenecks.
Set up and load. Test at scale. Ask the client what they want to accomplish and then identify the best solution based on their needs with regards to load and performance testing. Run tests in the production environment. Do the first performance test in the test environment. Big architectural changes have the most significant effect on performance. Identify the gaps in the architecture to optimize resources. Gauge traffic coming in and see if it can take 25 to 50% more. Perform burst testing to optimize for traffic.

Other

1) You need a solution platform for troubleshooting and diagnostics capable of monitoring every transaction and every command in real time. 2) There’s a fundamental disconnect between application team members and infrastructure team members since the application team is now leveraging the second generation of APM solutions. There’s no common understanding of what the computer and application is trying to manage. There is no end-to-end view. Infrastructure complexity and scale is beyond human comprehension. You need intelligence within intelligence monitoring tools.
Applications have multiple legs of communications from CDNs and DNS providers to DDoS mitigation providers to origin servers. Backend links via APIs to SaaS providers. There are so many pieces outside specific WAN boundaries. Given the diversity, you need to figure out how to establish a baseline on the new normal between applications and network teams. UX depends on multiple pieces working together. Understand all the components of communications and establish a baseline. As applications evolve, there are more cloud components. Map web API journeys to have a common understanding between application developers and operations about what’s normal to manage. Know what to measure and look at the components of the UX. Determine what you need to manage and how to restore to the norm. Have a shared understanding of application and infrastructure.
Maturity is not high. First, make sure the app doesn’t crash. Next, make sure the customer has a pleasant journey. Look up the user journey and determine what matters to them. Test real end-to-end test cases with positive and negative experiences. Do what users are really doing. Measure performance with regards to speed. Practice continuous performance testing. Usability, UX testing, A/B tests, accessibility testing (508 compliance). Show correlation with the changes you are making. Determine the remedy to improve the UX. Test cases should be user stories.
Build an unbreakable continuous delivery (CD) pipeline. Use continuous performance with validation steps to see how the application is performing under load. More customers have a continuous performance environment. Test auto-deployment, then load, then do a performance assessment. Look at more than response, throughput, customers, CPU cycles per endpoint, CPU cycles/tests.

Here’s who we spoke to:

Dawn Parzych, Director of Product and Solution Marketing, Catchpoint Systems Inc.
Andreas Grabner, DevOps Activist, Dynatrace
Amol Dalvi, Senior Director of Product, Nerdio
Peter Zaitsev, CEO, Percona
Amir Rosenberg, Director of Product Management, Perfecto
Edan Evantal, VP, Engineering, Quali
Mel Forman, Performance Team Lead, SUSE
Sarah Lahav, CEO, SysAid
Antony Edwards, CTO and Gareth Smith, V.P. Products and Solutions, TestPlant
Alex Henthorn-Iwane, V.P. Product Marketing, ThousandEyes
Tim Koopmans, Flood IO Co-founder & Flood Product Owner, Tricentis
Tim Van Ash, S.V.P. Products, Virtual Instruments
Deepa Guna, Senior QA Architect, xMatters

application Testing agile IT

Opinions expressed by DZone contributors are their own.

Related

Trending