Black Friday Lessons

In the last blog (found here), I talked about the first five of the top 10 ways that can help make your systems bulletproof for holiday processing for reliability, availability, and scalability demands. This week I continue with reliability, availability, and scalability topics from my technology geeky background point of view.

The next five items will help you understand the application and database performance technology side of my suggestions. The system architecture, application code base, and their detailed implementation provide the technology landscape that will determine your metrics. Here are the next five items:

  • Three Rs for performance tuning. When I was a ripe fresh new IT college graduate in my first job I was lucky enough to work with old school performance experts who were always fixing and improving applications. They said system and application performance were all real and impacted every user every day during their interactions. Therefore, the developer’s coding problems or later improvements reverberated through the years of the application usage.

    My performance mentors emphasized capturing real end-user experiences and completely reliable and resilient statistics that reflected real application interactions regardless of their results. Good, bad, or indifferent the real statistics reflected the system and application usage. I say the three Rs because real, reliable, and resilient statistics are the foundation for improving any situation. This three Rs approach has guided me well over the many years of mainframe, client server, distributed, and now, cloud system performance tuning. The three Rs concept is applicable to many parts of technology like costs, development time, and number of interconnect processes, and, especially poignant for performance. Start with three Rs– real, reliable and resilient statistics and you can always improve the performance of any situation.

  • Understand the number of transactions, especially web transactions. This is somewhat related to the first three Rs concept. With web pages and web micro services there can be many different types of performance implications for web transactions. The web application performance is so critical, and there are many different types of application web page factors and impacts, such as web page load time, the number of bytes that make up the page, number of objects, connections, and hosts involved in the transaction.

    All of these factors, along with number of database calls, number of databases with which to interact, types of calls, and number of web partners verifying web transactions for fraud, credit worthiness, and upselling make dissecting web transactions and diagnosing web performance very difficult. Comparing all of these statistics against last year’s statistics is even more maddening because the standard web transaction has been enhanced using a new programming architecture, with twice the application servers and twice as many micro services, making web transactions especially difficult to compare. Knowing your application history, architecture, and processes that make up a transaction is the first step to measuring it and dissecting its performance.

  • Understand when to add more capacity. Performance is only as good as the limits of your capacity. Knowing when your servers or mainframe need more capacity is a critical piece of performance evaluation point for any application or system. Capacity is not only determined by your CPU being 100% busy, but also your memory utilization, storage, and network capacity your application consumes while serving the customers.

    Capacity means knowing the details of hardware within your operating environment, details such as the number of processors, number of storage connections, and network bandwidth in and out of the system. Know the system–maybe not down to the manufacturer’s specs–but the general configuration of the system. The smaller the system such as UNIX or VMs, the easier it is to understand the resources and maximize performance.

    The bigger the systems, the more shared resources and the more workload management supervisors are required. In a previous blog, I talked about UNIX VMs and how the Red Hat VMotion workload management product can dynamically reallocate CPU, memory, and resources to other VMs within the VMotion system fabric. Within the mainframe world and DB2 LUW environments, Work Load Manager (WLM) can prioritize and manage workloads. So understand your capacity to realize when your system is running out of resources and when a WLM supervisor is not reallocating them appropriately to achieve good performance.

  • Know your security overhead. Security overhead used to be something that only happened at the beginning of transactions or jobs. Now security protocols can be encountered hundreds of times within transactions or jobs because of the usage of called, subordinate, or third party interfaces. The security overhead, processing delays and extra CPU, I/O, and network connection speed can have a dramatically negative impact on the overall processing.
    Security continues to be the most important aspect of the system and is worth every bit of processing, but the application designers and developers need to be conscience of the overhead costs and processing delays.

  • Know your database interactions. Being a performance and security database consultant all these many years, I always hear management, developers, and DBAs complain about database performance. The only way to address these complaints is through solid metrics and statistics. In many situations the application developers believe their execution is one way, while it is completely different. To monitor and improve any problem, specifically a suspected database issue, a comprehensive group of items needs to be gathered to determine the root cause of the performance situation.
    The following is a list of statistics which I recommend to clients to gather and retain to guarantee their on-going performance analysis. These minimal figures and statistics are usually only a subset of the information that can be gathered from most performance monitors. The only consideration is how long your company should retain the statistics?  Usually 18 months of performance details are good so that year-to-year comparisons can show the cost of the extra application features from a performance impact perspective.

Transaction/Interaction Duration Started/Ended
Processing areas: startup, security, network, application, SQL, database, I/Os, CPU, memory time statistics

System and application utilization CPU, I/O, zIIP, memory usage
Max/Average Concurrent processes
Max/Average CPU percent used
Max/Average storage reads/writes per second
Max/Average Memory utilization used
Objects referenced, tables, columns and indexes used/accessed
Max/Average Sync reads, List and Dynamic pre-fetch
Get pages, I/O and Wait time

Java performance statistics

JVM settings and utilization
Max/Average open JDBC connections
Max/Average Total queue records processed
Heap % and Health score
Garbage collection percent of JVM time

These five different areas document what the system, application, and database are doing and what your developers asked these components to do. Your company’s developer’s job is to give an application an efficient architecture, robust hardware, and proper code, along with the best table and index designs, JOINing the tables through the best indexed keys, and provide the SQL host variables values that will give the database the best information to quickly do an index retrieval of your data. It is amazing to find shops still trying to find the silver bullet technology that will resolve all their performance issues. Regardless of the technology or database platform, statistical analysis is paramount for performance.

Dave Beulke is a system strategist, application architect, and performance expert specializing in Big Data, data warehouses, and high performance internet business solutions.  He is an IBM Gold Consultant, Information Champion, President of DAMA-NCR, former President of International DB2 User Group, and frequent speaker at national and international conferences. His architectures, designs, and performance tuning techniques help organization better leverage their information assets, saving millions in processing costs. Follow him on Twitter  or connect through LinkedIn.

Leave a Reply

You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>