App usage has skyrocketed over the last ten years, for consumers and businesses alike. Not only do people interact with apps more frequently in every aspect of their day-to-day lives, but they also use a far broader range of apps to serve different functions. For individuals, this does not cause too much of an issue – managing applications is usually as simple as adding or removing them from a desktop or smart devices. But for organisations handling a much greater number of users and larger amount of data, managing applications is far more complex.
To counteract these difficulties, many enterprises are turning to application performance monitoring (APM) platforms to help troubleshoot and optimise their infrastructure. But the reality is that many APM tools are overwhelmed by the complexity of the issue, and unable to provide organisations with the actionable insights they need.
Orchestrating enterprise applications
Modern-day enterprise applications are often composable — that is, they’re built from individual containers and co-ordinated by platforms across local and cloud environments. Individual containers and the servers they run on can be spun up or down elastically as needed. This provides great flexibility and efficient use of resources, but it also makes for applications that are incredibly complex, with thousands of individual components.
In practice, this means that what looks to the user like a single transaction (paying for an item in a shopping cart, for instance) can actually involve millions of nesting method calls behind the scenes. There are many components where something might go wrong in the course of that transaction execution, so when you’re troubleshooting transactions, you need to be able to see into all of those nodes, because the problem could be anywhere.
APM tools often struggle to handle the ‘three Vs’ as a consequence:
● The volume of data involved
● The velocity at which it flows through the application
● The variance of metadata in individual transactions
To revisit the shopping cart example, the items in the shopping cart and the total amount the customer would pay for them constitutes as metadata — and if there was a problem with the transaction, an organisation would need to know whether it was related to a £10 cart or a £10,000 cart!
An unfinished symphony
There are several commonly used tricks to circumvent the limitations of these application performance management methodologies. For instance, instrumenting only parts of the application, because platforms are unable to scale to tens of thousands of components. Or capturing data based on triggers – only monitoring components if a trigger condition is met.
But the problem with these techniques is that they produce an incomplete picture. The sampled transactions might miss a crucial problem with application performance. If components are instrumented selectively, often the deep layers of the application where the root problems lie are neglected, and discovering the underlying cause can take time and effort. This leads to a constant trade-off between data quality and scale. And deciding which components are relevant to the specific issue requires a very intimate knowledge of the code.
Fine-tuning APM
Now here’s the good news: this trade-off can be avoided. Best-in-class APM solutions say goodbye to sampling. Instead, they are capable of capturing everything, in spite of the volume, using clustered architecture that delivers a 10x increase in ability to scale. A single analysis server can support tens of thousands of agents capturing many billions of transactions per day, whilst logging and storing every app transaction, along with system metrics at one-second intervals, with all of the relevant metadata, down to the deepest levels of the call stack.
But the crucial importance of being able to run APM at scale is the real-world business benefits it provides:
● Alignment with business growth: the ability to monitor keeps pace as an organisation’s infrastructure grows in scale and complexity
● Never missing a performance problem: No more blind spots for troubleshooting or business analysis
● Simplified APM deployment: no need to configure or maintain monitoring infrastructure to focus on hot spots
This results in faster mean time to resolution, proactively resolved production problems, and proper priority given to high-value efforts. Armed with a complete data set free of blind spots, business decision makers are in the best possible position to ensure that mission-critical technology continues to perform at an optimal level as organisations scale. In turn, this allows departments and employees to maximise productivity and keep pace in a challenging and competitive digital landscape.
Discussion about this post