network observability

Comparing New Relic’s New AI-Driven Digital Experience Monitoring Solution with Datadog

In the ever-evolving landscape of digital experience monitoring, two prominent players have emerged with innovative solutions: New Relic and Datadog. Both companies aim to enhance user experiences and optimize digital interactions, but they approach the challenge with different strategies and technologies. Let’s dive into what sets them apart.

New Relic’s AI-Driven Digital Experience Monitoring Solution

New Relic recently launched its fully-integrated, AI-driven Digital Experience Monitoring (DEM) solution, which promises to revolutionize how businesses monitor and improve their digital interactions. Here are some key features:

1. AI Integration: New Relic’s solution leverages artificial intelligence to provide real-time insights into user interactions across all applications, including AI applications. This helps identify incorrect AI responses and user friction points, ensuring a seamless user experience.
2. Comprehensive Monitoring: The platform offers end-to-end visibility, allowing businesses to monitor real user interactions and proactively resolve issues before they impact the end user.
3. User Behavior Analytics: By combining website performance monitoring, user behavior analytics, real user monitoring (RUM), session replay, and synthetic monitoring, New Relic provides a holistic view of the digital experience.
4. Proactive Issue Resolution: Real-time data on application performance and user interactions enable proactive identification and resolution of issues, moving from a reactive to a proactive approach.

Datadog’s Offerings

Datadog focuses on providing comprehensive monitoring solutions for infrastructure, applications, logs, and more. Here are some highlights:

1. Unified Monitoring: Datadog offers a unified platform that aggregates metrics and events across the entire DevOps stack, providing visibility into servers, clouds, applications, and more.
2. End-to-End User Experience Monitoring: Datadog provides tools for monitoring critical user journeys, capturing user interactions, and detecting performance issues with AI-powered, self-maintaining tests.
3. Scalability and Performance: Datadog’s solutions are designed to handle large-scale applications with high performance and low latency, ensuring that backend systems can support seamless digital experiences.
4. Security and Compliance: With enterprise-grade security features and compliance with industry standards, Datadog ensures that data is protected and managed securely.

Key Differences

While both New Relic and Datadog aim to enhance digital experiences, their approaches and focus areas differ significantly:

• Focus Area: New Relic is primarily focused on monitoring and improving the front-end user experience, while Datadog provides comprehensive monitoring across the entire stack, including infrastructure and applications.

• Technology: New Relic leverages AI to provide real-time insights and proactive issue resolution, whereas Datadog focuses on providing scalable and secure monitoring solutions.

• Integration: New Relic’s solution integrates various monitoring tools to provide a comprehensive view of the digital experience, while Datadog offers a unified platform that aggregates metrics and events across the full DevOps stack.

Conclusion

Both New Relic and Datadog offer valuable solutions for enhancing digital experiences, but they cater to different aspects of the digital ecosystem. New Relic’s AI-driven DEM solution is ideal for businesses looking to proactively monitor and improve user interactions, while Datadog’s robust monitoring offerings provide comprehensive visibility across infrastructure and applications. By leveraging the strengths of both platforms, businesses can ensure a seamless and optimized digital presence.

What do you think about these new offerings? Do you have a preference for one over the other?

Network Monitoring for Cloud-Connected IoT Devices

One of the emerging trends in network monitoring is the integration of cloud computing and Internet of Things (IoT) devices. Cloud computing refers to the delivery of computing services over the internet, such as storage, processing, and software. IoT devices are physical objects that are connected to the internet and can communicate with other devices or systems. Examples of IoT devices include smart thermostats, wearable devices, and industrial sensors.

Cloud-connected IoT devices pose new challenges and opportunities for network monitoring. On one hand, cloud computing enables IoT devices to access scalable and flexible resources and services, such as data analytics and artificial intelligence. On the other hand, cloud computing introduces additional complexity and risk to the network, such as latency, bandwidth consumption, and security threats.

Therefore, network monitoring for cloud-connected IoT devices requires a comprehensive and proactive approach that can address the following aspects:

  • Visibility: Network monitoring should provide a clear and complete view of the network topology, status, and performance of all the devices and services involved in the cloud-IoT ecosystem. This includes not only the physical devices and connections, but also the virtual machines, containers, and microservices that run on the cloud platform. Network monitoring should also be able to detect and identify any anomalies or issues that may affect the network functionality or quality.
  • Scalability: Network monitoring should be able to handle the large volume and variety of data generated by cloud-connected IoT devices. This requires a scalable and distributed architecture that can collect, store, process, and analyze data from different sources and locations. Network monitoring should also leverage cloud-based technologies, such as big data analytics and machine learning, to extract meaningful insights and patterns from the data.
  • Security: Network monitoring should ensure the security and privacy of the network and its data. This involves implementing appropriate encryption, authentication, authorization, and auditing mechanisms to protect the data in transit and at rest. Network monitoring should also monitor and alert on any potential or actual security breaches or attacks that may compromise the network or its data.
  • Automation: Network monitoring should automate as much as possible the tasks and processes involved in network management. This includes using automation tools and scripts to configure, deploy, update, and troubleshoot network devices and services. Network monitoring should also use automation techniques, such as artificial intelligence and machine learning, to perform predictive analysis, anomaly detection, root cause analysis, and remediation actions.

Solutions for Network Monitoring for Cloud-Connected IoT Devices

There are many solutions available for network monitoring for cloud-connected IoT devices. Some of them are native to cloud platforms or specific IoT platforms, while others are third-party or open-source solutions. Some of them are specialized for certain aspects or layers of network monitoring, while others are comprehensive or integrated solutions. Some of them are:

  • Domotz: Domotz is a cloud-based network and endpoint monitoring platform that also provides system management functions. This service is capable of monitoring security cameras as well as network devices and endpoints. Domotz can monitor cloud-connected IoT devices using SNMP or TCP protocols. It can also integrate with various cloud platforms such as AWS, Azure, and GCP.
  • Splunk Industrial for IoT: Splunk Industrial for IoT is a solution that provides end-to-end visibility into industrial IoT systems.  Splunk Industrial for IoT can collect and analyze data from various sources such as sensors, gateways, and cloud services. Splunk Industrial for IoT can also provide dashboards, alerts, and insights into the performance, health, and security of cloud-connected IoT devices.
  • Datadog IoT Monitoring: Datadog IoT Monitoring is a solution that provides comprehensive observability for cloud-connected IoT devices. Datadog IoT Monitoring can collect and correlate metrics, logs, traces, and events from various sources such as sensors, gateways, cloud services. Datadog IoT Monitoring can also provide dashboards, alerts, and insights into the performance, health, and security of cloud-connected IoT devices.
  • Senseye PdM: Senseye PdM is a solution that provides predictive maintenance for industrial IoT systems. Senseye PdM can collect and analyze data from various sources such as sensors, gateways, and cloud services. Senseye PdM can also provide  dashboards, alerts, and insights into the condition, performance, and reliability of cloud-connected IoT devices.
  • SkySpark: SkySpark is a solution that provides analytics and automation for smart systems. SkySpark can collect and analyze data from various sources such as sensors, gateways, and cloud services. SkySpark can also provide dashboards, alerts, and insights into the performance, efficiency, and optimization of cloud-connected IoT devices.

Network monitoring for cloud-connected IoT devices is a vital and challenging task that requires a holistic and adaptive approach. Network monitoring can help to optimize the performance, reliability, and security of the network and its components. Network monitoring can also enable new capabilities and benefits for cloud-IoT applications, such as enhanced user experience, improved operational efficiency, and reduced costs.

Cloud Native Security: Cloud Native Application Protection Platforms

Back in 2022, 77% of interviewed CIOs stated that their IT environment is constantly changing. We can only guess that this number, would the respondents be asked today, will be as high as 90%+. Detecting flaws and security vulnerabilities becomes more and more challenging in 2023 since the complexity of typical software deployment is exponentially increasing year to year. The relatively new trend of Cloud Native Application Protection Platforms (CNAPP) is now supported by the majority of cybersecurity companies, offering their CNAPP solutions for cloud and on-prem deployments.

CNAPP rapid growth is driven by cybersecurity threats, while misconfiguration is one of the most reported reasons for security breaches and data loss. While workloads and data move to the cloud, the required skill sets of IT and DevOps teams must also become much more specialized. The likelihood of an unintentional misconfiguration is increased because the majority of seasoned IT workers still have more expertise and got more training on-prem than in the cloud. In contrast, a young “cloud-native” DevOps professional has very little knowledge of “traditional” security like network segmentation or firewall configuration, which will typically result in configuration errors.

Some CNAPP are proud to be “Agentless” eliminating the need to install and manage agents that can cause various issues, from machine’ overload to agent vulnerabilities due to security flows and, guess what, due to the agent’s misconfiguration. Agentless monitoring has its benefits but it is not free of risks. Any monitored device should be “open” for such monitoring, typically coming from a remote server. If an adversary was able to fake a monitoring attempt, he can easily get access to all the monitored devices and compromise the entire network. So “agentless CNAPP” does not automatically mean a better solution than a competing security platform. Easier for maintenance by IT staff? Yes, it is. Is it more secure? Probably not.

Predictive Networks: is it the Future?

Post-chatGPT Update as of May 26th, 2023:
Cisco and their EVP Liz Centoni have probably never been so wrong before in their useless predictions!

“Predictive Network” is a cool term but it goes down to some things that Cisco EVP Liz Centoni does not consider cool or trending anymore: Artificial Intelligence (AI) and Machine Learning (ML), which collect and analyze millions of network events, delivering problem-solving solutions. AI-based Predictive Networks, that by the way, are one of Liz’s 2023 “trends” predictions are contradicting her statement that

The cloud and AI are no longer frontiers

Obviously, Cisco’s EVP and Chief Strategy Officer Centoni refers to Cisco’s own Predictive Network product which, quoting Cisco now

 rely on a predictive engine in charge of computing (statistical, machine learning) models of the network using several telemetry sources

So how exactly AI is “no longer the frontier” Liz, if machine learning powers Predictive Networks that you predict to become a 2023 trend?

Full Stack IT Observability Will Drive Business Performance in 2023

Cisco predicts that 2023 will be shaped by a few exciting trends in technology, including network observability with business correlation. Cisco’s EVP & Chief Strategy Officer Liz Centoni is sure that

To survive and thrive, companies need to be able to tie data insights derived from normal IT operations directly to business outcomes or risk being overtaken by more innovative competitors

and we cannot agree more.

Proper intelligent monitoring of digital assets along with distributed tracing should be tightly connected to the business context of the enterprise. Thus, any organization can benefit from actionable business insights while improving online and digital user experience for customers, employees, and contractors. Additionally, fast IT response based on artificial intelligence data analysis of monitored and collected network and assets events can prevent or at least provide fast remediation for the most common security threat that exists in nearly any modern digital organization: misconfiguration. 79% of firms have already experienced a data breach in the past 2 years, while 67% of them pointed to security misconfiguration as the main reason.

Misconfiguration of most software products can be timely detected and fixed with data collection and machine learning of network events and configuration files analyzed by network observability and network monitoring tools. An enterprise should require its IT departments to reach full stack observability, and connect the results with the business context. It is particularly important since we know that 99% of cloud security failures are customers’ mistakes (source: Gartner). Business context should be widely adopted as a part of the results delivered by intelligent observability and cybersecurity solutions.

Observability and Protection for Cloud Native Applications

Banks and other financial institutions are moving to the cloud. It is a slow process but the trend is here. Cloud computing business models give financial organizations flexibility to deploy pay-as-you-go cloud services. Furthermore, the cloud comes with built-in scalability so businesses react to market changes quickly. Pay-as-you-go infrastructure drastically reduces costs for banks and financial services institutions (BFSI), but then other questions raise. The first of these questions would be “is it secure to move my data and services to the cloud?”. Here network observability and AI-based network monitoring come to help, and particularly because financial institutions need to be compliant with regulations such as the PIPEDA.

MarketAndMarket report predicts that the market for cloud-native protection platforms will reach $19.3 billion by 2027. This is more than double from $7.8 billion in 2022 as estimated by the marketing firm. BFSI and other enterprises move to the cloud. This requires intelligent network observability and security solutions based on artificial intelligence and machine learning and thus such a rapid market growth at 19.9% CAGR in 2022-2027 seems to be a very reasonable assumption. Today AI-based observability and security solutions analyze hundreds of thousands of events a day. We should expect that the next generation of these software solutions will create and analyze a few orders of magnitudes of events daily, scaling up to tens to hundreds of millions of events a day for an average cloud-based BFSI organization. The report names a few market leaders, among them Check Point (Israel), Trend Micro (Japan), Palo Alto Networks (US), CrowdStrike (US), Fortinet (US), Forcepoint (US), Proofpoint (US), Radware (Israel), Zscaler (US).

Cloud Monitoring Market Size Estimations

According to a marketing study, the global IT infrastructure monitoring market is supposed to grow at 13.6% CAGR reaching USD $64.5 in 2031. Modern IT infrastructure becomes increasingly more complex and requires new skills from IT personnel, often blurring the borders between IT staff, DevOps, and development teams. With the continued move from on-prem deployments to the enterprise cloud, IT infrastructure goes to the cloud as well, and thus IT teams have to learn basic cloud-DevOps skills, such as scripting, cloud-based scaling, events creation, and monitoring. Furthermore, no company today offers a complete monitoring solution that can monitor any network device and software component.

Thus, IT teams have to build their monitoring solutions piece by piece, using various mostly not interconnected systems, developed by different, often competing vendors. For some organizations, it also comes to compliance, such as GDPR or ISO requirements, and to SLAs that obligate the IT department to timely detect, report, and fix any issue with their systems. In this challenging multi-system and multi-device environment, network observability becomes the key to enterprise success. IT organizations keep increasing their budgets seeking to reach the comprehensive cloud and on-prem monitoring for their systems and devices, and force the employees to run network and device monitoring software on their personal devices, such as mobile phones and laptops. This trend also increases the IT spend on cybersecurity solutions such as SDR and network security analysis with various SIEM tools.

Strategies to Combat Emerging Gaps in Cloud Security

As cloud clients input 2023 with a hybrid presence in multiple clouds, they work on prioritizing techniques to fight rising gaps in cloud security.

Most big agencies are getting access to cloud offerings in numerous public clouds, whilst preserving organization structures and personal clouds of their company’s facts centers.

One of the ways of closing these gaps in security could be adopting deep observability. We have already reviewed a few Deep Observability providers such as Gigamon. While Gigamon probably can be considered a current market leader in this relatively new and small market with under $2B annual market size, they still should watch out for the newcomers who come with shiny new products and great technologies under the hood.

CtrlStack is one of these startups and they recently got a second round of funding from Lightspeed VC, led by Kearny Jackson and Webb Investment Network.

The delivery of features and applications by today’s digital-first companies and developers is accelerating. Teams from information technology operations and software development must collaborate closely to do this, forming a practice known as DevOps. When events occur, they may involve any number of digital environment systems, including operations, infrastructure, code, or any combination of modifications made to any of them.

The CtrlStack platform connects cause and effect to make troubleshooting easier and incident root cause analysis faster by tracking relationships between components in a customer’s systems. Developers and engineers can solve problems quickly by giving DevOps teams the tools they need.

By forming an understanding graph of all of the infrastructure, interconnected offerings, and impact, CtrlStack can supply the full picture while capturing the devices’ modifications and relationships throughout the whole device stack. Using CtrlStack product DevOps groups can view dependencies, measure the impact of modifications and examine occasions in actual time.

Key capabilities of the platform encompass an occasion timeline that permits groups to browse and clear out out extrade occasions, without having to sift via log documents or survey users, and a visual representation that offers insights into operational data. Both of those capabilities additionally force dashboards for builders and DevOps groups.

Developers can also access their dashboards that give visibility for any modifications to code commits, configuration documents, or function flags, – all in one click. DevOps groups get a dashboard for root reason evaluation that permits them to seize all of the context for the time being they came about with a searchable timeline of dependencies displaying the whole impacted topology and impacted metrics.

Deep Observability and Zero Trust

Zero trust architecture has established itself as a highly recognized method of safeguarding both on-premises systems and the cloud in response to the exponential rise in ransomware and other cyber threats. In example, although only 51% of EMEA IT and security professionals said they were confident implementing zero trust in 2019, that percentage increased noticeably to 83% in 2022.

The implicit trust that is placed in internal network traffic, people, or devices is eliminated by a zero trust architecture, to put it simply. Businesses can increase both productivity and security with this defense / defense in depth approach to security.

For businesses, implicit confidence in the technology stack can be a major problem. IT teams frequently struggle to put the right trust controls in place because they typically assume that the company owns the system, that all users are employees, or that the network was previously safe. These trust indicators, however, are insufficient. Organizations are becoming more exposed to risk as a result of trust built on assumptions. These careless measurements of trust can be utilized by threat actors against a company to facilitate network intrusion and data breaches.

A zero trust framework gets rid of any implicit trust and instead determines whether a company should grant access in each specific situation. It is more crucial now that bring-your-own-device (BYOD) initiatives have become so popular due to the rise of remote and hybrid working.

To increase the effectiveness of metric, event, log, and trace-based monitoring and observability tools and reduce risk, deep observability is the addition of real-time network-level intelligence. With it comes more insight to strengthen a company’s security posture since deep observability enables security professionals to examine the metadata that threat actors leave behind after evading endpoint detection and response systems or SIEMs. Therefore, it is essential to support a thorough zero trust strategy.

In the end, zero trust’s primary objective is to identify and categorize all network-connected devices, not only those that have endpoint agents installed and functioning, and to tightly enforce a least-privilege access policy based on a detailed analysis of the device. This cannot be done for devices or users that you can not access.

Metrist raises $5.5M for eBPF-based cloud monitoring

Metrist, a startup with DevOps roots, raises $5.5M to help companies to deal with cloud services outages. Metrist was founded by two DevOps veterans, Jeff Martens and Ryan Duffield, whose past experience includes working for New Relic, PagerDuty and similar observability and monitoring companies.

Metrist Founders
Metrist Founders, Image Credit: Metrist

Metrist’s idea is not very original: negotiate outages that vendors’ SLAs do not cover. Surprisingly, there are not too many competitors in this area. Some competition for Metrist’s business comes from Parametric Insurance, which sells insurance policies that include cloud and CDN outages.

In contrast to selling insurance, Metrist is willing to play the role of the trusted arbiter in negotiating outage outcomes with vendors and the affected company.

One of the interesting parts of this story is that according to TechCrunch report Metrist team plans to run an eBPF agent to gather data services a customer runs. There are a few issues associated with this technical approach:

  1. Metrist is going to miss all container deployments, e.g. ECS at AWS or any K8s+dockers infrastructure. It is quite a big part of cloud infrastructure that Metrist won’t be able to observe with eBPF-based agents.
  2. On top of that, eBPF can not see into Serverless deployments, e.g. AWS Lambda-s. This further reduces the world of apps that Metrist can monitor.
  3. And there is a third factor that limits Metrist scale-up: most enterprises become very suspicious once they are asked to run yet another agent on their cloud VM or a barrel metal machine. While companies like PageDuty or New Relic have already overcome this psychological barrier by being on the market for long enough, it still could be a showstopper for a young startup that needs to prove itself to its customers.

Having said this, we wish the Metrist’s team all the success.