From logs to insights: AI breakthroughs redefined

by SkillAiNest

From logs to insights: AI breakthroughs redefined

Presented by Flexible


Logging is set to become the primary tool for finding the “why” in diagnosing network events.

The problem with data in the modern IT environment: there’s too much of it. Organizations that need to manage enterprise environments are increasingly challenged to detect and diagnose problems in real time, improve performance, improve reliability, and ensure security and compliance.

There are many modern observational landscape tools that offer solutions. Most DevOps revolves around teams or site reliability engineers (SREs) uncovering patterns and figuring out what’s going on across the network, and figuring out why a problem or event has occurred. The problem is that this process creates an information overload: just one Kubernetes cluster can emit 30 to 50 gigabytes of logs a day, and suspicious behavior patterns can sneak past human eyes.

"In the world of AI, it’s so uncivilized now, to think of humans observing the infrastructure." says Ken Exner, Chief Product Officer of Elasticity. "I hate to break it to you, but machines are better than humans at pattern matching. “

An industry-wide focus on looking at symptoms forces engineers to search for answers manually. important "Why?" Buried in logs, but because they contain large amounts of unstructured data, the industry uses them as a tool of last resort. This forces teams into a costly trade-off: either spend countless hours building complex data pipelines, dropping logged data and risking significant visibility gaps, or log and forget.

Resilient, the search AI company, recently released a new feature for observation called Streams, which aims to be the primary indicator of investigation by taking noise logs and turning them into patterns, context and meaning.

Streams uses AI to automatically segment and parse to extract relevant fields, greatly reducing the effort required by SRE to make logs usable. Streams also automatically surface critical events such as critical errors and anomaly levels from context-rich logs, giving SREs early warning and a clear understanding of their workload, enabling them to quickly investigate and resolve issues. The ultimate goal is to show corrective actions.

"From raw, big, messy data, Streams automatically generates structure, and puts it into a form that’s usable, automatically alerts you to problems and helps you fix them." Exner says. "This is the magic of rivers."

A broken workflow

Streams has introduced an observational process that some say is broken. Typically, SREs have set metrics, logs, and markers. They then set alerts, and service level objectives (SLOs)—often hard-coded rules—to indicate where a service or process is out of bounds, or a particular pattern has been detected.

When an alert is triggered, it indicates the metric that is exhibiting an anomaly. From there, SREs look to a metrics dashboard, where they can look at the issue and compare alerts to other metrics, or CPU to I/O to memory, and start looking for patterns.

They may then need to look at a trace, and examine upstream and downstream dependencies in the application to dig down to the root cause of the problem. Once they figure out what’s causing the problem, they jump into the logs for that database or service to try and debug the problem.

Some companies simply try to add more tools when existing ones prove ineffective. This means SREs are hopping from tool to tool to monitor and diagnose problems in their infrastructure and applications.

"You are hopping different tools. You’re relying on a human to interpret these things, visually see the relationships between systems in a service map, visually see graphs on a matrix dashboard, to figure out what and where the problem is, " Exner says. "But AI automates this workflow."

With the help of AI-powered threads, React isn’t just used to resolve login issues, but to process potential issues and create information-rich alerts that help teams jump right into fixing the issue, before automatically notifying the team that it’s taken care of.

"I believe that the richest set of information, the actual signal type, will start driving a lot of the automation that a service reliability engineer typically does today, and does a lot manually," He added. "A human shouldn’t be in that process, where they’re digging into themselves, trying to figure out what’s going on, where and what, and then once they find the root cause, they’re trying to figure out how to debug it."

The future of observation

Large language models (LLMs) may be a key player in the future of observation. LLMS excels at recognizing patterns in large amounts of repetitive data, closely resembling log and telemetry data in complex, dynamic systems. And today’s LLM can be trained for specific IT processes. Along with automation tooling, LLM has the knowledge and tools needed to troubleshoot database errors or Java heap issues, and more. Adding to platforms that bring context and relevance.

Automation will still take some time, Exner says, but automated runbooks and playbooks developed by LLMS will become standard practice within the next two years. In other words, treatment measures will be driven by the LLM. LLM will offer fixes instead of calling an expert, and a human will verify and implement them.

Addressing skill shortages

Going all-in on AI for observation will help address a major shortfall in the capabilities needed to manage IT infrastructure. Hiring is slow because organizations need teams with a lot of experience and understanding of potential issues, and how to resolve them quickly. Exner says that experience can come from an LL.M. that is contextually based.

"We can help address the skills gap by increasing people with LLMs that make them all instant experts," He explains. "I think it makes it much easier for us to take novice practitioners and turn them into expert practitioners in security and surveillance, and it will make it possible for a more novice practitioner to act like an expert.

Stream is now available in Flexible View. Start reading More on rivers.


Sponsored articles are content produced by a company that is either paying for the post or has a business relationship with VentureBeat, and is always clearly marked. For more information, contact sales@ventorbet.com.

You may also like

Leave a Comment

At Skillainest, we believe the future belongs to those who embrace AI, upgrade their skills, and stay ahead of the curve.

Get latest news

Subscribe my Newsletter for new blog posts, tips & new photos. Let's stay updated!

@2025 Skillainest.Designed and Developed by Pro