eventstats - Splunk Documentation (2024)

Description

Generates summary statistics from fields in your events and saves those statistics in a new field.

Only those events that have fields pertinent to the aggregation are used in generating the summary statistics. The generated summary statistics can be used for calculations in subsequent commands in your search. See Usage.

Syntax

The required syntax is in bold.

eventstats
[allnum=<bool>]
<stats-agg-term> ...
[<by-clause>]

Required arguments

<stats-agg-term>
Syntax: <stats-func>( <evaled-field> | <wc-field> ) [AS <wc-field>]
Description: A statistical aggregation function. See Stats function options. The function can be applied to an eval expression, or to a field or set of fields. Use the AS clause to place the result into a new field with a name that you specify. You can use wild card characters in field names.

Optional arguments

allnum
Syntax: allnum=<bool>
Description: If set to true, computes numerical statistics on each field, if and only if ,all of the values of that field are numerical. If you have a BY clause, the allnum argument applies to each group independently.
Default: false
<by-clause>
Syntax: BY <field-list>
Description: The name of one or more fields to group by.

Stats function options

stats-func
Syntax: The syntax depends on the function that you use. Refer to the table below.
Description: Statistical and charting functions that you can use with the eventstats command. Each time you invoke the eventstats command, you can use one or more functions. However, you can only use one BY clause. See Usage.
The following table lists the supported functions by type of function. Use the links in the table to see descriptions and examples for each function. For an overview about using functions with commands, see Statistical and charting functions.
Type of functionSupported functions and syntax
Aggregate functionsavg()

count()
distinct_count()
estdc()
estdc_error()

exactperc<int>()

max()
median()
min()
mode()

perc<int>()

range()
stdev()
stdevp()

sum()

sumsq()
upperperc<int>()
var()
varp()

Event order functionsearliest()
first()
last()
latest()
Multivalue stats and chart functionslist(X)
values(X)

Usage

The eventstats command is a dataset processing command. See Command types.

The eventstats search processor uses a limits.conf file setting named max_mem_usage_mb to limit how much memory the eventstats command can use to keep track of information. When the limit is reached, the eventstats command processor stops adding the requested fields to the search results.

Do not set max_mem_usage_mb=0 as this removes the bounds to the amount of memory the eventstats command processor can use. This can lead to search failures.

Splunk Cloud Platform
To change the max_mem_usage_mb setting, request help from Splunk Support. If you have a support contract, file a new case using the Splunk Support Portal at Support and Services. Otherwise, contact Splunk Customer Support.
Splunk Enterprise
To change the max_mem_usage_mb setting, follow these steps.
Prerequisites
  • Have the permissions to change the max_mem_usage_mb setting. Only users with file system access, such as system administrators, can increase the max_mem_usage_mb setting using configuration files.
  • Know how to edit configuration files. Review the steps in How to edit a configuration file in the Splunk Enterprise Admin Manual.
  • Decide which directory to store configuration file changes in. There can be configuration files with the same name in your default, local, and app directories. See Where you can place (or find) your modified configuration files in the Splunk Enterprise Admin Manual.

Never change or copy the configuration files in the default directory. The files in the default directory must remain intact and in their original location. Make changes to the files in the local directory.

Steps
  1. Open or create a local limits.conf file at $SPLUNK_HOME/etc/system/local.
  2. Under the [default] stanza, look for the max_mem_usage_mb setting.
  3. Under Note, read the information about he eventstats command and how the max_mem_usage_mb and the maxresultrows settings are used to determine the maximum number of results to return.
  4. Change the value for the max_mem_usage_mb setting and if necessary the maxresultrows setting.

Differences between eventstats and stats

The eventstats command is similar to the stats command. You can use both commands to generate aggregations like average, sum, and maximum.

The differences between these commands are described in the following table:

stats commandeventstats command
Events are transformed into a table of aggregated search resultsAggregations are placed into a new field that is added to each of the events in your output
You can only use the fields in your aggregated results in subsequent commands in the searchYou can use the fields in your events in subsequent commands in your search, because the events have not been transformed

How eventstats generates aggregations

The eventstats command looks for events that contain the field that you want to use to generate the aggregation. The command creates a new field in every event and places the aggregation in that field. The aggregation is added to every event, even events that were not used to generate the aggregation.

For example, you have 5 events and 3 of the events have the field you want to aggregate on. the eventstats command generates the aggregation based on the data in the 3 events. A new field is added to every event and the aggregation is added to that field in every event.

Statistical functions that are not applied to specific fields

With the exception of the count function, when you pair the eventstats command with functions that are not applied to specific fields or eval expressions that resolve into fields, the search head processes it as if it were applied to a wildcard for all fields. In other words, when you have | eventstats avg in a search, it returns results for | eventstats avg(*).

This "implicit wildcard" syntax is officially deprecated, however. Make the wildcard explicit. Write | eventstats <function>(*) when you want a function to apply to all possible fields.

Functions and memory usage

Some functions are inherently more expensive, from a memory standpoint, than other functions. For example, the distinct_count function requires far more memory than the count function. The values and list functions also can consume a lot of memory.

If you are using the distinct_count function without a split-by field or with a low-cardinality split-by by field, consider replacing the distinct_count function with the the estdc function (estimated distinct count). The estdc function might result in significantly lower memory usage and run times.

Event order functions

Using the first and last functions when searching based on time does not produce accurate results.

  • To locate the first value based on time order, use the earliest function, instead of the first function.
  • To locate the last value based on time order, use the latest function, instead of the last function.

For example, consider the following search.

index=test sourcetype=testDb| eventstats first(LastPass) as LastPass, last(_time) as mostRecentTestTime BY testCaseId | where startTime==LastPass OR _time==mostRecentTestTime | stats first(startTime) AS startTime, first(status) AS status, first(histID) AS currentHistId, last(histID) AS lastPassHistId BY testCaseId

When you use the stats and eventstats commands for ordering events based on time, use the earliest and latest functions.

The following search is the same as the previous search except the first and last functions are replaced with the earliest and latest functions.

index=test sourcetype=testDb | eventstats latest(LastPass) AS LastPass, earliest(_time) AS mostRecentTestTime BY testCaseId | where startTime==LastPass OR _time==mostRecentTestTime | stats latest(startTime) AS startTime, latest(status) AS status, latest(histID) AS currentHistId, earliest(histID) AS lastPassHistId BY testCaseId

Basic examples

1. Calculate the overall average duration

This example uses the sample data from the Search Tutorial but should work with any format of Apache web access log. To try this example on your own Splunk instance, you must download the sample data and follow the instructions to get the tutorial data into Splunk. Use the time range All time when you run the search.

Calculate the overall average duration of a set of transactions, and place the calculation in a new field called avgdur.

host=www1 | transaction clientip host maxspan=30s maxpause=5s | eventstats avg(duration) AS avgdur

Because no BY clause is specified, a single aggregation is generated and added to every event in a new field called avgdur .

When you look at the list of Interesting Fields, you will see that avgdur has only one value.

2. Calculate the average duration grouped by a specific field

This example is the same as the previous example except that an average is calculated for each distinct value of the date_minute field. The new field avgdur is added to each event with the average value based on its particular value of date_minute.

host=www1 | transaction clientip host maxspan=30s maxpause=5s | eventstats avg(duration) As avgdur BY date_minute

When you look at the list of Interesting Fields, you will see that avgdur has 79 values, based on the timestamp, duration, and date_minute values.

3. Search for spikes in the volume of errors

This searches for spikes in error volume. You can use this search to trigger an alert if the count of errors is higher than average, for example.

eventtype="error" | eventstats avg(bytes) AS avg | where bytes>avg

Extended example

The following example provides you with a better understanding of how the eventstats command works. This example is actually a progressive set of small examples, where one example builds on or extends the previous example.

It's much easier to see what the eventstats command does by showing you examples, using a set of simple events.

These examples use the makeresults command to create a set of events. The streamstats and eval commands are used to create additional fields in the events.

Creating a set of events

Let's start by creating a set of four events. One of the events contains a null value in the age field.

| makeresults count=4 | streamstats count | eval age = case(count=1, 25, count=2, 39, count=3, 31, count=4, null())| eval city = case(count=1 OR count=3, "San Francisco", count=2 OR count=4, "Seattle")

  • The streamstats command is used to create the count field. The streamstats command calculates a cumulative count for each event, at the time the event is processed.
  • The eval command is used to create two new fields, age and city. The eval command uses the value in the count field.
  • The case function takes pairs of arguments, such as count=1, 25. The first argument is a Boolean expression. When that expression is TRUE, the corresponding second argument is returned.

The results of the search look like this:

_timeagecitycount
2020-02-05 18:32:0725San Francisco1
2020-02-05 18:32:0739Seattle2
2020-02-05 18:32:0731San Francisco3
2020-02-05 18:32:07Seattle4

Using eventstats with a BY clause

The BY clause in the eventstats command is optional, but is used frequently with this command. The BY clause groups the generated statistics by the values in a field. You can use any of the statistical functions with the eventstats command to generate the statistics. See the Statistical and charting functions.


In this example, the eventstats command generates the average age for each city. The generated averages are placed into a new field called avg(age).

The following search is the same as the previous search, with the eventstats command added at the end:

| makeresults count=4 | streamstats count | eval age = case(count=1, 25, count=2, 39, count=3, 31, count=4, null())| eval city = case(count=1 OR count=3, "San Francisco", count=2 OR count=4, "Seattle")| eventstats avg(age) BY city

  • For San Francisco, the average age is 28 = (25 + 31) / 2.
  • For Seattle, there is only one event with a value. The average is 39 = 39 / 1. The eventstats command places that average in every event for Seattle, including events that did not contain a value for age.

The results of the search look like this:

_timeageavg(age)citycount
2020-02-05 18:32:072528San Francisco1
2020-02-05 18:32:073939Seattle2
2020-02-05 18:32:073128San Francisco3
2020-02-05 18:32:0739Seattle4

Renaming the new field

By default, the name of the new field that is generated is the name of the statistical calculation. In these examples, that name is avg(age). You can rename the new field using the AS keyword.

In the following search, the eventstats command has been adjusted to rename the new field to average age by city.

| makeresults count=4 | streamstats count | eval age = case(count=1, 25, count=2, 39, count=3, 31, count=4, null())| eval city = case(count=1 OR count=3, "San Francisco", count=2 OR count=4, "Seattle")| eventstats avg(age) AS "average age by city" BY city

The results of the search look like this:

_timeageaverage age by citycitycount
2020-02-05 18:32:072528San Francisco1
2020-02-05 18:32:073939Seattle2
2020-02-05 18:32:073128San Francisco3
2020-02-05 18:32:0739Seattle4

Events with text values

The previous examples show how an event is processed that does not contain a value in the age field. Let's see how events are processed that contain an alphabetic character value in the field that you want to use to generate statistics .

The following search includes the word test as a value in the age field.

| makeresults count=4 | streamstats count | eval age = case(count=1, 25, count=2, 39, count=3, 31, count=4, "test")| eval city = case(count=1 OR count=3, "San Francisco", count=2 OR count=4, "Seattle")


The results of the search look like this:

_timeagecitycount
2020-02-05 18:32:0725San Francisco1
2020-02-05 18:32:0739Seattle2
2020-02-05 18:32:0731San Francisco3
2020-02-05 18:32:07testSeattle4


Let's add the eventstats command to the search.

| makeresults count=4 | streamstats count | eval age = case(count=1, 25, count=2, 39, count=3, 31, count=4, "test")| eval city = case(count=1 OR count=3, "San Francisco", count=2 OR count=4, "Seattle")| eventstats avg(age) BY city


The alphabetic values are treated like null values. The results of the search look like this:

_timeageavg(age)citycount
2020-02-05 18:32:072528San Francisco1
2020-02-05 18:32:073939Seattle2
2020-02-05 18:32:073128San Francisco3
2020-02-05 18:32:07test39Seattle4

Using the allnum argument

But suppose you don't want statistics generated when there are alphabetic characters in the field or the field is empty?

The allnum argument controls how the eventstats command processes field values. The default setting for the allnum argument is FALSE. Which means that the field used to generate the statistics does not need to contain all numeric values. Fields with empty values or alphabetic character values are ignored. You've seen this in the earlier examples.

You can force the eventstats command to generate statistics only when the fields contain all numeric values. To accomplish this, you can set the allnum argument to TRUE.

| makeresults count=4 | streamstats count | eval age = case(count=1, 25, count=2, 39, count=3, 31, count=4, "test")| eval city = case(count=1 OR count=3, "San Francisco", count=2 OR count=4, "Seattle")| eventstats allnum=true avg(age) BY city


The results of the search look like this:

_timeageavg(age)citycount
2020-02-05 18:32:072528San Francisco1
2020-02-05 18:32:0739Seattle2
2020-02-05 18:32:073128San Francisco3
2020-02-05 18:32:07testSeattle4

Because the age field contains values for Seattle that are not all numbers, the entire set of values for Seattle are ignored. No average is calculated.

The allnum=true argument applies to empty values as well as alphabetic character values.

See also

Commands
stats
streamstats
Blogs
Search commands > stats, eventstats and streamstats
eventstats - Splunk Documentation (2024)

FAQs

What is the difference between stats and eventstats in Splunk? ›

Eventstats calculates a statistical result same as stats command only difference is it does not create statistical results, it aggregates them to the original raw data. Streamstats command uses events before the current event to compute the aggregate statistics that are applied to each event.

What is the difference between stats and Streamstats? ›

Using streamstats to bolster data with cumulative analysis enables the trending of growth over time. Preserving events with summary calculations: Unlike the stats command, streamstats allows for the use of statistical aggregation functions without condensing the results set to a condensed summary.

What are stats in a Splunk query? ›

The stats command is a fundamental Splunk command. It will perform any number of statistical functions on a field, which could be as simple as a count or average, or something more advanced like a percentile or standard deviation.

What are streamstats in Splunk? ›

The SPL2 streamstats command adds a cumulative statistical value to each search result as each result is processed. For example, you can calculate the running total for a particular field, or compare a value in a search result with a the cumulative value, such as a running average.

What are the 4 types of searches in Splunk by performance? ›

How search types affect Splunk Enterprise performance
Search typeRef. indexer throughputPerformance impact
DenseUp to 50,000 matching events per second.CPU-bound
SparseUp to 5,000 matching events per second.CPU-bound
Super-sparseUp to 2 seconds per index bucket.I/O bound
RareFrom 10 to 50 index buckets per second.I/O bound

What is the difference between metrics and events in Splunk? ›

An event, or event time series (ETS), is a specific occurrence that can be represented in Splunk Observability Cloud outside of the flow of streaming metrics. Events provide context to metric data. Events are very basic, and you can't apply analytic functions to events.

What is StreamStats Eventstats stats? ›

Eventstats performs calculations on events within a single search, while streamstats calculate statistics over the entire search result set in a streaming fashion.

What is the difference between stats Ttest_ind and stats Ttest_rel? ›

stats. ttest_rel is for related samples and stats. ttest_ind is for independent samples. Since you have two different values for the size they are two independent samples.

What method does StreamStats use? ›

StreamStats incorporates (1) a map-based user interface for the site selection; (2) a relational data base that contains information for data-collection stations and regression equations used to estimate flow statistics for ungaged sites; (3) a GIS program that allows locating sites of interest in the user interface, ...

How to improve Splunk query performance? ›

Target your search to a narrow dataset

Limit the timeframe of your search to 15 minutes or less. Reduce the amount of data the Splunk platform needs to search through by specifying specific index names in your searches. Typically, you want to store like data that is commonly searched together in the same index.

What is the difference between stats and transaction commands in Splunk? ›

Stats provides the aggregation. transaction provides the unique number / count. Like you perform 10 steps as part of one transaction.

What is the difference between stats and chart in Splunk? ›

Use the stats command when you want to create results tables that show granular statistical calculations. Use the stats command when you want to specify 3 or more fields in the BY clause. Use the chart command when you want to create results tables that show consolidated and summarized calculations.

What do eventstats do in Splunk? ›

The SPL2 eventstats command generates summary statistics from fields in your events and saves those statistics into a new field. The eventstats command places the generated statistics in new field that is added to the original raw events.

What is the difference between StreamStats and Eventstats? ›

eventstats adds the desired stats function result to the event, derived from the entire set of events. Streamstats adds the desired stats function result to the event, derived from the point in time of the current event in the stream. An example is a moving average.

Can Splunk track user activity? ›

The User Activity dashboard displays panels representing common risk-generating user activities such as suspicious website activity. For more information about risk scoring, see How Splunk Enterprise Security assigns risk scores.

What is the difference between stats and statsmodels? ›

stats has all of the probability distributions and some statistical tests. It's more like library code in the vein of numpy and scipy. Statsmodels on the other hand provides statistical models with a formula framework similar to R and it works with pandas DataFrames.

What is the difference between stats and chart command in Splunk? ›

In Summary

Use the stats command when you want to specify 3 or more fields in the BY clause. Use the chart command when you want to create results tables that show consolidated and summarized calculations. Use the chart command to create visualizations from the results table data.

What is the difference between events and logs in Splunk? ›

An event is a thing that happened anywhere at any time. It might be in Splunk and it might not. A log is the digital exhaust of that event; it is the plain-text vestige that indicates than an event happened. A result is each thing that is returned from a Splunk search.

Top Articles
Latest Posts
Article information

Author: Mrs. Angelic Larkin

Last Updated:

Views: 6482

Rating: 4.7 / 5 (67 voted)

Reviews: 90% of readers found this page helpful

Author information

Name: Mrs. Angelic Larkin

Birthday: 1992-06-28

Address: Apt. 413 8275 Mueller Overpass, South Magnolia, IA 99527-6023

Phone: +6824704719725

Job: District Real-Estate Facilitator

Hobby: Letterboxing, Vacation, Poi, Homebrewing, Mountain biking, Slacklining, Cabaret

Introduction: My name is Mrs. Angelic Larkin, I am a cute, charming, funny, determined, inexpensive, joyous, cheerful person who loves writing and wants to share my knowledge and understanding with you.