A customer inquired about identifying logs that were recently modified, but not during the current day. Although calculating the data written to a particular log file is straightforward, identifying files that weren't written to is a bit trickier, since the
logVolume metric is not reported for inactive log files.
This query is useful for anyone interested in the log activity of a particular server, and could be used to identify logs for use in alerts or search queries.
- This query should be run over multiple days (we used 3 days in our example), as it identifies logs that were modified during this period
- Today is 2020-09-04 and we are in Pacific Standard Time (PST).
- The value for today that
timebucket("1d")returns is 2020-09-03 17:00 (PST)
- This is the equivalent of 2020-09-04 00:00 (UTC)
- As previously mentioned, a constraint of the
timebucketfunction is its use of the UTC timezone (this will likely be addressed in a future update).
- The value for today that
The final query is provided here for convenience; however, I will break it down by segment and include some tips and explanations that may assist readers with adapting it for their particular needs
tag='logVolume' metric='logBytes' !(forlogfile=='none') |
group log_mb=sum(value) / 1024 / 1024 by ts=timebucket("1d"), host, forlogfile |
columns ts, host, forlogfile, ts2=timebucket(queryend(),"1d") |
let endDate=(ts=ts2) |
sort ts, host, forlogfile |
group tally=count(forlogfile), tallyEndDate=count(endDate), lastActive=last(ts) by host, forlogfile |
filter tallyEndDate=0 |
columns timestamp=lastActive, host, forlogfile, tally, tallyEndDate
1: We are evaluating the
logBytes metric for a particular log file
2: Total the
logBytes and convert from bytes to megabytes. Group by day (as an epoch value), host, and logfile. This aggregation condenses the log events down to an essential list.
columns only returns specified columns, similar to a
ts2=timebucket(queryend(),"1d") takes the end of the query (as specified in the PowerQuery UI) and converts it to a day and assigns it to each row to identify whether the row was modified at this time.
let endDate=(ts=ts2) generates a boolean (either 0 or 1) based on whether the file was modified on the PowerQuery's end date
sort is needed because the
last function is used in an upcoming
group by statement. Otherwise, you wouldn't want to sort the data returned by the query at this point.
6: We're performing another
group by to further distill the log file statistics. The individual days when the log file was modified is no longer of concern, as we're interested in whether the log file was modified today. We use the
sum function to achieve this (although the largest value a given log file from a server could have is 1 , due to the
group by on line 2).
last function is used to carry the last date when the log file / server combination was modified
filter restricts the results to log files that were not modified on the day of the PowerQuery's endDate
8: Setting the
timestamp= causes the the epoch to be displayed as a timestamp