跳转至

日志分析示例

此示例展示了一种可能的创建日志文件分析器的方法。它使用的模型是 mattw/loganalyzer,该模型基于 codebooga,一个34亿参数的模型。

使用方法如下:

python loganalysis.py <logfile>

您可以尝试在此目录中包含的 logtest.logfile 文件。

运行示例

1. 确保您已安装 mattw/loganalyzer 模型:

ollama pull mattw/loganalyzer

2. 安装Python所需库。

pip install -r requirements.txt

3. 运行示例:

python loganalysis.py logtest.logfile

代码审查

此示例的第一部分是一个模型文件,它采用 codebooga 并应用了一个新的系统提示:

SYSTEM """
You are a log file analyzer. You will receive a set of lines from a log file for some software application, find the errors and other interesting aspects of the logs, and explain them so a new user can understand what they mean. If there are any steps they can do to resolve them, list the steps in your answer.
"""

此模型可在 https://ollama.com/mattw/loganalyzer 上找到。您可以使用命令 ollama create <namespace/modelname> -f <path-to-modelfile> 然后 ollama push <namespace/modelname> 将其自定义并添加到您自己的命名空间。

然后 loganalysis.py 扫描给定日志文件中的所有行,并搜索单词 'error'。当找到该词时,前后10行被设置为调用生成API的提示。

data = {
  "prompt": "\n".join(error_logs),
  "model": "mattw/loganalyzer"
}

最后,解析流式输出并打印输出中的响应字段到行。

response = requests.post("http://localhost:11434/api/generate", json=data, stream=True)
for line in response.iter_lines():
  if line:
    json_data = json.loads(line)
    if json_data['done'] == False:
      print(json_data['response'], end='')

接下来的步骤

这里还有很多可以做的事情。这是一种检测错误的简单方式,寻找单词 error。或许寻找日志中的异常活动会很有趣。创建每行的嵌入并比较它们,寻找相似行,或者研究应用Levenshtein距离算法来找到相似行以帮助识别异常行会很有趣。

尝试不同的模型和不同的提示来分析数据。您可以考虑添加检索增强生成(RAG)来帮助理解新的日志格式。

源码

loganalysis.py

import sys
import re
import requests
import json

# prelines and postlines represent the number of lines of context to include in the output around the error
prelines = 10
postlines = 10

def find_errors_in_log_file():
  if len(sys.argv) < 2:
    print("Usage: python loganalysis.py <filename>")
    return

  log_file_path = sys.argv[1]
  with open(log_file_path, 'r') as log_file:
    log_lines = log_file.readlines()

  error_logs = []
  for i, line in enumerate(log_lines):
      if "error" in line.lower():
          start_index = max(0, i - prelines)
          end_index = min(len(log_lines), i + postlines + 1)
          error_logs.extend(log_lines[start_index:end_index])

  return error_logs

error_logs = find_errors_in_log_file()

data = {
  "prompt": "\n".join(error_logs),
  "model": "mattw/loganalyzer"
}

response = requests.post("http://localhost:11434/api/generate", json=data, stream=True)
for line in response.iter_lines():
  if line:
    json_data = json.loads(line)
    if json_data['done'] == False:
      print(json_data['response'], end='', flush=True)

Modelfile

FROM codebooga:latest

SYSTEM """
You are a log file analyzer. You will receive a set of lines from a log file for some software application, find the errors and other interesting aspects of the logs, and explain them so a new user can understand what they mean. If there are any steps they can do to resolve them, list the steps in your answer.
"""

PARAMETER TEMPERATURE 0.3

logtest.logfile

2023-11-10 07:17:40 /docker-entrypoint.sh: /docker-entrypoint.d/ is not empty, will attempt to perform configuration
2023-11-10 07:17:40 /docker-entrypoint.sh: Looking for shell scripts in /docker-entrypoint.d/
2023-11-10 07:17:40 /docker-entrypoint.sh: Launching /docker-entrypoint.d/10-listen-on-ipv6-by-default.sh
2023-11-10 07:17:40 10-listen-on-ipv6-by-default.sh: info: Getting the checksum of /etc/nginx/conf.d/default.conf
2023-11-10 07:17:40 10-listen-on-ipv6-by-default.sh: info: Enabled listen on IPv6 in /etc/nginx/conf.d/default.conf
2023-11-10 07:17:40 /docker-entrypoint.sh: Sourcing /docker-entrypoint.d/15-local-resolvers.envsh
2023-11-10 07:17:40 /docker-entrypoint.sh: Launching /docker-entrypoint.d/20-envsubst-on-templates.sh
2023-11-10 07:17:40 /docker-entrypoint.sh: Launching /docker-entrypoint.d/30-tune-worker-processes.sh
2023-11-10 07:17:40 /docker-entrypoint.sh: Configuration complete; ready for start up
2023-11-10 07:17:40 2023/11/10 13:17:40 [notice] 1#1: using the "epoll" event method
2023-11-10 07:17:40 2023/11/10 13:17:40 [notice] 1#1: nginx/1.25.3
2023-11-10 07:17:40 2023/11/10 13:17:40 [notice] 1#1: built by gcc 12.2.0 (Debian 12.2.0-14)
2023-11-10 07:17:40 2023/11/10 13:17:40 [notice] 1#1: OS: Linux 6.4.16-linuxkit
2023-11-10 07:17:40 2023/11/10 13:17:40 [notice] 1#1: getrlimit(RLIMIT_NOFILE): 1048576:1048576
2023-11-10 07:17:40 2023/11/10 13:17:40 [notice] 1#1: start worker processes
2023-11-10 07:17:40 2023/11/10 13:17:40 [notice] 1#1: start worker process 29
2023-11-10 07:17:40 2023/11/10 13:17:40 [notice] 1#1: start worker process 30
2023-11-10 07:17:40 2023/11/10 13:17:40 [notice] 1#1: start worker process 31
2023-11-10 07:17:40 2023/11/10 13:17:40 [notice] 1#1: start worker process 32
2023-11-10 07:17:40 2023/11/10 13:17:40 [notice] 1#1: start worker process 33
2023-11-10 07:17:40 2023/11/10 13:17:40 [notice] 1#1: start worker process 34
2023-11-10 07:17:40 2023/11/10 13:17:40 [notice] 1#1: start worker process 35
2023-11-10 07:17:40 2023/11/10 13:17:40 [notice] 1#1: start worker process 36
2023-11-10 07:17:40 2023/11/10 13:17:40 [notice] 1#1: start worker process 37
2023-11-10 07:17:40 2023/11/10 13:17:40 [notice] 1#1: start worker process 38
2023-11-10 07:17:44 192.168.65.1 - - [10/Nov/2023:13:17:43 +0000] "GET / HTTP/1.1" 200 615 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36" "-"
2023-11-10 07:17:44 2023/11/10 13:17:44 [error] 29#29: *1 open() "/usr/share/nginx/html/favicon.ico" failed (2: No such file or directory), client: 192.168.65.1, server: localhost, request: "GET /favicon.ico HTTP/1.1", host: "localhost:8080", referrer: "http://localhost:8080/"
2023-11-10 07:17:44 192.168.65.1 - - [10/Nov/2023:13:17:44 +0000] "GET /favicon.ico HTTP/1.1" 404 555 "http://localhost:8080/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36" "-"
2023-11-10 07:17:50 2023/11/10 13:17:50 [error] 29#29: *1 open() "/usr/share/nginx/html/ahstat" failed (2: No such file or directory), client: 192.168.65.1, server: localhost, request: "GET /ahstat HTTP/1.1", host: "localhost:8080"
2023-11-10 07:17:50 192.168.65.1 - - [10/Nov/2023:13:17:50 +0000] "GET /ahstat HTTP/1.1" 404 555 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36" "-"
2023-11-10 07:18:53 2023/11/10 13:18:53 [error] 29#29: *1 open() "/usr/share/nginx/html/ahstat" failed (2: No such file or directory), client: 192.168.65.1, server: localhost, request: "GET /ahstat HTTP/1.1", host: "localhost:8080"
2023-11-10 07:18:53 192.168.65.1 - - [10/Nov/2023:13:18:53 +0000] "GET /ahstat HTTP/1.1" 404 555 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36" "-"

仓库地址

python-loganalysis