PostgreSQL error logs and slow query logs play a very important role in online system analysis, problem warning, and problem investigation. I will not go into details here.
This document records the collection, analysis and storage display methods of error logs and slow query logs.
1. Overall idea
PostgreSQL log output can be configured in a variety of formats, among which the log information output in csvlog format is the most comprehensive. However, CSV logs can only be collected in the form of local files, and cannot be directly written to the network, and uploaded to the log server in real time.
Log collection:
PostgreSQL servers are distributed in different machines, and we use the rsyslog client-server method to collect logs to the log server. Specific method: Deploy the client on the PostgreSQL server and deploy the server on the log server. The client monitors the changes in the log files and uploads the added logs to the server in real time. The server records the logs uploaded by different clients in different files.
The advantage of this method is that a original complete csv log can be saved locally on the PostgreSQL server for full information to be viewed and analyzed.
Log Analysis:
Logstash is an open source data collection engine with real-time pipeline capabilities. Logstash can dynamically unify data from different files, perform data filtering and cleaning, and standardize data to the destination you choose.
Log storage display:
Use traditional Elasticsearch for data storage and Kibana for data display.
2. RSSlog server-side configuration
Add the following content to the rsyslog configuration file /etc/ and restart the rsyslog service.
$PreserveFQDN on # Used to get the hostname correctly$FileOwner root #The file stored belongs to the master$FileGroup root #File belongs to the master$FileCreateMode 0644 #Created file permissions$DirCreateMode 0755 #Created directory permissions$Umask 0022 $PrivDropToUser root #Users who can delete logs$PrivDropToGroup root #User group that can delete logsmodule(load="imuxsock") module(load="imklog") module(load="imudp") #input(type="imudp" port="514") module(load="imtcp" MaxSessions="500") input(type="imtcp" port="514") $template linefmt,"%msg:2:$%\n" #Receive log format (remove the beginning space)$template pgloglocation,"/data/pglogs/%hostname%/%$YEAR%-%$MONTH%-%$DAY%.csv" :rawmsg,contains,"pg_5432" ?pgloglocation;linefmt ##variable:%fromhost-ip%
3. rsyslog client configuration
Create a new configuration file /etc// and restart the rsyslog service.
cat /etc// module(load="imuxsock") module(load="imklog") module(load="imfile") #module(load="imudp") #input(type="imudp" port="514") module(load="imtcp" MaxSessions="500") input(type="imtcp" port="514") ruleset(name="remote"){ action(type="omfwd" target="" #Log Server IP Address port="514" #port protocol="tcp" #User Protocol ="linkedList" #Use asynchronous processing ="/var/log/rsyslog" #Quote Directory ="pglog" #Quote Name ="1g" #Quote takes up the maximum disk space ="on" #Save memory data if rsyslog is closed ="-1" #Infinite retry failed to insert ) stop } input( type="imfile" File="/pg/data/log/*.csv" #PG server log path Facility="local1" Severity="info" Tag="pg_5432" #Define log tags, important, the server can identify logs based on this tag PersistStateInterval="1" #Write back offset data to file interval time (seconds), depending on the actual situation deleteStateOnFileDelete="on" reopenOnTruncate="on" Ruleset="remote" The rule name defined in #)
4. Logstash configuration
After editing the configuration file on the log server, start logstash. The configuration file is as follows:
input { file { path => ["/data/pglogs/*/*.csv"] start_position => "end" codec => multiline { pattern => "^20[0-9]{2}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}" negate => true what => "previous" } } } filter { csv { separator => "," columns => ["log_time","user_name","database_name","process_id","connection_from","session_id","session_line_num","command_tag","session_start_time","virtual_transaction_id","transaction_id","error_severity","sql_state_code","message","detail","hint","internal_query","internal_query_pos","context","query","query_pos","location","application_name"] convert => { "process_id" => "integer" "session_line_num" => "integer" "transaction_id" => "integer" "internal_query_pos" => "integer" "query_pos" => "integer" } skip_empty_columns => true } mutate{ split => { "log_time" => " CST" } add_field => { "log_time_tmp" => "%{[log_time][0]}" } remove_field => ["log_time"] } date { match => ["log_time_tmp", "yyyy-MM-dd HH:mm:"] target => "@timestamp" locale => "cn" remove_field => ["log_time_tmp"] } if "duration:" in [message] and "ms" in [message] and "statement:" in [message] { grok{ match => { "message" => "duration: %{NUMBER:duration} ms" } } mutate{ split => { "message" => "statement: " } add_field => { "statement" => "%{[message][1]}" } remove_field => ["message"] } } mutate{ split => { "path" => "/" } add_field => { "db_host" => "%{[path][3]}" } remove_field => ["path", "host"] convert => { "duration" => "float" } } } output { if [error_severity] == "ERROR" or [error_severity] == "FATAL" or [error_severity] == "PANIC" { elasticsearch { hosts => [":x", ":x"] index => "pg_error" id => "elasticsearch_pg_error" } }else if [duration] and [statement] { elasticsearch { hosts => [":x", ":x"] index => "pg_slow" id => "elasticsearch_pg_slow" } } }
5. Elasticsearch storage and Kibana display are omitted here, so I won't introduce it here
Supplement: PostgreSQL developer mode error feedback and log settings
#### when when to record
#client_min_messages = notice log_min_messages = debug5 #debugLevels are provided for developers,This allows you to see the program call information andSQLInformation converted into data structures,Level per minute
Where to record #### where
#log_destination = 'stderr' logging_collector = on #Open log collectionlog_directory = 'pg_log' #Log Directorylog_filename = 'postgresql-%Y-%m-%d_%H%M%'
####what What logs do you write
debug_print_parse = on #Parasing Treedebug_print_rewritten = on #Query rewritten SQLdebug_print_plan = on #Details of execution plandebug_pretty_print = on #Format readability of debug_print_parse, debug_print_rewriteten, debug_print_plan#log_checkpoints = off #If it is a disk IO study of pg, this needs to be set to onlog_connections = on #Connection loglog_disconnection = on #Disconnect log#log_duration=on # statement execution time, for analysis
The above is personal experience. I hope you can give you a reference and I hope you can support me more. If there are any mistakes or no complete considerations, I would like to give you advice.