SoFunction
Updated on 2025-04-20

Implementation method of using less efficient GC logs to Linux

introduction

In Linux environment, log analysis is an indispensable part of operations and maintenance and developers' daily work. Especially for garbage collection (GC) logs for Java applications, because of their complex content and generally large file size, it is particularly important to choose the right tools and methods. This article will explain in detail how to use it based on actual cases.lessCommands efficiently read and analyze GC logs and discuss whylessIt is a better choice than other tools.

Question background: Interviewer's question

In a technical interview, the interviewer asked: "How do you read log files in Linux?" This question seems simple, but behind it is the candidate's familiarity with Linux tools and practical experience in handling large files. The following are my answers to this question, gradually analyzing common tools and their limitations, and finally leading tolessAdvantages.

1. Directly use cat: scrolling problem

The most intuitive way is to usecatThe command outputs the log content:

cat 

However, GC logs are usually very large, often several GB or even dozens of GB. usecatThis will cause the terminal screen to scroll quickly, the content flashes by and it is impossible to read at all. What's worse is,catInteractive navigation is not supported, and it is impossible to pause or turn pages to view specific sections.

2. Use more: Limitations of One-way Page Turning

To solve the scrolling problem, you can trymoreOrder:

more 

moreAllows viewing of file contents by pages, and supports page turning down (press the space bar). But its limitation is that it can only turn the page forward (down) and cannot look back at the previous content. For GC log analysis, this one-way navigation is very inconvenient because we often need to jump back and forth in the log to locate specific points in time or exceptional events.

3. Using vim: memory usage issues

Then, you may think of using itvimEditor:

vim 

vimIt provides powerful text editing and navigation functions, and supports up and down page turn, search and other operations. However,vimThe entire file will be loaded into memory. For several GB of GC logs, the loading process can take minutes or even longer. What's more serious is that a large amount of memory usage may trigger Linux's OOM Killer (Out-Of-Memory Killer), causing business processes to be killed and affecting system stability. This risk is unacceptable in a production environment.

4. Best choice: Use less

After comparison,lessCommands are the best tool for reading large GC logs:

less 

lessThe core advantages are:

  • Low memory usagelessLoad file contents on demand, and will not read the entire file into memory at once, making it suitable for handling super large files.
  • Two-way page turn: Supports page up and down (using arrow keys, Page Up/Down), which facilitates free navigation in the log.
  • Powerful search function: Support forward and reverse searches to quickly locate key information.
  • Real-time viewing: Can dynamically track file changes (similar totail -f), suitable for monitoring real-time logs.

Actual case: Use less to analyze GC logs

Case background

Suppose we are responsible for the performance optimization of a Java application, and recently we found that the system response time has slowed down, which is suspected to be a GC performance problem. We need analysisFiles to find out the root cause of frequent Full GCs. The log file size is 5GB and contains millions of lines of records that record the GC activity of the JVM.

Step 1: Open the GC log

First, uselessOpen the log file:

less 

lessThe interface will display the contents of the first page of the file, which is loading very quickly and will not occupy too much memory.

Step 2: Quickly locate Full GC events

In GC logs, Full GC is usually the culprit of performance bottlenecks. We can quickly locate lines containing "Full GC" through the search function. Press the following key to enter the search mode:

  • according to/Enter forward search mode.
  • enterFull GCAnd press Enter.

lessThe matching row is highlighted and redirects to the first matching position. Assume that the log format is as follows:

2025-04-18T10:15:32.123+0800: 12345.678: [Full GC (Ergonomics) [PSYoungGen: 2048K->0K(6144K)] [ParOldGen: 8192K->4096K(12288K)] 10240K->4096K(18432K), [Metaspace: 3000K->3000K(1056768K)], 0.1501234 secs] [Times: user=0.30 sys=0.02, real=0.15 secs]

Through searching, we found that Full GC happens frequently and triggers every few seconds.

Step 3: Turn the page up and down to view the context

In order to analyze the reasons for Full GC, we need to view the log content before and after triggering Full GC. Use the following keys to navigate:

  • Up and down arrow keys: Move line by line to view the details of specific GC events.
  • Page Up / Page Down: Quickly turn the page and browse the logs near time points.
  • g: Jump to the beginning of the file to view the initial configuration of the GC log.
  • G: Jump to the end of the file and check the latest GC activity.

By turning the page, we noticed that before each Full GC, the memory footprint of the young generation (PSYoungGen) quickly reached the upper limit, indicating that the object allocation rate is too high.

Step 4: Dynamically track real-time logs

If the GC log is still writing (for example, the application is running), we can uselessReal-time tracking function:

  • according toFkey, enter similartail -f, dynamically display new content at the end of the file.

Suppose we find that the Full GC frequency suddenly increases at a certain point in time, and it can be combined with application logs or monitoring data, and it is speculated that a business function (such as batch tasks) may cause a surge in memory allocation.

Step 5: Search for complex patterns using regular expressions

Sometimes, a specific pattern needs to be found in the GC log, such as GC events for a certain period of time.lessSupports regular expression search. For example, find the logs at 10 a.m. on April 18, 2025:

  • according to/Enter search mode.
  • enter2025-04-18T10:And press Enter.

This will target all GC events at 10 a.m. to help us focus on analysis for specific time periods.

Step 6: Export the key segment (optional)

If you need to export a certain part of the log for further analysis, you can combine it withlessMarking function:

  • according tomThen enter a letter (such asa), mark the current position.
  • Navigate to another location and pressmEnter another letter (such asb), mark the end position.
  • Use external tools (e.g.sedorawk) Extract the content between the tags.

For example, extract the tagaarrivebLogs:

sed -n '/mark_a/,/mark_b/p'  > gc_segment.log

Analysis results

Through the above steps, we found:

  • Full GC is frequently triggered due to the high memory allocation rate of young generations.
  • During a certain period of time, a business function causes a large number of temporary objects to be created, triggering frequent GC.
  • Optimization suggestions: Adjust JVM parameters (such as increasing the size of the younger generation) or optimize code to reduce object allocation.

Why choose less?

In summary,lessThere are the following advantages when processing GC logs:

  • Efficiency: Low memory footprint, fast loading of large files.
  • flexibility: Supports two-way page turn, search and real-time tracking to meet complex analysis needs.
  • Security: The system stability will not be affected by excessive memory usage.

By comparison,catCan't interact,moreNavigation is restricted,vimThe memory usage is too high and is not suitable for handling large GC logs.

Summarize

In a Linux environment, reading and analyzing GC logs requires selecting the right tools to balance efficiency and system stability. Through the actual case, we show how to use itlessEfficiently locate and analyze Full GC problems in GC logs. Proficient in masteringlessThe shortcut keys and functions can significantly improve the efficiency of log analysis and help quickly locate and solve problems.

The above is the detailed content of the implementation method of Linux using less efficient reading of GC logs. For more information about Linux less reading of GC logs, please pay attention to my other related articles!