Python uses watchfiles to monitor directory changes.

In the work will inevitably encounter such a need to monitor a specified directory, if the directory file changes, then a series of processing. And how to monitor a directory, is the subject of our discussion.

To monitor directories we can use the watchfiles module, which is not only simple but also performs well. The main reason for this is that the code that interacts with the underlying filesystem is written in Rust, so performance is guaranteed.

After installing it via pip install watchfiles, let's see how it's used.

from watchfiles import watch
# The current directory is /Users/satori/Desktop/project
for change in watch("."):
    print(change)

We execute this program, which will be in a blocking state and continuously listen for changes in the specified directory. Then we create a few files in the current directory and see the effect.

Create a text file with the following program output:

{(<: 1>, '/Users/satori/Desktop/project/')}

What is returned is a collection, which currently involves only one file change, so there is only one element inside the collection. And the collection is stored inside the tuple, the first element of the tuple indicates the type of operation, there are a total of three: respectively, add, modify and delete.

The second argument to the tuple is the specific file path, so the output of the program tells us that the current directory has a new .

Create another txt_files directory, and the program output is as follows:

{(<: 1>, '/Users/satori/Desktop/project/txt_files')}

Whether it's a directory file or a text file, they are all files, so the output is the same. If you want to know whether the addition is a directory or a regular file, then you also need to check it with the os module.

We create one in the txt_files directory and the program output is as follows:

{(<: 1>, '/Users/satori/Desktop/project/txt_files/')}

So the watch function listens not only to the specified directory, but also to the recursive subdirectories within it.

The problem is that there is a file and a txt_files directory in the current directory, and there is also a txt_files directory. So if you move the current directory into txt_files and agree to overwrite it, what will the program output?

{(<: 3>, '/Users/satori/Desktop/project/txt_files/'),
(<: 1>, '/Users/satori/Desktop/project/txt_files/'),
(<: 3>, '/Users/satori/Desktop/project/')}

At this point the output collection contains three tuples, so the process involves three file changes. The files in txt_files were replaced, so they were deleted and then recreated. The ones in the current directory are removed, so they are deleted.

We then create multiple levels of directories at the same time with mkdir -p a/b/c. The program output is as follows:

{(<: 1>, '/Users/satori/Desktop/project/a/b/c'),
(<: 1>, '/Users/satori/Desktop/project/a/b'),
(<: 1>, '/Users/satori/Desktop/project/a')}

The whole thing is relatively simple, and then in addition to the watch function, there's also an awatch, which does the same thing with all the same arguments, except that the awatch needs to be paired with a coprocessor, so let's take an example.

import sys
import asyncio
from asyncio import StreamReader
from watchfiles import awatch, Change
# Monitor the specified directory
async def watch_files(path):
    # awatch(...) returns an asynchronous generator that needs to be iterated over with async for
    async for change in awatch(path):
        print("-" * 20)
        # change is a collection of changes that may involve multiple files within it
        for item in change:
            if item[0] == :
                operation = "You've increased"
            elif item[0] == :
                operation = "You modified it."
            else:
                operation = "You deleted it."
            print(f"{operation} `{item[1]}`")
        print("\n")
# Read command line input, but note: you can't use the input function because it's a synchronous blocking call!
# This kind of call is a big no-no in a concurrent thread, it blocks the whole thread, we need to change it to an asynchronous mode
async def read_from_stdin():
    reader = ()
    protocol = (reader)
    loop = asyncio.get_running_loop()
    await loop.connect_read_pipe(lambda: protocol, )
    return reader
# read_from_stdin function is not too much of a concern for the moment
# Just know that it can read the command line asynchronously, more on this later
# And then define the master thread
async def main():
    # Monitor the current directory
    asyncio.create_task(watch_files("."))
    # Create readers
    stdin_reader = await read_from_stdin()
    while True:
        # Read input from the command line
        command = await stdin_reader.readline()
        # Execute commands
        procs = await asyncio.create_subprocess_shell(command)
        await ()
loop = asyncio.get_event_loop()
try:
    loop.run_until_complete(main())
finally:
    ()

Take a look at the results:

The result is no problem, file changes are detected. Then add: watch and awatch can listen to multiple directories at the same time, because the first argument is *paths.

Let's test this by listening to multiple directories at the same time. We'll create two subdirectories in the current directory: boy and girl, and then monitor each of them.

The output is normal, so these two functions can listen to any number of directories. In addition, since they are currently listening on two subdirectories of the current directory, changes to files in the current directory will not be seen, since it is not being monitored.

The above is the basic use of these two functions, of course, these two functions have other parameters:

Here are a few in brief.

Filter (watch_filter)

watchfiles monitors directories for file changes, but not all files are logged.

watchfiles has a built-in filter that filters out files that are not relevant to your business. If you also want to filter out files in other formats, then modify the filter.

Stop event (stop_event)

The iterator doesn't stop while monitoring the file, so if you want to control its end freely, you can pass an event.

import asyncio
from watchfiles import awatch
async def watch_files(*paths, stop_event):
    async for _ in awatch(*paths, stop_event=stop_event):
        pass
    print("Cease surveillance.")
async def main():
    event = ()
    # Pass an event, or to be precise, any object as long as it has an is_set method
    asyncio.create_task(watch_files(".", stop_event=event))
    # Stop monitoring when event.is_set() is True
    print("is_set: ", event.is_set())
    await (3)  # sleep 3
    ()
    print("Three seconds later, is_set:", event.is_set())
    # Waiting for the child thread to finish printing
    await (0.1)
(main())
"""
is_set: False
After three seconds, is_set: True
Stop monitoring
"""

Whether recursive monitoring (recursive)

If this parameter is True, then subdirectories are monitored recursively, otherwise only top-level directories are monitored.

The other parameters are basically rarely used, so I won't go into it again, so if you're interested, you can find out for yourself.

to this article on the use of Python watchfiles to monitor directory changes to this article, more related to Python to monitor directory changes, please search for my previous posts or continue to browse the following related articles I hope you will support me in the future more!