SoFunction
Updated on 2025-03-02

Detailed explanation of the support for custom language parsing by Rust calling tree-sitter

To use Rust to call tree-sitter to parse a custom language, you need to follow a series of steps to define the language's syntax, generate a parser, and use this parser in Rust. Here are the detailed steps:

1. Define the syntax of a custom language

First, you need to create a tree-sitter language definition. Suppose you want to define a simple custom language, create a new directory and add a file named :

mkdir my_language
cd my_language
touch 

Define your language syntax in . For example, the following is a simple example representing a hypothetical language:

 = grammar({
    name: 'my_language',
    rules: {
        // Rule definition        program: $ => repeat($.statement),
        statement: $ => choice(
            $.expression,
            $.assignment
        ),
        expression: $ => /[a-zA-Z_][a-zA-Z0-9_]*/,
        assignment: $ => seq(
            $.expression,
            '=',
            $.expression
        ),
    }
});

2. Generate C parser

Use the tree-sitter-cli tool to generate a C parser. Make sure you have tree-sitter-cli installed, which can be installed via the following command:

npm install -g tree-sitter-cli

Then run the following command in your custom language directory to generate the parser:

tree-sitter generate

This will generate a C code file in the my_language directory.

3. Use a custom parser in a Rust project

Next, you need to use this custom language parser in your Rust project. First, create a new Rust project:

cargo new tree_sitter_my_language
cd tree_sitter_my_language

4. Add dependencies

In the file, add tree-sitter and cc dependencies:

[dependencies]
tree-sitter = "0.23"
[build-dependencies]
cc = "1.0"

5. Create

Create a file in the project root directory to compile a custom parser:

extern crate cc;
fn main() {
    cc::Build::new()
        .include("my_language/src") // Point to the src directory of the custom language        .file("my_language/src/")
        .compile("tree-sitter-my_language");
    println!("cargo:rerun-if-changed=my_language/src/");
}

6. Write Rust code

Write code in src/ using a custom parser:

use tree_sitter::{Parser, Language};
// Introduce custom languageextern "C" { fn tree_sitter_my_language() -> Language; }
fn main() {
    // Initialize parser    let mut parser = Parser::new();
    // Set custom language    let language = unsafe { tree_sitter_my_language() };
    parser.set_language(&language).expect("Error loading custom language grammar");
    // Custom language code to parse    let source_code = r#"
    x = 10
    y = 20
    z = x + y
    "#;
    // parse the source code    let tree = (source_code, None).unwrap();
    // Get the root node of the syntax tree    let root_node = tree.root_node();
    // Output analysis results    println!("Parsed custom language code:\n{:?}", root_node);
}

7. Run the project

Ensure that the project structure looks like this:

tree_sitter_my_language/
├── 
├── 
├── my_language/        # Custom language directory│   ├── 
│   ├── src/
│   │   ├── 
│   │   └── ... (Other generated files)
└── src/
    └── 

Then run the following command:

cargo build
cargo run

This parses the custom language code and outputs the root node information of the syntax tree.

This is the end of this article about Rust calling tree-sitter supports custom language parsing. For more related content on Rust calling tree-sitter, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!