Ten days ago, I released open source software to slim .NET Core programs. Compared with the built-in tailorer of .NET Core, it not only has better tailoring the program, but also supports WPF and WinForm programs.
Many friends are very interested in the principles of this open source project, so I will introduce it to you through this article.
Technology 1. Detect the assembly and class loaded by the program
Microsoft provides Diagnostics, a library for analyzing the runtime behavior of .NET Core, which can obtain rich runtime information, such as class instance creation, assembly loading, class loading, method call, GC run, file read and write operations, network connection, etc. The tool in Visual Studio that evaluates the call time of each method is implemented using Diagnostics.
To use the Diagnostics library, we first need to install and these two assemblies, and then use the DiagnosticsClient class to connect the process of the analyzed .NET Core program. The code looks like this:
using ; using ; using ; using ; using ; using ; string filepath = @"E:\temp\test6\";//The path of the program to be analyzedProcessStartInfo psInfo = new ProcessStartInfo(filepath); = true; using Process? p = (psInfo);//Start the programvar providers = new List<EventPipeProvider>()// Events to be listened to { new EventPipeProvider("Microsoft-Windows-DotNETRuntime", , (long)) }; var client = new DiagnosticsClient();//Set the process of DiagnosticsClient monitoringusing EventPipeSession session = (providers, false);//Start the listeningvar source = new EventPipeEventSource(); += (TraceEvent obj) => { if (obj is ModuleLoadUnloadTraceData)// Assembly loading event var data = (ModuleLoadUnloadTraceData)obj; string path = ;//Get the path to the assembly ($"Assembly Loaded:{path}"); } else if (obj is TypeLoadStopTraceData)//Class loading event var data = (TypeLoadStopTraceData)obj; string typeName = ;//Get the class name ($"Type Loaded:{typeName}"); }; ();
Different types of messages correspond to different types of objects in the event. These classes are inherited from TraceEvent. What I am analyzing here is the assembly loading event ModuleLoadUnloadTraceData and the class loading event TypeLoadStopTraceData.
In this way, we can know the assembly and type information loaded during the program's running, so we can know which assembly and type are not loaded, so we can know which assembly and type to delete.
Technology 2. Delete classes that cannot be used in the assembly
It provides the function of deleting ILs of classes that cannot be used in the assembly. This function uses the dnlib library to complete the editing of assembly files. Dnlib is an open source project that reads, writes and edits .NET assembly files.
In Dnlib, we use it to load an existing assembly, and the return value of the Load method is of type ModuleDefMD. ModuleDefMD represents assembly information, for example, the Types attribute in it represents all types in the assembly. We can modify ModuleDefMD and its objects, and then save the modified assembly to disk by calling the Write method.
For example, the following code is used to change all non-public types in an assembly to public types, and clear all the Attributes modified on the method:
using ; string filename = @"E:\temp\net6.0\"; ModuleDefMD module = (filename); foreach(var typeDef in ) { if ( == false) { |= ;//Modify the access level of the class } foreach(var methodDef in ) ();//Attribute of the clear method} (@"E:\temp\net6.0\");//Save Modify
Here is the source code for the assembly to be tested:
internal class Class1 { [DisplayName("AAA")] public void AA() { ("hello"); } }
The following are the decompilation results of the modified assembly:
public class Class1 { public void AA() { ("hello"); } }
You can see that our assembly modification works.
After mastering the method of modifying the assembly using Dnlib, we can realize the function of deleting types that cannot be used in the assembly. We just need to delete the corresponding type from the Types property of ModuleDefMD. However, in actual operation, this will encounter problems, because the class we want to delete may be referenced by other places. Although those places only refer to the class we want to delete and are not really called, in order to ensure the validity of the modified assembly, the Write method of ModuleDefMD will still perform legality verification, otherwise the Write method will throw a ModuleWriterException exception, such as:
ModuleWriterException: 'A method was removed that is still referenced by this module.'
Therefore, we need to carefully check the assembly to ensure that every reference to the class to be deleted. Because the class definition itself occupies very little file size and the main code space occupies in the class method body, I found an alternative solution, that is, not to delete the class, but just clear the class method body.
In Dnlib, the corresponding type of method is the MethodDef type, and the Body attribute of the CilBody type of MethodDef represents the method body of the method. If a method has a method body (that is, not an abstract method, etc.), then CilBody's Instructions represent a collection of IL instructions for the method body code. Therefore, I immediately thought of the method body to clear the method through the following code:
();
However, when running, when using the ModuleDefMD cleaned by the above code for saving, it may cause illegal assembly structure. For example, some methods define return values. If we directly clear the method body, it will cause the problem that the method has no return value returned. Therefore, I changed my thinking, that is, change all method bodies to throw null; the corresponding IL code of this C# code, because all method bodies can be changed to throw an exception to ensure the correctness of the logic. Therefore, I wrote the following code to clean up the method body:
(); (); (); (new Instruction() { Offset = 0 }); (new Instruction() { Offset = 1 }); (new Instruction() { Offset = 2 });
The IL code added in the last three lines is the corresponding C# code of throw null.
Please check the project's github address to get all source code, project address:/yangzhongke/
Other issues with Dnlib use
I have some other gains in the process of using Dnlib, which I will record and share with you here.
Harvest 1. Problems encountered when Dnlib saves assembly containing local code
When cleaning up the assembly using the methods I mentioned above, most of the custom assembly we wrote and assemblies of third-party NuGet packages are fine. However, I encountered a problem when using the same method to process the basic .NET Core assembly, that is, even if I just load the assembly and do not make any changes, and directly write it, the assembly will become significantly smaller. For example, I will use the following code to process it:
using (var mod = (@"E:\temp\")) { (@"E:\temp\"); }
The original size is 15.9MB, while the new file size after saving is only 5.7MB. After asking the author of Dnlib, we learned that these assemblies contain local code (such as code written in C++/CLI or assemblies in formats such as ReadyToRun/NGEN/CrossGen). These local codes will be ignored when saving using the Write method. This is why the saved assembly size becomes significantly smaller. We can use the NativeWrite method instead of the Write method because this method retains the local code.
However, according to Washi1337, author of AsmResolver (an open source project similar to DnLib), the NativeWrite method will try to save the structure of the local code as much as possible, so it cannot reduce the size of the assembly, and may even increase the size of the assembly (see details for details)./Washi1337/AsmResolver/issues/267). And when I was actually using it, I found that after modifying these assemblies, the program will fail to start. Checking the Windows event log, I found that it was caused by the failure of CLR to start when the program was started. According to Washi1337, if the assembly contains only the local code of ReadyToRun, then just remove the ILLibrary flag in the assembly, let the CLR skip the local code of ReadyToRun and execute the IL code directly. After all, the original IL code is still saved for the assembly optimized by ReadyToRun. However, after the operation as mentioned by Washi1337, the program still failed to start. I don’t know what the reason is, because the assembly containing local code cannot be well tailored, so I did not study it in depth. Friends who are proficient in CLR are welcome to share their experience.
Harvest 2. Other applications of Dnlib
Since DnLib can modify assembly, we can use it to do a lot of things, such as modifying the default behavior of the program (you know). We can use DnLib to write our own code obfuscator or implement static weaving for Oriented Programming (AOP).
What other application scenarios of DnLib have you thought of? Welcome to share.
This is the article about revealing the technology behind .NET Core tailorers. For more related .NET Core tailorers, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!