MojoUnityJson is a JSON parser implemented using C#. The algorithm idea comes from the C language implementation of the game engine Mojoc. With the help of C# class library, it can be simpler and more comprehensive than C's implementation, especially handling the parsing of Unicode Code (beginning with \u). C#'s StringBuilder itself supports UnicodeCodePoint.
MojoUnityJson uses recursively descending parsing mode, with only 450 lines of core parsing code (it may be only more than 300 lines removed), and supports standard JSON format. The algorithm implementation strives to be concise and clear, and uses the most direct and fastest method to achieve the goal, without complex concepts and patterns. In addition to analyzing JSON, it also provides a set of convenient and intuitive APIs to access JSON data. The overall implementation only has one file and only depends on it. , , SystemThree namespaces, MojoUnityJson can be easily embedded in other projects.
This article mainly introduces the super simple and efficient parsing algorithm that can be fully understood at a glance, which can be copied and pasted almost intact into implementations of other language versions.
Save context information
Use a simple structure to pass some context data during parsing.
private struct Data { // JSON string that needs to be parsed public string json; // The current location index of JSON string parsing public int index; // Cache a StringBuilder to cut out a JSON character. public StringBuilder sb; public Data(string json, int index) { = json; = index; = new StringBuilder(); } }
Abstract JSON values
We abstract the value of JSON into the following types:
public enum JsonType { Object, Array, String, Number, Bool, Null, }
Overall analysis steps
// parse JsonValueprivate static JsonValue ParseValue(ref Data data); // parse JsonObjectprivate static JsonValue ParseObject(ref Data data); // Analysis JsonArrayprivate static JsonValue ParseArray(ref Data data); // parse stringprivate static JsonValue ParseString(ref Data data); // parse numberprivate static JsonValue ParseNumber(ref Data data)
This is the entire parsing process. In ParseValue, the type of the following different parsing functions will be judged according to the characters. JsonValue corresponds to a JSON value, which has a JsonType that represents the type of this value. This is a recursive process. ParseValue will be called recursively during the ParseValue, ParseObject and ParseArray process. JSON must start with an Object or Array. When the top-level value is parsed, the entire JSON is parsed.
Parsing whitespace characters
During the parsing process, there will be many whitespace characters that need to be eliminated in order to obtain informational characters. This is a repetitive process and requires a unified processing of a function.
private static void SkipWhiteSpace(ref Data data) { while (true) { switch ([]) { case ' ' : case '\t': case '\n': case '\r': ++; // Each time one character is consumed, the JSON index is pushed backwards continue; } break; } }
Parse JsonValue
private static JsonValue ParseValue(ref Data data) { // Skip whitespace characters SkipWhiteSpace(ref data); var c = []; switch (c) { case '{': // means Object return ParseObject(ref data); case '[': // means Array return ParseArray (ref data); case '"': // means string return ParseString(ref data); case '0': case '1': case '2': case '3': case '4': case '5': case '6': case '7': case '8': case '9': case '-': // Indicates numerical value return ParseNumber(ref data); case 'f': // means that it may be false if ( [ + 1] == 'a' && [ + 2] == 'l' && [ + 3] == 's' && [ + 4] == 'e' ) { += 5; // means false return new JsonValue(, false); } break; case 't': // means that it may be true if ( [ + 1] == 'r' && [ + 2] == 'u' && [ + 3] == 'e' ) { += 4; // means true return new JsonValue(, true); } break; case 'n': // means that it may be null if ( [ + 1] == 'u' && [ + 2] == 'l' && [ + 3] == 'l' ) { += 4; // means that it may be null return new JsonValue(, null); } break; } // Can't handle it anymore throw new Exception(("Json ParseValue error on char '{0}' index in '{1}' ", c, )); }
- ParseValue is the main entrance to parsing, representing the abstract JSON value of JsonValue, whose real type is gradually concreted during parsing.
- After stripping off the blank characters, it is easy to determine the possible numerical type by a single character without searching forward or backward.
- True, false, null These fixed types are directly processed, while other slightly complex types need to be processed using functions.
- There is no if else used here, but a large number of cases are used to improve efficiency and reduce the number of judgments.
Parse JsonObject
private static JsonValue ParseObject(ref Data data) { // Object corresponds to C#Dictionary var jsonObject = new Dictionary<string, JsonValue>(JsonObjectInitCapacity); // skip '{' ++; do { // Skip whitespace characters SkipWhiteSpace(ref data); if ([] == '}') { // Empty Object, "{}" break; } ( [] == '"', "Json ParseObject error, char '{0}' should be '\"' ", [] ); // skip '"' ++; var start = ; // parse the key value of the Object while (true) { var c = [++]; switch (c) { case '"': // check end '"' break; case '\\': // skip escaped quotes ++; continue; default: continue; } // already skip the end '"' break; } // get object key string // Deduct key string var key = (start, - start - 1); // Skip the blank SkipWhiteSpace(ref data); ( [] == ':', "Json ParseObject error, after key = {0}, char '{1}' should be ':' ", key, [] ); // skip ':' ++; // set JsonObject key and value // Recursively call ParseValue to get the value of the Object (key, ParseValue(ref data)); // Skip the blank SkipWhiteSpace(ref data); if ([] == ',') { // The next pair of KVs of Object ++ ; } else { // Skip the blank SkipWhiteSpace(ref data); ( [] == '}', "Json ParseObject error, after key = {0}, char '{1}' should be '{2}' ", key, [], '}' ); break; } } while (true); // skip '}' and return after '}' ++; return new JsonValue(, jsonObject); }
The JsonObject type simply corresponds to the Dictionary of C#, and the value is the JsonValue type. When the parsing is completed, the type of value is determined.
JsonValue is handled by calling ParseValue recursively, and its type may be any type of the JsonType enum.
Parse JsonArray
private static JsonValue ParseArray(ref Data data) { // JsonArray corresponding to List var jsonArray = new List<JsonValue>(JsonArrayInitCapacity); // skip '[' ++; do { // Skip the blank SkipWhiteSpace(ref data); if ([] == ']') { // null "[]" break; } // add JsonArray item // Recursively process each element of List (ParseValue(ref data)); // Skip the blank SkipWhiteSpace(ref data); if ([] == ',') { // parse the next element ++; } else { // Skip the blank SkipWhiteSpace(ref data); ( [] == ']', "Json ParseArray error, char '{0}' should be ']' ", [] ); break; } } while (true); // skip ']' ++; return new JsonValue(, jsonArray); }
The JsonArray type simply corresponds to the C# List, and the element is the JsonValue type. When the parsing is completed, the type of element is determined.
JsonValue is handled by calling ParseValue recursively, and its type may be any type of the JsonType enum.
parsing string
private static JsonValue ParseString(ref Data data) { // skip '"' ++; var start = ; string str; // Process strings while (true) { switch ([++]) { case '"': // End of string // check end '"' if ( == 0) { // No StringBuilder is used, just cut out the string str = (start, - start - 1); } else { // There are special characters in StringBuilder str = (, start, - start - 1).ToString(); // clear for next string // Clear characters for next time = 0; } break; case '\\': { // check escaped char var escapedIndex = ; char c; // Handle various escape characters switch ([++]) { case '"': c = '"'; break; case '\'': c = '\''; break; case '\\': c = '\\'; break; case '/': c = '/'; break; case 'n': c = '\n'; break; case 'r': c = '\r'; break; case 't': c = '\t'; break; case 'u': // Calculate the code point of unicode characters c = GetUnicodeCodePoint ( [], [ + 1], [ + 2], [ + 3] ); // skip code point += 4; break; default: // not support just add in pre string continue; } // add pre string and escaped char // Put the specially processed characters and normal characters together into StringBuilder (, start, escapedIndex - start - 1).Append(c); // update pre string start index start = ; continue; } default: continue; } // already skip the end '"' break; } return new JsonValue(, str); }
The troublesome thing about handling strings is that escape characters require special processing, and these escape characters will be displayed directly without showing special functions. Fortunately, StringBuilder is very powerful and provides interfaces for handling various situations.
Parsing Unicode characters
In JSON, Unicode characters are escaped characters starting with \u and following 4 code points. Code points are directly supported in StringBuilder's Append overload function. So, we just need to convert the 4 characters after \u into code points and pass them to Append.
/// <summary> /// Get the unicode code point. /// </summary> private static char GetUnicodeCodePoint(char c1, char c2, char c3, char c4) { // Convert the 4 chars after \u into code points. Note that it needs to be char type to be correctly processed by Append. // After the 4 chars are converted to int, they are mapped to the high to low bits in hexadecimal, and then added to get the code point. return (char) ( UnicodeCharToInt(c1) * 0x1000 + UnicodeCharToInt(c2) * 0x100 + UnicodeCharToInt(c3) * 0x10 + UnicodeCharToInt(c4) ); } /// <summary> /// Single unicode char convert to int. /// </summary> private static int UnicodeCharToInt(char c) { // Use switch case to reduce the judgment of if else switch (c) { case '0': case '1': case '2': case '3': case '4': case '5': case '6': case '7': case '8': case '9': return c - '0'; case 'a': case 'b': case 'c': case 'd': case 'e': case 'f': return c - 'a' + 10; case 'A': case 'B': case 'C': case 'D': case 'E': case 'F': return c - 'A' + 10; } throw new Exception(("Json Unicode char '{0}' error", c)); }
parse number
private static JsonValue ParseNumber(ref Data data) { var start = ; // Collect numeric characters while (true) { switch ([++]) { case '0': case '1': case '2': case '3': case '4': case '5': case '6': case '7': case '8': case '9': case '-': case '+': case '.': case 'e': case 'E': continue; } break; } // Cut out numeric strings var strNum = (start, - start); float num; // Treat it as a float, of course you can also use double if ((strNum, out num)) { return new JsonValue(, num); } else { throw new Exception(("Json ParseNumber error, can not parse string [{0}]", strNum)); } }
How to use
There is only one sentence, parse the Json string into a JsonValue object, and then the JsonValue object contains all the values.
var jsonValue = (jsonString); JsonValueVisitsAPI // JsonValue is used as stringpublic string AsString(); // JsonValue is used as floatpublic float AsFloat(); // JsonValue is used as intpublic float AsInt(); // JsonValue is used as boolpublic float AsBool(); // JsonValue is used as nullpublic float IsNull(); // JsonValue is used as Dictionarypublic Dictionary<string, JsonValue> AsObject(); // JsonValue is used as Dictionary and get value according to key as JsonValuepublic JsonValue AsObjectGet(string key); // JsonValue is used as Dictionary and get value according to key as Dictionarypublic Dictionary<string, JsonValue> AsObjectGetObject(string key); // JsonValue is used as Dictionary and get value according to key as Listpublic List<JsonValue> AsObjectGetArray(string key); // JsonValue is used as Dictionary and get value according to key as stringpublic string AsObjectGetString(string key); // JsonValue is used as Dictionary and get value according to key as floatpublic float AsObjectGetFloat(string key); // JsonValue is used as Dictionary and get value according to key as intpublic int AsObjectGetInt(string key); // JsonValue is used as Dictionary and get value according to key as boolpublic bool AsObjectGetBool(string key); // JsonValue is used as Dictionary and get value according to key as nullpublic bool AsObjectGetIsNull(string key); // JsonValue is used as Listpublic List<JsonValue> AsArray(); // JsonValue is used as List and get the value of index is used as JsonValuepublic JsonValue AsArrayGet(int index); // JsonValue is used as a List and get the value of index is used as a Dictionarypublic Dictionary<string, JsonValue> AsArrayGetObject(int index); // JsonValue is used as List and get the value of index is used as Listpublic List<JsonValue> AsArrayGetArray(int index); // JsonValue is used as List and get the value of index is used as stringpublic string AsArrayGetString(int index); // JsonValue is used as List and get the index's value as floatpublic float AsArrayGetFloat(int index); // JsonValue is used as List and get the value of index is used as intpublic int AsArrayGetInt(int index); // JsonValue is used as List and get the value of index is used as boolpublic bool AsArrayGetBool(int index); // JsonValue is used as List and get the value of index is used as nullpublic bool AsArrayGetIsNull(int index);
at last
The purpose of MojoUnityJson is to complete the simple and single JSON string parsing function, and being able to read JSON data is the most important function. I also learned about some open source C# implementation JSON libraries. They either have too many functions or are a bit cumbersome to implement, so I manually implemented MojoUnityJson.
Summarize
The above is the C# implementation of JSON parser MojoUnityJson introduced to you. I hope it will be helpful to you. If you have any questions, please leave me a message and the editor will reply to you in time. Thank you very much for your support for my website!