SoFunction
Updated on 2025-04-08

Learn to use the XML engine XQEngine

Recently I have been looking for XML search tools. The application I write needs to search for related XML files regularly. I originally meant to see if there is data in the file that matches the data I want, but sometimes, I also want to output the data I found. At the beginning, I tried XSLT and XPath and wanted to convert the search problem into problems that can be solved by using XSLT. However, after a period of experimentation, I found that using XSLT did not really solve the search problem I wanted to deal with because the data I wanted to output was a comma separated number, and XSLT could not meet this requirement, and XLST could not provide full-text search functionality. Then I wanted to try using XML query language (XQL) to see if it can be solved, so I carefully looked at the implementation of various versions of XQL. It happened to be a coincidence that a gadget called XQEngine can solve this problem. So, in this article, I would like to introduce how to use XQEngine to search for the string data you want to find in your XML file.

XQEngine can be found under the website, it is a JavaBean that uses a SAX parser to index one or more XML documents, and you can then search in these documents in a composite search. The search language it uses is a superset of XQL, with a similar syntax to XPath.

Java classes using XQEngine must implement a result() method. After completing the search, the engine will call this method to pass the search results into the result() method. Three formats for displaying data can be used to output data results. Use command line parameters to indicate the search parameters you need. For example, you can indicate that if a file contains the word stop, it will not be indexed; or, in the parameters, you can command the engine to ignore words that are less than the specified number of sub-numbers.

Below, I gave a routine using XQEngine. Let's analyze it now. First, the main() method instantiates a search engine: XmlEngine engine = new XmlEngine(), and then it obtains the three parameters of file name, return result format, and search request from the command line, and then uses various configuration methods to set the engine, and then calls the setSaxParserName() method to set the full name of the SAX parser, because we are using the Xerces parser, so we need to use "". Then we need to set the search parameters, and in this example we will not index numbers or any words with less than 3 characters. There will be detailed configuration parameter descriptions in the XQEngine API document you downloaded, so I won’t go into details about how to configure parameters here. Please refer to the relevant documents yourself. Finally, the setDocument() method specifies the XML file that XQEngine will index or search for. Of course, if you want to index multiple files, you just need to set a few corresponding setDocument() methods.

From the code below we can also see that the XQEngine engine will return search results in three different formats: STANDARD, SUMMARY, and CSV (values ​​separated by commas) For simplicity, I defined a number for each return result type instead (1, 2, 3), and then called the setListenerType() method with the corresponding parameters. I will introduce each return result type in detail later. There is also a method printSessionState() to output the index and engine information, but I did not write it into the routine, so the above program will only output the search results; next step calls the addXQLResultListener() method, and passes an instance of Search to implement the XQLResultListener interface; then call the setQuery method as a parameter, and the engine will start to execute the query task. After the query is finished, the engine calls the result() method of the Search class and passes the query results back. In the routine I provide, the result() method simply outputs the result.
Code:

import .*;
import ;
import .*;
import ;

public class Search implements XQLResultListener
{
public static void main( String[] args )
{
XmlEngine engine = new XmlEngine();
String searchFile = args[0];
String searchType = args[1];
String query = args[2];
try { file://configuration engine
( "");
( 3 );
( false );
( searchFile );


if (("1")) {
(
XmlEngine.STANDARD_LISTENER);
}
else if (("2")) {
(
XmlEngine.SUMMARY_LISTENER);
}
else {
(
XmlEngine.CSV_LISTENER);
}
}
catch( MissingOrInvalidSaxParserException e ){
(
"Missing or unavailable SAX parser");
return;
}
catch( FileNotFoundException e ) {
(
"The XML file cannot be found: ");
return;
}
catch( CantParseDocumentException e ) {
(
"The XML file cannot be parsed: ");
return;
}
// ();
( new Search() );
try {
( query );
}
catch( InvalidQueryException e ) {
(
"Unable query request: " + () );
return;
}
}
public void results( String xqlResults )
{
( xqlResults );
}
}


 
OK, we have written a program using XQEngine, so let's run this code. Before compiling this code, we need to download it to XQEngine and SAX parsers. I downloaded it to the Xerces parser from above. The operating system I use is Windows 2000 Professional, and the JDK is version 1.3. OK, after getting these things done, let me set CLASSPATH, modify CLASSPATH in the "environment variables", and add "c:\xql\;c:\xql\;c:\xql\;c:\xerces\". Now we can compile the code, but in order to run the program, we also need an XML file, and I used the files in Apache Tomcat as a demonstration. I have also introduced it before. We use 1, 2, and 3 to replace the three formats of return query results:

1. Use STANDARD_LISTENER (number 1) and query item "//welcome-file-list/welcome-file", C:\xql\xql1>java Search 1 "//welcome-file-list/welcome-file"

:

<>
installed successfully
1: indexing
Query: ( // ( / welcome-file-list welcome-file ) )
3 hit(s) for file://welcome-file-list/welcome-file
<?xml version="1.0"?>
<xql:result
query="//welcome-file-list/welcome-file"
hitCount="3"
elemCount="3"
docCount="1"
xmlns:xql="/ Standard_Listener.html">
<welcome-file>

</welcome-file>
<welcome-file>

</welcome-file>
<welcome-file>

</welcome-file>
</xql:result>

In the example above, the query item requires that all "welcome-file" child elements be found for any "welcome-file-list" element. Please note that the search results are basically excerpted from the original XML document, and the relationship between the search results and the original document cannot be established. The SUMMARY_LISTENER(2) return type is somewhat different, it includes a "docID" number and an "elemlx" number, so that the result can be linked to the original document.

Here is an example of the return result:


C:\xql\xql1>java Search 2
"//welcome-file-list/welcome-file"
: <>
installed successfully

1: indexing

Query: ( // ( / welcome-file-list welcome-file ) )

3 hit(s) for file://welcome-file-list/welcome-file

<?xml version="1.0"?>
<xql:result
query="//welcome-file-list/welcome-file"
hitCount="3"
elemCount="3"
docCount="1"
xmlns:xql="/
Summary_Listener.html">
<welcome-file xql:docID="0" xql:elemIx="270"/>
<welcome-file xql:docID="0" xql:elemIx="271"/>
<welcome-file xql:docID="0" xql:elemIx="272"/>
</xql:result>

As I said before, for my application, the most important thing is to return the result separated by commas, so CSV_LISTENER(3) is very useful, it can return a result separated by commas, as follows:

C:\xql\xql1>java Search 3
"//welcome-file-list/welcome-file"
:
<>
installed successfully

1: indexing

Query: ( // ( / welcome-file-list welcome-file ) )

3 hit(s) for file://welcome-file-list/welcome-file

3,3,1,0
0,270,welcome-file
0,271,welcome-file
0,272,welcome-file

Of course, XQEngine has many powerful functions. I cannot introduce them one by one. The documents it comes with have rich source programs and usage methods. You can learn and use them according to yourself. Of course, if you are willing, you can even develop a GUI program. The document comes with a GUI-based search program: SwingQueryDemo. You can take a look at the research.