Write once, run anywhere

About PHP's database API has different functions for different databases, and some people have always tried to use PHP's object-oriented functions to encapsulate them. Among them are the more famous ADODB and PHPLIB. PEAR DB, which later attracted worldwide attention, was one of the best among them. These packages encapsulated using object-oriented database APIs are generally called database abstraction layers.

This article introduces a very good introduction to the MDB that focuses on efficiency and is simple and easy to use and very powerful functions in PEAR. The author is the main creator of MDB.

To get the latest original and translations of PHP/PEAR that I have been following recently, please visit my homepage

Write once - run anywhere
Write once—run anywhere

PEAR MDB Database Abstraction Layer
PEAR MDB database abstraction layer

Author: Lukas Smith
Translator: taowen

While this is a Java marketing phrase it is also a key feature of PHP. Many business models depend on operation system independence to ensure that products can be sold to a wide range of customers. So why lock yourself in with a specific database vendor? Database abstraction layers allow you to develop your application independent of a database. But often they eat more performance than you are willing to give or they do not abstract enough to eliminate all database specific code.
This is a marketing slogan in Java, but it is also one of the key features of PHP. Many business models rely on operating system agnosticity to ensure that products can be sold to a wide range of customer bases. Therefore, why tie yourself to some kind of database manufacturer? The database abstraction layer enables you to develop your applications independently from the database. However, they usually have more impact on performance than you would expect, or they are not abstract enough to eliminate all code related to a particular database.

What will this article teach me?
What will this article teach me?

This article will give a good introduction to the database abstraction package PEAR MDB. The focus will be explaining the more advanced features of MDB like data type abstraction and the XML based schema management that go beyond what other similar packages offer. A basic level of understanding of PHP and SQL is recommended.
This article will provide a good introduction to the database abstraction package PEAR MDB. The focus of the article will be on the more advanced features MDB goes beyond similar packages, such as data type abstraction and XML-based schema management. A basic understanding of PHP and SQL is recommended.

Why another database class?
Why do you need another database class?

Often, web projects are added to existing IT infrastructures, where the client already made a choice of what RDBMS (relational database management system) to work with. Even if that is not the case different budgets might affect what database you chose for deployment. Finally, you as the developer simply might prefer not to lock yourself in with a specific vendor. So far this meant to keep multiple versions for each supported database or giving up more performance and ease of use than necessary: Enter PEAR MDB.
Typically, web engineering is added to an existing IT infrastructure after the customer has determined which RDBMS (relational database management system) to use. Even if that's not because different budgets may affect what data you choose for deployment. Ultimately, you as a developer may simply prefer not to tie yourself to a vendor. From then on, it means keeping the version of each supported data or sacrificing more performance but getting more ease of use than necessary: go to PEAR MDB.

MDB is a database abstraction layer that aims to make writing RDBMS independent PHP application development a straightforward process. Most other so called database abstraction layers for PHP only provide a common API for all supported databases and only very limited abstraction (mostly only for sequences). MDB on the other hand can be used to abstract all data being send and received from the database. Even database schemas can be defined in a RDBMS independent format. But it does this while retaining a high level of performance and ease of use. This was achieved by closely examining two popular database abstraction layers, PEAR DB and Metabase, and merging them. But during the merging the opportunity was also used to clean up their merged APIs as well as any performance hindering design patterns.
MDB is a database abstraction layer that focuses on making writing RDBMS-independent PHP programs a simple process. Most other PHP's so-called database abstraction layer provides a public API and very limited abstraction (mostly for sequences) to all supported databases. On the other hand, MDB can be used to abstract all data sent and received by databases. Even the database schema can be defined as an RDBMS-independent format. But it provides these features while still maintaining high performance and simplicity. This is obtained by looking at two popular database abstractions, PEAR DB and Metabase, and then fusion them. And during the fusion process, take this opportunity to clean up their fused APIs and any designs that affect performance.

How did MDB come to be?
How does MDB appear?

Back in fall 2001, I was looking for a database abstraction package that would make my companies application framework RDBMS independent. The goal was to reduce database specific code to zero. The only package I found that offered such features was Metabase. But Metabase had a somewhat uncomfortable API that was partly a result of the compatibility to PHP3. This also made Metabase slower than it needed to be for our purposes, since we did not need PHP3 compatibility. Nonetheless, we decided that Metabase is our only option. But even after adding a performance enhancing patch to Metabase we felt that we were giving up too much performance. We met with the author of Metabase at the International PHP Conference 2001 and we talked about the benefits of having something like Metabase as part of the PEAR project. Shortly afterwards a discussion began once more in the PEAR mailing list about the potential benefits of a merge of PEAR DB and Metabase. After much discussion at my company we decided to take up this task. After several months of hard work we now have the first stable release of MDB.
Back in the fall of 2001, I was looking for a database abstract package that might allow my company's program framework to be independent of RDBMS. This goal is to reduce the amount of code related to a particular database to zero. The only package I found that provides such functionality is Metabase. But Metabase is partly due to the uncomfortable API for PHP3 compatibility. Nevertheless, we decided that Metabase was our only option. But even after adding a performance improvement patch to Metabase, we still feel like we're giving up too much performance. We met Metabase’s author at the 2001 PHP International Conference and we talked about the benefits of making things like Metabase a part of PEAR engineering. Shortly afterwards, another discussion began on the PEAR mailing list about the possible benefits of the fusion of PEAR DB and Metabase. After many discussions in our company, we decided to take on this task. After months of hard work, we now have the first stable release of MDB.

What does MDB do for you?
What does MDB provide you with?

MDB combines most of the features of PEAR DB and Metabase. Actually, the only feature that is missing is PEAR DB's feature of returning an object as a result set. This feature was dropped because the feature's usage never became abundant but the performance penalty was quite apparent. A lot of development time was spend on making the API as intuitive as possible as well. Finally, MDB provides this functionality at a very high level of performance that is at least as fast as PEAR DB and much faster than Metabase. Here is the list of the most important features:
MDB combines most of the features of PEAR DB and Metabase. In fact, the only thing that no longer exists in the PEAR DB feature is to return an object as a result set. We gave up this feature because it is not commonly used and the loss of performance is very obvious. Much of the development time is spent making the API as useful as possible. Ultimately, MDB offers these features very high as PEAR DB and much faster than Metabase. A list of these most important features:

OO-style API
prepared queries emulation
full data type abstraction for all data passed to and from the database (including LOB support)
transaction support
database/table/index/sequence creation/dropping/altering
RDBMS independent database schema management
Integrated into the PEAR framework (PEAR Installer, PEAR error handling etc.)

OO style API
Prepared query simulation
A complete data type abstraction (including LOB support) for all data passed in and fetched from the database
Transaction support
Database/Table/Index/Sequence creation/discard/Change
RDBMS-independent database schema management
Inherited into the PEAR framework (PEAR installer, PEAR error handling, etc.)

So how does it work?
So how do it be used?

MDB provides some very advanced abstraction features. It is important to keep in mind that these features are optional. But using them is critical in writing RDBMS independent PHP applications. An example of how the basics of MDB work can be found under Links & Literature at the end of the article. As stated earlier, the focus of the article is to introduce the features that set MDB apart from other database abstraction layers for PHP. You can find example scripts for all code examples found in this article on the CD that is packaged with this issue.
MDB provides some very advanced abstract features. It is important to remember that these characteristics are only for choice. But it is very important to use them when writing RDBMS-independent PHP programs. A "Links and Literature" section showing how simple it is to use MDB in the end of the article. As mentioned earlier, the focus of the article is to introduce those features that make MDB different from other PHP database abstraction layers. You can find the code for all these example scripts in the CD that is wrapped with this article.

But first we will need to get MDB installed. This is actually quite easy using the PEAR installer. I cannot cover the entire PEAR Installer within this article but I hear the next issue will talk about great details about all the ins and outs of the PEAR framework. There is work going on to make the Installer work on Windows but the support is still a bit flaky. For *nix systems you will need a CGI version of PHP installed on your system and simply run the following command:
However, first we need to install MDB. Using PEAR installer is actually very easy. I can't talk about the PEAR installer in full in this post, but I've heard that the next issue will discuss the inside and out of the PEAR framework in great detail. The work of getting the installer running on Windows is underway, but the support is still a little weird. For *nix system you need the CGI version of PHP to be installed on your system and simply run the following command:

lynx -source |php

After completing the installation process you simply need to type one more command and you are all set.
After the installation is completed, you only need to enter another line of commands and you will be done.

pear install MDB

If the above does not work for you there is always the option of getting the package directly from the PEAR MDB homepage. The URL is listed at the bottom of the article.
If the previous process doesn't work for you, there is always an option to get packages directly from the PEAR MDB homepage. The URL is listed at the end of the article.

Making use of data type abstraction
Take advantage of data type abstraction

Since most databases tend to have some specialities or quirks it is important for MDB to hide these differences from the developer. MDB achieves this by defining its own internal data types: text, boolean, integer, decimal, float, date, time, time stamp, large objects (files). All data that is passed to and from the database may be converted from MDB's internal format to the databases internal format. The accompanying example scripts to this section can be found in the datatype directory. Let us look at the following query:
Because most databases tend to have some personality or quirks, it is very important for MDB to hide these differences from developers. MDB achieves this by defining its own internal data types: text, boolean, integer, decimal, float, date, time, time stamp, large objects (file). All data passed to and obtained from the database can be converted into the internal format of MDB or converted back from the internal format of the database. The example scripts related to this section can be found in the datatype directory. Let's take a look at the following query:

$session = '098f6bcd4621d373cade4e832627b4f6';
// set time out to 30 minutes
$timeout = time()+60*30;
// SELECT query showing how the datatype conversion works
$query = 'SELECT createtime, user_id FROM sessions';
$query .= ' WHERE session = '.$session;
$query .= ' AND lastaccess < '.$timeout;

This query will most likely fail if it were send to a database. The reason being that the value stored in $name would need to be converted to the correct string format. This would mean the contents of $name would have to have special characters escaped and quotes placed around. PEAR DB provides the method DB:.quote() for this. In MDB the method is called MDB::getTextValue(). The difference is that MDB offers such a method for every data type listed above. So we can also convert $timeout to the correct format.
If this query is sent to the database, it will probably fail. The reason is that the values stored in $name need to be converted to the correct string format. This may mean that the content of $name may have special escape characters or be surrounded by quotes. PEAR DB provides the method DB:.quote() for this. In MDB, this method is called MDB::getTextValue(). The difference is that MDB provides such a function to each of the data types listed above. Therefore, we can also convert $timeout to the correct format.

// convert $timeout to the MDB timestamp format
$timeout = MDB_date::unix2Mdbstamp($timeout);
// SELECT query showing how the datatype conversion works
$query = 'SELECT createtime, user_id FROM sessions';
$query .= ' WHERE session = '.$mdb->getTextValue($session);
$query .= ' AND lastaccess < '.$mdb->getTimestampValue($timeout);

For the sake of the example let us assume that we only want to retrieve the first row. MDB::queryRow() fetches the first row, he frees the result set and returns the content, so it is exactly what we want.
For a demonstration, let's assume that I just want to get the first line. MDB::queryRow() gets the first line, which releases the result set and returns its contents, so it is exactly what we want.

$result = $mdb->queryRow($query);

But different RDBMS return data like dates in different formats. So, if I then want to do some date arithmetic it is important that data is always returned in the same format regardless of the RDBMS chosen. This can be done semi-automatically by MDB. All you need to do is tell MDB what type your result columns will have and MDB handles the conversion. The easiest way is to pass such information with the query method call:
However, different RDBMSs use different formats when returning data like dates. So if we then want to calculate some data, regardless of the RDBMS selected, it is important to return the data in the same format. This can be done semi-automatically by MDB. All you need to do is tell your result column what type will be, and MDB will handle the conversion work. The easiest way is to pass such information to the query function.

$types = array('timestamp', 'integer');
$result = $mdb->queryRow($query, $types);

This tells MDB that the first column of the result set is of the type `timestamp' and the second is of the type `integer'. All methods that allow querying can take such meta-information as an optional parameter. The data can also be set later using MDB::setResultTypes(). Depending on the database that the data is retrieved from, it will then convert the returned data accordingly. The MDB internal data format for timestamps is the ISO 8601 standard. Other packages such as PEAR::Date can handle this format. MDB also provides a small number of methods for date format conversion in the MDB_Date class that can be included optionally.
This tells the MDB result set that the first column type is 'timestamp' and the second column is 'integer'. All query functions can accept such meta information as optional parameters. The data can also be set later using MDB::setResultTypes(). Depending on the database from which the data is retrieved, it will then be returned by the corresponding conversion. The data format of timestamps within MDB is in accordance with the ISO 8601 standard. Other packages like PEAR::Date can handle this format. MDB also provides some data format conversion functions in the MDB_Date class, which can be optionally included.

Since pretty much every RDBMS returns integer data the same way there is no need to convert integer data. So, in order to gain a slight performance improvement you could do the following:
Because quite a lot of RDBMS returns integer data in the same way, there is no need to convert integer data. So for a slight performance improvement you can do this:

$types = array('timestamp');
$result = $mdb->queryRow($query, $types);

This way only the first column of the result set would be converted. Of course this may become an issue if MDB would be used in conjunction with a database that does return integers differently. However unlikely, the slight performance increase might not be worth this risk. But again it shows that the usage of these features is optional.
In this way, only the first column of the result set will be converted. Of course, this could become a problem if MDB is used to return different databases with integers. However, a slight performance improvement may not be worth the risk. But again, it shows that the use of these features is only for choice.

Listing 1 shows an example use of prepared queries. These can be quite convenient if you have to run a number of queries where the only difference is in the data that is being passed to the database while the structure of the query remains the same. Advanced databases can store the parsed query in memory to offer a performance boost.
Listing 1 shows an example of using prepared queries. If you have to run a large number of queries, the only difference is that the data is passed to the database, but the query structure is still the same, these can be quite convenient. Advanced databases can store parsed queries in memory to speed up performance.

Listing 1

$alldata = array(
array(1, 'one', 'un'),
array(2, 'two', 'deux'),
array(3, 'three', 'trois'),
array(4, 'four', 'quatre')
);

$p_query = $mdb->prepareQuery('INSERT INTO numbers VALUES (?,?,?)');
$param_types = array('integer', 'text', 'text');

foreach ($alldata as $row) {
$mdb->execute($p_query, NULL, $row, $param_types);
}

Each of the 4 arrays that are stored in $alldata will be used in an execute statement. The data will automatically be converted to the correct format. Since this is an insert statement the second parameter for MDB::execute() is set to NULL because we will not have any result columns for which we would need to set data types.
All four arrays stored in $alldata will be used in the execute statement. The data will be automatically converted to the correct format. Since this is an insert statement, the second parameter of MDB::execute() is set to NULL because we will not have any result columns that require us to set the data type.

Among the supported data type are also LOB's (Large OBjects) which allow you to store files into a database. Binary files are stored in BLOBs (Binary Large OBject) and normal text files are stored on CLOBs (Character Large OBject). In MDB you can only store LOB's using prepared INSERT and UPDATE queries. Using either MDB::setParamBlob() or MDB::setParamClob() you can set the values of the LOB field in a prepared query. Both methods expect to be passed a LOB object however which can be created using MDB::createLob().
There is also LOB (large object) in the supported data types, which allows us to store files in the database. Binary files are stored in BLOB (binary large object) and ordinary text files are stored in CLOB (character large object). In MDB you can only store LOBs using prepared INSERT and UPDATE queries. Using MDBA::setParamBlob() or MDB::setParamClob() you can set the value of the LOB field that is ready for the query. Both functions expect to pass a LOB object, which can be created using MDB::createLob() .

$binary_lob = array(
'Type' => 'inputfile',
'FileName' => './'
);
$blob = $mdb->createLob($binary_lob);

$character_lob = array(
'Type' => 'data',
'Data' => 'this would be a very long string container the CLOB data'
);
$clob = $mdb->createLob($character_lob);

As you can see MDB::createLob() is passed an associative array. The value for the Type key may be one of the following: data, inputfile or outputfile. The first two are used when you want to write a LOB into the database. If you have the LOB stored in a variable you should use data while inputfile should be used to read the LOB directly from a file. Finally, outputfile should be used when you want to retrieve a LOB from the database. Depending on if you are using data or inputfile you need to specify a value for the Filename key or the Data key as seen in the above example. Now, we will store the above LOB's in the database.
As you can see, MDB::createLob() is passed a relational array. The value of the Type key may be one of the following: data, inputfile, or outputfile. The first two are used when you want to write LOBs to the database. If you have a LOB stored in a variable, you should read the LOB directly from the file when you need to use the inputfile. Finally, the outputfile should be used when you want to read the LOB from the database. Depending on whether you are using data or inputfile, you need to specify a value to the Filename key or Data key, like in the above example. Now, we will store the previous LOBs in the database.

$p_query = $mdb->prepareQuery('INSERT INTO files (id, b_data, c_data) VALUES (1, ?, ?)');

$mdb->setParamBlob($p_query, 1 , $blob, 'b_data');
$mdb->setParamClob($p_query, 2 , $clob, 'c_data');

$result = $mdb->executeQuery($p_query);

In order to fetch the above file from the database we will need to first select the data from the database and create a LOB object using MDB::createLob(). This time we will set `Type' to `outputfile'.
In order to get the above file from the database, we need to first select the data from the database and create the LOB object using MDB::createLob(). This time we will set 'Type' to 'outputfile'

$mdb->query('SELECT b_data FROM files WHERE id = 1');

$binary_lob = array(
'Type' => 'outputfile',
'Result' => $result,
'Row' => 0,
'Field' => 'b_data',
'Binary' => 1,
'FileName' => './'
);
$blob = $mdb->createLob($binary_lob);

Now we can read the LOB from the result set using MDB::readLob(). Passing a length of 0 to MDB::readLob() means that the entire LOB is read and stored in the file we specified above. Once we are done we can free the resources. Alternatively, you can set any length larger than zero and read the LOB using a while loop checking MDB::endofLob().
Now we are able to read LOBs from the result set using MDB::readLob(). Passing length 0 to MDB::readLob() means that the entire LOB is read and stored in the file we specified earlier. Once the task is completed, we can free up resources. You can also set any length greater than zero and use a while loop to check MDB::endofLob() to read the LOB.

$mdb->readLob($blob, $data, 0);

It is important to note that you may not mix this method of fetching with the bulk fetching methods like MDB::fetchAll() as this will cause problems in most PHP database extensions. At some point MDB may be able to retrieve LOB's using the bulk fetching methods.
Note that you don't mess with bulk get functions like MDB::fetchAll(), as this will cause problems in most PHP database extensions. At some point, MDB may be able to use bulk get function to get LOB.

As we have seen in this section MDB features its own set of native data types that are automatically mapped to native data types in the database. This ensures that no matter what data we send or retrieve from the database it will always be in the same format no matter what RDBMS is used. As I have mentioned in the opening paragraph of this section this obviously requires that the data types used in the database are what MDB expects. This requirement was made to ensure that the mapping is done with a minimal performance loss. The next section will teach us how MDB assists with using the correct data types in the database.
As we see in this section, the set of native data types of the MDB feature itself is automatically mapped to native data types in the database. This ensures that no matter what kind of data we send and receive from the database, it can use the same format independent of the RDBMS used. As I mentioned at the beginning of this section, it is obvious that the data type used by the database is what MDB expects. This need is used to ensure that the cost of mapping is small. The next section will teach us how MDB assists in using the correct data type in the database.

Making use of XML schema files
Using XML schema files

With the features described in the last paragraph you can write truly database independent applications. But MDB tries to go one step further: It allows you to define your schemas in XML. A manager converts this schema into the necessary SQL statements for each RDBMS. This means that you can use the same schema for any of the supported RDBMS. The examples for this section can be found in the xml_schema directory.
Using the features described in the previous paragraph, you can write truly database-independent programs. But MDB tries to take a step further: it allows you to define your schema in XML. A manager converts this schema into the necessary SQL statements for each RDBMS. This means you can use the same schema for all supported RDBMSs. Examples in this section can be found in the xml_schema directory.

We will now write an XML schema file from scratch. First we must define an XML document. The database definition is contained within a database tag. The name of the database is defined using the name tag. The create tag tells the manager if the database should be created if it does not yet exist. If you split up your schema into several files you will only need to set create to 1 in the file you will submit first to the manager.
We will now write an XML schema file from scratch. First, we must define an XML document. The database definition is contained in a database tag. The name of the database is defined using the name tag. The create tag tells the manager whether the database needs to be created when it does not exist. If you split your schema file into several files, set create to 1 in the file you first submit to the manager.

<?xml version="1.0" encoding="ISO-8859-1" ?>
<database>
<name>auth</name>
<create>1</create>
</database>

As you may have guessed from the database name auth the purpose of this database is to store user data for a simple authentication application. Listing 2 defines a table in which we can store the user data.
Maybe you have guessed from the database name auth that the purpose of this database is to store user data for simple verification programs. Listing 2 defines a table in which we can store user data.

Listing 2

<table>
<name>users</name>
<declaration>
<field>
<name>user_id</name>
<type>integer</type>
<notnull>1</notnull>
<unsigned>1</unsigned>
<default>0</default>
</field>
<field>
<name>handle</name>
<type>text</type>
<length>20</length>
<notnull>1</notnull>
<default></default>
</field>
<field>
<name>is_active</name>
<type>boolean</type>
<notnull>1</notnull>
<default>N</default>
</field>
</declaration>
</table>

As you can see, things can get a bit lengthy here as to be expected when using XML. No need to worry: We are working on a browser based tool called MDB_frontend that will make this process much easier. I will talk about this project further down into this article a bit more. Hopefully, the advantage of this pretty verbose representation of the table is that things are somewhat self explanatory. The table in the last example is called users and we have defined 3 fields: user_id of type integer, handle of type text and is_active of type boolean. Remember that MDB handles the type abstraction for you if you pass it the necessary metadata as shown in the previous section. You also need not to worry about what MDB maps these types to in your RDBMS. The other tags you can use in each of the field declarations are optional: length, notnull, unsigned and default.
As you can see, as you can expect when using XML, things become a bit verbose. Don't worry: we have a browser-based tool called MDB_frontend to make this process simpler. I will talk about this project later in this post. Perhaps the advantage of this extremely detailed table description is very obvious. The table in the previous example is called users and we define 3 fields: user_id of type integer, handle of type text, and is_active of type logical. Remember if you pass the necessary metadata as in the previous section MDB handles type abstraction for you. You don't need MDB to map these types to anything in your RDBMS. Other tags that can be used in each domain declaration are optional: length, notnull, unsigned, and default.

The next thing that we now need to do is to ensure that the user_id is unique by placing the proper index on the user_id field. The index definition goes within the declaration tag (Listing 3).
The next thing we need to do now is to make sure that user_id is unique by placing the appropriate index in the user_id field. The index definition is within the declaration label (Listing 3).

Listing 3:

<table>
<name>users</name>
<declaration>
<index>
<unique>1</unique>
<name>user_id_index</name>
<field>
<name>user_id</name>
<sorting>ascending</sorting>
</field>
</index>
</declaration>
</table>

The definition in listing 3 would create a unique ascending index named user_id_index on the field user_id. Of course, we could have specified more than one field in the index definition by simply adding another field tag. What we are still missing now is a sequence to generate unique user id's for us:
Definition in listing 3 Creates a unique rising order index named user_id_index in the domain user_id. Of course, we can simply add another domain tag to specify more than one domain in the index definition. What we still haven't mentioned is the sequence that generates the only user id for us.

<sequence>
<name>users_user_id</name>
<start>1</start>
<on>
<table>users</table>
<field>user_id</field>
</on>
</sequence>

The last example is pretty mind blowing. Going through line by line we see that we first open a sequence tag followed by a name tag which specifies the name of the sequence. This is followed by a start tag that defines the initial value of the sequence. Now, we open an optional on tag. Here we need to set a specific field within a table. This information is used by the manager to set the value of the sequence to the maximum value in the user_id field of the users table. If the users table is empty the value specified in the start tag is used instead. Please note that the value specified in the start tag is the first value that will be returned if you call MDB::nextId().
The previous example is very tidbit. Looking at the lines, we see that a sequence tag is first opened, followed by a name tag with the specified sequence name. This is followed by a start tag that defines the initial value of the sequence. Now, we open an optional on tab. Here we need to set a specified field in a table. This information is used by the manager to set the value of the sequence to the maximum value of the user_id field of the users table. If the users table is empty, as an alternative, the value specified in the start tag is used. Note that the value specified in the start tag is the first value we return by calling MDB::nextId() .

Of course, you can also initialize a table with any values. For example you may want to initialize the above table with a maintenance user that you always want to include with your application. To do this we need to add an initialization tag to the table tag. Listing 4 defines one row after another enclosed with an insert tag.
Of course, you can also initialize the table with any value. For example, you may want to initialize the previous table with the administrative user you always want to include in your program. To do this, we need to add an initialization tag to the table tag. Listing 4 defines a line after another line included with the insert tag.

Listing 4

<table>
<name>users</name>
<initialization>
<insert>
<field>
<name>user_id</name>
<value>1</value>
</field>
<field>
<name>handle</name>
<value>default</value>
</field>
<field>
<name>is_active</name>
<value>Y</value>
</field>
</insert>
</initialization>
</table>

As you can see from the last example all we have to do is to define a value for each field of the table. We now have the necessary basics to create an XML schema for MDB. The next step is to pass this schema file to the MDB manager.
As you can see from the previous example, all we need to do is set values for each field of the table. We now know the basics necessary to create an MDB XML schema. The next step is to pass this schema file to the MDB manager.

$manager = new MDB_Manager;
$input_file = '';
// we do not have to connect to a specify a specific database at this time
$dsn = "mysql://$user:$pass@$host";
$manager->connect($dsn);
$manager->updateDatabase($input_file, $input_file. '.before');

We now have a new database called auth with a table called users. There is one index on the field user_id. There is one row in the table as well. We also have a sequence called users_user_id which will be initialized at 1. The next value in the sequence will therefore be 2. Finally, a copy of the schema was created with the name . This happened because we passed the optional second parameter to MDB_Manager::updateDatabase(). In the next section we will see why this copy is created.
We now have a new database called auth, which has a table called users. There is an index in the domain user_id. And there is a row in the table. We also have a sequence called users_user_id, which will be initialized to 1. Therefore the next value in the sequence is 2. Finally, a copy of the schema is created with a name. This is because we pass an optional second parameter to MDB_Manger::updateDatabase() . In the next section we will see why we want to create this copy.

This is all fairly amazing but it gets better. It is often the case an application needs to be changed at some point. For example we may decide we want to change the name of the table from users to people. We also want to add a field called pwd to store the password field (please check the textbox Reserved Words).
All of this is amazing but it gets better. In many cases, procedures require changes in certain places. For example, we may decide that we need to change the name of the table from users to people. We may also need to add a domain pwd to store the password domain (please check the reserved words of textbox).

Reserved Words
Reserved words

The reason we do not call the field password is that this is a reserved word for field names in Interbase. Since we want to be RDBMS independent the MDB manager will either issue a warning or fail if the option fail_on_invalid_names is set to true (which is the default).
The reason we didn't call that domain password is that it is a reserved word for a domain name in Interbase. Because we need RDBMS to be independent, the MDB manager either gives a warning or fails when the fail_on_invalid_names option is set to true (this is the default).

In the old days you would now be in a bit of pain to alter all your existing installations to this new schema. But thanks to MDB this can be automated. In listing 5 are the changes we make to our table definition:
In the past, you may now be in the pain of turning everything you already have into this new schema. However, because MDB these tasks can be done automatically. In listing 5 we make changes to our table definition:

Listing 5

<table>
<name>people</name>
<was>users</was>
<declaration>
<field>
<name>pwd</name>
<type>text</type>
<length>32</length>
<notnull>1</notnull>
<default></default>
</field>
</declaration>
</table>

Now we want the manager to make the necessary alterations, but before I want to mention a possible pitfall. Since we renamed the table users to people we also have to change all references to the old name like in the sequence we build. There the reference in the on tag needs to be changed to point to the people table. To achieve this we pass the new and the old version of the schema to the manager. This is why we created a .before file when we first called MDB_Manager::updateDatabase(). This ensures that we have an old version of the schema to compare the new version with.
Now we want the manager to make the necessary changes, but before that I'm like to mention the possible pitfalls. Because we change the table from users to people, we also need to change all references to the original name, such as the sequence we created. The index in the on tag needs to be changed to point to the people table. To achieve this, we pass the old and new versions of shcema to the manager. This is why we created a .before file when we first called MDB_Manager::updateDatabase() . This ensures that we have an old version of shcema to compare with the new version.

$input_file = '';
$manager->updateDatabase($input_file, $input_file.'.before');

That's all! The users table is now called people and now we also have a pwd field.
That's all! users are represented in people and we also have a pwd domain.

I now want to look at one last feature of the XML schema format. This feature is especially important if you want to programmatically use the manager. Imagine that you have several customers that have the same authentication application running on your database server. Every customer has a database running on this server with the same schema but one minor difference: the name of the database. While it may be feasible to keep separate schema files for each client because the update cycles will not be the same this is not the case for our sample authentication application. Here all clients will be updated at the same time. The XML schema format allows us to use the variable tag for this.
I'm going to look at the last feature of the XML schema format now. This feature is especially important if you want to use a manager programmatically. Suppose you have several clients with the same verification program running on your database server. Each client has a server running on this server with the same schema with only a slight difference: the name of the database. It may be possible to save the schema file individually for each customer because the update cycle may not be the same, which is not the case with our example verification program. All customers here are updated at the same time. The XML schema file allows us to use variables for this purpose.

<?xml version="1.0" encoding="ISO-8859-1" ?>
<database>
<name><variable>name</variable></name>
</database>

We can now set the variable name at run time to whatever we may need.
We now set the variable at runtime to whatever we need.

foreach($clients as $name) {
$variables = array('name' => $name)
$manager->updateDatabase($input_file, $input_file.'.before', $variables);
}

The XML schema management is another important piece in the database abstraction concept that MDB provides. It allows us to keep our schema definition independent of a specific RDBMS. But using this format also ensures that the correct native data types are used so that MDB can correctly map its native data types. Finally, since the format is based on XML it is much easier to write tools that generate or read XML schema files.
XML schema management is another very important part of the database abstraction concept provided by MDB. It allows us to keep our schema definition independent of a specific RDBMS. But using this format also ensures that the correct native data type is used so that MDB can correctly map its native data type. Finally, because the data is based on XML, it is easier to write tools that generate or read XML schema files.

Sounds great but my application already uses ...
Sounds good but my app is already used...

Most readers probably find themselves in a position where they already have a number of applications that run on some other database abstraction layer. Due to MDB's heritage most PEAR DB users should find that MDB feels very similar, since the API of MDB is based on that of PEAR DB. Metabase users should find that all their favourite functions have their counterpart in MDB. The XML schema format is exactly the same as in Metabase. A complete guide to porting your existing applications to MDB is beyond the scope of this article, instead I will use this space to give some tips. If you have any specific questions feel free to email me.
Most readers may find them in a situation where they already have a large number of programs running on other database abstractions. Due to MDB's background, most PEAR DB users should find that MDB feels very similar, because MDB's API is based on PEAR DB. Metabase users should find that all their preferred features have corresponding things in MDB. The XML schema format is the same as in Metabase. A complete guide to porting your already written programs to MDB is beyond the scope of this article, but I will take advantage of this opportunity to give some tips. If you have any specific questions, please send me a message to ask me with confidence.

To port your PEAR DB application to MDB the best place to start is the PEAR wrapper. For one you can run your application using the PEAR wrapper. The wrapper of course does add a little bit of overhead so you will probably want to port to the native interface at some point. The first step then should be listing all PEAR DB methods that your application currently uses. Then look at the wrapper for any differences in the API. There are two key differences you will notice: result sets are not objects anymore and all of the querying methods allow you to pass the data types of the result set which will result in slight changes in the parameter order. The first difference means that instead of calling the fetch method on the result object:
To port your PEAR DB program to MDB, the best starting point is PEAR wrapper. You can use PEAR wrapper to run your program. wrapper certainly adds some extra burden, so you may want to port to native interfaces. Then the first step is to list all the PEAR DB functions currently used by your program. Then take a look at wrapper to find out the difference on any API. There are two key differences you need to pay attention to: the result set is no longer an object and all the query methods that allow you to pass the result set data type will cause a little change in the order of the parameters. The first difference means that the get function can no longer be called on the result object.

$result = $db->query($sql);
$row = $result->fetchRow();

You will now have to call the MDB object for fetching:
You now have to call the MDB object to get:

$result = $mdb->query($sql);
$row = $mdb->fetchRow($result);

The second difference is quite easily fixed by looking at the wrapper. As you can see in the wrapper you may simply pass NULL where MDB would otherwise expect data types in the result set. Now, your application should work with MDB. Of course, you are now not really taking advantage of the advanced features of MDB. This most likely will require some changes to your current database schema. The manager can attempt to reverse engineer an XML schema file from an existing database. A very simple front end can be found in the MDB package: the reverse_engineer_xml_schema.php script. Most likely you will need to manually fix the resulting XML schema file, but it will give you a nice starting point.
The second difference can be easily solved by observing the wrapper. As you can see in wrapper again, you can simply pass NULL where MDB expects to get the data type of the result set. Now, your program should be able to use MDB. Of course, you are not really benefiting from the advantages of the advanced features of MDB right now. This is most likely to require some changes to your current database schema. The manager can try to retrieve the XML schema file from an existing database in reverse. A very simple front-end can be found in the MDB package: reverse_engineer_xml_schema.php script. It's very likely that you will need to manually correct the generated XML schema mercy, but it will give you a good start.

If you want to port your existing application from Metabase to MDB you will have to change all of your function calls. Looking at the Metabase wrapper it will become quite obvious what changes need to be made. If you know regular expressions well you might even be able to get most of the work done with a few such replacements. Anyways, you should be up and running your old beloved advanced abstraction features but now using MDB in no time. What you will probably notice is that the method names are much shorter now. If you do some benchmarking you will also see a nice performance increase.
If you want to port your existing program from Metabase to MDB you will have to change all function calls. It will become very obvious to see what needs to be changed by Metabase wrapper. If you know regular expressions you might be able to do most of these replacements. Anyway, you should move forward and run the advanced abstract features you used to like but now you are using MDB. What you may notice is that the function name has become shorter. If you do some performance tests, you will also see considerable performance improvements.

So what does the future look like for MDB?
So what will MDB look like in the future?

At the time this article publishes MDB will have moved on from the original 1.0 release. Next to the original MySQL and PostGreSQL drivers MDB will also have an ODBC driver and possibly even more drivers. This is one key area that is focused on during the development of MDB. Once MDB has caught up with PEAR DB in terms of drivers it is likely to become the standard database abstraction layer in the PEAR framework.
When this article was published, MDB may no longer be the original 1.0 release. After the original MySQL and PostGreSQL drivers, MDB will also have an ODBC driver and possibly more drivers. This is one of the key areas to focus on during MDB development. Once MDB keeps up with PEAR DB in terms of drivers, it is likely to become the standard database abstraction layer in the PEAR framework.

But there is another key area of development: the MDB_frontend project. The MDB_frontend will be a phpMyAdmin like webfrontend based on MDB and the MDB manager. With this tool you will be able to browse databases stored on any RDBMS that MDB supports. The MDB_frontend will show both the native and the MDB data types. Emulated features such as sequences in MySQL will be hidden. The user will simply see a list of sequences and not a table storing the value of the sequence which is how sequences are emulated in MySQL. Furthermore the MDB_frontend will assist in porting existing databases to match the native data types that MDB expects to be used. It will also help in creating and updating XML schema files. Some initial work has been completed but much more work is needed before a public release can be expected.
But there is another key area in development: MDB_frontend engineering. MDB_frontend will become phpMyadmin based on MDB and MDB managers. With this tool, you will be able to browse databases stored in RDBMS supported by MDB. MDB_frontend will display both native and MDB data types. Simulated features such as sequences in MySQL will be hidden. The user will only see a list of sequences instead of a table that stores the sequence references, and in MySQL this is how the sequence is simulated. And MDB_frontend will help migrate existing databases to match the native data types that MDB expects to use. It will also help create and update XML schema files. Some early work has been done but a lot of work needs to be added before public release.

While drivers and the MDB_frontend are the focus of all development currently, there are other things that MDB users may need: Like the integration of bulk fetching of LOB fields, others may need foreign and primary key support. As always in opensource things will go faster if you participate in testing and implementation. But I am also thankful for any other feedback like feature requests.
Drivers and MDB_frontend are all the focus of current development, and there are many other users who may need in MDB: like bulk to get integration of LOB domains, others may need external and primary key support. As always, if you participate in testing and implementation, open source things will speed up a lot. But I am also very grateful for the feedback like Heyang for its characteristics.

Some final thoughts
Some thoughts after writing

After months of hard work MDB is gaining acceptance among the current PEAR DB and Metabase users. I also hope that people that so far have not been convinced by other database abstraction layers realize the benefits that MDB holds for them. Of course, there are still a lot of applications that need to be tailored specifically to one RDBMS where a tool like MDB just ads unnecessary overhead and restrictions. Overall I am very pleased that we made the decision in my company to lead the MDB development. In the beginning, we were all a bit worried that by attempting to please both the PEAR DB and Metabase users the result would end up pleasing neither side. Another source of concern was if the PHP community would assist in the development or not. I am very happy that the PHP community came through and helped in writing drivers and helping on the core of MDB as well. Therefore we consider this project to be a huge success. We are sure that together MDB will be improved even further. And we are happy to have helped making PHP even better.
After months of hard work, MDB is gaining recognition among current PEAR DB and Metabase users. I also hope that users who are not currently convinced by other database abstractions are aware of the benefits MDB gives them. Of course, there are still many programs that require special tailoring of RDBMS, and in this case tools like MDB only add unnecessary extra burdens and limitations. Overall, I am very happy that we have made the decision to lead MDB development in our company. At first I was a little worried about trying to please both PEAR DB and Metabase users but the result might be a little bit unpleasant. Another source of concern is whether the PHP community will help its development. I'm so glad that the PHP community has come and helps write about drivers and the core of MDB. Therefore, we think this project is a great success. We also believe that MDB will be improved even more. And we are happy to help PHP get better.

Lukas Smith is the lead author of PEAR MDB. He actively contributes to various PHP opensource projects and is a founder of the company BackendMedia which specializes in PHP development.
Lukas Smith is the lead author of PEAR DB. It actively contributes to multiple PHP Kaiyuan projects and is the founder of BackendMeida, a company focused on PHP development.

Links and Literature
Links and Literature

PEAR MDB homepage: /?package=MDB

PEAR MDB documentation: /MDB/docs/

PEAR MDB sample script: //pear/MDB/MDB_test.php

PEAR DB homepage: /?package=DB

Metabase homepage: /?page=%%2Fpackage%

Simple benchmark: /screenshots/30313/