The code is not only shorter, but also easier to understand and efficient. Instead of executing two queries, we execute one.
Although this question may sound far-fetched, in practice we usually summarize that all tables should be in the same database unless there is a very forced reason.
Question 4: No relationship
Relational databases are different from programming languages, and they do not have array types. Instead, they use relationships between tables to create one to multiple structures between objects, which has the same effect as arrays. One problem I've seen in the application is that engineers are trying to use databases as programming language, i.e. creating arrays by using text strings with comma-separated identifiers. Please see the pattern below.
Copy the codeThe code is as follows:
DROP TABLE IF EXISTS files;
CREATE TABLE files (
id MEDIUMINT,
name TEXT,
path TEXT
);
DROP TABLE IF EXISTS users;
CREATE TABLE users (
id MEDIUMINT,
login TEXT,
password TEXT,
files TEXT
);
INSERT INTO files VALUES ( 1, '', 'media/' );
INSERT INTO files VALUES ( 2, '', 'media/' );
INSERT INTO users VALUES ( 1, 'jack', 'pass', '1,2' );
Listing 10.
One user in the system can have multiple files. In programming languages, arrays should be used to represent files associated with a user. In this example, the programmer chooses to create a files field containing a comma-separated list of file ids. To get a list of all files for a specific user, the programmer must first read the rows from the user table, then parse the file's text, and run a separate SELECT statement for each file. The code is shown below.
Copy the codeThe code is as follows:
<?php
require_once("");
function get_files( $name )
{
$dsn = 'mysql://root:password@localhost/bad_norel';
$db =& DB::Connect( $dsn, array() );
if (PEAR::isError($db)) { die($db->getMessage()); }
$res = $db->query( "SELECT files FROM users WHERE login=?",
array( $name ) );
$files = null;
while( $res->fetchInto( $row ) ) { $files = $row[0]; }
$rows = array();
foreach( split( ',',$files ) as $file )
{
$res = $db->query( "SELECT * FROM files WHERE id=?",
array( $file ) );
while( $res->fetchInto( $row ) ) { $rows[] = $row; }
}
return $rows;
}
$files = get_files( 'jack' );
var_dump( $files );
?>
Listing 11.
The technology is slow, difficult to maintain, and does not make good use of the database. The only solution is to re-architect the pattern to convert it back to the traditional relationship form, as shown below.
Copy the codeThe code is as follows:
DROP TABLE IF EXISTS files;
CREATE TABLE files (
id MEDIUMINT,
user_id MEDIUMINT,
name TEXT,
path TEXT
);
DROP TABLE IF EXISTS users;
CREATE TABLE users (
id MEDIUMINT,
login TEXT,
password TEXT
);
INSERT INTO users VALUES ( 1, 'jack', 'pass' );
INSERT INTO files VALUES ( 1, 1, '', 'media/' );
INSERT INTO files VALUES ( 2, 1, '', 'media/' );
Listing 12.
Here, each file is associated with the user in the file table through the user_id function. This may be the opposite of the idea of anyone who thinks of multiple files as arrays. Of course, the array does not reference the objects it contains—in fact, vice versa. But in relational databases, this is how it works, and queries are therefore much faster and easier. Listing 13 shows the corresponding PHP code.
Copy the codeThe code is as follows:
<?php
require_once("");
function get_files( $name )
{
$dsn = 'mysql://root:password@localhost/good_rel';
$db =& DB::Connect( $dsn, array() );
if (PEAR::isError($db)) { die($db->getMessage()); }
$rows = array();
$res = $db->query(
"SELECT files.* FROM users,files WHERE =?
AND =files.user_id",
array( $name ) );
while( $res->fetchInto( $row ) ) { $rows[] = $row; }
return $rows;
}
$files = get_files( 'jack' );
var_dump( $files );
?>
Listing 13. Get_good.php
Here, we do a query to the database to get all the rows. The code is not complicated, and it uses the database as its original purpose.
Question 5: n+1 mode
I really don't know how many times I've seen such large applications where the code first searches some entities (such as customers) and then searches them one by one to get detailed information for each entity. We call this n+1 pattern because the query is executed so many times - a query retrieves a list of all entities and then performs a query for each of the n entities. This is not a problem when n=10, but what about when n=100 or n=1000? Then there will definitely be inefficiency issues. Listing 14 shows an example of this pattern.
Copy the codeThe code is as follows:
DROP TABLE IF EXISTS authors;
CREATE TABLE authors (
id MEDIUMINT NOT NULL AUTO_INCREMENT,
name TEXT NOT NULL,
PRIMARY KEY ( id )
);
DROP TABLE IF EXISTS books;
CREATE TABLE books (
id MEDIUMINT NOT NULL AUTO_INCREMENT,
author_id MEDIUMINT NOT NULL,
name TEXT NOT NULL,
PRIMARY KEY ( id )
);
INSERT INTO authors VALUES ( null, 'Jack Herrington' );
INSERT INTO authors VALUES ( null, 'Dave Thomas' );
INSERT INTO books VALUES ( null, 1, 'Code Generation in Action' );
INSERT INTO books VALUES ( null, 1, 'Podcasting Hacks' );
INSERT INTO books VALUES ( null, 1, 'PHP Hacks' );
INSERT INTO books VALUES ( null, 2, 'Pragmatic Programmer' );
INSERT INTO books VALUES ( null, 2, 'Ruby on Rails' );
INSERT INTO books VALUES ( null, 2, 'Programming Ruby' );
[code]
Listing 14.
This mode is reliable, without any errors. The problem is accessing the database to find all the books in the code of a given author, as shown below.
[code]
<?php
require_once('');
$dsn = 'mysql://root:password@localhost/good_books';
$db =& DB::Connect( $dsn, array() );
if (PEAR::isError($db)) { die($db->getMessage()); }
function get_author_id( $name )
{
global $db;
$res = $db->query( "SELECT id FROM authors WHERE name=?",
array( $name ) );
$id = null;
while( $res->fetchInto( $row ) ) { $id = $row[0]; }
return $id;
}
function get_books( $id )
{
global $db;
$res = $db->query( "SELECT id FROM books WHERE author_id=?",
array( $id ) );
$ids = array();
while( $res->fetchInto( $row ) ) { $ids []= $row[0]; }
return $ids;
}
function get_book( $id )
{
global $db;
$res = $db->query( "SELECT * FROM books WHERE id=?", array( $id ) );
while( $res->fetchInto( $row ) ) { return $row; }
return null;
}
$author_id = get_author_id( 'Jack Herrington' );
$books = get_books( $author_id );
foreach( $books as $book_id ) {
$book = get_book( $book_id );
var_dump( $book );
}
?>
Listing 15.
If you look at the code below, you might think, “Hey, that’s really clear.” First, get the author id, then get the book list, and then get information about each book. Indeed, it is clear, but is it efficient? The answer is no. See how many queries you have to perform when just retrieving Jack Herrington’s books. Get id once, get a list of books another time, and then perform a query for each book. Three books need to be searched five times!
The solution is to use a function to execute a large number of queries as shown below.
Copy the codeThe code is as follows:
<?php
require_once('');
$dsn = 'mysql://root:password@localhost/good_books';
$db =& DB::Connect( $dsn, array() );
if (PEAR::isError($db)) { die($db->getMessage()); }
function get_books( $name )
{
global $db;
$res = $db->query(
"SELECT books.* FROM authors,books WHERE
books.author_id= AND =?",
array( $name ) );
$rows = array();
while( $res->fetchInto( $row ) ) { $rows []= $row; }
return $rows;
}
$books = get_books( 'Jack Herrington' );
var_dump( $books );
?>
Listing 16. Get_good.php
Retrieving the list now requires a quick, single query. This means that I will most likely have to have several methods of these types with different parameters, but there is really no choice. If you want to have an extended PHP application, you must use the database efficiently, which means smarter queries.
The problem with this example is that it is a bit too clear. Generally speaking, these types of n+1 or n*n problems are much more subtle. And they only occur when the database administrator runs the query profiler on the system when the system has performance issues.
Conclusion
Databases are powerful tools, just like all powerful tools, and they will be abused if you don't know how to use them correctly. The trick to identifying and solving these problems is to better understand the underlying technology. For a long time, I have heard business logic writers complain that they don't want to have to understand databases or SQL code. They use the database as an object and wonder why the performance is so poor.
They did not realize how important it is to understand SQL to convert databases from a difficult necessity to a strong alliance. If you use a database every day but are not familiar with SQL, please read The Art of SQL. This book is well written and very practical, and can guide you to a basic understanding of databases.