Two practical techniques

To fully utilize the advantages, it is also very important to not only have a comprehensive and in-depth understanding of the programming model, but also to summarize experience and skills in a timely manner. ADO has years of practical experience, and based on this, it provides richer and more powerful tools; despite this, the design goal is not to provide a plug-and-play tool, and it will not simplify all programming work to the point where it can be done with just a mouse click.
It contains a large number of objects representing various logical entities in the data access model, among which the two objects are the most important. The purpose of a connection is to establish a channel for communication with the backend database, and the creation of a connection object must be based on a specific .NET data provider. Transaction objects can be created on existing connection objects, or they can be created by explicitly executing a BEGIN TRAN SQL statement. Although the theory is simple, in fact, there are many uncertainties surrounding connections and transactions, and they have a critical impact on the overall stability and efficiency of the application.

How to save a connection string to protect sensitive information (such as passwords) that may be contained in the connection string? How to design a complete data access strategy that takes into account security (i.e. authentication, authorization), but does not have too much impact on performance and scalability? If transactions are needed, how can we implement and control transactions efficiently? Using automatic transactions or manual transactions? These issues must be carefully considered when used.

1. Connect strings, connect pools

Database connection is an important, limited and expensive resource, so making good use of connection objects is the most basic requirement of any application. The key points of using database connections can be summarized as follows:

Pay attention to safety when saving connection strings.
Opening the connection should be late, closing the connection should be early.
The connection string is the key to accessing the database. In addition to indicating the data to be accessed, the connection string also contains proof of identity of why the user can access that data. When performing database operations, user identity proof is the most important factor in determining data access rights.

1.1 Save the connection string

Currently, hard-coded connection strings have the best performance because they are compiled directly into the application's code. However, hard-coded strings affect the flexibility of the program and the application must be recompiled once the connection string changes.

Saving connection strings external increases flexibility at the cost of accessing external strings. But in most cases, the performance overhead caused by this is negligible, and the real concern is security issues. For example, an attacker may modify or steal connection strings. Common ways to save connection strings to external environments are: configuration files, UDL files, and Windows registry.

.NET framework configuration files are deployed in the form of plain text files for easy access. If the connection string contains a password, the text format will be the biggest drawback because the password will be saved in plain text. You can consider introducing a dedicated encryption/decryption engine, but this part of the work needs to be done by the developer himself.

UDL files are text files for use by OLE DB providers, that is, SQL Server hosting providers do not support UDL files. UDL files also have the same security problems as the previous configuration files, and generally speaking, there are not many advantages.

Finally, the Windows registry can serve as a naturally secure storage place. The registry is a systematic knowledge base that stores key information. If encryption technology is combined, it can achieve higher security. The main disadvantages of using the registry are the hassle of deployment, requiring the creation of the registration key (and possibly also encrypted) and reading data from the registry. Although the .NET Framework provides a set of encapsulated classes that call the underlying Win32 API, none of these classes provide encryption functionality. The aspnet_setreg.exe tool can be used to create the registration key under HKEY_LOCAL_MACHINE to save the user name and password, for example: aspnet_setreg.exe -k "Software\MyData" -u:userID -p:password. This command encrypts the specified user ID and password.

1.2 Connection pooling principle

The connection pool allows us to reuse existing connection objects through a buffer pool, avoiding creating a new object every time we use the connection object. After using the connection pool, only a small number of connection objects can meet the needs of a large number of clients.

Each connection pool is associated with an independent connection string and its transaction context. Each time a new connection is opened, the data provider tries to match the specified connection string with the connection pool's string. If the match fails, the data provider creates a new connection and adds it to the connection pool. After the connection pool is created, it will not be demolished unless the process ends. Some people think that this way of handling will affect performance, but in fact, it does not require much overhead to maintain an inactive or empty connection pool.

After the connection pool is created, the system creates some connection objects and adds them to the connection pool until the minimum number of connection objects is reached. In the future, the system will create and add connection objects as needed until the maximum number of connection objects is reached. If there are no idle connection objects available when the program requests a connection object, and the number of objects in the connection pool has reached the upper limit, the request is placed in the queue and will be taken out immediately for use once a connection is released back to the buffer pool.

Avoid constructing connection strings programmatically. If the connection string is constructed by combining multiple input data, it is easy to give injection attacks a chance. If the data entered by the user must be used, strict verification must be carried out.

1.3 Close the connection

When closing a connection, the connection object is returned to the connection pool for reuse, but the actual database connection is not removed at this time. If the connection pool is disabled, the actual database connection is also closed. One thing that must be emphasized here is that the connection object should be explicitly closed and returned to the connection pool after use, and do not rely on the garbage collector to release the connection. In fact, when the reference to the connection object is out of the valid range, the connection is not necessarily closed - the function of the garbage collector is to remove .NET encapsulated objects representing physical connections, but this does not mean that the underlying connection is also closed.

Calling the Close or Dispose method can release the connection back to the connection pool. The connection object is deleted from the connection pool only when the lifetime ends or a serious error occurs.

1.4 Connection pool and security

If all data access operations of an application use the same connection string, the advantages of connection pooling will be taken to the limit. However, this is just an ideal situation and is likely to conflict with other requirements of the application. For example, if you only use one connection string, it will be difficult to perform security controls at this level of the database.

On the other hand, if each user is asked to use his own connection string (that is, to set up a database account for each user), there will inevitably be a large number of small connection pools, and many connections will not be reused at all. By convention, the best solution to this type of problem is to find an appropriate trade-off between the two extremes. We can set up a representative set of public accounts and modify the stored procedure at the same time to accept a parameter representing the user ID. The stored procedure performs different operations based on the incoming user ID.

2. Transaction mode

Distributed enterprise applications are inseparable from transactions. There are two main ways to add transaction management functions to the data access code: manual method and automatic method.

In manual mode, the programmer is responsible for writing all configurations and using transaction mechanisms. Automatic (or COM+) transactions add declarative attributes to the .NET class to specify the transaction characteristics of the runtime object. Automatic method facilitates the configuration of multiple components to run within the same transaction. Both transaction methods support local or distributed transactions, but automatic transaction methods greatly simplify distributed transaction processing.

It must be noted that transactions are a very expensive operation, so you must think twice before deciding to use transactions. If you really need to use transactions, you must try to minimize the granularity of the transaction and reduce the locking time and locking range of the database. For example, for SQL Server, a single SQL statement does not need to declare transactions explicitly, and SQL Server will automatically run each statement as an independent transaction. Manual local transactions are always much faster than other transactions because they do not need to involve DTC (Distributed Transaction Coordinator).

Manual transactions and automatic transactions should be regarded as two different and mutually exclusive technologies. Manual transactions are preferred if you want to perform transactional operations on a single database. When a single transaction spans multiple remote databases, or a single transaction involves multiple resource managers (e.g., one database and one MSMQ resource manager), automatic transactions are preferred. In any case, efforts should be made to avoid mixing the two transaction modes. If performance is not particularly important, consider using automatic transactions even for only one database operation, making the code more concise (but slightly slower).

In short, to improve the quality of database access code, you must have an in-depth understanding of the object model and flexibly use various techniques according to actual conditions. It is a public API, and various applications - whether it is a Windows Forms application, ASP page or Web service, can access the database; however, it is not a black box that accepts input and spits out the results, but a toolbox composed of many tools.