golang sql statement timeout control scheme and principle

Generally, when an application executes an SQL statement, it will set a timeout time for this SQL. If the timeout time has not been executed yet, it will directly terminate the SQL, release the resource, and return an error. Here we mainly discuss the specific practices, implementation principles and impact on the number of connections of the connection pool in the golang+mysql scenario.

Implement the timeout control of SQL statements based on context:

Using context for timeout control is a standard practice in golang. It can be said that when the first parameter of a function is ctx, the function should make a promise. When the ctx cancel signal is received, the execution of the function should be terminated in advance and resources should be released. Currently, the most common practice for back-end applications to operate databases is to use the Gorm framework. This framework mainly plays the role of SQL stitching and shielding the differences in underlying databases. It does not provide a connection pool and the client driver of mysql. The connection pool uses the connection pool provided by the database/sql standard library by default, and the driver uses go-sql-driver/mysql. Therefore, to analyze how to perform timeout control based on context, it is necessary to analyze from these three layers.

For Gorm, if you want to timeout control of a SQL, you can directly use the WithContext() method, as follows:

func main() {
  ctx := ()
  ctx, cancel := (ctx, 3*)
  defer cancel()
  err := (ctx).Exec("select sleep(10)").Error
  if err != nil {
    (err)
  }
}

// output
// [3001.379ms] [rows:0] select sleep(10)
// 2023/12/17 13:31:54 context deadline exceeded

Here, the timeout time of ctx is set to 3s, and the SQL statement is sleep 10s. Finally, when the execution time reaches 3s, the context deadline exceeded error is returned.

After gorm calls WithContext, it will eventually give this ctx to the ExecContext function of the database/sql connection pool.

func (db *DB) ExecContext(ctx , query string, args ...any) (Result, error) {
  var res Result
  var err error

  err = (func(strategy connReuseStrategy) error {
    res, err = (ctx, query, args, strategy)
    return err
  })

  return res, err
}

This function will take a connection from the connection pool, and then the database driver layer will actually execute SQL.

func (mc *mysqlConn) ExecContext(ctx , query string, args []) (, error) {
    dargs, err := namedValueToValue(args)
    if err != nil {
       return nil, err
    }
    // Here is the core code, put ctx into the listening queue    if err := (ctx); err != nil {
       return nil, err
    }
    defer ()

    return (query, dargs)
}

Before actually communicating with the MySQL server, go-sql-drive/mysql will call watchCancel to listen for the cancel signal of the current ctx to ensure that the cancel signal can be received immediately during the execution of the sql and perform the SQL cancellation operation. The specific implementation of the watchCancel function is as follows:

func (mc *mysqlConn) watchCancel(ctx ) error {
    if  {
       // Reach here if canceled,
       // so the connection is already invalid
       ()
       return nil
    }
    // When ctx is already cancelled, don't watch it.
    if err := (); err != nil {
       return err
    }
    // When ctx is not cancellable, don't watch it.
    if () == nil {
       return nil
    }
    // When watcher is not alive, can't watch it.
    if  == nil {
       return nil
    }

    // Under normal circumstances, you will walk here and put ctx into a watcher pipeline     = true
     &lt;- ctx
    return nil
}

You can see that the core code puts this ctx into a pipeline, so there must be a program that listens to this pipeline, which is actually the following code:

func (mc *mysqlConn) startWatcher() {
    watcher := make(chan , 1)
     = watcher
    finished := make(chan struct{})
     = finished
    go func() {
       for {
          var ctx 
          select {
          case ctx = &lt;-watcher:
          case &lt;-:
             return
          }

          select {
          // Here I listened to the ctx cancel signal and actually performed the cancel operation          case &lt;-():
             (())
          case &lt;-finished:
          case &lt;-:
             return
          }
       }
    }()
}

This startWatcher function will start a coroutine separately to listen to the watcher pipeline of this connection. For each ctx taken from the pipeline, and to monitor whether its cancel signal is over, the SQL statements on the same connection will definitely be executed in sequence. There will be no problem listening to each ctx in sequence. This startWatcher will be called when the connection is created, ensuring that the ctx added to each statement on the subsequent connection will be listened to.

If you really hear the cancel signal, you will call the cancel function to cancel.

// finish is called when the query has canceled.
func (mc *mysqlConn) cancel(err error) {
    (err)
    ()
}

func (mc *mysqlConn) cleanup() {
  if !(true) {
    return
  }

  // Makes cleanup idempotent
  close()
  if  == nil {
    return
  }
  // The core code is as follows, closing the TCP connection used for communication  if err := (); err != nil {
    (err)
  }
}

Finally, when the cancel signal is received, the TCP connection that communicates with mysql server will be closed.

Implement the timeout control of SQL statements based on readTimeout and writeTimeout in DSN:

Another method is to specify it in the dsn that opens a db object. The specific method is as follows:

func init() {
    dsn := "root:12345678@tcp(localhost:3306)/test?charset=utf8mb4&parseTime=True&loc=Local&timeout=1500ms&readTimeout=3s&writeTimeout=3s"
    var err error
    db, err = ((dsn), &{})
    if err != nil {
       (err)
    }
    ()
}

func main() {
  ctx := ()
  err := (ctx).Exec("select sleep(10)").Error
  if err != nil {
    (err)
  }
}

// output
// [3002.597ms] [rows:0] select sleep(10)
// 2023/12/17 14:21:11 invalid connection

Specifying readTimeout=3s&writeTimeout=3s in dsn, and executing a sleep(10) at the same time can also report an error in the third second, but the error will be a little strange. The invalid connection seems to have nothing to do with timeout. This is because the meaning of these two timeout times is actually for the TCP connection used by mysql under the layer. That is, readTimeout is the timeout of a packet reading from the TCP connection, writeTimeout is the timeout of a packet writing to a TCP connection, and this timeout is implemented based on the deadline of the connection. Therefore, once the timeout is timed out, it will be considered that the connection is abnormal, and finally return an error of such a connection exception.

The specific implementation principle is still in go-sql-driver/mysql. When creating a connection, these two timeouts will be processed.

...

// This is the logic of starting and listening to ctx as mentioned above()
if err := (ctx); err != nil {
    ()
    return nil, err
}
defer ()

 = newBuffer()

// After parsing dsn, two timeouts will be assigned to the two properties of the core connection object = 
 = 

...

These two properties will be used when actually communicating with mysql server.

func (b *buffer) readNext(need int) ([]byte, error) {
    if  &lt; need {
       // refill
       if err := (need); err != nil {
          return nil, err
       }
    }

    offset := 
     += need
     -= need
    return [offset:], nil
}

// fill reads into the buffer until at least _need_ bytes are in it
func (b *buffer) fill(need int) error {
  
  ...go
  
  for {
    // If timeout>0, set a new deadline for the connection based on this timeout    if  &gt; 0 {
      if err := (().Add()); err != nil {
        return err
      }
    }

    nn, err := ([n:])
    n += nn

    switch err {
    case nil:
      if n &lt; need {
        continue
      }
       = n
      return nil

    case :
      if n &gt;= need {
         = n
        return nil
      }
      return 

    default:
      return err
    }
  }
}

Every time you need to get data from mysql server, you will set a deadline for this read operation. The specific time is the current time + timeout value. In this way, every time you read data from the server, once this time exceeds this time, an io timeout error will be reported. After the upstream receives this error, the following processing will be performed:

data, err = (pktLen)
if err != nil {
    if cerr := (); cerr != nil {
       return nil, cerr
    }
    (err)
    // Close the current connection    ()
    // Return invalid connection error    return nil, ErrInvalidConn
}

First, the connection is closed, and then an invalid connection error is returned. This is why the timeout error in the above example is an invalid connection. The core thing you need to pay attention to is the Close method. Here is the follow-up processing of timeout:

func (mc *mysqlConn) Close() (err error) {
    // Makes Close idempotent
    if !() {
      // Send quit command to mysql server to indicate that you want to exit       err = (comQuit)
    }

    // Calling cleanup is consistent with the operation above after listening to ctx cancel signal    ()

    return
}

First, send a quit command to tell yourself that you need to exit, and then use the cleanup method to close the tcp connection. You can see that the timeout control logic here is basically the same as the comparison based on ctx. This solution currently sends a quit instruction to the server side. From the external use, it seems that the effect of adding this instruction is the same. As long as the connection is closed, the mysql server side can be recycled (it may not be recycled immediately, but it will be recycled in the end). I searched for some information and the official documents of mysql, but did not find out what would be the impact of closing the tcp connection directly if I did not send quit commands. I have not studied the source code of mysql. If anyone knows, please give me some advice. But I think there should be no big problem, otherwise the timeout control based on ctx would have been a problem long ago.

How to deal with connection pool when sql statement timeout:

I personally think there is no problem with the above two SQL timeout solutions. The operations done by the underlying layer in the end are basically the same (closing the TCP connection). I personally prefer the ctx-based solution. After all, ctx was designed to do this at the beginning, and it can also be consistent with the timeout control in other scenarios, and the error message is more friendly. The next thing to consider is how the upper connection pool is handled once an error occurs at the bottom layer and the connection is closed.

func (db *DB) execDC(ctx , dc *driverConn, release func(error), query string, args []any) (res Result, err error) {
    defer func() {
       // Here is the core code. After executing SQL, the current connection needs to be released. When releasing, it will be processed based on whether err is nil.       release(err)
    }()
    execerCtx, ok := .()
    var execer 
    if !ok {
       execer, ok = .()
    }
    if ok {
       var nvdargs []
       var resi 
       withLock(dc, func() {
          nvdargs, err = driverArgsConnLocked(, nil, args)
          if err != nil {
             return
          }
          // The driver layer actually conducts query          resi, err = ctxDriverExec(ctx, execerCtx, execer, query, nvdargs)
       })
       if err !=  {
          if err != nil {
             return nil, err
          }
          return driverResult{dc, resi}, nil
       }
    }
    ...
}

After exec is executed, the connection will be released and put back into the connection pool for other queries. When putting back into the connection pool, it will be processed based on whether there are any errors in SQL in this article. The release function is injected into the release function, which is actually the releaseConn function. The putConn function is called internally.

func (db *DB) putConn(dc *driverConn, err error, resetSession bool) {
    if !(err, ) {
       // Here we determine whether the connection is no longer available. If it is no longer available, assign err to ErrBadConn       if !(resetSession) {
          err = 
       }
    }
    ()
    if ! {
       ()
       if debugGetPut {
          ("putConn(%v) DUPLICATE was: %s\n\nPREVIOUS was: %s", dc, stack(), [dc])
       }
       panic("sql: connection returned that was never out")
    }
    // If the connection has reached its maximum survival time, it is also necessary to mark that the connection is no longer available.    if !(err, ) &amp;&amp; () {
       ++
       err = 
    }
    if debugGetPut {
       [dc] = stack()
    }
     = false
     = nowFunc()

    for _, fn := range  {
       fn()
    }
     = nil
    // If the connection is not available, perform the following processing    if (err, ) {
       // There is a connection closed, consider opening a new connection       ()
       ()
       // Close the connection       ()
       return
    }
    if putConnHook != nil {
       putConnHook(db, dc)
    }
    // SQL execution is normal, or there are some errors but the connection is normal, and the connection will be returned normally    added := (dc, nil)
    ()

    if !added {
       ()
       return
    }
}

Overall, when an exception occurs in SQL execution, it will determine whether the connection is available. This judgment is also completed in the driver layer. The driver layer implements the following methods:

// IsValid implements  interface
// (From Go 1.15)
func (mc *mysqlConn) IsValid() bool {
    return !()
}

Used to tell the connection pool whether the connection is still normal. In the cleanup method called last timeout of SQL execution, the first thing is to mark this connection is no longer available.

func (mc *mysqlConn) cleanup() {
  // Tag connection is not available  if !(true) {
    return
  }
}

After judging the connection exception or exceeding the maximum survival time, the Close method of the connection pool is called. Note that it is the Close of the connection pool, not the Close of the driver layer. This Close will eventually be called to finalClose.

func (dc *driverConn) finalClose() error {
    var err error
    var openStmt []*driverStmt
    withLock(dc, func() {
       openStmt = make([]*driverStmt, 0, len())
       for ds := range  {
          openStmt = append(openStmt, ds)
       }
        = nil
    })
    for _, ds := range openStmt {
       ()
    }
    withLock(dc, func() {
       // Here we call the driver layer to close the connection        = true
       err = ()
        = nil
    })

    ()
    // The number of currently open connections is reduced by one    --
    ()
    ()

    (1)
    return err
}

This method mainly does two things. One is to actually close the connection at the driver layer. This is mainly for connections that have reached the maximum survival time. For connections that have already been closed in SQL will not be closed again. TrySet will not be executed successfully, and the subsequent TCP link closing operation will not continue. After closing the connection, reduce the number of currently opened connections by one, so as to ensure that the new connection can be opened normally.

Probable problems:

To sum up, although the implementation principles of these two timeout control methods are different, the things that are done after discovering that the timeout are the same. Both close the connection and reduce the number of connections opened by the connection pool by one. This actually has a problem, because although the client has normally adjusted Close and believes that the connection has been closed, the mysql server cannot sense the connection closing message in a non-sleep state. A specific situation is that a certain connection of mysql is executing a time-consuming query, but at this time, the client actively closes the connection, but the mysql server will not terminate the query immediately and close the connection. When showing processlist, you can still see that the SQL in the connection is still executing normally. In fact, you can also see this phenomenon when observing the status of the TCP connection. When both parties communicate normally, the status is

tcp6 0 0 ::1.3306 ::1.61351 ESTABLISHED
tcp6 0 0 ::1.61351 ::1.3306 ESTABLISHED

After the client side actively calls Close, the server side will remain in the CLOSE_WAIT state because it wants to execute the current SQL. The client side enters the FIN_WAIT_2 state. It will not perform a subsequent wave process until the server side completes the SQL execution to truly close the connection.

tcp6 5 0 ::1.3306 ::1.61351 CLOSE_WAIT
tcp6 0 0 ::1.61351 ::1.3306 FIN_WAIT_2

The problem is that the connection on the server side has not been closed, but the number of connections on the connection pool has been reduced by one, and a new connection can be created in the future. This causes the number of connections on the mysql server side to be higher than the maximum number of connections in the connection pool. If there are many SQL times out, it is very likely that the number of connections exceeds the maximum number of connections on the connection pool and reaches the maximum number of connections on the mysql server side. The subsequent new connections will not be established, and the too many connectios error will be directly returned. If the SQL executed by these connections is really slow, or a deadlock occurs, mysql may directly deny service for a long time. This shows that in a production environment, try not to set the max connectios parameter of mysql to the maximum number of database connections, and should leave a certain margin to avoid the leaked connections that directly fill up the number of mysql connections when many SQL timeouts occur, resulting in unavailability of the database.

This is the article about the timeout control scheme and principles of golang sql statements. For more related golang sql timeout control content, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!