Detailed explanation of Kafka's listening address configuration example

Sometimes we encounter the network that is smooth but cannot connect to Kafka, especially in multi-network card environments or cloud environments. This is actually related to Kafka's monitoring configuration. This article introduces monitoring-related configurations. Currently, monitoring-related parameters are the following:

listeners
(Historical legacy, abandoned, do not use it)
(Historical legacy, abandoned, do not use it)
(Historical legacy, abandoned, do not use it)

The most important ones are listeners and: listeners to the address configured by listeners when the cluster starts, and write the configured address into Zookeeper as part of the cluster metadata. We can divide the process of connecting the client (producer/consumer) to the Kafka cluster to operate into 2 steps:

Connect to a Broker through the connection information (ip/host) configured by listeners (broker will regularly obtain and cache the metadata information in zk), and obtain the address information configured in the metadata.
Connection information and Kafka cluster communication (read/write) obtained through step 1.

Therefore, in virtualized environments with internal and external network isolation (such as Docker, public cloud), external clients often have access to Kafka (step 1), but sending/consuming data timeout (step 2), because listeners configures external network address, but configures internal network address. So how should these parameters be configured?

Let’s look at the configuration format of the connection information first:{listener name}://{HOST/IP}:{PORT}. HOST/IP and PORT are very clear, mainly this is "Listener name” field. To understand this, you need to understandThis configuration item: Its purpose is to configure the mapping of listener name and protocol (so it is a map of key-value), the key is "listener name", the value is "protocol name", and its default value is "listener name" and the protocol name is the same as the "protocol name". It's a bit confusing. For example, for example: PLAINTEXT:PLAINTEXT, SSL:SSL, SASL_PLAINTEXT:SASL_PLAINTEXT, SASL_SSL: SASL_SSL, the colon is preceded by the key, that is, the protocol name; the value is followed by the protocol name. We can name listener at will, while the protocol name is a fixed enumerable range. So if we customize the listener name, we need to explicitly set its corresponding protocol name.

The and are both used to configure communication between Brokers. The former configures the name (i.e. the key in the latter) and the latter configures the protocol (i.e. the value in the past), and the default value is PLAINTEXT. Only one of these two configuration items can be configured at the same time.

Why is a connection so complicated? Mainly for various scenario needs. Let’s give an explanation of a more complex application scenario. For example, we deploy a Kafka cluster on a public cloud, which has an external network address external_hostname and an internal network address internal_hostname; and the external network address cannot be obtained internally (most public clouds are like this). Then, if you want to implement the internal client accessing the cluster, it does not require encryption; while when the external client accessing the external client accessing the cluster, it does need encryption. To achieve this requirement, the cluster can be configured as follows:

=INTERNAL:PLAINTEXT,EXTERNAL:SSL
listeners=INTERNAL://0.0.0.0:19092,EXTERNAL://0.0.0.0:9092
=INTERNAL://{internal_hostname}:19092,EXTERNAL://{external_hostname}:9092
==INTERNAL:PLAINTEXT,EXTERNAL:SSL

In fact, we can go further, we can also customize the connection between the cluster Controller node and other Broker nodes through optional parameters, and the configuration information becomes:

=INTERNAL:PLAINTEXT,EXTERNAL:SSL,CONTROL:SSL
listeners=INTERNAL://0.0.0.0:19092,EXTERNAL://0.0.0.0:9092
=INTERNAL://{internal_hostname}:19092,EXTERNAL://{external_hostname}:9092,CONTROL://{control_ip}:9094
=INTERNAL
=CONTROL

Finally, the default values and some notes for these configuration items are given:

If listeners are not configured explicitly, they will listen to all network cards, which is equivalent to 0.0.0.0.0. The listeners name and port in this configuration item must be unique and cannot be repeated.
If not configured, the value configured by listeners is used by default. If listeners are not explicitly configured, use the IP address obtained by(). If listeners are configured with 0.0.0.0, it must be configured explicitly, because this configuration item must be a specific address and is not allowed to be 0.0.0.0 (because the client cannot connect to Broker based on this address). Also, the ports in the process allow duplication.
For listeners and, when there are multiple addresses, each address must be configured in the format of {listener name}://{HOST/IP}:{PORT}, and multiple addresses are separated by English commas.
If the hostname of all nodes in the cluster can be correctly parsed between the client and server nodes, the hostname is preferred rather than the IP. Because () is used in the code, sometimes the access will be inaccessible when using IP.

Summarize:

The listeners address is used for the first connection; the address will be written into zk, and the client establishes a connection through the listeners address to obtain the address information, and then interacts with the cluster through the address. Therefore, for the client, these two addresses must be accessible.

This is all about this article about Kafka monitoring address configuration. For more related content on Kafka monitoring address configuration, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!