SoFunction
Updated on 2025-04-12

Android realizes recording mute noise reduction

This article shares the specific code for Android to realize recording mute noise reduction for your reference. The specific content is as follows

need:

Customer feedbacks that there are a lot of noise in the recording of the product (because we adjust the recording gain of the Codec to the maximum, and there is no dedicated audio processing chip on the circuit, and the CPU is directly connected to the MIC (with packaged ground)). When the shell and hardware cannot be modified, the software must find a way to try to solve the problem.

The first thing that comes to mind is dual-microphone noise reduction. The principle is roughly: one main microphone is used for calls, the other collects ambient noise, analyzes the audio waveform and operates phase, superimposes on the sampling waveform of the main microphone, forms phase cancellation, and then reduces noise. The disadvantage is that the two microphones cannot be too close, and the distance between the two microphones cannot be too far from the speaker. If it is too far, the angle will be very small and it is impossible to distinguish. In addition, according to the use of the product, the upper and lower microphones have a chance to be called the main microphone. So the results of the experimental tests were not very good.

Considering that recording noise cannot be distinguished when there is a "vocal" or has little impact, and there is obvious environmental noise when muting, I want to use mute noise reduction to avoid the problem.

This article is just a simple silent noise reduction. The principle is as follows: Considering that when starting recording, you have to wait for a period of time (such as 0.5s) before there will be a vocal. You can predict the magnitude of the noise (threshold) based on this 0.5s time, and then use this as a basis to detect the starting point of the "vocal". Before the vocals arrive, set all audio data to 0, which means mute processing, so this is called mute noise reduction. When the voice arrives, the actual audio data (including the noise data inside) is returned. The method of calculating the threshold is simply summing and averaging.

The following code is implemented on the RK platform hardware/alsa_sound/.

#define MUTE_NOISE_REDUCTION
#ifdef MUTE_NOISE_REDUCTION
bool enable_reduction_noise = false;    //Controlled by attributes
int threshold_def = 0x400;    //Default thresholdint threshold = 0;    //Adaptive noise thresholdint threshold_count = 0;    //Count, if you exceed THRESHOLD_COUNT, use threshold to detect "vocals"#define THRESHOLD_COUNT 10

#define MUTE_DELAY_COUNT 15           //The number of audio frames retained after playing the vocals, not mute
#define AUDIO_BUFFER_NUM 4         //The number of frames that cache audio data#define AUDIO_BUFFER_SIZE 1024   //Audio data size of one framechar *audio_buffer[AUDIO_BUFFER_NUM];    //audio_buffer is used to cache audio datachar *audio_buffer_temp;    // Used for interactive audio dataint audio_buffer_pos=0;
#endif

#ifdef MUTE_NOISE_REDUCTION
    {
        unsigned int value = 0;
        int is_voice = 0;
        static int is_mute_delay_count;
        //ALOGE("in_begin_swip_num:%d in_begin_narrow_num=%d",in_begin_swip_num,in_begin_narrow_num);        

         if(enable_reduction_noise && bytes > AUDIO_BUFFER_SIZE){
            bytes = AUDIO_BUFFER_SIZE;
        }

        if(enable_reduction_noise){
            unsigned char * buffer_temp=(unsigned char *)buffer;
            unsigned int total = 0;
            unsigned int total_count=0;
            unsigned int total_temp = 0;
            short data16;
            int j = 0;
            for(j=0; j<bytes; j=j+2){
                value = buffer_temp[j+1];    //The second byte is high-bit data                value = (value<<8)+buffer_temp[j];    //Get a 16-bit audio data                data16 = value&0xFFFF;
                if( (data16 & 0x8000) == 0){//Positive number                    total +=data16;        //Thinking: Will it overflow?                    total_count++;        //count                }
            }

            total_temp = total/total_count;
            if(total_temp > threshold_def){
                is_voice++;        //Voice detected            }else {    //is noise
                if(threshold_count == 0){
                    threshold = total_temp;
                }else{
                    threshold = (threshold+total_temp)/2;
                }
                threshold_count++;
                if(threshold_count >= THRESHOLD_COUNT){
                    threshold_def = threshold*2;    //Update the threshold, here 2 must be determined by the product experiment.                    threshold_count = THRESHOLD_COUNT;    //There is no longer the new threshold value until the recording stops                }
            }

            //The meaning of is_mute_delay_count is that if the vocals are played in front, then stop talking and continue to retain the audio data of MUTE_DELAY_COUNT, so that it will not "suddenly stop".            if( is_voice != 0 ){
                is_mute_delay_count=MUTE_DELAY_COUNT;
            }else{
                if(is_mute_delay_count != 0)
                    is_mute_delay_count--;
            }

            //The meaning of audio_buffer: When a vocal is detected, a small piece of audio data must be returned before speaking, otherwise the sound will jump from mute to the vocal with a POP sound.            // Here you use audio_buffer to cache AUDIO_BUFFER_NUM frame data.            if(is_mute_delay_count == 0){//Mute in order to remove noise
                memcpy(audio_buffer[audio_buffer_pos], (char *)buffer, bytes);    //Cached audio                memset(buffer, 0, bytes);    //Return mute data            }else {
                memcpy(audio_buffer_temp, (char *)buffer, bytes);
                memcpy((char *)buffer, audio_buffer[audio_buffer_pos], bytes);    //Return old audio data                memcpy(audio_buffer[audio_buffer_pos], (char *)audio_buffer_temp, bytes);     //Save new audio data            }
            audio_buffer_pos++;
            if(audio_buffer_pos>=AUDIO_BUFFER_NUM)
                audio_buffer_pos=0;
        }
    }
#endif

The above is all the content of this article. I hope it will be helpful to everyone's study and I hope everyone will support me more.