SoFunction
Updated on 2025-03-01

Summary of knowledge points about mean, median and pattern in R language

Statistical analysis in R is performed by using many built-in functions. Most of these functions are part of the R basic package. These functions take R vectors as inputs and parameters and give the result.

The functions we discuss in this chapter are mean, median, and patterns.

Mean average

By finding the sum of the data set and then dividing it by the total sum of the sum number, we get the average value

The function means() is used to calculate the average value in R language.

grammar

The basic syntax for calculating the average in R is

mean(x, trim = 0,  = FALSE, ...)

The following is a description of the parameters used

  • x is the input vector.
  • trim is used to discard some observations from both ends of the sorted vector.
  • Used to remove missing values ​​from the input vector.

example

# Create a vector. 
x <- c(12,7,3,4.2,18,2,54,-21,8,-5)

# Find Mean.
 <- mean(x)
print()

When we execute the above code, it produces the following result

[1] 8.22

Apply trimming options

When a trim parameter is provided, the values ​​in the vector are sorted and the required observations are then subtracted from the calculated average.

When trim = 0.3, the 3 values ​​from each end will be subtracted from the calculation to find the mean.

In this case, the sorted vector is (-21, -5, 2, 3, 4.2, 7, 8, 12, 18, 54), and the values ​​removed from the vector used to calculate the mean are (-21, -5, 2) from the left and (12, 18, 54) from the right.

# Create a vector.
x <- c(12,7,3,4.2,18,2,54,-21,8,-5)

# Find Mean.
 <-  mean(x,trim = 0.3)
print()

When we execute the above code, it produces the following result

[1] 5.55

Apply NA Options

If there are missing values, the average function returns NA.

To remove missing values ​​from calculations, use =TRUE. This means removing the NA value.

# Create a vector. 
x <- c(12,7,3,4.2,18,2,54,-21,8,-5,NA)

# Find mean.
 <-  mean(x)
print()

# Find mean dropping NA values.
 <-  mean(x, = TRUE)
print()

When we execute the above code, it produces the following result

[1] NA
[1] 8.22

Median Median

The middlemost value in the data series is called the median. Use the median() function to calculate this value in R language.

grammar

The basic syntax for calculating medians of R is

median(x,  = FALSE)

The following is a description of the parameters used

  • x is the input vector.
  • Used to remove missing values ​​from the input vector.

example

# Create the vector.
x <- c(12,7,3,4.2,18,2,54,-21,8,-5)

# Find the median.
 <- median(x)
print()

When we execute the above code, it produces the following result

[1] 5.6

Mode mode

A pattern is the value that appears the most frequently in a set of data. Unike average and median, the pattern can contain both numeric and character data.

R language does not have standard built-in functions to calculate patterns. Therefore, we create a user function to calculate the schema of data sets in R language. This function takes a vector as input and a pattern value as output.

example

# Create the function.
getmode <- function(v) {
   uniqv <- unique(v)
   uniqv[(tabulate(match(v, uniqv)))]
}

# Create the vector with numbers.
v <- c(2,1,2,3,1,2,3,4,1,5,5,3,2,3)

# Calculate the mode using the user function.
result <- getmode(v)
print(result)

# Create the vector with characters.
charv <- c("o","it","the","it","it")

# Calculate the mode using the user function.
result <- getmode(charv)
print(result)

When we execute the above code, it produces the following result

[1] 2
[1] "it"

This is the article about the summary of the average, median and pattern knowledge points in R language. For more relevant R language averages, median and pattern content, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!