SoFunction
Updated on 2025-04-10

R language implementation group counting, summing, etc.

df is 1 object, with two columns: stratum and psu. Here we count the stratum column count

Method 1:

cnt = table(df$stratum)

Method 2:

cnt = tapply(df$psu, INDEX=df$stratum, FUN=length)

Based on method 2, as long as the FUN function is changed, the functions such as grouping summing and average calculation can be implemented, as follows

Grouping to find the mean:

tapply(df$psu, INDEX=df$stratum, FUN=mean)
#(Equivalent topythonIn-house('stratum').)

Supplement: R language | Custom functions perform conditional judgment calculation on columns of dataset()

1.Use iris dataset

> iris_10 <- head(iris, n = 10)
## Custom function: if x >= 5.0, z = y *10> get_With_function <- function(x, y, z){
+   if(x >= 5.0){
+     z <- y * 10
+   }
+   c(zlie = z )
+ }

2. For the sake of insurance, set z column to 0, and it may not be necessary.

> iris_10$z <- 0

3. Use custom functions to judge the x rows, perform calculations on the y column, and assign values ​​to the z column

4… Pay attention to the use of Map

> iris_10$z <- with(
+   iris_10,
+   Map(
+     get_With_function,
+     iris_10$,
+     iris_10$,
+     z
+   )
+   )
> iris_10
     
1      5.1     3.5     1.4     0.2
2      4.9     3.0     1.4     0.2
3      4.7     3.2     1.3     0.2
4      4.6     3.1     1.5     0.2
5      5.0     3.6     1.4     0.2
6      5.4     3.9     1.7     0.4
7      4.6     3.4     1.4     0.3
8      5.0     3.4     1.5     0.2
9      4.4     2.9     1.4     0.2
10     4.9     3.1     1.5     0.1
  Species z
1  setosa 35
2  setosa 0
3  setosa 0
4  setosa 0
5  setosa 36
6  setosa 39
7  setosa 0
8  setosa 34
9  setosa 0
10 setosa 0

The above is personal experience. I hope you can give you a reference and I hope you can support me more. If there are any mistakes or no complete considerations, I would like to give you advice.