SoFunction
Updated on 2025-04-10

Summary of knowledge points about Poisson's regression in R language

Poisson regression includes regression models, where the response variable is in the form of counts rather than fractions.

For example, the number of births or wins in a football match series. Furthermore, the values ​​of the response variables follow the Poisson distribution.

The general mathematical equation for Poisson's regression is

log(y) = a + b1x1 + b2x2 + bnxn.....

The following is a description of the parameters used

  • ​y is the response variable.
  • a​and​b​ are numeric coefficients.
  • ​x​ is a predictor.

The function used to create a Poisson regression model isglm()function.

grammar

On the return of Poissonglm()The basic syntax of a function is

glm(formula,data,family)

The following is a description of the parameters used in the above functions

  • Formula is a symbol that represents the relationship between variables.
  • ​data​ is the dataset that gives the values ​​of these variables.
  • ​family​ is an R language object to specify the details of the model. Its value is a logistic regression of "Poisson".

example

We have built-in datasets"​warpbreaks”, which describes the type of wool (A​orB​) and tension (low, medium or high) effects on the number of warp yarn breaks in each loom. Let's consider "break" as the response variable, which is the count of the number of breaks. Wool “type” and “tension” are used as predictors.

Enter data

input <- warpbreaks
print(head(input))

When we execute the above code, it produces the following result

      breaks   wool  tension
1     26       A     L
2     30       A     L
3     54       A     L
4     25       A     L
5     70       A     L
6     52       A     L

Create a regression model

output <-glm(formula = breaks ~ wool+tension, 
                   data = warpbreaks, 
                 family = poisson)
print(summary(output))

When we execute the above code, it produces the following result

Call:
glm(formula = breaks ~ wool + tension, family = poisson, data = warpbreaks)

Deviance Residuals: 
    Min       1Q     Median       3Q      Max  
 3.6871 1.6503 0.4269     1.1902   4.2616  

Coefficients:
            Estimate Std. Error z value Pr(>|z|)    
(Intercept)  3.69196    0.04541  81.302  < 2e-16 ***
woolB      0.20599    0.05157 3.994 6.49e-05 ***
tensionM   0.32132    0.06027 5.332 9.73e-08 ***
tensionH   0.51849    0.06396 8.107 5.21e-16 ***
---
Signif. codes:  0 ‘***' 0.001 ‘**' 0.01 ‘*' 0.05 ‘.' 0.1 ‘ ' 1

(Dispersion parameter for poisson family taken to be 1)

    Null deviance: 297.37  on 53  degrees of freedom
Residual deviance: 210.39  on 50  degrees of freedom
AIC: 493.06

Number of Fisher Scoring iterations: 4

In the summary, we look for the ​ in the last columnpValue less than0.05, to consider the effect of predictors on response variables. As shown in the figure, there is a tension type​MandHTypes of woolB​It has an effect on the fracture count.

This is the end of this article about the summary of Poisson's knowledge points in R language. For more related content on Poisson's regression in R language, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!