SoFunction
Updated on 2025-04-09

OpenCV  iOS Detailed tutorial on introductory image processing programming

Introduction to OpenCV

OpenCV(Open Source Computer Vision Library) is a popular open source cross-platform computer vision library. It implements many general algorithms in image processing and computer vision, covering all algorithms from the most basic filtering to advanced object detection.

Multilingual interface

OpenCV is developed using C/C++, and also provides interfaces to other languages ​​such as Python, Java, and MATLAB.

Cross-platform

OpenCV is cross-platform and can run on operating systems such as Windows, Linux, Mac OS, Android, iOS, etc.

Wide application fields

OpenCV has a wide range of applications, including image stitching, image noise reduction, product quality inspection, human-computer interaction, face recognition, action recognition, action tracking, unmanned driving, etc. OpenCV also provides machine learning modules, you can use machine learning algorithms such as normal Bayesian, K nearest neighbor, support vector machine, decision trees, random forests, artificial neural networks, etc.

Integrated OpenCV

1. First create an Xcode project and set BuEnable Bitcode to NO in Build Settings.

2. Use cocoaPads to configure OpenCV. Open the terminal, cd to the project directory, execute the pod init command to initialize the project, and create the Podfile file corresponding to the project. Use vim Podfile to add pod 'OpenCV' ,'~> 4.3.0', and finally execute pod install to install OpenCV.

3. For the class file referenced to OpenCV, you need to change the m file to .mm, and tell the compiler that there is C++.

Basic image container Mat

Image representation

Usually, when we take real images in the real world, when converted into electronic devices, we record the values ​​of each point in the image.

An image with a size of AxB can be represented by a matrix of AxB. The value of the matrix element indicates the brightness of the pixel at this position. Generally speaking, the larger the pixel value indicates the brighter the point.

Generally speaking, the grayscale graph is represented by a 2-dimensional matrix, and the color (multi-channel) image is represented by a 3-dimensional matrix (M × N × 3). For image display, most devices currently use unsigned 8-bit integers (type CV_8U) to represent pixel brightness.

The order in which image data is stored in computer memory starts with the upper leftmost point (or perhaps the lower leftmost point). If it is a multi-channel image, such as an RGB image, each pixel is represented by three bytes. In OpenCV, the channel order of RGB images is BGR.

Key attributes and definitions of Mat class

The key attributes are as follows:

/* flag parameter contains a lot of information about the matrix, such as: -Mat's identification

-Is the data continuous -Deep -Number of channels

*/

int flags;

//The dimension of the matrix should be greater than or equal to 2

int dims;

//The number of rows and columns of the matrix. If the matrix exceeds 2 dimensions, the values ​​of both variables are -1

int rows, cols;

// Pointer to data

uchar* data;

// Pointer to reference count // NULL if the data is allocated by the user

int* refcount;

Mat is defined as follows:

class CV_EXPORTS Mat
{
public:
    Mat();
    Mat(int rows, int cols, int type);
    Mat(Size size, int type);
    Mat(int rows, int cols, int type, const Scalar& s);
    Mat(Size size, int type, const Scalar& s);
    Mat(int ndims, const int* sizes, int type);
    Mat(const std::vector<int>& sizes, int type);
    Mat(int ndims, const int* sizes, int type, const Scalar& s);
    Mat(const std::vector<int>& sizes, int type, const Scalar& s);
    Mat(const Mat& m);
    Mat(int rows, int cols, int type, void* data, size_t step=AUTO_STEP);
    Mat(Size size, int type, void* data, size_t step=AUTO_STEP);
    .......
    .........................
    ................................
          /*! includes several bit-fields:
         - the magic signature
         - continuity flag
         - depth
         - number of channels
     */
    int flags;
    //! the matrix dimensionality, >= 2
    int dims;
    //! the number of rows and columns or (-1, -1) when the matrix has more than 2 dimensions
    int rows, cols;
    //! pointer to the data
    uchar* data;

    //! helper fields used in locateROI and adjustROI
    const uchar* datastart;
    const uchar* dataend;
    const uchar* datalimit;

    //! custom allocator
    MatAllocator* allocator;
    //! and the standard allocator
    static MatAllocator* getStdAllocator();
    static MatAllocator* getDefaultAllocator();
    static void setDefaultAllocator(MatAllocator* allocator);

    //! internal use method: updates the continuity flag
    void updateContinuityFlag();

    //! interaction with UMat
    UMatData* u;

    MatSize size;
    MatStep step;

protected:
    template<typename _Tp, typename Functor> void forEach_impl(const Functor& operation);
};

Create a Mat object

Mat is a very excellent image class, and it is also a general matrix class that can be used to create and manipulate multi-dimensional matrices. There are several ways to create a Mat object.

For a two-dimensional multi-channel image, the first thing to do is to define its size, namely the number of rows and columns. Then you need to specify the data type of the storage element and the number of channels for each matrix point. To do this, the definition rules are as follows:

CV_【Number of digits】【Signed or not】【Type prefix】C【Number of channels】

Example: CV_8UC3: means using an 8-bit unsigned char type, each pixel has three elements to form three channels. The number of predefined channels can be as many as four. Scalar is a short vector that can initialize the matrix using specified customized values, and it can also represent color.

1. Constructor method creates Mat

Mat M(5,8, CV_8UC3, Scalar(255,0,0));

Create an image with a height of 5 and a width of 8, the image element is an 8-bit unsigned type, and has 3 channels. All pixel values ​​of the image are initialized to (255, 0, 0). Since the default color order in OpenCV is BGR, this is a pure blue image [RGB is: (0, 0, 255)]

//Create an image with rows, columns, and type type;Mat::Mat(int rows, int cols, int type);
//Create an image of size and type type;Mat::Mat(Size size, int type)
//Create an image with rows, columns, type type, and initialize all elements into value s;Mat::Mat(int rows, int cols, int type, const Scalar&amp; s);
//Create an image of size and type type, and initialize all elements to the value s;Mat::Mat(Size size, int type, const Scalar&amp; s);

2. Create Mat using Create() function

Mat mat; 
(2, 2, CV_8UC3);

Commonly used data structures and functions

Point class

Used to represent points. The Point-like data structure represents points in a two-dimensional coordinate system, that is, 2D points specified by their image coordinates x and y.

How to use it is as follows:

Point point;
=2;
=5;
or
Point point=Point(2,5);

Scalar class

Used to represent color. Scalar() represents an array with 4 elements, which is widely used in OpenCV to pass pixel values, such as RGB color values. The RGB color value is three parameters. For Scalar(), the fourth parameter is optional. If it is not used, there is no need to write it out. Only write three parameters. OpenCV will think that we want to represent three parameters.

example:

 Scalar scalar=Scalar(0,2,255);

The defined RGB color values ​​are: 0: blue component, 2: green component, 255: red component.

The origin of the Scalar class isScalar_class, andScalar_A variant of the class Vec4x, the commonly used Scalar is actuallyScalar_<double>, This is why many functions can be input into Mat or Scalar.

//Vec is a derived class of Matx, a one-dimensional Matx, which is very similar to vector.  Matx is a lightweight Mat, and the size must be specified before use.template&lt;typename _Tp&gt; class Scalar_ : public Vec&lt;_Tp, 4&gt;
{
public:
    //! default constructor
    Scalar_();
    Scalar_(_Tp v0, _Tp v1, _Tp v2=0, _Tp v3=0);
    Scalar_(_Tp v0);

    Scalar_(const Scalar_&amp; s);
    Scalar_(Scalar_&amp;&amp; s) CV_NOEXCEPT;

    Scalar_&amp; operator=(const Scalar_&amp; s);
    Scalar_&amp; operator=(Scalar_&amp;&amp; s) CV_NOEXCEPT;

    template&lt;typename _Tp2, int cn&gt;
    Scalar_(const Vec&lt;_Tp2, cn&gt;&amp; v);

    //! returns a scalar with all elements set to v0
    static Scalar_&lt;_Tp&gt; all(_Tp v0);

    //! conversion to another data type
    template&lt;typename T2&gt; operator Scalar_&lt;T2&gt;() const;

    //! per-element product
    Scalar_&lt;_Tp&gt; mul(const Scalar_&lt;_Tp&gt;&amp; a, double scale=1 ) const;

    //! returns (v0, -v1, -v2, -v3)
    Scalar_&lt;_Tp&gt; conj() const;

    //! returns true iff v1 == v2 == v3 == 0
    bool isReal() const;
};

typedef Scalar_&lt;double&gt; Scalar;

Size class

Used to represent dimensions. The source code of Size class is as follows:

typedef Size_<int> Size2i;
typedef Size_<int64> Size2l;
typedef Size_<float> Size2f;
typedef Size_<double> Size2d;
typedef Size2i Size;

inSize_It's a template class, here,Size_<int> The type represented by the template inside its class is int. Meaning: first give known data typesSize_<int> Give a new name Size2i, and then give a new name Size to the known data type Size2i. therefore,Size_<int> Three types of names: Size2i and Size are equivalent.

Size_The template is defined as follows:

template&lt;typename _Tp&gt; class Size_
{
public:
    typedef _Tp value_type;

    //! default constructor
  //Constructor    Size_();
    Size_(_Tp _width, _Tp _height);
    Size_(const Size_&amp; sz);
    Size_(Size_&amp;&amp; sz) CV_NOEXCEPT;
    Size_(const Point_&lt;_Tp&gt;&amp; pt);

    Size_&amp; operator = (const Size_&amp; sz);
    Size_&amp; operator = (Size_&amp;&amp; sz) CV_NOEXCEPT;
    //! the area (width*height)
  //Area(width*height)    _Tp area() const;
    //! aspect ratio (width/height)
    double aspectRatio() const;
    //! true if empty
    bool empty() const;

    //! conversion of another data type.
  //Convert to another data type    template&lt;typename _Tp2&gt; operator Size_&lt;_Tp2&gt;() const;
     //Common properties, width and height of template type    _Tp width; //!< the width    _Tp height; //!< the height};

Size_Some constructors are overloaded inside the template class, and the most useful constructors are as follows:

Size_(_Tp _width, _Tp _height);

So we can use and to represent width and height respectively.

example:Size(2,3);The constructed Size width is 2 and the height is 3. That is, =2, =3.

 Size size=Size(2,3);
 ;
 ;

Rect class

Used to represent rectangles. The member variables of the Rect class include x, y, width, and height, which are the coordinates of the upper left corner and the width and height of the rectangle. Commonly used member functions include Size(), the return value is Size; area() returns the area of ​​the rectangle; contains(Point) determines whether the point is in the rectangle; inside(Rect) function determines whether the rectangle is in the rectangle; tl() returns the coordinates of the upper left corner; br() returns the coordinates of the lower right corner. If you want to find the intersection and union of two rectangles, you can write it as follows:

  Rect rect1=Rect(0,0,100,120);
  Rect rect2=Rect(10,10,100,120);
  Rect rect=rect1|rect2;
  Rect rect3=rect1&rect2;

If you want the rectangle to pan or scale, you can:

  Rect rect=Rect(10,10,100,120);
  Rect rect1=rect+point;
  Rect rect2=rect+size;

cvtColor class

Used for color space conversion. The cvtColor() function is a color space conversion function in OpenCV. It can realize the conversion of RGB to HSV, HSI and other color spaces, and can be converted into grayscale images.

The cvtColor() function is defined as follows:

CV_EXPORTS_W void cvtColor( InputArray src, OutputArray dst, int code, int dstCn = 0 );

The first parameter src is the input image, the second parameter dst is the output image, the third parameter code is the identifier of color space conversion, and the fourth parameter dstCn is the number of channels of the target image. If the parameter is 0, it indicates the number of channels of the target image to take the source image.

Example: Convert source image to grayscale image

 cvtColor(matInput, grayMat,COLOR_BGR2GRAY);

Color space conversion identifier in OpenCV libraryIn-houseColorConversionCodesMany identifiers are defined in the enumeration.

/** the color conversion codes
@see @ref imgproc_color_conversions
@ingroup imgproc_color_conversions
 */
enum ColorConversionCodes {
    COLOR_BGR2BGRA     = 0, //!< add alpha channel to RGB or BGR image
    COLOR_RGB2RGBA     = COLOR_BGR2BGRA,

    COLOR_BGRA2BGR     = 1, //!< remove alpha channel from RGB or BGR image
    COLOR_RGBA2RGB     = COLOR_BGRA2BGR,

    COLOR_BGR2RGBA     = 2, //!< convert between RGB and BGR color spaces (with or without alpha channel)
    COLOR_RGB2BGRA     = COLOR_BGR2RGBA,

    COLOR_RGBA2BGR     = 3,
    COLOR_BGRA2RGB     = COLOR_RGBA2BGR,
    ...........
    ...................
    ..............................
        //! Demosaicing with alpha channel
    COLOR_BayerBG2BGRA = 139,
    COLOR_BayerGB2BGRA = 140,
    COLOR_BayerRG2BGRA = 141,
    COLOR_BayerGR2BGRA = 142,

    COLOR_BayerBG2RGBA = COLOR_BayerRG2BGRA,
    COLOR_BayerGB2RGBA = COLOR_BayerGR2BGRA,
    COLOR_BayerRG2RGBA = COLOR_BayerBG2BGRA,
    COLOR_BayerGR2RGBA = COLOR_BayerGB2BGRA,

    COLOR_COLORCVT_MAX  = 143
};

Image processing technology

Access pixels in an image

We have learned that the size of the image matrix depends on the color model used, and to be precise, the number of channels used. If it is a grayscale image, the matrix is ​​as follows:

  c 0 c 1 c ... c m
R 0 0,0 0,1 ... 0,m
R 1 1,0 1,1 ... 1,m
R ... ...,0 ...,1 ... ...,m
R n n,0 n,1 n,.. n,m

For multi-channel graphs, the columns in the matrix will contain multiple sub-columns, and the number of sub-columns is equal to the number of channels.

Example: The following represents the matrix of the RGB color model.

  c 0 c 1 c ... c m
R 0 0,0 0,0 0,0 0,1 0,1 0,1 ... ... ... 0,m 0,m 0,m
R 1 1,0 1,0 1,0 1,1 1,1 1,1 ... ... ... 1,m 1,m 1,m
R ... ...,0 ...,0 ...,0 ...,1 ...,1 ...,1 ... ... ... ...,m ...,m ...,m
R n n,0 n,0 n,0 n,1 n,1 n,1 n,.. n,.. n,.. n,m n,m n,m

It is worth noting that the channel order of sub-columns in OpenCV is the reverse, BGR rather than RGB.

Any image processing algorithm starts with operating each pixel. There are three ways to access each pixel in OpenCV.

Pointer access:

Pointer access pixels use operators [] in C language. This is the fastest.

//Color space reductionvoid colorReduce(Mat&amp; matInput,Mat&amp; matoutput,int div){
    //Copy the input image    matoutput=();
    int rows=;//Line number, height    int cols=*();//Number of columns*Number of channels=Number of elements per row    //Transfer the image matrix//Width    for (int i=0; i&lt;rows; i++) {//Run loop        uchar *data=&lt;uchar&gt;(i);//Retrieve the first address of the i-line        for (int j=0; j&lt;cols; j++) {//Column loop            data[j]=data[j]/div*div+div/2;//Start process every pixel        }
    }
}

The public member variable rows in the Mat class is the height of the image, and cols is the width. The channels() function returns the number of channels of the image. The grayscale channel is 1, the number of color channels is 3, and the number of alpha is 4.ptr<uchar>(i)You can get the first address of any row of the image. ptr is a template function that returns the first address of the i-th line.

Iterator:

In the iterative method, it is only necessary to obtain the beginning and end of the image matrix, and then add the iteration from begin to end. Will(*it)Before adding the star operator to the iterative pointer, you can access the content currently pointed. Compared to the problem of direct pointer access may cause out-of-bounds, iterators are definitely a very safe method.

//Color space reductionvoid colorReduceIterator(Mat&amp; matInput,Mat&amp; matoutput,int div){
    //Copy the input image    matoutput=();
    //Initial position iterator    Mat_&lt;Vec3b&gt;::iterator it=&lt;Vec3b&gt;();
    //The iterator of the terminating position    Mat_&lt;Vec3b&gt;::iterator itend =&lt;Vec3b&gt;();
    //Transfer the image matrix    for (;it!=itend;++it) {
        // Process every pixel        (*it)[0]=(*it)[0]/div*div+div/2;
        (*it)[1]=(*it)[1]/div*div+div/2;
        (*it)[2]=(*it)[2]/div*div+div/2;
    }
}

Dynamic address calculation:

Use dynamic addresses to calculate the operation pixels, and need to be combined with the colorReduce function of the at method. This method is concise and clear. But not the fastest.

//Color space reductionvoid colorReduceVec(Mat&amp; matInput,Mat&amp; matoutput,int div){
    //Particle preparation    matoutput=();
    int rows=;//Line Number    int cols=;//Number of columns    for (int i=0; i&lt;rows; i++) {
        for (int j=0; j&lt;cols; j++) {// Process every pixel            //Blue Channel            &lt;Vec3b&gt;(i,j)[0]=&lt;Vec3b&gt;(i,j)[0]/div*div+div/2;
             //Green Channel            &lt;Vec3b&gt;(i,j)[1]=&lt;Vec3b&gt;(i,j)[1]/div*div+div/2;
            //Red Channel            &lt;Vec3b&gt;(i,j)[2]=&lt;Vec3b&gt;(i,j)[2]/div*div+div/2;
        }
    }
}

at<Vec3b>(i,j)Functions can be used to access image elements, but the data type of the image must be known during the compilation period. It is important to ensure that the specified data type matches the data type in the matrix, because the at method itself does not convert any data type.

Color images

Each pixel consists of three parts, blue channel, green channel, and red channel [BGR].

If there is an alpha channel, each pixel consists of three parts, blue channel, green channel, and red channel. Alpha channel [BGRA].

Three-channel image

It refers to an image with three channels of RGB, which is simply a color image. R: red, G: green, B: blue. For example, red is (255, 0, 0)

Four-channel image

An Alpha channel is added on the basis of three channels, which is used to measure the transparency of a pixel or image. For example, when Alpha is 0, the pixel is completely transparent, and when Alpha is 255, the pixel is completely opaque. A Mat containing a color image will return a vector composed of 3 8-digit numbers. In OpenCV, a vector of this type is defined as Vec3b, that is, a vector composed of 3 unsigned chars. If it has an alpha channel, a vector composed of 4 8-digit numbers will be returned. OpenCV defines a vector of this type as Vec4b. So we can use it like this:<Vec3b>(i,j)[0]The index value 0 indicates the channel number of the color is 0. The B component (blue) representing the point.

Image graying

//Stamp the dust-(UIImage *)grayInPutImage:(UIImage *)inputImage{
    cv::Mat matInput=[[CVUtil sharedInstance]cvMatFromUIImage:inputImage];
    cv:: Mat grayMat;
    cv::cvtColor(matInput, grayMat,cv::COLOR_BGR2GRAY);
    UIImage *imag=[[CVUtil sharedInstance]UIImageFromCVMat:grayMat];
    return imag;
}

Box filtering

//Box filtering operation-(UIImage *)boxFilterInPutImage:(UIImage *)inputImage value:(int)value{
    Mat matInput=[[CVUtil sharedInstance]cvMatFromUIImage:inputImage];
    Mat boxFilterMat;
    boxFilter(matInput, boxFilterMat, -1,cv::Size(value+1,value+1));
    UIImage *imag=[[CVUtil sharedInstance]UIImageFromCVMat:boxFilterMat];
    return imag;
}

Mean filtering

//Mean filtering operation-(UIImage *)blurInPutImage:(UIImage *)inputImage value:(int)value{
    Mat matInput=[[CVUtil sharedInstance]cvMatFromUIImage:inputImage];
    Mat blurMat;
    blur(matInput, blurMat, cv::Size(value+1,value+1),cv::Point(-1,-1));
    UIImage *imag=[[CVUtil sharedInstance]UIImageFromCVMat:blurMat];
    return imag;
}

Gaussian filtering

//Gaussian filtering operation-(UIImage *)gaussianBlurInPutImage:(UIImage *)inputImage value:(int)value{
    Mat matInput=[[CVUtil sharedInstance]cvMatFromUIImage:inputImage];
    Mat gaussianBlurMat;
    GaussianBlur(matInput, gaussianBlurMat, cv::Size(value*2+1,value*2+1), 0,0);
    UIImage *imag=[[CVUtil sharedInstance]UIImageFromCVMat:gaussianBlurMat];
    return imag;
}

Median filtering

//Medium filtering operation-(UIImage *)medianBlurInPutImage:(UIImage *)inputImage value:(int)value{
    Mat matInput=[[CVUtil sharedInstance]cvMatFromUIImage:inputImage];
    Mat medianBlurMat;
    medianBlur(matInput, medianBlurMat,value*2+1);
    UIImage *imag=[[CVUtil sharedInstance]UIImageFromCVMat:medianBlurMat];
    return imag;
}

Bilateral filtering

//Binocular filtering operation-(UIImage *)bilateralFilterInPutImage:(UIImage *)inputImage value:(int)value{
    Mat matInput=[[CVUtil sharedInstance]cvMatFromUIImage:inputImage];
    Mat bilateralFilterMat;
    Mat grayMat;
    cvtColor(matInput, grayMat,cv::COLOR_BGR2GRAY);
    bilateralFilter(grayMat, bilateralFilterMat, value, (double)value*2, (double)value/2);
    UIImage *imag=[[CVUtil sharedInstance]UIImageFromCVMat:bilateralFilterMat];
    return imag;
}

corrosion

//Corrosion operation- (UIImage *)erodeInPutImage:(UIImage *)inputImage value:(int)value{
    Mat matInput=[[CVUtil sharedInstance]cvMatFromUIImage:inputImage];
    Mat element;
    element=cv::getStructuringElement(MORPH_RECT, cv::Size(2*value+1,2*value+1),cv::Point(value,value));
    Mat desimg;
    erode(matInput,desimg,element);
    UIImage *imag=[[CVUtil sharedInstance]UIImageFromCVMat:desimg];
    return imag;
}

Expansion

// Expansion operation- (UIImage *)dilateInPutImage:(UIImage *)inputImage value:(int)value{
    Mat matInput=[[CVUtil sharedInstance]cvMatFromUIImage:inputImage];
    Mat element;
    element=cv::getStructuringElement(MORPH_RECT, cv::Size(2*value+1,2*value+1),cv::Point(value,value));
    Mat desimg;
    dilate(matInput,desimg,element);
    UIImage *imag=[[CVUtil sharedInstance]UIImageFromCVMat:desimg];
    return imag;
}

Edge detection

//Edge detection-(UIImage *)cannyInPutImage:(UIImage *)inputImage value:(int)value{
    if (value==0) {
        return inputImage;
    }
    Mat srcImage=[[CVUtil sharedInstance]cvMatFromUIImage:inputImage];
    Mat destImage;
    ((), ());
    Mat grayImage;
    cvtColor(srcImage, grayImage, COLOR_BGR2GRAY);
    Mat edge;
    blur(grayImage,edge,cv::Size(value,value));
    Canny(edge, edge, 13, 9 ,3);
    destImage=Scalar::all(0);
    (destImage, edge);
    UIImage *imag=[[CVUtil sharedInstance]UIImageFromCVMat:destImage];
    return imag;
}

Image contrast and brightness adjustments

//Adjust the contrast and brightness-(UIImage *)contrasAndBrightInPutImage:(UIImage *)inputImage alpha:(NSInteger)alpha beta:(NSInteger)beta{
    Mat g_srcImage=[[CVUtil sharedInstance]cvMatFromUIImage:inputImage];
    if(g_srcImage.empty()){
        return nil;
    }
    Mat g_dstImage=Mat::zeros(g_srcImage.size(),g_srcImage.type());
    int height=g_srcImage.rows;
    int width=g_srcImage.cols;
    for (int row=0; row&lt;height; row++) {
        for (int col=0; col&lt;width; col++) {
            for (int c=0; c&lt;4; c++) {//4 channel BGRA image                g_dstImage.at&lt;Vec4b&gt;(row,col)[c]=saturate_cast&lt;uchar&gt;((alpha*0.01)*(g_srcImage.at&lt;Vec4b&gt;(row,col)[c])+beta);
            }
        }
    }
    UIImage *imag=[[CVUtil sharedInstance]UIImageFromCVMat:g_dstImage];
    return imag;
}

Summarize

OpenCV has a wide range of applications. Those who are interested in image processing, human-computer interaction and machine learning algorithms can choose one direction for in-depth research.

This is the end of this article about OpenCV- iOS image processing programming. For more related OpenCV  iOS image processing content, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!