Image processing: basic operation of image

Keywords: OpenCV Computer Vision image processing

1. IO operation of image

Here we will introduce how to read images, how to display images and how to save images.

1.1 reading images

API: cv.imread()


  • Image to read
  • Flag of reading mode
    • cv.IMREAD_COLOR: when loading an image in color mode, the transparency of any image will be ignored. This is the default parameter.
    • cv.IMREAD_GRAYSCALE: loads images in grayscale mode
    • cv.IMREAD_UNCHANGED: load image mode including alpha channel.

You can use 1, 0, - 1 to replace the above three flags

Reference code:

import numpy as np
import cv2.cv2 as cv
# Read the image in the form of grayscale image
img = cv.imread('messi5.jpg',0)

Note: if there is an error in the loaded path, no error will be reported, and a None value will be returned

1.2 display image

API: cv.imshow()


  • The name of the window that displays the image, expressed as a string
  • Image to load

Note: after calling the API for displaying the image, call cv.waitKey() to leave time for the image drawing, otherwise the window will have no response and the image cannot be displayed.

In addition, we can also use matplotlib to display images.

# Display in opencv
cv.waitKey(0) # 0 stands for permanent display, and the unit of other numbers is ms
# Displayed in matplotlib

1.3 saving images

API: cv.imwrite()


  • File name, where do you want to save it
  • Image to save

Reference code:


1.4 summary

We load the grayscale image and display the image. If we press's' and exit, we will save the image, or press ESC to exit without saving.

import numpy as np
import cv2.cv2 as cv
import matplotlib.pyplot as plt

# 1 read image
img = cv.imread('messi5.jpg',0)

# 2 display image
# 2.1 displaying images with opencv

# 2.2 displaying images in Matplotlib
plt.title('Matching results'), plt.xticks([]), plt.yticks([])
k = cv.waitKey(0)

# 3 save image

2 drawing geometry

2.1 draw a straight line

API: cv.line(img,start,end,color,thickness)


  • img: image to draw a line
  • Start,end: start and end of the line
  • Color: the color of the line
  • Thickness: line width

2.2 draw a circle

API:,centerpoint, r, color, thickness)


  • img: image to draw a circle
  • Centerpoint, r: Center and radius
  • Color: the color of the line
  • Thickness: the width of the line. When it is - 1, a closed pattern is generated and filled with color

2.3 draw rectangle

API: cv.rectangle(img,leftupper,rightdown,color,thickness)


  • img: image to draw rectangle
  • Leftupper, rightdown: coordinates of the upper left and lower right corners of the rectangle
  • Color: the color of the line
  • Thickness: line width

2.4 adding text to images

API: cv.putText(img,text,station, font, fontsize,color,thickness,cv.LINE_AA)


  • img: image
  • Text: text data to write
  • station: where text is placed
  • Font: font
  • Font size: font size

2.5 effect display

We generate an all black image, then draw the image and add text in it

import numpy as np
import cv2.cv2 as cv
import matplotlib.pyplot as plt
# 1 create a blank image
img = np.zeros((512,512,3), np.uint8)
# 2 drawing graphics
cv.rectangle(img,(384,0),(510,128),(0,255,0),3),(447,63), 63, (0,0,255), -1)
cv.putText(img,'OpenCV',(10,500), font, 4,(255,255,255),2,cv.LINE_AA)
# 3 image display
plt.title('Matching results'), plt.xticks([]), plt.yticks([])


3 acquire and modify the pixels in the image

We can obtain the pixel value of the pixel point through the coordinate values of the row and column. For BGR images, it returns an array of blue, green, and red values. For grayscale images, only the corresponding intensity value is returned. Use the same method to modify the pixel value.

import numpy as np
import cv2.cv2 as cv
img = cv.imread('messi5.jpg')
# Gets the value of a pixel
px = img[100,100]
# Gets only the intensity value of the blue channel
blue = img[100,100,0]
# Modify the pixel value of a location
img[100,100] = [255,255,255]

4 get the attributes of the image

Image attributes include the number of rows, columns and channels, image data type, number of pixels, etc.

Image sizeimg.size
data typeimg.dtype

5 splitting and merging of image channels

Sometimes it is necessary to work alone on B, G and R channel images. In this case, the BGR image needs to be divided into a single channel. Or in other cases, it may be necessary to combine these separate channels into BGR images. You can do it in the following ways.

# Channel splitting
b,g,r = cv.split(img)
# Channel merging
img = cv.merge((b,g,r))

Posted by djbuddhi on Thu, 04 Nov 2021 22:05:43 -0700