Blogs/Image Translation

# Image Translation

mesakarghm Sep 16 2021 1 min read 169 views
Computer Vision The translation of an image is the process of moving or relocating of an image or object from one location to another. Using a predefined transformation matrix, we can relocate the image in any direction. The following is a transformation matrix that can be used for image translation.

$$\begin{bmatrix} 1 & 0 & t_x\\ 0 & 1 & t_y \end{bmatrix}$$

where tx and ty are the shift distance.

If the value of tx is negative, the image will be shifted to the left and the image will be shifted to the right for positive value of tx. Similarly, the image will be shifted up for negative values of ty and the image will be shifted down for positive value of ty.

To find the pixel value in the translated image, we just have to find the dot product between the current pixel and this transformation matrix. This will give us the pixel value for the translated pixel.

Below I provide a Python implementation for image translation.

The given pictures show the ranslation_img() function in action, where the image are shifted using a shift distance of (50,50).

Before: After: Learn and practice this concept here:

https://mlpro.io/problems/img-translation/

def translation_img(src_img,shift_distance,shape_of_out_img):
h,w = src_img.shape[:2]
x_distance = shift_distance
y_distance = shift_distance
ts_mat = np.array([[1,0,x_distance],[0,1,y_distance]])

out_img = np.zeros(shape_of_out_img,dtype='u1')

for i in range(h):
for j in range(w):
origin_x = j
origin_y = i
origin_xy = np.array([origin_x,origin_y,1])

new_xy = np.dot(ts_mat,origin_xy)
new_x = new_xy
new_y = new_xy
if (0<new_x < w) and (0<new_y < h):
out_img[new_y,new_x]  = src_img[i,j]
return out_img