PIL Image to CV2 image

Posted by Wujie of Dasheshire on September 17, 2012

Bands need to be reversed for CV2. PIL is RGB. CV2 is BGR.
Also, if you are getting errors with CV2, it may be because it doesn’t like transformed ‘views’ of arrays. Use the copy method.

	# PIL RGB 'im' to CV2 BGR 'imcv'
	imcv = np.asarray(im)[:,:,::-1].copy()
	# Or
	imcv = cv2.cvtColor(np.asarray(im), cv2.COLOR_RGB2BGR)

	# To gray image
	imcv = np.asarray(im.convert('L'))
	# Or
	imcv = cv2.cvtColor(np.asarray(im), cv2.COLOR_RGB2GRAY)

view raw gistfile1.py hosted with ❤ by GitHub

PIL and CV2 use the same percentages of R,G, and B to make a greyscale image but the results may have a difference of 1 in many places due to rounding off. PIL uses integer division and CV2 uses floating point percentages.
PIL: GRAY = R * 299/1000 + G * 587/1000 + B * 114/1000
CV2: GRAY = R * 0.299 + G * 0.587 + B * 0.114

If the PIL image has four bands, only reverse the first three.

skew matrix with numpy operations

Posted by Wujie of Dasheshire on September 4, 2012

As an alternative to plugging values into [[0,-c,b],[c,0,-a],[-b,a,0]], i tried using numpy operations. This is very slightly faster than the above.

def skew(v):
    if len(v) == 4: v = v[:3]/v[3]
    skv = roll(roll(diag(v.flatten()), 1, 1), -1, 0)
    return skv - skv.T

Sudoku game generator

Posted by Wujie of Dasheshire on September 4, 2012

I wrote a Sudoku game generator based on what I know from playing Sudoku.

it is on ActiveState.com http://goo.gl/FbDQ0

I think it’s possible that some games may have more than one solution. I don’t feel like working with it more, but one way to prevent multiple solutions is have the computer play the game a few times and throw out those that do have more than one solution. I would have to randomize the order in which play options are done. It takes me awhile to solve one by hand, so I haven’t tested it too much, but it’s good so far.

cv2.triangulatePoints

Posted by Wujie of Dasheshire on August 29, 2012

X = cv2.triangulatePoints(P1, P2, x1, x2)
	• P1: Camera projection from X to x1; x1 = dot(P1,X)
	• P2: Camera projection from X to x2; x2 = dot(P2,X)
	• x1: 2xN normalized points
	• x2: 2xN normalized points

The projection matrices have the same origin. If camera one is set as the world origin, then P1 = eye(4)[:3]. The program crashed when I tried integer point arrays.

Complete example:

# Camera projection matrices
P1 = eye(4)
P2 = array([[ 0.878, -0.01 ,  0.479, -1.995],
            [ 0.01 ,  1.   ,  0.002, -0.226],
            [-0.479,  0.002,  0.878,  0.615],
            [ 0.   ,  0.   ,  0.   ,  1.   ]])
# Homogeneous arrays
a3xN = array([[ 0.091,  0.167,  0.231,  0.083,  0.154],
              [ 0.364,  0.333,  0.308,  0.333,  0.308],
              [ 1.   ,  1.   ,  1.   ,  1.   ,  1.   ]])
b3xN = array([[ 0.42 ,  0.537,  0.645,  0.431,  0.538],
              [ 0.389,  0.375,  0.362,  0.357,  0.345],
              [ 1.   ,  1.   ,  1.   ,  1.   ,  1.   ]])
# The cv2 method
X = cv2.triangulatePoints( P1[:3], P2[:3], a3xN[:2], b3xN[:2] )
# Remember to divide out the 4th row. Make it homogeneous
X /= X[3]
# Recover the origin arrays from PX
x1 = dot(P1[:3],X)
x2 = dot(P2[:3],X)
# Again, put in homogeneous form before using them
x1 /= x1[2]
x2 /= x2[2]

print 'X\n', X
print 'x1\n', x1
print 'x2\n', x2

X
[[  1.003   2.008   3.012   1.003   2.011]
 [  4.012   4.009   4.017   4.029   4.019]
 [ 11.019  12.023  13.041  12.089  13.057]
 [  1.      1.      1.      1.      1.   ]]
x1
[[ 0.091  0.167  0.231  0.083  0.154]
 [ 0.364  0.333  0.308  0.333  0.308]
 [ 1.     1.     1.     1.     1.   ]]
x2
[[ 0.42   0.537  0.645  0.431  0.538]
 [ 0.389  0.375  0.362  0.357  0.345]
 [ 1.     1.     1.     1.     1.   ]]

It would be nice if the cv2 function could accept 4×4 transformation and 3xN homogeneous point matrices and also return a homogeneous array.

def triangulatePoints( P1, P2, x1, x2 ):
    X = cv2.triangulatePoints( P1[:3], P2[:3], x1[:2], x2[:2] )
    return X/X[3]

cv2.correctMatches

Posted by Wujie of Dasheshire on August 29, 2012

This function requires two 1xN array of points, how does that work? Maybe 1xNx2 would be more clear. You need to convert the point array to the shape (1,N,2). If your point array looks like mine:

a3xN = \
array([[35, 59, 32, 92, 71, 12,  4,  8, 28, 19],
       [ 9,  4, 36, 80, 85, 47, 60, 83, 38, 68],
       [ 1,  1,  1,  1,  1,  1,  1,  1,  1,  1]])

You need to change it to look like this:

a1xNx2 = \
array([[[35,  9],
        [59,  4],
        [32, 36],
        [92, 80],
        [71, 85],
        [12, 47],
        [ 4, 60],
        [ 8, 83],
        [28, 38],
        [19, 68]]])

Try this:

a1xNx2 = r_[[a3xN[:2].T]]
#or
a1xNx2 = array([a3xN[:2].T])

The output is in the same form. Use output[0].T to convert to a 2xN array. If you don’t need the original a3xN array, then copy the output 2xN into the first two rows of a3xN (a3xN[:2] = output[0].T). Or convert to homogeneous coordinates like this:

outa3xN = append(outa[0].T,ones((1,len(outa[0]))),0)
#or
outa3xN = vstack((outa[0].T,ones(len(outa[0]))))

Complete example:

# A fundamental or essential matrix
F3x3 = array([[ 0.0345, -0.2073, -0.0673],
              [-0.1403, -0.0003,  0.6889],
              [ 0.0604, -0.6725,  0.0349]])
# Homogeneous 2D point arrays
a3xN = array([[35, 59, 32, 92, 71, 12,  4,  8],
              [ 9,  4, 36, 80, 85, 47, 60, 83],
              [ 1,  1,  1,  1,  1,  1,  1,  1]], float)
# Pass an integer array and get an integer array back
b3xN = array([[95, 24, 86, 26,  4, 85, 68, 66],
              [35, 44, 91, 10, 14, 41, 93, 34],
              [ 1,  1,  1,  1,  1,  1,  1,  1]], int)
# The cv2 method
outa, outb = cv2.correctMatches( F3x3, r_[[a3xN[:2].T]], r_[[b3xN[:2].T]] )
# Two ways of making the output homogeneous
outa3xN = append(outa[0].T,ones((1,len(outa[0]))),0)
outb3xN = vstack((outb[0].T,ones(len(outb[0]),int)))
print outa3xN
print outb3xN

[[ 34.37  52.71  12.63  94.56  70.1    5.   -24.92 -14.35]
 [ -1.31 -15.19  -3.31  77.    85.69   0.48  35.01  77.4 ]
 [  1.     1.     1.     1.     1.     1.     1.     1.  ]]
[[ 96  36  88   6  -8  85  49   4]
 [ 33  28  89 -10   7  41 103  44]
 [  1   1   1   1   1   1   1   1]]

Note that if you pass an integer array, then the function returns an int array, so you may need to convert to float first.

Epipole and Epipolar Lines

Posted by Wujie of Dasheshire on August 26, 2012

Epipole from essential matrix:
e1c2 is the position of camera 2 in image 1 coordinates. (Plot it in image 1)

def epipoleSVD(M):
    V = cv2.SVDecomp(M)[2]
    return V[-1]/V[-1,-1]
e1c2 = epipoleSVD(E)
e2c1 = epipoleSVD(E.T)

Note: This cv2 method returns three values, [S, U, V], but only V is needed to find the epipole. The epipole is in form [x,y,1].

Epipolar lines from essential matrix:
elines1 (a 3xN array) are drawn on image 1 and will converge at e1c2 (from above). x1 and x2 are normalized 3xN arrays of corresponding points in image 1 and image 2. (A line vector contains the values [a,b,c] in ax+by+c=0)

elines2 = dot(E, x1)
elines1 = dot(E.T, x2) #or dot(x2.T, E).T

To plot these lines, try the following:

def lines2points(lines, epipole):
    xmax = ones(lines.shape[1])*epipole[0]
    a,c = lines[0]/lines[1], lines[2]/lines[1]
    x = array([-xmax,xmax])
    y = array([r_[-a*Lx[0] - c], r_[-a*Lx[1] - c]])
    return x, y
Lx,Ly = lines2points( dot(E, x1), epipoleSVD(E.T) )
plt.plot(Lx, Ly, 'r')

Note: This will overlay epipolar lines derived from image 1 on corresponding points in image 2 that are between +/- epipole x-coordinate.

Essential Matrix

Posted by Wujie of Dasheshire on August 25, 2012

From point correspondences:
x1 and x2 are 3xN array of normalized points from images 1 and 2.

E = cv2.findFundamentalMat(x1[:2].T, x2[:2].T)[0]

Note: This cv2 method requires points to be in Nx2 format.
The ending [0] returns the matrix. The 2nd returned value [1] is a mask array of inliers (=1) and outliers (=0)

From transformation/projection matrix:
P is a 3×4 (or 4×4) camera projection matrix for image 2.
(World origin at image 1 camera)

R,t = cv2.decomposeProjectionMatrix(P[:3])[1:3]
skewt = roll(roll(diag(t.flat[:3]/t[3]), 1, 1), -1, 0)
skewt -= skewt.T
E = dot(R, skewt)

Note: This method requires a 3×4 array and returns seven items, [1] and [2] are the rotation matrix and translation vector. Also, the t-vec has length 4 and must be normalized by the last value (t[:3]/t[3])
roll, diag, .flat, dot are all NumPy functions.

First steps towards bundle adjustment

Posted by Wujie of Dasheshire on July 4, 2012

The white points are triangulated from the correspondences in the first two video frames. The red shows the two camera positions. Green are from adding data from the third image. The third camera position is farther ahead because 3 frames are skipped in the video sequence.

This was done with three rectified jpeg images. When added to the Ladybug program, it will find correspondences from the raw images (higher resolution) and then project those points to the rectified position. This uses the camera intrinsics produced through the Ladybug API.

PIL projective transform

Posted by Wujie of Dasheshire on June 3, 2012

im.transform(size, PERSPECTIVE, data) image ⇒ image

size = output image size
PERSPECTIVE = Image.PERSPECTIVE (int)
data = 8-tuple (a, b, c, d, e, f, g, h) of coefficients

Questions you may have:
1) How do you know the size of the image after the transform to set the size before doing the transform?
2) How is the transformed image placed within that pre-sized output window?
3) Where does the data 8-tuple come from?
4) How can I use the 8-tuple coefficients to project a point back and forth between original and transformed image.

Answers:
1&2) It depends on how much of the image you want to include. The destination points determine the placement within the new image. You can first calculate the coefficients (forward form, see Ans. 4), then calculate where the image corners will end up, then shift the destination points (or corners) to the desired placement and adjust the image size. Finally, calculate the coefficients a second time (backward form) for the Image.transform function.

3) The coefficients are calculated with some matrix math. You can find an explanation elsewhere. I’ll just show some code:

def get_transform_data(pts8, backward=True ):
    '''This method returns a perspective transform 8-tuple (a,b,c,d,e,f,g,h).

    Use to transform an image:
    X = (a x + b y + c)/(g x + h y + 1)
    Y = (d x + e y + f)/(g x + h y + 1)
    
    Image.transform: Use 4 source coordinates, followed by 4 corresponding 
        destination coordinates. Use backward=True (the default)
        
    To calculate the destination coordinate of a single pixel, either reverse
        the pts (4 dest, followed by 4 source, backward=True) or use the same
        pts but set backward to False.
    
    @arg pts8: four source and four corresponding destination coordinates
    @kwarg backward: True to return coefficients for calculating an originating
        position. False to return coefficients for calculating a destination
        coordinate. (Image.transform calculates originating position.)
    '''
    assert len(pts8) == 8, 'Requires a tuple of eight coordinate tuples (x,y)'
    
    b0,b1,b2,b3,a0,a1,a2,a3 = pts8 if backward else pts8[::-1]
    
    # CALCULATE THE COEFFICIENTS
    A = array([[a0[0], a0[1], 1,     0,     0, 0, -a0[0]*b0[0], -a0[1]*b0[0]],
               [    0,     0, 0, a0[0], a0[1], 1, -a0[0]*b0[1], -a0[1]*b0[1]],
               [a1[0], a1[1], 1,     0,     0, 0, -a1[0]*b1[0], -a1[1]*b1[0]],
               [    0,     0, 0, a1[0], a1[1], 1, -a1[0]*b1[1], -a1[1]*b1[1]],
               [a2[0], a2[1], 1,     0,     0, 0, -a2[0]*b2[0], -a2[1]*b2[0]],
               [    0,     0, 0, a2[0], a2[1], 1, -a2[0]*b2[1], -a2[1]*b2[1]],
               [a3[0], a3[1], 1,     0,     0, 0, -a3[0]*b3[0], -a3[1]*b3[0]],
               [    0,     0, 0, a3[0], a3[1], 1, -a3[0]*b3[1], -a3[1]*b3[1]]] )
    B = array([b0[0], b0[1], b1[0], b1[1], b2[0], b2[1], b3[0], b3[1]])
    
    return linalg.solve(A,B)

If your image has black borders that you want to make transparent, this code could help:

    transparency = 150
    I = asarray(transimage)  # (y, x, band)
    maskmat = (I[:,:,0] | I[:,:,1] | I[:,:,2] == 0)
    mask = ones(size, int).transpose() * transparency
    mask[maskmat] = 0
    mask = Image.fromarray(uint8(mask))
    transimage.putalpha(mask)

This code applies an alpha mask to a transformed image (transimage). Transparency is how transparent the image part will be and the maskmat region is for all the black (non-image) border area. This area of the mask is set to completely transparent (mask[maskmat] = 0). This works with Tkinter where an alpha of zero is transparent and 255 is opaque.

4) When I first saw the equation for calculating the point transform I believed I must have read it wrong. “Why would the equations move a point from the transformed image back to the original? I must be reading this wrong,” I thought. But the data set passed to the transform method must be working in the following way: It first creates a blank image of the specified size and then uses the ‘backwards’ equation to calculate from which pixel in the original image to copy color values into the destination location. Well, it seems counter-intuitive to me so I still call it ‘backwards’ (where ‘backward’ means mapping from destination to the origin, and ‘forward’ is origin to destination). Here is the code to transform a point:

def transform_pt(pt , coeffs ):
    T = coeffs
    x = (T[0]*pt[0] + T[1]*pt[1] + T[2])/(T[6]*pt[0] + T[7]*pt[1] + 1)
    y = (T[3]*pt[0] + T[4]*pt[1] + T[5])/(T[6]*pt[0] + T[7]*pt[1] + 1)
    return x,y

Two ways to use this. Pass a destination coord with the default (backward) transformation coefficients, or pass an origin coord with the forward (backward=False) coefficients.

C enums revisited

Posted by Wujie of Dasheshire on May 27, 2012

Just found an easy way to make enums.

(LADYBUG_DISABLE,
   LADYBUG_EDGE_SENSING,
   LADYBUG_NEAREST_NEIGHBOR,
   LADYBUG_NEAREST_NEIGHBOR_FAST,
   LADYBUG_RIGOROUS,
   LADYBUG_DOWNSAMPLE4,
   LADYBUG_MONO,
   LADYBUG_HQLINEAR) = xrange(8)

This is a simple way to make enums. Overriden values need to be added separately.
Unless a method to retrieve the enum name from the integer is needed, in that case you make another list (or dict for unordered values) to retrieve the enum word. The namedtuple is the best way if you don’t want to put quotes around each word in a long list.
For a ctypes version, you can use map()

(...) = map(c_int, xrange(8))

I guess there really isn’t a need to go the reverse direction unless you want to read the value for debugging. In my novice understanding when first learning ctypes I thought it would be necessary to have this option. Here how to make a list and then enumerate from that list:

LadybugColorProcessingMethod = '''
   LADYBUG_DISABLE
   LADYBUG_EDGE_SENSING
   LADYBUG_NEAREST_NEIGHBOR
   LADYBUG_NEAREST_NEIGHBOR_FAST
   LADYBUG_RIGOROUS
   LADYBUG_DOWNSAMPLE4
   LADYBUG_MONO
   LADYBUG_HQLINEAR
   '''.split()
for n, v in zip( LadybugColorProcessingMethod, range(8) ):
    exec( n + ' = ' + str(v) )

A namedtuple implementation might be preferred if the same name might be used for different values in different enums.

The Python Path

"Because the geeks shall inherit the properties and methods of object Earth" -heard on Slay Radio