https://spotlight.it-notes.ru/

The above site tells you where the beautiful photos of Window 10 log-in background are taken.

Just another WordPress.com site

RSS
#
Author Archives: kyuhyoung

##
The places where Windows 10 wall photos are taken.

##
Preventing MobaXTerm SSH session from freezing

##
Binary cross entropy and cross entropy loss usage in PyTorch

##
Element-wise operations over multiple tuples in Python

##
Intrinsic camera parameters for resized images set of a chessboard

##
Role of size of calibration pattern in camera calibration

##
OpenCV-related tips

https://spotlight.it-notes.ru/

The above site tells you where the beautiful photos of Window 10 log-in background are taken.

Advertisements

reference : http://www.smallake.kr/?p=19226

Whenever I go out to have lunch and come back to my desk, the SSH session in MobaXTerm is freezed.

So I turn on [Settings -> SSH -> SSH settings -> SSH keepalive] option.

Let me see if this works.

reference

https://discuss.pytorch.org/t/lstm-crossentropyloss-change-to-bceloss/5320/4

t1 = torch.randn(10, 2) print ('t1 : ', t1) score = Variable(t1) t2 = torch.rand(10) > 0.5 print ('t2 : ', t2) t3 = t2.long() print ('t3 : ', t3) target = Variable(t3) lfn1 = torch.nn.CrossEntropyLoss() lfn2 = torch.nn.BCELoss() t4 = lfn1(score, target) print('t4 : ', t4) t5 = torch.nn.functional.softmax(score) print ('t5 : ', t5) t6 = t5[:, 1] print ('t6 : ', t6) t7 = target.float() print ('t7 : ', t7) t8 = lfn2(t6, t7) print('t8 : ', t8)t.float()))

결과

t1 :

-0.6985 0.4857

-0.4547 0.8040

-1.4664 1.2185

0.5449 0.0662

0.7061 -0.6480

-0.8922 0.4205

0.1444 -0.1308

1.4784 0.7342

0.6642 -0.3723

-0.9741 -0.3100

[torch.FloatTensor of size 10×2]

t2 :

1

1

0

0

0

0

1

1

0

1

[torch.ByteTensor of size 10]

t3 :

1

1

0

0

0

0

1

1

0

1

[torch.LongTensor of size 10]

t4 : Variable containing:

0.8223

[torch.FloatTensor of size 1]

t5 : Variable containing:

0.2343 0.7657

0.2212 0.7788

0.0639 0.9361

0.6174 0.3826

0.7948 0.2052

0.2120 0.7880

0.5684 0.4316

0.6779 0.3221

0.7382 0.2618

0.3398 0.6602

[torch.FloatTensor of size 10×2]

t6 : Variable containing:

0.7657

0.7788

0.9361

0.3826

0.2052

0.7880

0.4316

0.3221

0.2618

0.6602

[torch.FloatTensor of size 10]

t7 : Variable containing:

1

1

0

0

0

0

1

1

0

1

[torch.FloatTensor of size 10]

t8 : Variable containing:

0.8223

[torch.FloatTensor of size 1]

The results of [CrossEntropyLoss] and [softmax + BCELoss] are the same, which means CrossEntropyLoss includes softmax in it.

aa, bb, cc = (1, 3), (9, 31), (20, 60) d = tuple(map(lambda a, b, c: a * b + c, aa, bb, cc)) print(d) >> (29, 153) e = tuple(map(lambda a, b, c: a * 10 + b > c, aa, bb, cc)) print(e) >> (False, True)

I realized that it is more convenient to resize (downsize) the input camera image to the small standard size (in my case 320 x 240) instead of change parameters accordingly to each camera image size every time. That is, the original large image and its intrinsic parameters are just used for displaying the rendered scene of augmented reality and all the processing behind is done with the resized image and the corresponding intrinsic parameters.

Basically, intrinsic camera parameters are needed for calculating the projection of 3D point on to the image plane. Such projection appears to be important part in visual SLAM as predicting observation of 3D landmarks for current frame based on the camera pose and 3D map estimation of the previous frame.

So for getting the resized image, all that I have to do is to use cv::resize() function of OpenCV. For the corresponding intrinsic parameters, I have to do something. I have to create the intrinsic parameters of virtual camera from a physically existing and calibrated camera.

Intuitively, among intrinsic parameters, the focal length and principal point of the resized image can be calculated by just scaling as resizing ratio. For example, if the original camera image is 1280 x 960 and resized image is 320 x 240, the ratio is 1/4 and the focal length and principal point is scaled so.

What about distortion factors? Should I scale them as the focal length and principal point? I posted the following question on the visual slam community board in Google plus and José Jerónimo Moreira Rodrigues gave me an answer saying “distortion factors remain the same due to their definitions”. Thanks Rodrigues.

My question :

Let’s say I have chessboard images of size w(width) and h(height). (Notice that w and h are not the actual size of the chessboard but that of the images of chessboard.)

After (Zhang’s) camera calibration, the resulted intrinsic camera parameters are fx(focal length x), fy(focal length y), cx(principal point x), cy(principal point y), k1(1st distortion factor), k2(2nd distortion factor), k3(3rd distortion factor), k4(4th distortion factor), k5(5th distortion factor).

If I resize the chessboard images by half so that the width and height of the resized images are w/2 and h/2 respectively and conduct camera calibration again, I would expect to theoretically get fx/2 ,fy/2, cx/2 and cy/2 as focal length x, focal length y, principal point x and principal point y respectively.

How about distortion factors ? What would I expect to get theoretically the 1st, 2nd, 3rd, 4th and 5th distortion factors of the resized chessboard images in terms of k1, k2, k3, k4 and k5 ?

Rodrigues :

They dont change. They are function of X/Z and Y/Z. Checkhttp://www.vision.caltech.edu/bouguetj/calib_doc/htmls/parameters.html. Only the intrinsic matrix changes, not the distortion coefficients.

The experiment using Camera calibration toolbox for Matlab with chessboard images in the OpenCV example folder also shows the distortion factors do NOT change.

I did the experiment by making two (1280 x 960 and 320 x 240) sets of images from the original chessboard images (640×480) in OpenCV example folder. The result is

1280 x 960 (double)

Focal Length: fc = [ 1074.34831 1073.83567 ]

Principal point: cc = [ 653.70628 499.16838 ]

Distortion: kc = [ -0.28805 0.10555 -0.00083 0.00033 0.00000 ] 640 x 480 (original)

Focal Length: fc = [ 536.93593 536.52653 ]

Principal point: cc = [ 326.87081 249.24606 ]

Distortion: kc = [ -0.28990 0.10780 -0.00079 0.00017 0.00000 ]

320 x 240 (half)

Focal Length: fc = [ 268.65196 268.42076 ]

Principal point: cc = [ 162.95051 124.58663 ]

Distortion: kc = [ -0.29024 0.10502 -0.00072 -0.00004 0.00000 ]

…

When using Camera Calibration Toolbox or OpenCV camera calibration, we are asked to provide the size of width and height of a grid in the the chessboard. So we measure the length with a ruler (in millimeter or meter) and type the length into the program.

When the measured width and height of a grid are 25 mm and 25 mm, what will happen if we provide the width and height as 2500 mm and 2500 mm (which are wrong and hundred times bigger than the real size) respectively to the program. The answer is that it does not make any difference to the intrinsic parameter values and this implies the unit of focal length as the output of calibration software above is NOT mm or meter BUT pixels. (Actually it affects only translation, that is, part of extrinsic parameters of each camera frame, which is unimportant calibration results). This also implies that if the image size (resolution) is big enough, it is OK to use a small chessboard such as the one printed on a A4 paper sheet.

This also means it is up to scale, that is, when the actually measured width and height are 25 and 50 mm respectively, it is okay to provide such a pair of scaled values (250, 500), (25000, 50000) , (1, 2) or (0.001, 0.002) as long as the ratio of width and height is the same as that of the real size (here 1 : 2).

What if, we provide the width and height value whose ratio is not the same as the measured one? In other words, if we provide 250 and 750 as the width and height of grid (ratio 1 : 3), the program will not run properly giving complain or very strange intrinsic parameters. Because with a chessboard of grid ratio 1 : 3, it might be impossible to get such images (of chessboard with ratio 1 : 2) taken by the camera to be calibrated.

- To get the same results from PC and Android platforms with the same source codes.
- For matrix inverse function, that is inv(), there are many options namely DECOMP_SVD, DECOMP_EIG, DECOMP_LU and DECOMP_CHOLESKY.

If you want the inversion result of PC and that of Android to be the same, you should use either DECOMP_SVD or DECOMP_EIG.

However the results of DECOMP_SVD and DECOMP_EIG are also different for the same input. For my work, DECOMP_SVD seems to give better result than DECOMP_EIG.

For the issue of speed, I checked the processing times for DECOMP_LU, DECOMP_SVD and DECOMP_EIG and found there is no such a difference in their running times. - For arc-tangent, there is a OpenCV function fastAtan2().

If you want the arc-tangent return values of PC and Android to be the same, you should use fastAtan2() instead of atan2() of “math.h”

For the issue of speed, I checked the speeds of atan2() and fastAtan2() and got strange results. In Debug mode, fastAtan2() is faster than atan2() as the name implies. In Release mode, however, atan2() is much faster than fastAtan2() and it took almost 0 millisecond for processing atan2(). (I might have done something wrong) - For random number generation, there are functions rand() and srand() of “stdlib.h”. OpenCV also provides random number generator class cv::RNG.

Sometimes you have to compare the (intermediate) results of PC and Android version of the same source codes so that you should remove the randomness in your codes. In other words, if you want for the sequences of generated random numbers from PC and Android to be fixed and the same, you should use cv::RNG with a fixed seed such as RNG(1234567) instead of rand() of “stdlib.h”. Even if you give a fixed seed to srand() such as in srand(1234567), it will give you different fixed random sequences for PC and Android. In fact, this will give the same random sequence every run for PC. And it will do so for Android. However, the sequences for PC and Android will not be the same. To get the same fixed random sequence with the same source codes, you should use cv::RNG as in the following example.cv::RNG rng = RNG(12378213); float randomF = rng.uniform(5.f, 45.f); int randomI = rng.uniform(5, 45); double randomD = rng.uniform((double)5, (double)45);

In addition, there are some OpenCV functions such as cv::solvePnPRansac() which use random number generator internally. If you want such a random function to give the fixed and same results in every run, you should give a fixed seed such as in following.

theRNG().state = 1234567; cv::solvePnPRansac(blah, blah, blah, ...);

- For matrix inverse function, that is inv(), there are many options namely DECOMP_SVD, DECOMP_EIG, DECOMP_LU and DECOMP_CHOLESKY.
- About reading gray image file.

There are many options for the funtion cv::imread(), namely

IMREAD_UNCHANGED =-1,

IMREAD_GRAYSCALE =0,

IMREAD_COLOR =1,

IMREAD_ANYDEPTH =2,

IMREAD_ANYCOLOR =4

and the default is IMREAD_COLOR.

If you read a gray image file with the default option IMREAD_COLOR, the resulted Mat (say matCOLOR) has 3 channels as a color image. If you split matCOLOR into 3 channels(red, green and blue) with the function cv::split(), the resulted blue, green and red matrices (say matBlueFromCOLOR, matGreenFromCOLOR and matRedFromCOLOR respectively) are all identical. If you convert matCOLOR to a gray image (say matGrayFromCOLOR) using cv::cvtColor(CV_BGR2GRAY), matGrayFromCOLOR is also identical to the blue, green and red matrices.

If you read the gray image file with the option IMREAD_GRAYSCALE or IMREAD_UNCHANGED, the resulted Mat (say matGRAYSCALE or matUNCHANGED respectively) is also identical to the converted Mat matGrayFromCOLOR.

In short, for a gray image file, the following Mats are all identical.

matBlueFromCOLOR, matGreenFromCOLOR, matRedFromCOLOR,

matGRAYSCALE, matUNCHANGED, matGrayFromCOLOR