Camera Calibration Using Catenary

—In this work, we introduce a novel calibration technique based on a hanging chain curve replacing the checkerboard-based methods. It is a known physical phenomenon that a hanging chain or a ﬂexible rope under gravity can be modeled by a special curve called catenary. Therefore, instead of the commonly-used planar calibrator, we propose using multiple shots of a catenary-shaped chain for calibration. This approach can solve the out-of-focus problem which is faced in checkerboard calibration methods when the size of the board is not large enough. Although enlarging a planar calibrator increases the manufacturing time and cost, a simple label chain can create large planar areas as precise as a rigid checker-board, is easily foldable and transportable. We compare the results of our proposed approach against the widely used checkerboard-based calibration as well as the state-of-the-art calibration methods and show that catenary-based calibration is much more accurate than checkerboard-based calibration by a very large margin and is also very competitive among the other approaches.


I. INTRODUCTION
S IMPLY, a camera is a device which maps the 3D space of the real world onto the 2D plane of the image sensor. This mapping is mathematically explained via the camera model which contains hidden parameters such as focal length and principal point. Estimation of these hidden parameters is called camera calibration; and camera calibration is used in many applications of machine vision to detect and measure objects, for scene reconstruction and for sensor fusion [1].
Although modern digital cameras are assembled by robotic arms, and even if they are produced on the same fabrication line, each camera has a unique parameter set and should be calibrated separately. Despite the fact that several alternative methods exist, cameras are commonly calibrated by a method proposed by Sturm and Maybank [2] and popularized by Zhang [3] two decades ago. This method is based on capturing a checkerboard image multiple times in arbitrary orientations with respect to the camera. Although these arbitrary orientations are unknown, owing to the common camera parameters of all the shots, orientations can be estimated and internal camera parameters can be computed. Thanks to its convenience, this pioneering method has become so popular that it dominated all of the applications containing photogrammetry unless extreme precision is needed. Nevertheless, in the last two decades, camera and sensor technology have developed and became more convenient. At the time of the article published by Zhang, cameras with analog output were popular; hence their resolution was around 0.3 MP. The article implied printing out a standard A4-sized paper for a planar calibrator. Standard inkjet printers had a resolution of 300 dpi which means around 8 million dots on paper. Therefore the reference object was precise enough to measure a low-resolution camera.
Moreover, for such a low resolution, blurring due to outof-focus for an A4-sized calibrator was not a problem. Yet, imaging of the calibrator with modern cameras has a significant problem caused by depth of field. When the focus is set to infinity, the minimum distance to capture a sharp image is called hyperfocal distance which is proportional with the resolution and the aperture. Modern camera lenses can be auto-focused on any target. Nevertheless, this motion on the lens also shifts the focal length. If the calibration is needed for photogrammetric purposes far from the hyperfocal distance, calibration should also be performed when the focus is set to infinity. In this case, the calibrator should be also placed far from the hyperfocal distance. Otherwise, blur on the calibrator causes uncertainty on the location of the markers and consequently error on the calibration parameters.
Nevertheless, if the calibrator does not roughly occupy the whole field of view, calibration also fails because estimation of perspective parameters would be ill-conditioned if the inputs are taken from a local window instead of the whole picture. Naturally, this problem could be overcome by enlarging the calibrator. Considering the resolution and the aperture of modern cameras, the calibrator should be placed one meter away and have roughly 1m 2 area for a satisfactory calibration. Sticking an A4-sized printout on a rigid planar surface was not a hard goal, but in the case of a larger calibrator, the precision of planarity of the coated rigid surface is another problem as it is not easy to carry around a 1m 2 rigid board. Inevitably, in order to solve the defocus problem, the calibrator should be enlarged; which means higher production time, cost, and also storage and mobility problems.
With the aim of overcoming the disadvantages of the larger calibrator, we propose to use a chain that would easily fit into one's pocket, but would also cover a large area when it is hanging. It is a physical phenomenon that a hanging chain or a flexible rope under gravity fits a special curve called catenary, and forms a planar curve whose parametric equation is known. From this point of view, we propose a novel calibration method with a hanging chain as shown in Fig. 1, which can be used to construct a large planar structure without the hassles listed above.
The contributions of this paper can be summarized as follows: (i) We propose a new calibrator that is shaped by a physical phenomenon, is naturally planar due to the laws of physics, and is easily foldable and transportable. (ii) Our proposed approach provides a solution to the out-of-focus problem, that is seen in lenses that are focused at a large distance and have a reduced depth of focus. As such, modern cameras in mobile devices have larger resolution and aperture which forces the calibrator to be manufactured in larger dimensions. The proposed calibrator is practical and can be bought off-the-shelf which reduces the manufacturing time and cost. (iii) We experiment with a camera having a fixed-focus lens; and compare the proposed calibrator with checkerboards of two distinct sizes: A4-sized and 1m 2 . By taking the larger checkerboard as ground truth we show that the proposed calibrator gives more accurate values than the smaller checkerboard. (iv) We also compare the catenary approach to several state-of-the-art models and show its competitiveness.

II. LITERATURE REVIEW
Camera calibration methods are separated into two archetypical approaches [4]: photogrammetric [5] and selfcalibration [6]. The first approach is based on a simple principle: a model, whose parameters are hidden, maps the 3D points of the world into the 2D points of the image. If a sufficient number of point pairs are collected, estimation of these hidden parameters is possible. For that purpose, a special tool called a calibrator, whose dimensions are wellknown has to be manufactured. Via markers on the calibrator, selected points on the image can be detected. On the contrary, the self-calibration approach does not need a calibrator. In this approach, arbitrary shots are taken from a constant scene. Between shots, there exist random angles which cause warping because of perspective transformation. Although random rotation angles of the camera are unknown, estimation of hidden parameters is possible due to the fact that these parameters are common in all shots. The pros and cons of these approaches are palpable: while photogrammetric methods are accurate and scene independent, self-calibration is imprecise and highly scene dependent. On the other hand, while the photogrammetric method needs a specially manufactured calibrator, selfcalibration needs nothing except for the captured scene. The recent approaches on deep learning such as [7] and [8] also fall into this category. In this work, we focus our attention only on the photogrammetric methods.
More than two decades ago, a renowned paper by Zhang [3] blended the two approaches by saving the advantages and discarding the disadvantages of these aforementioned methods. Manufacturing a 3D calibrator with high accuracy is fairly costly and time-consuming. Instead of a 3D object with markers on certain points, Zhang proposed using a 2D calibrator which can be easily produced by printing on paper, and estimated the parameters from the several shots taken with arbitrary directions with respect to the calibrator plane.
Nevertheless, with the development in camera technology, increments in resolutions, defocus aberration in capturing the calibrator has become a current issue. Within the span of the last five years among the two new-release iPhone models (7 vs. 13), hyperfocal distance has increased by twice roughly due to larger pixels on the sensor and larger aperture on the lens [9], [10]. Recent studies which follow the planar calibrator approach have challenged the focus problem. Ha et al. [11] and Bell et al. [12] proposed a method using a smartphone that displays special patterns to overcome defocus. Later, in an inspiring paper, Chen et al. [13] proposed a novel calibration method which utilizes Zhang's algorithm but gets rid of the printed planar calibrator. Instead of this, the parabolic trajectory of a bouncing ball is used. A bouncing ball follows a planar trajectory on space whose time function is wellknown. Therefore, the markers on a planar calibrator as given in Zhang's method were replaced by the tracking of a bouncing ball on a video sequence. Sturm and Quan [14] also proposed a similar calibration method by following a geometric approach. In the last decade, novel camera calibration methods are also proposed. Wong et al. [15] used the fact that the projection of a sphere is an ellipse to calibrate camera with balls. Su et al. [16] used a special calibrator consisting of a grid of spheres and model the projection of each sphere as an ellipse. Chen [17] used 4 planar points and an additive non-planar point which are available in sports arenas. Shen and Hornsey [18] used a special calibrator having spheres with distinct colors to calibrate a multi-camera system. Liu et al. [19] used a minimal planar graph and followed a geometric approach to estimate calibration parameters. Wang and Wan [20] used a simple planar graph consisting of only two vertical lines as a calibrator. Fu et al. [21] used a linear wand with three LEDs as a calibrator. Kong et al. [22] used vertical plumb lines to calibrate a multi-camera system. Lu and Chuang [23] used a flat monitor and plot lines on the monitor and estimated the projection between the image and monitor planes for multiple shots to calibrate the camera.
In this work, we propose a novel calibration method which is based on a physical phenomenon. It is known that a hanging chain or a flexible rope under gravity fits a special curve called catenary. Therefore, instead of a planar calibrator, multiple shots of the catenary-shaped chain can be used for calibration. This method can solve the "out-of-focus" problem which is faced with the checkerboard unless the size of the board is large enough. Hence, we propose an inexpensive and practical solution that enables precise calibration.

III. METHOD
In this section, we first describe the catenary curve model and how to estimate the camera parameters using the catenary as the calibrator. Then we discuss what would happen if the two ends of the calibrator were not aligned properly and offer a solution to handle it.

A. Catenary Curve
Under constant gravity, a chain having infinitesimally small links or an ideally flexible rope takes shape of a catenary curve. Algebraically the curve is formulated via the hyperbolic cosine function on the Cartesian plane. This function is a solution of the differential equation which is derived from the free body diagram of an infinitesimally small link of the chain [24], given as: where x and y correspond to the horizontal and vertical components of the curve, c 1 and c 2 are constants which shift the solution horizontally and vertically so they can be equated to 0. The shape of the curve is only determined by a. A wider catenary curve has a larger a parameter, and vice versa. We note that the equation is independent of the mass of the link, therefore, the material of the chain. Initially, the chain is marked with equal intervals in order to take the role of markers on the checkerboard. To formulate the locations of the markers, integration of differential lengths over the curve formulated in Eq. 1 is sufficient [24]. Therefore horizontal and vertical components x and y of the curve can be parameterized in terms of length u from the bottom point as in Eq. 2 and 3: Then, we need to compute the locations of the marked links using Eqs. 2 and 3. For that purpose, we compute the unknown parameter a in the formula using two constraints: length of the chain and aperture, i.e. distance between two ends of the chain, under the assumption that the two ends are on the same level horizontally. This assumption will be also discussed in Sec. III-C. These inputs can be measured with a tape line on the experiment setup. Let 2l and 2w be the total length of the Fig. 2: Warping between the calibrator and the image plane chain and aperture respectively. We can substitute w for x and substitute l for u in Eq. 2, and finally obtain Eq. 4 which must be satisfied by the unknown parameter a: The positive root of Eq. 4 can be computed using Newton's method. After obtaining parameter a, coordinates of marked links can be computed using Eqs. 2 and 3 by substituting evenly spaced numbers over the interval of [−l, l] for u.
Pixel coordinates of projections of the same points on the image plane can also be selected manually. Therefore we obtain two sets of coordinates that belong to the real-world and image plane as seen in Fig. 2. We expect these markers to take the place of crosses on the checkerboard to follow Zhang's method algebraically. Just as these crosses help to estimate the warp matrix H between the checkerboard and the image planes, markers on the chain could have the same role. Warping transformation can be formulated as in Eq. 5: where (u, v) pairs are pixel coordinates, (x, y) pairs are realworld coordinates. s represents the scale factor for projection. At least 4 points are enough to estimate warp matrix H using least squares estimation (LSE). The importance of the warp matrix for calibration will be explained in the next subsection.

B. Internal Parameter Estimation
After estimation of a series of warp matrices for each shot, we could perform internal parameter estimation as explained in [2], [3]. To compute the internal camera parameters, the pin-hole camera model could be formulated as a projection transformation of the 3D real world into 2D image plane as: where m = [u v 1] T and X = [x y z 1] T are homogeneous coordinates that belong to the image plane and the real world, respectively. Internal parameters that are contained in matrix A, namely f u , f v , γ, and [u 0 v 0 ] T are the vertical and horizontal focal lengths, skewness, and the principal point respectively. External parameters are the rotation matrix R and the translation vector t. Finally, s represents the scale factor for projection. Warping between the calibrator plane and the image plane can also be formulated as in Eq. 7 by assuming the calibrator plane as the z = 0 plane.
where r 1 and r 2 are the two columns of the rotation matrix R. Hence, each estimated warp matrix H can be shown explicitly in terms of a common matrix A which carries the internal parameters and the distinct external parameters as in Eq. 7.
The two properties of the columns of the rotation matrix r 1 and r 2 , perpendicularity and normality lead to two equations which constrain the columns of the warp matrix H.
Hence, for each calibrator capture, two constraints are obtained. That means, at least three image captures are sufficient to calculate the internal parameters. The brief information up to now on the algebraic derivation of internal parameter estimation is widely known as Zhang's method [3] although Sturm and Maybank presented a conference paper having the same approach before [2]. To obtain minimum projection error, nonlinear optimization is needed using maximum likelihood estimation which also enables estimation of the radial distortion parameters which belong to the nonlinear part of the camera model [3].

C. Leveling Problem
As mentioned in Sec. III-A, we assume that the two ends of the catenary-formed chain are on the same level horizontally. Nevertheless providing this condition can be impractical. Hence a new method that overcomes this condition should be sought. At this point, a perspective transformation between the image and the calibrator planes plays a key role. If a level difference exists, a shift in the locations of the markers occurs as seen in Fig. 3. By neglecting the level difference computed locations would be erroneous. Hence, perspective estimation between image coordinates of the markers and erroneous locations of the markers on the calibrator plane would also produce greater estimation error due to the fact that the mathematical model does not fit the physical setup. In order to overcome this problem, a hypothetical level difference can be assumed. If the hypothetical level difference is close to the true value, a smaller estimation error should be expected. Therefore, we should search for the hypothetical level difference that gives the smallest estimation error.
Marker locations X i (p) can be expressed as a function of hypothetical shift p. Hence the projective transformation between image coordinates of the markers x i and corresponding locations on the calibrator plane X i can be expressed as: and the optimal hypothetical shift p * can be computed as: Here x i and X i express homogeneous coordinates where i indicates the marker index. Estimated projection matrix H LSE produces estimation errors e i . Our proposed method aims to minimize the total magnitude of errors by searching for the optimal hypothetical shift p as in Eq. 10. Then the optimal level difference is substituted from optimal shift with Eq. 3. To solve this cost function, we use successive parabolic interpolation. The experiments regarding the leveling problem will be discussed in Sec. IV-D.

IV. EXPERIMENTS
In this section, we describe the setup, perform our experiments on three calibrators and discuss the factors that affect the accuracy.

A. Experimental Setup
For the experiments, we used the Waveshare RPi (B) 2.0 module camera with 5 MP resolution, which is compatible with Raspberry Pi. It has a manually adjustable focus hence we ensured that the internal parameters of the camera are fixed. For the chain, we preferred to use a two-meter-long label chain (Fig. 1a) as it has tiny links and free motion between the links. We painted 13 links on the chain black with equal intervals to detect marker points easily on the images (Fig. 1b). The aperture of the two ends of the chain is set as 90 centimeters hence roughly 1m 2 area is created as seen in Fig. 4.
By the two sets of coordinates that belong to the chain plane and the image plane, a warp matrix can be estimated using LSE. Although a basic calibration by following Zhang's algebraic derivation as briefed above is possible, in order to realize more sophisticated calibration which consists of radial distortion also, we need a nonlinear optimization process that minimizes projection error. In other words, the parameter set is searched to minimize the error between detected image locations of the marked points and calculated locations of projections of the same points by using locations on the calibrator plane.

B. Comparison of Calibrators
As mentioned above, standard Zhang's method using A4sized printout causes the out-of-focus problem in modern cameras with high resolution and wide aperture. A simple solution is enlarging the checkerboard and placing it further away from the camera. Nevertheless, end-users, who instantly need camera calibration, seldom attempt to construct such a large object. For that purpose, we propose a new calibrator that can take place of the larger checkerboard. Just a two-meterlong chain can roughly create a 1m 2 surface. Hence, in our experiments, we examined three calibrators as shown in Fig.  4: A4-sized checkerboard, 1m 2 checkerboard, and two-meterlong chain.
We surveyed the novel calibration methods in the literature of the last decade [15], [17]- [19], [21], [22]. In these studies, Zhang's method [3] and Bouguet's implementation [25] were taken as ground truth. Hence, we also took the same method [3] and implementation [25] as a ground truth but using a 1m 2 checkerboard. Consequently, using this ground truth, we compared the widely used A4-sized calibrator and our chain calibrator. Comparative results are given in Sec. IV-C. We also observe the out-of-focus problem on the images of the A4sized calibrator which is 30 or 40 centimeters far away from the camera. In Fig. 5 comparison of sharpness between the markers of these two checkerboards can be seen.

C. Calibration Results
We captured 20 shots for each of the three calibrators. We took special care to take captures of calibrators with distinct orientations with respect to camera coordinates as diverse as possible (Fig. 6) [26]. Because manipulating the orientations of the 1m 2 checkerboard and the catenary with respect to the Earth are unpractical, we just manipulate the orientations of the camera by rotating it around optical axis roughly. We also used a tripod and a remote shutter release to avoid shaky images. Consequently, we obtained the results listed for each of the three calibrators in Table I.  As mentioned above in subsection IV-B, we took the results of the 1m 2 checkerboard as ground truth. Then we measured the performances of the proposed method and A4-sized checkerboard by computing their relative errors with respect to ground truth results for each internal parameter: horizontal and vertical components of focal length and the principal point. Our proposed method surpasses the A4-sized checkerboard for estimation of each inner camera parameter. The reason for the failure of the checkerboard method is the out-of-focus problem. Especially larger error of A4-sized checkerboard on the principal point is caused by blur on the markers as seen in Fig. 5.
The results are also compared with other novel calibration methods of the last decade [15] [17] [18] [19] [21] [22], described briefly in Sec.II, in Table II. The performances of these methods are also measured with respect to Zhang's method. Nevertheless, in these works, dimensions of the checkerboard using for ground truth are not expressed. We observed that among the novel calibration methods given above, only the wand calibrator [21] is assured of precise calibration. The result of the catenary is as fine as the wand and sometimes even better.
In the methods named Spheres [15], Non-planar [18], Wand [21] and also our method Catenary, proposed calibrators have spherical shaped markers. Performance of the Wand and Catenary can be explained by the fact that smaller spherical markers produce smaller errors in the location of centers. The weakness of line-based methods, named Quadratic [19] and Plumb [22], is caused by the sparse distribution of the lines on the image while markers of previous methods cover the whole image. Lastly, the 5-Points [17] method uses arbitrary objects on the scene instead of a manufactured calibrator whose dimensions are known. Hence it shows the worst performance.
We also examined the estimation of radial distortion parameters. Radial distortion is modeled as a polynomial so it is parameterized by coefficients of a polynomial. Nevertheless,  for a fair comparison, instead of the estimated coefficients, we focused on projection error of the camera models including radial distortion corresponding to the estimated parameters which belong to distinct calibration methods (Table III). Here, the mean projection error of the A4-sized checkerboard is two times larger than the Catenary.

D. Experiments on leveling
In experiments, the same setup is used except for the fact that one end of the catenary is lifted to 2.5 centimeters. Other parameters remained the same. 20 images are captured from several camera angles by rotating the camera as mentioned above and also by positioning the tripod at the same locations as in the previous experiments. In order to observe the convexity of the optimization problem, the mean perspective error functions of each image are also plotted as shown in Fig. 7. As shown in the figure, although error functions vary around distinct intervals, each of them has a parabolic shape.
After the optimization process has been done, the computed optimum level difference between the ends as ∆h = −2.45 which is so close to the true distance of 2.5 centimeters. Minus sign of the optimum shift is also coherent with the end of the catenary which is lifted up. Finally, the camera calibration algorithm is also operated after computing marker locations with respect to the optimum shift computed before.   This process is the same as the previous calibration with the no shift assumption. Then new camera parameters are estimated and compared with the results obtained with the checkerboard as seen in Table IV. In comparison with the results in Table  I, a similar performance has been obtained.

E. Effect of the noise on calibration
Detecting the exact location of the markers can be a problem for precise camera calibration. Mostly uncertainty of marker locations on the image is caused by blur. Although the uncertainty is not on the same scale as seen in Fig. 8, noise having the same amplitude is added on the marker locations for both of the two calibrators. Then noisy locations are used in camera calibration for the two calibrators. This process is repeated 100 times. Hence the amount of bias (mean error w.r.t. the ground truth results) and variance (spread through iterations) caused by noise are observed for the checkerboard and catenary (Table V). Although the catenary gives better results in average considering bias, this method is also more sensitive to noise considering the variance. This weakness is caused by the number of markers, i.e. while the checkerboard has 100 markers, the catenary has just 13.

F. Effect of number of markers on calibration
To investigate the effect of the number of markers used on the catenary, we reduced the number of markers by skipping 2, 3, and 4 at a time, and executed the same calibration procedure.  Euclidean distances between the results from the ground truth are given in Table VI. The performance of the calibration seems saturated at around 13 markers.

V. CONCLUSION
In this work, we introduce a novel calibration technique based on a hanging chain curve, called a catenary, replacing the checkerboard-based methods. The catenary is an easily foldable and transportable object; it is easy to fabricate in large sizes, and is naturally planar. Calibrating lenses that are focused at a large distance and have a reduced depth of focus (such as a zoom lens) can be challenging using a checkerboard, and would require to build large calibrators which incur higher manufacturing time, cost, and storage problems. The catenary makes it a lot simpler, while providing accurate internal parameters with error rates less than 1%. Further, the catenary always points in the direction of gravity, so it enables cameraaccelerometer calibration as well. We show that the proposed method brings a fresh solution to the out-of-focus problem and outperforms the state-of-the-art approaches of the last decade.
Since our calibrator is shaped by gravity, its another potential benefit is to enable camera-accelerometer. Nowadays, IMU (inertial measurement unit) sensors are frequently integrated into mobile devices. The acceleration vector also indicates the direction of the gravity when the sensor remains immobile. Any change in camera orientation will be reflected in the measured gravity vector by the accelerometer. This fact enables to estimate the parameters which relate IMU sensor coordinates to the camera coordinates. By utilizing both camera and IMU sensors, several applications that require a mutual coordinate system that combines photogrammetric and kinematic data, such as image rectification and deblurring of shaky images can be performed on mobile devices.