Method for Measuring Length of Free-swimming Farmed Fry by 3D Monitoring

To monitor the continuous growth of farmed fry, we present an automated method for measuring the length of free-swimming fry in a small aquarium. The proposed method detects the frame where a swimming fish appears for the measurement in a movie and estimates the locations of its snout and caudal fin base from its shape characteristics. The length is measured as the distance between these two points in 3D space, whose 3D coordinates are obtained by the method of direct linear transformation. The performance of the proposed method was examined by applying it to 10 fish in each of six species, and the standard deviation of the error rates was less than 5%. In the best case, the mean value and standard deviation of the error rates were 6.6 and 1.7%, respectively.


Introduction
In the research field of fisheries science, measurement of the fish length is one of the most important tasks for researchers to estimate the appropriate feeding amount, catch quantity, etc. Currently, the length of fish is measured by catching fish in a water tank and using a measuring instrument. However, this method is difficult for researchers because they have to straighten a flapping fish and measure its length promptly without hurting the fish. In fact, since fish often die as a result of the measurement, it is difficult to observe the growth of fish continuously. To overcome this problem, a method for estimating fish length without touching the fish has been reported. (1,2) However, since it is necessary to choose two standard points on a fish body to measure the fish length manually by this method, researchers must spend a long time finding the points by eye in a movie of free-swimming fish in a net cage. In addition, since this method only targets large fish such as bluefin tuna in a net cage, it is difficult to apply this method to small fish such as fry farmed in another environment such as small indoor water tanks. Because the caudal fin of fry is more transparent than that of adult fish and the length of fry is only a few centimeters, it is difficult to clearly capture the whole body of fry by cameras, find characteristic points by eye, and precisely choose the points by clicking the mouse.
With this background, to measure the length of farmed fry continuously, we propose a method for measuring the length of a small fish using movies taken by stereo cameras without touching the fish body. The proposed method finds a proper frame and the standard points for the measurement, and measures the length in 3D space.

Conversion from 2D Coordinates to 3D Coordinates Using Stereo Cameras
As means of 3D measurement for an object, the focus method, (3) light stripe projection method, (4) and direct linear transformation (DLT), (5,6) etc. have been reported. The focus method measures the depth of an object using multiple blurred images. However, since the purpose of this method is to recognize the shape of an object in 3D space, it cannot be applied to a moving object; hence, it is not appropriate for measuring the length of free-swimming fish. In the light stripe projection method, a laser beam is shined on an object and 3D coordinates on the surface of the object are found by using the angle of reflected light. However, most of the machines for shining the laser beam are expensive. Therefore, the proposed method finds the 3D coordinates of the snout and the caudal fin base of the fish by DLT, and measures the fish length by calculating the distance between the two points in 3D space.
The DLT obtains 3D coordinates from 2D images taken from different viewing points using Eqs. (1) and (2). 1 P X P Y P Z P u P X P Y P Z 1 P X P Y P Z P v P X P Y P Z where u and v are the x-coordinate and y-coordinate of a point in a 2D image, respectively, and X, Y, and Z are the coordinates in 3D space for the point. P 1 -P 11 are camera parameters.
To determine these parameters, n control points are prepared by hand in advance. Their 3D coordinates have already been known in 3D space because the points are located in real 3D space. Then, by taking images including the points, n pairs of 2D coordinates and 3D coordinates are obtained at every point, such as (X 1 , Y 1 , Z 1 ) and (u 1 , v 1 ). Next, the obtained coordinates are substituted into Eq. (3).
Denoting the left matrix as X, the middle one (i.e., the set of camera parameters) as P, and the right one as U, Eq. (3) is represented as Therefore, P can be obtained as The estimation of 3D coordinates by the DLT needs multiple cameras and P is calculated for each camera, i.e., the parameters for a camera R and another camera L are calculated as (P R1 , P R2 , ..., P R11 ) and (P L1 , P L2 , ..., P L11 ), respectively. To obtain the 3D coordinates of a point from the 2D coordinates of the point (u R , v R ) in an image taken by camera R and the point (u L , v L ) in an image taken by camera L, Eq. (6) is derived from Eqs. (1) and (2). x P uP P uP P uP u P y P vP P vP P vP v P z Since there are two viewing points of cameras R and L, the camera parameters are substituted into Eq. (6) to obtain In Eq. (7), denoting the left and right matrixes as A and B, respectively, (x, y, z) is obtained as The 3D coordinates of the point are obtained by substituting (u R , v R ), (u L , v L ), (P R1 , P R2 , ..., P R11 ), and (P L1 , P L2 , ..., P L11 ) into Eq. (8).

Recording environment
In this research, the locations of the snout and the caudal fin base of fish are used as the standard points for measuring the fish length, i.e., the distance between the two points is regarded as the fish length. Figure 1 shows the two points for a fish. To measure the length of small fish, it is desirable to set up a situation where fish swim parallel to the lenses of two stereo cameras located side by side because the standard points have to be captured at the same time by both lenses. In addition, it is also desirable to prepare an aquarium that minimizes the reflection of the fish figure. Therefore, we set up two GoPro HERO5 Black cameras side by side in front of an aquarium (W200 × D50 × H150 mm), where the depth of 50 mm is significantly shallow looking from the cameras. To suppress the reflection, the back wall of the aquarium was painted green. Figure 2 shows the recording environment, where (a) shows an actual situation, (b) is a top view of the configuration of the recording environment, and (c) is a side view of the configuration. As shown in Fig. 2, cameras L and R are set side by side 15 cm in front of the aquarium, which is the shortest distance for which both cameras can capture the whole aquarium. The origin in 3D space is defined as the bottom back left corner of the aquarium, the x-axis is the horizontal line from left to right, the y-axis is the direction from the front to the back of the aquarium, and the z-axis is the vertical (depth) direction seen from the cameras. The properties of the movies recorded by the cameras are MP4 format, 24-bit true   color, 1080 × 1920 (H/W) pixels, and 30 fps. Figure 3 shows a frame taken by the cameras at the same time. The proposed method is designed for the situation that each camera captures images of one free-swimming small fish for 2 min at the same time.
In the camera calibration to obtain the camera parameters in the DLT, the calibration frame shown in Fig. 4 was prepared, where the 17 points were used as the standard points.

Detection of the target frames for the measurement
Since the fish length is measured as the distance between the two standard points of the snout and the caudal fin base in 3D space, the length is not correctly measured when the fish body is curved as shown in Fig. 5. Hence, it is necessary to find an appropriate frame for the measurement, i.e., a frame where the side of the fish body faces the cameras and looks straight. To detect such a frame, the fish region is obtained as follows. First, a movie taken by each camera is divided into sequential frames, where the kth frame in the movie taken by camera L (R) is defined as E Lk (E Rk ). Also, the pixel value at the 2D coordinates ( y)). Since the water surface becomes noise when the surface is disturbed, the area of the water surface is cut from the frames by deleting the areas between E Lk (x, 0) and E Lk (x, 210) and between E Rk (x, 0) and E Rk (x, 210), regarding the upper left corner of the frames as the origin (0, 0). In addition, since the bottom area of the aquarium also becomes noise when the shadow of a fish appears on the bottom, the bottom area is cut from the frame by deleting the areas between E Lk (x, 829) and E Lk (x, 1079) and between E Rk (x, 829) and E Rk (x, 1079). Figure 6 shows an image obtained after cutting a frame taken by each camera as described above. Next, by processing using background subtraction, (7) the fish region, which is the foreground in the image, is detected in the images taken by cameras L and R as blackand-white images I Lk and I Rk , respectively, where the fish region and its background are shown by white and black pixels, respectively. This processing was implemented by the method of BackgroundSubtractorKNN in the OpenCV Library. (8) Then, a noise reduction is conducted by the median filtering, and a fill-up process is conducted by repeating the dilation and erosion three times. Figure 7 shows the fish region detected from Fig. 6. The target frame for measuring the fish length is detected as follows. First, the minimumarea rectangle for the fish region is obtained by the rotating calipers algorithm. (9) Figure 8 shows the minimum-area rectangle obtained from the fish region in Fig. 7. Next, from among the pairs of E L1 and E R1 to E L3600 and E R3600 , the mth pair for which the sum of the long sides of the rectangles obtained from both frames is greatest is detected as the target frames F Lm and F Rm . Figure 9 shows the pair detected as the target frames.

Removal of the caudal fin
In previous studies, (1,2) although the distance from the snout to the central hollow of the caudal fin was measured as the fish length, the location of the central hollow depends on the fish. Therefore, in the proposed method, the distance from the snout to the central point at the caudal fin base, which is the root of the fin, is measured as the fish length. To measure this distance, the region of the caudal fin is removed from each of F Lm and F Rm for convenience. First, to examine the color of the background around the fish region in F Lm and F Rm , the mean pixel values of each RGB color to the area excluding the fish region in the rectangle are calculated as R b , G b , and B b . Second, the fish region is represented as a straight line l using the weighted least squares method. (10) Since the line l can be assumed to pass through the center of the fish region, the two standard points are regarded to be on the line part on the fish region. Figure 10 shows the line l and the standard points for a frame of the movie. Third, for every h pixels of the line part, the pixel values of each RGB color are obtained as R i , G i , and B i (i = 1, 2, ..., h), and the difference between the fish region and its background D is calculated as Denoting the number of pixels in the fish region as N, all these pixel values of each RGB color are extracted as Rf j , Gf j , and Bf j ( j = 1, 2, ..., N). Finally, among these pixels, pixels that satisfy Eq. (10) are removed as pixels of the caudal fin. In Eq. (10), Th is the threshold used to determine whether these pixels are regarded as those of the caudal fin depending on D. The larger the value of Th, the more pixels are regarded as those of the caudal fin. Figure 11 shows an example of the removal when Th is set as 0.9.

Measurement of fish length
After finding the line part represented by pixels that are in the fish region excluding the caudal fin, such as the images shown in Fig. 11, both ends of the line part are regarded as the standard points of the snout and caudal fin base. Figure 12 shows the pair of the standard points obtained from Fig. 11. Then, the 2D coordinates of the standard points in the 2D images are converted into 3D coordinates by the DLT. Finally, the fish length is measured as the distance between the standard points in 3D space. For example, for the images shown in Fig. 12, the obtained locations of the two standard points are shown, and the measured 2D coordinates of the points are shown in Table 1. The camera parameters used in the DLT were obtained by the camera calibration and are shown in Table 2. The 2D coordinates and the camera parameters were substituted into Eqs. (7) and (8), and the 3D coordinates of the snout and the caudal fin base were obtained as (73.4 mm, 16.8 mm, 10.5 mm) and (50.5 mm, 31.9 mm, 5.7 mm), respectively. From the 3D coordinates, the fish length was measured to be 27.8 mm.

Experimental Results
The performance of the proposed method was examined for six species of fish, which are black neon tetra (B-tetra), Japanese white medaka (W-medaka), molly, gold barb (G-barb), Sumatra, and angelfish. Ten fish were prepared for each species. First, to examine the optimum threshold Th for each species, the proposed method was applied to the fish while changing Th from 0.05 to 0.95 at the interval of 0.05. All the correct data were prepared by measuring the fish length by hand immediately after the experiment. The performance was represented by the error rate ε obtained by Eq. (11) for each species. In Eq. (11), V mq and V tq are the fish length obtained by the proposed method and the correct length of the qth fish of each species, respectively. Figure 13 shows an image of measurement by hand to obtain the correct length, where the unit of the measure is 1 mm. That is, V tq contains a measurement error in the range of ±0.5 mm. Figure 14 shows experimental results using the error rate ε vs the threshold Th. In addition, Table 3 shows the optimum threshold obtained from Fig. 14, the mean value, and the standard deviation of all the error rates for the 10 fish of each species. Table 1 2D coordinates of the snout and the caudal fin base for the fish shown in Fig. 12. Fig. 12(a) Fig. 12 The purpose of this study is to observe the growth of fish automatically and continuously. Hence, since the error rate must be small, the most important value when examining the performance is the standard deviation of ε for each species of fish. From Table 3, it was confirmed that the proposed method successfully measured the fish length for all six species with a standard deviation within 5%.

Discussion
From Table 3, although it was confirmed that it is possible to observe the growth of fish with a standard deviation of the error rates within 5% for each species of fish, the optimum threshold Th and the relation between Th and ε depend on the species of fish as shown in Fig. 14. Therefore, it is necessary to use a specific threshold for each species of fish. When the proposed method is used, it will be necessary to prepare at least 10 of each species of fish to find the optimum threshold Th beforehand. That is, first, the user must obtain the correct length of all the samples by hand. Second, the user must repeatedly calculate the error rate ε by the proposed method, shifting the threshold from 0 to 1 by the interval of 0.05 to obtain the error rate ε for every threshold. Third, the threshold with the smallest standard deviation of the error rates for the samples is determined as the optimum threshold for the species.
Next, all the errors that the proposed method caused are discussed. First, there were three failures in the extraction of the fish region with the same cause. Figure 15 shows one of the failures. Since the background subtraction only works correctly for an object that is constantly moving, the method cannot extract the fish region correctly when the target fish hardly moves as shown in Fig. 15(b). However, this problem could not be serious because there are means of getting fish to move such as feeding them.
Second, there were failures in extracting the line l of angelfish. Figure 16 shows a failure case in the extraction. Since angelfish is flat with wide dorsal and pelvic fins and a body of similar width and length, the line l was not extracted correctly as shown in Fig. 16. Therefore, the error rate for angelfish was the highest among the species as shown in Table 3. Regarding the extraction of the line l, it is necessary to consider another method for fish with a flat body such as angelfish.

Conclusions
We presented a method for measuring the length of free-swimming small fish in a small aquarium using moving images. The proposed method could measure the length automatically in 3D space without contact with the fish body. From experimental results to examine the performance of the proposed method, it was confirmed that the standard deviation of the error rates was less than 5% for all six species. Among the species, the lowest mean value of the error rates and its standard deviation were 6.6 and 1.7%, respectively. It was also experimentally shown that the proposed method has problems in the cases when fish hardly move during the recording time and when the horizontal and vertical lengths of the fish body are similar, such as for angelfish.
As future work, it is necessary to propose a method for obtaining the standard points to the fish whose horizontal and vertical lengths are similar.