Segmentation of Activated Sludge Phase Contrast Microscopy Images Using U-Net Deep Learning Model

process have been carried out to verify the proposed image segmentation method. Our proposed U-Net models with the combined loss function give better results than the U-Net models with BCE, fully convolutional network-VGG16 (FCN-VGG16), and a traditional segmentation method


Introduction
The activated sludge process is a general and typical biological wastewater treatment process.Biodegradation occurs in aeration tanks where microorganisms form activated sludge flocs.The settling ability of activated sludge, which directly determines the quality of the wastewater effluent, is critical for the operation of wastewater treatment plants, Therefore, the key to the operation of a sewage treatment plant is to have sufficient activated sludge with good settling ability.
Conventionally, the settling ability of the activated sludge treatment plants is monitored by measuring physicochemical parameters such as the sludge volume index (SVI).However, regular measurements are costly, tedious, and time-consuming, and have associated environmental hazards. (1)The morphological characteristics and internal structure of flocs and filaments are closely related to the settling ability of activated sludge.The morphological change of the activated sludge is used for the early detection and identification of abnormal operation conditions, such as sludge bulking and pin flocs.Activated sludge microscopic examination is required for process control and stable plant operation. (2)However, the accuracy and reliability of manual image analysis largely depend on the operator's prior knowledge.
(5)(6) By extracting the morphological information of flocs and filaments from microscopic images, physical and chemical parameters such as SVI can be effectively measured, and the abnormal operation of a sewage treatment plant can be detected early. (7)ince the collected samples do not require special preparation and filamentous bacteria can be observed at a relatively low magnification, phase contrast microscopy (PCM) is commonly used to observe activated sludge.Over the last two decades, the processing and analyses of PCM images of activated sludge have received considerable attention and widely applied to the measurement of SVI and the early detection and fault diagnosis of sludge bulking in wastewater treatment plants. (8,9)Grijspeerdt and Verstraete investigated the relevance of the morphology of activated sludge flocs and the settling ability of the sludge, and performed image analysis to estimate the settling ability of activated sludge. (4)Khan et al. reported a robust segmentation of PCM images of filamentous bacteria, the identification of image analysis parameters for the morphology of the bacteria, and the measurement of SVI. (1)The processing and analyses of microscopy images have proven to be a potential alternative monitoring tool in the early warning and prediction of the settling ability. (10)mage segmentation is a key step in image analysis and the basis for further understanding the microbial structure.The accuracy of image analysis depends on the quality of the segmentation of microbial aggregates and filaments in microscopy images of activated sludge.Several traditional image segmentation algorithms, including edge detection, clustering, texture-based segmentation, watershed algorithm, and some combinations of algorithms, have been reported for activated sludge images.Khan et al. applied nine different approaches to segment PCM images of activated sludge (AS) samples and assessed their effectiveness in a comparative experiment. (11)The experiments have shown that these traditional algorithms have the risks of oversegmentation, undersegmentation, and failure, and are not always applicable to all PCM images.Nisar et al. found that flocs are oversegmented using the Otsu thresholding algorithm.Furthermore, they investigated three image segmentation algorithms for PCM images. (12)Khan et al. proposed a robust segmentation procedure for flocs and filamentous bacteria, and investigated regression models for SVI on the basis of the extracted morphological characteristic parameters of filaments and flocs. (13)Jenné et al. developed an automatic image analysis system for monitoring the floc and filament features of activated sludge. (14)owever, these methods usually require some parameters be manually set to achieve automatic segmentation.
Currently, traditional image segmentation methods are based on the assumption of sharp grey contrasts between flocs and filaments.In a microbial image of activated sludge captured by PCM, greyscale distinction is not clear owing to the illumination or sensor nonuniformity of PCM.The water stain on the slide causes artifacts of halos and the shade-off of PCM images and many white spots inside microbial aggregates.It is difficult to achieve high performance for the traditional segmentation method based on thresholds or edge detection in a complex view of PCM images.
Compared with a traditional image segmentation algorithm, image segmentation algorithms based on deep learning have made significant progress, which can solve many problems that cannot be solved by traditional image segmentation methods. (15)Shelhamer et al. adopted contemporary classification convolutional networks into fully convolutional networks (FCNs) and transfer their learned representations to the segmentation task. (16)The U-Net deep learning network is a potential tool that can be used to further improve the loss of detailed image information caused by the multiple down-sampling operations of the image in the network. (17)he U-Net model network structure is easily trained and is suitable for a small sample.
In our work, an automated segmentation method for a PCM image is proposed to extract flocs and filaments using the U-Net deep learning structure with data augmentation.In order to deal with sample imbalance, we propose a loss function combining the binary cross entropy (BCE) function and Dice coefficient to improve the segmentation accuracy and sensitivity with unbalanced foreground and background samples.
The rest of the paper is structured as follows.In Sect.2, we describe our experiment and image acquisition system, as well as an image segmentation algorithm based on the U-Net deep learning model.In Sect.3, we evaluate our proposed image segmentation method and discuss our main results.We conclude the paper with some final comments and directions for future work in Sect. 4.

Lab-scale activated sludge system
A lab-scale activated sludge system is designed to simulate the biological wastewater treatment process, as shown in Fig. 1.The activated sludge is sampled from a petrochemical wastewater treatment plant, and microorganisms are cultured in the laboratory.The activated sludge was fed with synthetic wastewater with a chemical oxygen demand (COD) of 300 mg/L.The wastewater was prepared using a glucose solution, a phosphoric acid mixed solution, and a sulfuric acid mixed solution.The wastewater is mixed with bacterial flocs in the aeration tank where pollutants are biodegraded.The reactor is a fully mixed aeration tank where air is injected in the mixed liquor to maintain a sufficient amount of dissolved oxygen in the aeration tank.The experiment lasted for two months.In addition to the PCM of activated sludge, we measured several physical and chemical indicators such as sludge volume (SV), pH, temperature, COD, suspended solids (SS), mixed liquor suspended solids (MLSS), and SVI.

Image acquisition system
All the activated sludge samples were collected from the aeration tank in the activated sludge experimental system.The microbial image acquisition system includes an inverted optical microscope (Nikon Eclipse TS100), an industrial digital camera (ToupTek ToupCam ucoms03100kpa), and a set of image acquisition software (ToupView).In this experiment, we used a phase contrast microscope equipped with a color CCD camera to capture the images of the activated sludge, as shown in Fig. 2. The output signal of the CCD is digitized, and each digital image has a size of 1024 × 768 pixels.The microscope has 10×, 20×, and 40× phase contrast objectives and a 10× eyepiece.We chose typical images with a resolution of 0.314 µm/pixel at 10× objective magnification.

Segmentation problem description
PCM images of activated sludge were acquired.When light passes through a sample with a different density (refractive index) in a phase contrast microscope, slight variations in light phase appear and are amplified to show visible changes in light amplitude.This allows the shape and internal morphology of the microorganisms to be displayed without exogenous fixing or staining, thus facilitating the observation and analysis of microbial components by the experimenter.However, the phase contrast microscope is too sensitive to the phase variation and has some imperfections in the imaging process, which causes some artifacts in the presented image such as halos, light spots, and shade off [see Fig. 3(a)].The obtained PCM images have a low contrast among filamentous bacteria, flocs, and the background.Most of the acquired images show a unimodal phenomenon in the grey histogram [see Fig. 3(b)].Therefore, some traditional segmentation methods based on greyscale thresholding will fail or   not be sufficiently good for segmenting filamentous bacteria and flocs from the background.Fuzzy boundaries of filamentous bacteria or flocs may cause unsatisfactory results with some traditional segmentation methods.At present, some researchers have attempted to solve the segmentation problems of PCM images. (18,19)However, these methods usually require that some parameters be manually set to achieve the automatic segmentation of image processing.

Image segmentation method
An automatic segmentation method of PCM images was proposed to extract flocs and filaments using the U-Net model, as shown in Fig.In the online testing, the new input result was fed to the trained U-Net models to predict objectively the probability that each pixel belongs to filaments and flocs.Then, the segmentation result is obtained by the fusion of filament and floc probabilities.

U-Net model for image segmentation
The architecture of the U-Net model for the PCM image segmentation of activated sludge is illustrated in Fig. 5.The U-Net architecture consists of a contracting path (left side) for feature encoding and an expanding decoding path (right side) for full-resolution segmentation.
In the encoding path, input images of size (384, 512, 1) are gradually encoded into smaller feature maps after being processed by four down-sampling modules.Each module includes two 3 × 3 padded convolution layers and a 2 × 2 max pooling layer.In the decoding path, four up-sampling modules are used to gradually restore abstract feature maps to a full-resolution image.Each up-sampling module includes two 3 × 3 padded convolution layers and a 2 × 2 upconvolution layer with stride 2. He-normal is used to initialize each convolution layer. (20)A rectified linear unit (ReLU) is used as the activation function.Skip connections are established between the down-sampling and up-sampling modules to capture both the local and contextual information, which ensures that the details of the target image can be gradually restored. (21)he last layer includes a 1 × 1 convolution and a sigmoid activation function.The sigmoid function is defined as where x is an input pixel value and S(x) is a probability map indicating that the pixel is a floc or filament object.An appropriate loss function in a complex scenario is critical to guaranteeing the performance of the deep learning model.The BCE function is commonly used as a loss function for binary classifiers, i.e., 1 ( ) log ( ) ( 1) log(1 ( )) where X = {x 1 , x 2 , ..., x N } is an input image, N = W × H is the total number of pixels, the segmentation mask M = {m 1 , m 2 , ..., m N }, and each {0,1} i m ∈ .However, the BCE function does not take into account the imbalance between the foreground and background pixels, resulting in classification results that are biased towards the class with more pixels.There is an imbalance in our segmentation task, where flocs or filaments occupy only a smaller portion in the microscopy image.To publish the misclassified pixels, we define a new loss function by combining a weighted binary cross-entropy function and the Dice coefficient, [ ] ( ) Dice T P where β is the weight ratio of the foreground to the total number of pixels to balance the class frequencies, T is the real manual segmentation map and each i m T ∈ , P is the predicted segmentation map and each ( ) i S x P ∈ , and the Dice coefficient is defined as The loss function can improve the test performance of the U-Net model under the class imbalanced condition using the weighted binary cross-entropy function and Dice coefficient.The activated sludge image is segmented by where X is a test image, P 1 (X ) is the output probability map of the U-Net model for segmentation flocs, P 2 (X) is the output of the U-Net model for segmentation filaments, and P 0 (X) = 1 − P 1 (X) − P 2 (X) is the probability map of the background in this image.The segmentation result is the fusion of the outputs of dual U-Net models.

Performance assessment
As defined, earlier metrics are calculated for testing the images suggested here and in similar places to evaluate the proposed image segmentation method.The formulas used are TP TN Accuracy TP TN FP FN where true positive (TP) is the number of pixels whereby the positive class is correctly identified by the U-Net, false positive (FP) is the number of pixels whereby the positive class is incorrectly detected, true negative (TN) is the number of pixels whereby the negative class is correctly categorized, and false negative (FN) is the number of pixels whereby the negative class is incorrectly categorized.
In the semantic segmentation task, the above indicators have different physical meanings. (22)-measure is the harmonic mean of precision and recall, which has the same calculation formula as the previously mentioned Dice coefficient.Intersection-over-union (IoU) is the intersection and union ratio of the two sets of ground truth and predicted segmentation.The higher the IoU, the closer the predicted segmentation is to the ground truth.

Dataset
For validating the presented method, several representative images are collected in the labscale activated sludge process.The collected digital image has a size of 1024 × 768 × 3 pixels.PCM images are labeled as floc mask and filament mask, as shown in Fig. 6.

Environment configurations
In this work, we used Keras based on TensorFlow as the deep learning framework.Keras is a high-level neural network with a focus on enabling fast experimentation.All the tests were conducted in a 2.30 GHz 6-core Intel ® Xeon ® CPU E5 with 13 GB of RAM and NVIDIA Tesla K80 GPU with 12 GB memory with an Ubuntu 18.04 operating system.All program codes run in a python environment.

Experimental results and analysis
We evaluate the U-Net models with the combined loss function and compare it with U-Nets with BCE, FCN-VGG16, and a traditional segmentation method.The segmentation results of the testing image are shown in Fig. 8.
In  convolutional neural model FCN-VGG16 uses VGG16 as the backbone of the model and uses parameters pretrained on the imagenet dataset.For the traditional image segment of filaments and flocs, we have applied the combination of Gaussian filtering, Otsu thresholding algorithm, Hole filling, and morphological operations.The DSeg segmentation program (24) is used for segmenting filaments, and the PHASECONG MATLAB Code (25) for removing halos.Note that our improved U-Net models can correctly segment the objects on 10 test images with halos, light spots, and shade off.Our trained models are also not affected by the red ruler on the image.Table 1 gives the segmentation performance characteristics of four methods.The mean in Table 1 is the average of the corresponding indicator of segmentation for flocs and filamentous bacteria, except for the background for 10 test images.From Table 1, it is easy to observe that our proposed U-Net models with the combined loss function have obvious advantages in terms of model accuracy, precision, recall, F-measure, and IoU.The improved loss function helps improve the PCM image segmentation performance.The filamentous bacteria in our captured images have fewer pixels, which causes the positive and negative class imbalance of the training samples.Our improved loss function combines the BCE and Dice coefficient, which ensures the smoothness of gradient descent and enhances the image segmentation performance.

Conclusions
In general, the key to measuring the SVI and early detection of filamentous bulking based on digital image analysis lies in the image segmentation performance of flocs and filaments for the activated sludge wastewater treatment process.We proposed an automatic floc and filament extraction method for PCM images using the U-Net deep learning model with data augmentation.A loss function combining the BCE function and Dice coefficient is proposed to improve the segmentation accuracy and sensitivity with unbalanced foreground and background class pixels.Our method achieved a mean IoU of 74.67%, a mean F-measure of 0.82, and a mean precision of 81.47 % for two types of objects (i.e., floc and filament) on the test set.The visualization also shows the performance of our model.It is worth noting that the PCM image from the AS process may be contaminated with phase shifts and light scattering interferences.Thus, ensemble image segmentation techniques combining images at a different magnification should be considered to enhance the image segmentation performance.In addition, because manual labelling is time-consuming, semisupervised image segmentation algorithms should be explored in the future.

Fig. 3 .
Fig. 3. (Color online) Phenomena in PCM.(a) Halos, light spots, and shade off in the image.(b) Greyscale distribution of the image presenting a single peak.

4 .
The proposed method consists of image acquisition, U-Net model training, and the online testing of the image segmentation of flocs and filamentous bacteria.The image acquisition system aims to capture an activated sludge image through PCM from activated sludge experiments.Datasets including input images and labels of flocs and filaments are divided into training, validation, and testing data.To improve the robustness of the networks, data augmentation is performed by elastically deforming the training samples using random cutting, rotation, and flipping methods.To reduce the training time, the parameters of the floc image segmentation model are initially determined using filament segmentation model parameters.The networks were trained with the Adam stochastic gradient descent optimization method with a new loss function.The loss function is defined by the sum of the weighted binary cross-entropy function and Dice coefficient, which guarantees the smoothness of the gradient descent and improves the segmentation accuracy and sensitivity with unbalanced foreground and background class pixels.

Figure 6 (
Figure 6(a) shows the original microbial image and Fig. 6(b) shows the ground truth of the flocs and filamentous bacteria manually labelled with the LabelMe Software Tool. (23)The flocs are in red, and the filamentous bacteria are in green.The flocs and filaments are greyed out and scaled to 512 × 384 greyscale image in Figs.6(c) and 6(d), respectively.A total of 69 typical images and their mask labels are selected to form the sample dataset.The dataset is randomly divided into 52 for training, 7 for validation, and 10 for testing.Testing datasets are used to evaluate the accuracy of the activated sludge image segmentation based on the improved U-Net deep learning network.

Figure 7
shows the U-Net model training of the floc and filament image segmentations.In Figs.7(a) and 7(b), the left diagram shows the combined loss function and the average IoU score value of the training data.The right diagram is the combined loss function and the average IoU score value of the validation data.As shown in Fig. 7, the convergence rate of the filamentous bacterial U-Net model is higher than that of the floc U-Net model.

Fig. 7 .
Fig. 7. (Color online) U-Net model training of floc and filament image segmentations.(a) Training of the U-Net model for microbial floc segmentation.(b) Training of the U-Net model for filamentous bacterial segmentation.

Table 1
Segmentation performance characteristics of four methods for filamentous bacteria and flocs.