Development of Smart Pillbox Using 3D Printing Technology and Convolutional Neural Network Image Recognition

. In this study, we propose a complete concept of an active smart pillbox, which comprises a main control unit, a pill dispenser unit, and an application software (app) for the automatic dispensing of medicine. The smart pillbox employs convolutional neural network image recognition and 3D printing technology. We adopt an Arduino-based platform to control the rotation and stopping of the motor to dispense the required quantity of pills as the first step towards a fully automated process. A smartphone can be connected to the smart pillbox by Bluetooth and be used to set the parameters of the system. This pillbox can be used at home and allows users to set the medication time and pill type from their smartphone using an app. Moreover, it can remind users to take their medicine. The device is very promising for use in home care and clinical practice


Introduction
Pill packaging is one of the key tasks of every hospital and clinic. Particularly in large hospitals, the pharmacy department must exert much effort in handling many types of medicine for packaging. In addition to human factors, medication safety is a major consideration, i.e., whether the quantity and content of the drugs are correct. To solve this problem, some companies, such as TOSHO in Japan (1) and JVM in Korea (2) have developed an automated drug-packaging system with which prescriptions are accepted from various units of the hospital through a computer, and then the drugs are delivered according to the prescription. Then the pills are counted and placed in a medicine bag and sealed, with a label printed to complete the packaging of the medicine. In this study, we modify this concept of an automated drugpackaging system to develop a personalized or family-type medicine management system or a smart pillbox. In the development of the application software (app) for use with a smartphone with basic functions, such as timing and counters, the dispensing of pills only requires defining their mixture during the initial setting of parameters. The system can inform the drug taker when to take the medicine, and according to the parameters set, the pills are dispensed separately. The pills are sorted one by one for consumption after each meal and separated into different drug delivery units. Although there are similar smart products (3) and patents, (4,5) users or caregivers are still required to assist in dividing and dispensing medicines. (6)(7)(8) The proposed system can assist in accurate drug delivery and achieve a simpler and safer environment for dispensing medication.

System design and operation flow
The system was implemented as a mechanism fabricated by 3D printing, with signal control by an Arduino-based platform, and programming by an app. The time and type of medication are set through the app, then the smartphone transmits the settings to the main Arduino controller through Bluetooth wireless connection. The Arduino (master) sends the action commands through a MAX485 to the Arduino (slave). The Arduino (slave) receives the command and starts the action. In the Arduino (slave) action process, the main Arduino (master) continues transmitting the message to each Arduino (slave) and completes the actions for each Arduino (slave). After completely dispensing the pills, the main Arduino (master) returns the message to the smartphone to inform the user taking the medicine. The overall structure of the system is shown in Fig. 1. Figure 2 shows the assembly of the pillbox system drawn by SolidWorks. The pillbox and rotating gear were manufactured using a 3D printing machine. Details of the parts are shown in Fig. 3.
The smart pillbox will initially be operated with a button setting, and MAX485-wired transmission will be used to control multiple boxes with a main Arduino-based controller. To increase the number of types of pills that can be stored, the Arduino (master) gives instructions to the slave through MAX485, sets the number of each type of pill, and operates the system. MIT APP Inventor is used to develop the smart mobile app for the required functions and design of the kit for putting different pills into different boxes, defining which pills to be dispensed, and then setting the medication commands. The times when the pills should be dropped are set, according to the arranged kit number. Multiple times can be set, and the user is reminded to take the medicine at these different times.
After setting the number of pills to be dispensed, a command is given to start the motor and the sensor. With the operation of the motor, the gears on the motor drive the gears connected to the rotating teeth, then the rotating teeth rotate the medicine tank. When a capsule reaches the medicine chute, it drops. There is a sensor next to the hole to detect the number of capsules dropped. For each drop, a value on the seven-segment display decreases by one. Figure 4 shows pictures of the whole pillbox system.

Pill image recognition and training model
One recognizes a picture by three processes: identification, looking, and scaling down. These three processes are carried out using the convolutional neural network (CNN), which is the most widely used model architecture for image recognition. The first layer of our convolution uses a smaller filter to obtain the inner product. The original picture of size 83 × 83 becomes 83 − 9 + 1 = 75 after convolution with 9 × 9 filters. A depth of 64 is used to save color and the first filter is not used, so it remains unchanged. This action is in line with the first two of the three processes of identifying the picture mentioned above. In the third process, max pooling is implemented using a 10 × 10 mask in 5 × 5 filters and the biggest one is selected, resulting in a 64 × 14 × 14. For the entire convolution, the max pooling action can be repeated.
As for some well-known CNN architectures such as Googlenet, Resne-t, and Vgg16, the above concepts are used to control the size or number of layers of convolution, and the fundamental concept is not markedly different. The proposed system uses the Googlenet  architecture for model training. Since the data set obtained by this project is very rare, even if data augmentation is used, it is difficult to achieve data diversity. However, the advantage of our system is that the background of the object has been considerably limited and there will be no other miscellaneous or object impact identification. Therefore, for this system, we will adopt the concept of the Siamese network. The Siamese network is often used for face recognition and model architecture training. The principle is to capture the features of the neural network for image recognition. The final fully connected layer for classification is not entered, but the feature is entered as one (128, 1). The vector, also known as an embedding, then discards the photo of the identified item and the photo in the database is selected for the model to obtain a vector of (128, 1), and then the two vectors are compared. For training this model, first, we must define a loss function to measure the distance between embeddings. Here, we use the triplet-loss function; there will be three photos: the anchor and the positive and negative images of the anchor (Fig. 5).

Results
For the individual detection of pills, because the data set is a picture of a single pill, multiple images are processed to cut out and separate each pill using the edge detection technique. The principle of this technique is to find the pixel-exposed part of the edge, as shown in Fig. 6 as an example. The steps are as follows. Gaussian smooth masking: the purpose is to eliminate noise and smooth the image (here, a 3 × 3 mask is used). Canny operation: the horizontal and vertical gradients are calculated with the Sobel operator to find the boundary. Finding the contour: use the findtour function of cv2; with the CV_RETR_EXTERNAL parameter, only the outline of the outermost layer is taken. Input of square image: the cv2.boundingRect parameter is used and saved as an image file (Fig. 7). We also assumed that the proposed sample training error is E1, as shown by where X 0 is the initial value of the training sample, m is the training size, X n is the excepted outputs of training samples, and Y n is the network simulation outputs of training samples.
To improve the safety of medication, a reminder is provided to users who need to take pills regularly without the erroneous repetition of medications. The purpose of this study is to develop an intelligent automatic pillbox system on the basis of the concept of a pharmaceutical packaging system used in smart hospitals, a system that is tailored for personalized use. The development of the smartphone app provides a more convenient and safe medication system for patients' homecare. It is not necessary to put the tablets one by one into a designated area according to the medication time of the user, which can reduce the unnecessary waste of time and exposure of tablets.
The proposed system will not only remind users to take medicine at the right time, but also provide the correct pills and doses. A CNN-based camera was integrated into the proposed system to acquire pill images, which are transferred to the app for CNN recognition. Finally, the proposed system has wireless connections to increase convenience and safety in taking medicine. At present, most of the smart medicine kits can be correctly administered according to the set numbers, but the experimental results are still partially limited.
There are still some issues that should be improved for this system. For example, when there are too many medicines put in the medicine box, the capsules generate excessive friction. Also, the torque of the motor is limited. At present, only fixed-form capsules can be processed, and the system design is applicable only to capsule-type drugs.

Conclusions
The recognition rate of the model using the training set can reach more than 90%, but the recognition rate of the model using data that have not been seen is still quite low. The reason is the lack of diversity of resources, which leads to the poor recognition of the images that the model sees. The result of retraining after setting the layer remains unsatisfactory. To highlight the unique characteristics of the picture background, we changed to using a Siamese network, but the recognition rate is still not markedly improved. Moreover, because of the need to compare with database photos when using the Siamese network, it resulted in a longer time to process. In the future, it is necessary to improve this system so that it can simultaneously dispense different drugs. The datasets currently in existence usually include the type of medicine, with both positive and negative images. If we can accumulate more datasets in the future, the accuracy of the model can be increased, and the model can be compared with other models. Models such as Yolo can also eliminate the problem of edge detection.