Improving text recognition model Accuracy

Taeyang's Learning Lab 2025. 3. 18. 00:51

2025. 3. 18. 00:51

Before evaluating the performance with the test dataset, we first judged whether the model was overfitting through two training sessions.

When trained with training and validation datasets in the first model, Performance of Accuracy = 0.8257 and val_accuracy = 0.5418.

When trained with training and validation datasets in the second model, The performance of Accuracy = 0.9244, val_accuracy = 0.3894 was shown.

As we learned more, the accuracy of the training set increased and the accuracy of the verification set decreased This suggests that the model is overfitting the training data.

Data preprocessing and hyperparameter tuning were modified to prevent overfitting of the model and increase the accuracy of the test set.

The learning rate and dropout figures were considered.However, the epoch was set at the same time as 50, early stopping and call back.

1. Modifying the list of unused terminology

In the process of tokenizing text data, unnecessary words are removed through a list of terminology, allowing the model to infer emotions from the text more effectively.

Before editing: [‘은', '는', '이', '가', '을', '를']

After modification: [ "의", "가", "이", "은", "들", "는", "좀", "잘", "걍", "과", "도", "를", "으로", "자", "에", "와", "한", "하다", "에서", "까지", "부터", "마다", "보다", "더", "만", "요", "그리고", "그러나", "하지만", "또한", "때문에", "그래서", "무엇", "어디", "왜", "어떻게", "그래도", "그런데", "그러면", "하면", "이다", "이런", "저런", "뿐", "만큼", "정도" ]

The terminology was mainly composed of investigations, connection words, and verbs that did not contain meaning in the word itself.

Since the text recognition model is made possible to grasp the context of sentences using a hybrid model combining CNN and Bi-LSTM, conjunctions that can infer the context are excluded from the list.

As a result, the accuracy of the test set increased from 0.5670 (before modification) to 0.5907 (after modification).

2. Modify Dropout

Although the test set's accuracy rose to 0.5907 with a slight modification to the non-verbal list, we were still concerned about the possibility of overfitting considering that the training set is still high and the verification and test sets are low.

Therefore, the number and value of dropout layers were considered as a solution.

Among the number and figures of dropout layers, it was questioned which factors were more influential in preventing overfitting, and to find out, the degree of overfitting was determined by modifying the value of the dropout from 0.4 to 0.5 instead of reducing the dropout by one in the existing model.

Before modificaton: 0.5907

After modification (down by 1 Dropout layer, up to 0.5 Dropout value): 0.6162

When comparing the pre-correction accuracy with the post-correction accuracy, The accuracy of the training set decreased, the accuracy of the verification set increased, and the accuracy of the test set also increased.

From this, it may vary depending on the situation of each model, but in the current model, it was found that the number of dropout layers has a greater impact on overfitting prevention.

3. Modifying the Learning Rate

Existing Learning Rate : 0.0001

Test accuracy when learning rate is 0.0003: 0.6162 -> 0.6212

Test accuracy when learning rate is 0.0005: 0.6212 -> 0.6104

Test accuracy when learning rate is 0.001: 0.6104 -> 0.6152

When the number increased from the existing learning rate of 0.0001 to 0.0003, the test accuracy increased After that, even if the learning rate increased, there was little difference in accuracy.Through this, the model was trained assuming an optimal learning rate of 0.0003.

'Multimodal Chatbot Project : ESA > development process' 카테고리의 다른 글

Text Recognition Model Architecture (0)	2025.05.19
Text data pre-processing process (0)	2025.05.18

taeyang4208 님의 블로그