Hi there,
Please see the attached 2 screenshots, the test results of machine learning models for certificate image classification are not expected, many labeled as valid images are tested as invalid, is there any way to adjust?
Thanks
Solved! Go to Solution.
Hi @Architdz_johoce,
Welcome to Google Cloud Community!
The performance of your image classification model in Vertex AI Model Registry, where valid certificates are misclassified as invalid, likely stems from data issues, model issues, or evaluation issues. This is probably because it hasn't seen enough examples, or the examples it has seen are confusing or wrongly labeled.
First, ensure you have many clear, correctly labeled pictures of both valid and invalid certificates. If one type is far more common than the other (class imbalance), you need to balance them out. Techniques like oversampling (duplicating examples of the minority class) or undersampling (removing examples from the majority class) can help. Alternatively, you can adjust class weights during model training to give more importance to the minority class.
Next, check if you're using the right kind of model and have correctly tuned its hyperparameters. And finally, ensure you're evaluating your model fairly using appropriate metrics. You might need to look at different aspects of its performance, such as precision, recall, and F1-score, not just the overall accuracy.
If you closely examine the pictures the model misclassifies, you might identify patterns – perhaps it struggles with specific lighting conditions or certificate types. This will help pinpoint the problem's root cause. It's a process of careful checking, adjustment, and refinement.
Here are some Google Cloud documentation links that you may find helpful:
Was this helpful? If so, please accept this answer as “Solution”. If you need additional assistance, reply here within 2 business days and I’ll be happy to help.
Hi @Architdz_johoce,
Welcome to Google Cloud Community!
The performance of your image classification model in Vertex AI Model Registry, where valid certificates are misclassified as invalid, likely stems from data issues, model issues, or evaluation issues. This is probably because it hasn't seen enough examples, or the examples it has seen are confusing or wrongly labeled.
First, ensure you have many clear, correctly labeled pictures of both valid and invalid certificates. If one type is far more common than the other (class imbalance), you need to balance them out. Techniques like oversampling (duplicating examples of the minority class) or undersampling (removing examples from the majority class) can help. Alternatively, you can adjust class weights during model training to give more importance to the minority class.
Next, check if you're using the right kind of model and have correctly tuned its hyperparameters. And finally, ensure you're evaluating your model fairly using appropriate metrics. You might need to look at different aspects of its performance, such as precision, recall, and F1-score, not just the overall accuracy.
If you closely examine the pictures the model misclassifies, you might identify patterns – perhaps it struggles with specific lighting conditions or certificate types. This will help pinpoint the problem's root cause. It's a process of careful checking, adjustment, and refinement.
Here are some Google Cloud documentation links that you may find helpful:
Was this helpful? If so, please accept this answer as “Solution”. If you need additional assistance, reply here within 2 business days and I’ll be happy to help.
Hi @ruthseki
Thanks for the detail instruction, balancing the samples solved the problem.
Sorry for late response, I was stuck in another project.