Description |
Effective data partitioning is critical in machine learning, especially in high-cost physical systems. Cross- validation techniques like K-Fold Cross-Validation (KFCV), though used to enhance model robustness, is known to compromise generalization assessment due to extensive computation and data shuffling demands. Simple Random Sampling (SRS) is used in this study to mitigate the weaknesses of KFCV. While SRS ensured representative samples, it risked non-representative sets in imbalanced data. Integrating both methods minimized biases, blending the simplicity of the SRS with the accuracy of the KFCV to optimize data partitioning. The hybrid method enhanced mean and variance convergence across diverse dataset sizes and trials, effectively balancing performance under various conditions. It proved beneficial in resource- constrained environments and with extensive datasets, providing practical solutions for effective machine learning implementations.
|
QLS Seminar: Hybrid Data Partitioning for Enhanced Machine Learning in High-Cost Systems
Go to day