The widespread incorporation of lithium-ion batteries in electric vehicles, energy storage, and portable electronics highlights their essential role in modern technology and underscores the need for accurate, precise, and reliable state of charge (SoC) estimation. SoC is fundamental to enabling battery management systems functionalities, such as charge-discharge control, estimation of energy and health, prediction of power capability, cell balancing, and lifetime management. However, accurately monitoring SoC remains a fundamentally challenging task due to the complex multiphysics governing battery cell behavior. To address this, we propose an explainable machine-learning approach for SoC estimation, leveraging three distinct datasets comprising eleven measurement types.
This approach integrates three machine learning techniques: Peter and Clark Momentary Conditional Independence (PCMCI) causal analysis, long short-term memory (LSTM), and multi-layer perceptrons (MLPs). Utilizing each dataset to estimate SoC, PCMCI causal analysis is applied to identify indirect or redundant variables and reduce unnecessary noise or overfitting. The refined inputs are then processed by an LSTM neural network to model temporal dynamics, followed by an MLP that performs the final regression step. The model is well-suited to handle a wide range of variables relevant to SoC estimation, including current, voltage, temperature, physical battery expansion, cathode and anode potential, light intensity, peak wavelength, and gas pressure. To evaluate the effectiveness of the proposed framework, we conducted comparative experiments using the LSTM-MLP architecture with and without the incorporation of PCMCI-based variable selection. Across all three datasets, incorporating PCMCI consistently improved model accuracy. The coefficient of determination moved closer to unity compared to models without PCMCI, with increases in R^2 from 0.961-0.9 to 0.999, highlighting the potential of causal variable selection to enhance model generalization and robustness. Interestingly, the model without PCMCI implementation outperformed the PCMCI-enhanced model in terms of root mean squared error (RMSE) and mean absolute error (MAE). This suggests PCMCI can reduce dimensionality, but it may also constrain the model’s ability to capture relevant temporal dynamics, especially sequence-aware architectures like LSTMs. These findings highlight the strength of sequence-based deep learning architectures and lay the groundwork for future research on applying transformer neural networks and other advanced models to improve lithium-ion batteries’ SOC estimation accuracy in real-world battery management systems.