TY - JOUR
T1 - A comprehensive review on machine learning-based VPN detection: Scenarios, methods, and open challenges
AU - Guerra-Manzanares, Alejandro
AU - Caprolu, Maurantonio
AU - Pietro, Roberto Di
PY - 2025/11
Y1 - 2025/11
N2 - Virtual Private Networks (VPNs) are an essential tool to protect user privacy and enforce secure communications over the Internet. However, they can also be misused to bypass legit network security mechanisms and hence access otherwise restricted content. These reasons, combined with the fact that VPN supporting technology has continuously evolved—reaching quite a relevant level of sophistication—make detecting VPN traffic a vested research issue for both academia and industry. In this paper, we provide a comprehensive review of machine learning-based (ML) solutions for VPN traffic detection. In particular, we start with framing the problem and identifying the main scenarios and related adversary models. Then, we provide a thorough analysis of the related literature and state-of-the-art in ML methodologies for VPN detection, identifying research gaps and unresolved challenges. In particular, we show that the vast majority of the current solutions rely on a specific dataset that suffers from a few severe limitations, hence questioning the validity of reported results when applied to real use case scenarios. Finally, we summarize existing knowledge highlighting common mistakes and providing guidelines as well as future research directions. To the best of our knowledge, this is the first paper that provides a deep dive into ML methodologies for VPN detection, showing current pitfalls, providing actionable recommendations, as well as suggesting research directions.
AB - Virtual Private Networks (VPNs) are an essential tool to protect user privacy and enforce secure communications over the Internet. However, they can also be misused to bypass legit network security mechanisms and hence access otherwise restricted content. These reasons, combined with the fact that VPN supporting technology has continuously evolved—reaching quite a relevant level of sophistication—make detecting VPN traffic a vested research issue for both academia and industry. In this paper, we provide a comprehensive review of machine learning-based (ML) solutions for VPN traffic detection. In particular, we start with framing the problem and identifying the main scenarios and related adversary models. Then, we provide a thorough analysis of the related literature and state-of-the-art in ML methodologies for VPN detection, identifying research gaps and unresolved challenges. In particular, we show that the vast majority of the current solutions rely on a specific dataset that suffers from a few severe limitations, hence questioning the validity of reported results when applied to real use case scenarios. Finally, we summarize existing knowledge highlighting common mistakes and providing guidelines as well as future research directions. To the best of our knowledge, this is the first paper that provides a deep dive into ML methodologies for VPN detection, showing current pitfalls, providing actionable recommendations, as well as suggesting research directions.
KW - VPN detection
KW - Virtual private network
KW - VPN traffic identification
KW - Encrypted traffic
KW - Network security
KW - Machine learning
KW - Deep learning
UR - https://doi.org/10.1016/j.cosrev.2025.100781
U2 - 10.1016/j.cosrev.2025.100781
DO - 10.1016/j.cosrev.2025.100781
M3 - Article
VL - 58
JO - Computer Science Review
JF - Computer Science Review
M1 - 100781
ER -