A Modern Implementation of System Call Sequence Based Host-based Intrusion Detection Systems

Jeffrey Byrnes; Thomas Hoang; Nihal Nitin Mehta; Yuan Cheng

doi:10.1109/TPS-ISA50397.2020.00037

A Modern Implementation of System Call Sequence Based Host-based Intrusion Detection Systems

Jeffrey Byrnes, Thomas Hoang, Nihal Nitin Mehta, Yuan Cheng

Research output: Chapter in Book/Conference proceeding › Conference contribution › peer-review

9 Citations (Scopus)

Abstract

Much research is concentrated on improving models for host-based intrusion detection systems (HIDS). Typically, such research aims at improving a model's results (e.g., reducing the false positive rate) in the familiar static training/testing environment using the standard data sources. Matching advancements in the machine learning community, researchers in the syscall HIDS domain have developed many complex and powerful syscall-based models to serve as anomaly detectors. These models typically show an impressive level of accuracy while emphasizing on minimizing the false positive rate. However, with each proposed model iteration, we get further from the setting in which these models are intended to operate. As kernels become more ornate and hardened, the implementation space for anomaly detection models is narrowing. Furthermore, the rapid advancement of operating systems and the underlying complexity introduced dictate that the sometimes decades-old datasets have long been obsolete. In this paper, we attempt to bridge the gap between theoretical models and their intended application environments by examining the recent Linux kernel 5.7.0-rc1. In this setting, we examine the feasibility of syscall-based HIDS in modern operating systems and the constraints imposed on the HIDS developer. We discuss how recent advancements to the kernel have eliminated the previous syscall trace collect method of writing syscall table wrappers, and propose a new approach to generate data and place our detection model. Furthermore, we present the specific execution time and memory constraints that models must meet in order to be operable within their intended settings. Finally, we conclude with preliminary results from our model, which primarily show that in-kernel machine learning models are feasible, depending on their complexity.

Original language	English
Title of host publication	Proceedings - 2020 2nd IEEE International Conference on Trust, Privacy and Security in Intelligent Systems and Applications, TPS-ISA 2020
Publisher	Institute of Electrical and Electronics Engineers Inc.
Pages	218-225
Number of pages	8
ISBN (Electronic)	9781728185439
DOIs	https://doi.org/10.1109/TPS-ISA50397.2020.00037
Publication status	Published - Oct 2020
Externally published	Yes
Event	2nd IEEE International Conference on Trust, Privacy and Security in Intelligent Systems and Applications, TPS-ISA 2020 - Virtual, Atlanta, United States Duration: 1 Dec 2020 → 3 Dec 2020

Publication series

Name	Proceedings - 2020 2nd IEEE International Conference on Trust, Privacy and Security in Intelligent Systems and Applications, TPS-ISA 2020

Conference

Conference	2nd IEEE International Conference on Trust, Privacy and Security in Intelligent Systems and Applications, TPS-ISA 2020
Country/Territory	United States
City	Virtual, Atlanta
Period	1/12/20 → 3/12/20

Keywords

hidden Markov model
host-based intrusion detection
system calls

ASJC Scopus subject areas

Artificial Intelligence
Computer Networks and Communications
Information Systems and Management
Safety, Risk, Reliability and Quality

Access to Document

10.1109/TPS-ISA50397.2020.00037

Cite this

Byrnes, J., Hoang, T., Mehta, N. N., & Cheng, Y. (2020). A Modern Implementation of System Call Sequence Based Host-based Intrusion Detection Systems. In Proceedings - 2020 2nd IEEE International Conference on Trust, Privacy and Security in Intelligent Systems and Applications, TPS-ISA 2020 (pp. 218-225). Article 9325401 (Proceedings - 2020 2nd IEEE International Conference on Trust, Privacy and Security in Intelligent Systems and Applications, TPS-ISA 2020). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/TPS-ISA50397.2020.00037

Byrnes, Jeffrey ; Hoang, Thomas ; Mehta, Nihal Nitin et al. / A Modern Implementation of System Call Sequence Based Host-based Intrusion Detection Systems. Proceedings - 2020 2nd IEEE International Conference on Trust, Privacy and Security in Intelligent Systems and Applications, TPS-ISA 2020. Institute of Electrical and Electronics Engineers Inc., 2020. pp. 218-225 (Proceedings - 2020 2nd IEEE International Conference on Trust, Privacy and Security in Intelligent Systems and Applications, TPS-ISA 2020).

@inproceedings{fee7540ba22a438db2f26cfd449cbd1f,

title = "A Modern Implementation of System Call Sequence Based Host-based Intrusion Detection Systems",

abstract = "Much research is concentrated on improving models for host-based intrusion detection systems (HIDS). Typically, such research aims at improving a model's results (e.g., reducing the false positive rate) in the familiar static training/testing environment using the standard data sources. Matching advancements in the machine learning community, researchers in the syscall HIDS domain have developed many complex and powerful syscall-based models to serve as anomaly detectors. These models typically show an impressive level of accuracy while emphasizing on minimizing the false positive rate. However, with each proposed model iteration, we get further from the setting in which these models are intended to operate. As kernels become more ornate and hardened, the implementation space for anomaly detection models is narrowing. Furthermore, the rapid advancement of operating systems and the underlying complexity introduced dictate that the sometimes decades-old datasets have long been obsolete. In this paper, we attempt to bridge the gap between theoretical models and their intended application environments by examining the recent Linux kernel 5.7.0-rc1. In this setting, we examine the feasibility of syscall-based HIDS in modern operating systems and the constraints imposed on the HIDS developer. We discuss how recent advancements to the kernel have eliminated the previous syscall trace collect method of writing syscall table wrappers, and propose a new approach to generate data and place our detection model. Furthermore, we present the specific execution time and memory constraints that models must meet in order to be operable within their intended settings. Finally, we conclude with preliminary results from our model, which primarily show that in-kernel machine learning models are feasible, depending on their complexity.",

keywords = "hidden Markov model, host-based intrusion detection, system calls",

author = "Jeffrey Byrnes and Thomas Hoang and Mehta, {Nihal Nitin} and Yuan Cheng",

note = "Publisher Copyright: {\textcopyright} 2020 IEEE.; 2nd IEEE International Conference on Trust, Privacy and Security in Intelligent Systems and Applications, TPS-ISA 2020 ; Conference date: 01-12-2020 Through 03-12-2020",

year = "2020",

month = oct,

doi = "10.1109/TPS-ISA50397.2020.00037",

language = "English",

series = "Proceedings - 2020 2nd IEEE International Conference on Trust, Privacy and Security in Intelligent Systems and Applications, TPS-ISA 2020",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

pages = "218--225",

booktitle = "Proceedings - 2020 2nd IEEE International Conference on Trust, Privacy and Security in Intelligent Systems and Applications, TPS-ISA 2020",

address = "United States",

}

Byrnes, J, Hoang, T, Mehta, NN & Cheng, Y 2020, A Modern Implementation of System Call Sequence Based Host-based Intrusion Detection Systems. in Proceedings - 2020 2nd IEEE International Conference on Trust, Privacy and Security in Intelligent Systems and Applications, TPS-ISA 2020., 9325401, Proceedings - 2020 2nd IEEE International Conference on Trust, Privacy and Security in Intelligent Systems and Applications, TPS-ISA 2020, Institute of Electrical and Electronics Engineers Inc., pp. 218-225, 2nd IEEE International Conference on Trust, Privacy and Security in Intelligent Systems and Applications, TPS-ISA 2020, Virtual, Atlanta, United States, 1/12/20. https://doi.org/10.1109/TPS-ISA50397.2020.00037

A Modern Implementation of System Call Sequence Based Host-based Intrusion Detection Systems. / Byrnes, Jeffrey; Hoang, Thomas; Mehta, Nihal Nitin et al.
Proceedings - 2020 2nd IEEE International Conference on Trust, Privacy and Security in Intelligent Systems and Applications, TPS-ISA 2020. Institute of Electrical and Electronics Engineers Inc., 2020. p. 218-225 9325401 (Proceedings - 2020 2nd IEEE International Conference on Trust, Privacy and Security in Intelligent Systems and Applications, TPS-ISA 2020).

Research output: Chapter in Book/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - A Modern Implementation of System Call Sequence Based Host-based Intrusion Detection Systems

AU - Byrnes, Jeffrey

AU - Hoang, Thomas

AU - Mehta, Nihal Nitin

AU - Cheng, Yuan

PY - 2020/10

Y1 - 2020/10

N2 - Much research is concentrated on improving models for host-based intrusion detection systems (HIDS). Typically, such research aims at improving a model's results (e.g., reducing the false positive rate) in the familiar static training/testing environment using the standard data sources. Matching advancements in the machine learning community, researchers in the syscall HIDS domain have developed many complex and powerful syscall-based models to serve as anomaly detectors. These models typically show an impressive level of accuracy while emphasizing on minimizing the false positive rate. However, with each proposed model iteration, we get further from the setting in which these models are intended to operate. As kernels become more ornate and hardened, the implementation space for anomaly detection models is narrowing. Furthermore, the rapid advancement of operating systems and the underlying complexity introduced dictate that the sometimes decades-old datasets have long been obsolete. In this paper, we attempt to bridge the gap between theoretical models and their intended application environments by examining the recent Linux kernel 5.7.0-rc1. In this setting, we examine the feasibility of syscall-based HIDS in modern operating systems and the constraints imposed on the HIDS developer. We discuss how recent advancements to the kernel have eliminated the previous syscall trace collect method of writing syscall table wrappers, and propose a new approach to generate data and place our detection model. Furthermore, we present the specific execution time and memory constraints that models must meet in order to be operable within their intended settings. Finally, we conclude with preliminary results from our model, which primarily show that in-kernel machine learning models are feasible, depending on their complexity.

AB - Much research is concentrated on improving models for host-based intrusion detection systems (HIDS). Typically, such research aims at improving a model's results (e.g., reducing the false positive rate) in the familiar static training/testing environment using the standard data sources. Matching advancements in the machine learning community, researchers in the syscall HIDS domain have developed many complex and powerful syscall-based models to serve as anomaly detectors. These models typically show an impressive level of accuracy while emphasizing on minimizing the false positive rate. However, with each proposed model iteration, we get further from the setting in which these models are intended to operate. As kernels become more ornate and hardened, the implementation space for anomaly detection models is narrowing. Furthermore, the rapid advancement of operating systems and the underlying complexity introduced dictate that the sometimes decades-old datasets have long been obsolete. In this paper, we attempt to bridge the gap between theoretical models and their intended application environments by examining the recent Linux kernel 5.7.0-rc1. In this setting, we examine the feasibility of syscall-based HIDS in modern operating systems and the constraints imposed on the HIDS developer. We discuss how recent advancements to the kernel have eliminated the previous syscall trace collect method of writing syscall table wrappers, and propose a new approach to generate data and place our detection model. Furthermore, we present the specific execution time and memory constraints that models must meet in order to be operable within their intended settings. Finally, we conclude with preliminary results from our model, which primarily show that in-kernel machine learning models are feasible, depending on their complexity.

KW - hidden Markov model

KW - host-based intrusion detection

KW - system calls

UR - http://www.scopus.com/inward/record.url?scp=85100422159&partnerID=8YFLogxK

U2 - 10.1109/TPS-ISA50397.2020.00037

DO - 10.1109/TPS-ISA50397.2020.00037

M3 - Conference contribution

AN - SCOPUS:85100422159

T3 - Proceedings - 2020 2nd IEEE International Conference on Trust, Privacy and Security in Intelligent Systems and Applications, TPS-ISA 2020

SP - 218

EP - 225

BT - Proceedings - 2020 2nd IEEE International Conference on Trust, Privacy and Security in Intelligent Systems and Applications, TPS-ISA 2020

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 2nd IEEE International Conference on Trust, Privacy and Security in Intelligent Systems and Applications, TPS-ISA 2020

Y2 - 1 December 2020 through 3 December 2020

ER -

Byrnes J, Hoang T, Mehta NN, Cheng Y. A Modern Implementation of System Call Sequence Based Host-based Intrusion Detection Systems. In Proceedings - 2020 2nd IEEE International Conference on Trust, Privacy and Security in Intelligent Systems and Applications, TPS-ISA 2020. Institute of Electrical and Electronics Engineers Inc. 2020. p. 218-225. 9325401. (Proceedings - 2020 2nd IEEE International Conference on Trust, Privacy and Security in Intelligent Systems and Applications, TPS-ISA 2020). doi: 10.1109/TPS-ISA50397.2020.00037

A Modern Implementation of System Call Sequence Based Host-based Intrusion Detection Systems

Abstract

Publication series

Conference

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this