A Modern Implementation of System Call Sequence Based Host-based Intrusion Detection Systems

Jeffrey Byrnes, Thomas Hoang, Nihal Nitin Mehta, Yuan Cheng

Research output: Chapter in Book/Conference proceedingConference contributionpeer-review

9 Citations (Scopus)

Abstract

Much research is concentrated on improving models for host-based intrusion detection systems (HIDS). Typically, such research aims at improving a model's results (e.g., reducing the false positive rate) in the familiar static training/testing environment using the standard data sources. Matching advancements in the machine learning community, researchers in the syscall HIDS domain have developed many complex and powerful syscall-based models to serve as anomaly detectors. These models typically show an impressive level of accuracy while emphasizing on minimizing the false positive rate. However, with each proposed model iteration, we get further from the setting in which these models are intended to operate. As kernels become more ornate and hardened, the implementation space for anomaly detection models is narrowing. Furthermore, the rapid advancement of operating systems and the underlying complexity introduced dictate that the sometimes decades-old datasets have long been obsolete. In this paper, we attempt to bridge the gap between theoretical models and their intended application environments by examining the recent Linux kernel 5.7.0-rc1. In this setting, we examine the feasibility of syscall-based HIDS in modern operating systems and the constraints imposed on the HIDS developer. We discuss how recent advancements to the kernel have eliminated the previous syscall trace collect method of writing syscall table wrappers, and propose a new approach to generate data and place our detection model. Furthermore, we present the specific execution time and memory constraints that models must meet in order to be operable within their intended settings. Finally, we conclude with preliminary results from our model, which primarily show that in-kernel machine learning models are feasible, depending on their complexity.

Original languageEnglish
Title of host publicationProceedings - 2020 2nd IEEE International Conference on Trust, Privacy and Security in Intelligent Systems and Applications, TPS-ISA 2020
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages218-225
Number of pages8
ISBN (Electronic)9781728185439
DOIs
Publication statusPublished - Oct 2020
Externally publishedYes
Event2nd IEEE International Conference on Trust, Privacy and Security in Intelligent Systems and Applications, TPS-ISA 2020 - Virtual, Atlanta, United States
Duration: 1 Dec 20203 Dec 2020

Publication series

NameProceedings - 2020 2nd IEEE International Conference on Trust, Privacy and Security in Intelligent Systems and Applications, TPS-ISA 2020

Conference

Conference2nd IEEE International Conference on Trust, Privacy and Security in Intelligent Systems and Applications, TPS-ISA 2020
Country/TerritoryUnited States
CityVirtual, Atlanta
Period1/12/203/12/20

Keywords

  • hidden Markov model
  • host-based intrusion detection
  • system calls

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Networks and Communications
  • Information Systems and Management
  • Safety, Risk, Reliability and Quality

Fingerprint

Dive into the research topics of 'A Modern Implementation of System Call Sequence Based Host-based Intrusion Detection Systems'. Together they form a unique fingerprint.

Cite this