I had the privilege of speaking about eliminating data security threats at Data Connectors in Houston a couple weeks ago, and I was asked by several people about how much space “all that audit log data” would take up, and how long you could realistically keep it while still being able to report on it. One person that asked explained that he had a product to collect audit data on a single busy file server, but it could only hold a month or so of data before it consumed a full terabyte of space, and (worse) became almost unusable when generating reports.
If you’ve ever enabled native auditing (like audit object access success in windows or BSM in Solaris) and taken a look at the logs, you’ve certainly noticed, among other things, the astounding number of events they generate. I just enabled native auditing on my workstation while writing this to get some numbers. I then opened one (existing) file, edited one line, saved it, and closed it– this generated 130 distinct events by itself (46 4656 events, 46 4658 events, and 38 4663 events). With numbers like this, it’s no wonder that collecting and storing raw audit logs can take up so much space, and be so slow to parse through.
This is one of the areas where metadata framework technology really shines in unstructured data protection. Not only can a metadata framework replace the inefficient native operating system auditing functionalities on many platforms, it can also normalize the audit information and store it within intelligent data structures. Normalization eliminates redundant information, and the data structures are much easier to process after the computationally intensive parts of the audit trail (like the path and SID) are converted into integers.
With normalization and intelligent data structures, not only can audit information be stored more efficiently, it is also quicker to search and easier to analyze.