Real-time event analysis
Posted by Tate Hansen Sun, 18 Mar 2007 17:17:00 GMT
I just finished a workshop covering the use of Data Stream Analysis. Its necessity is driven by the need to analyze massive volumes of data (e.g. system and network events) in near real time – essential given how fast you will hit your head on the insertion rate ceiling using standard relational databases.
Off the shelf DBs (PostgreSQL, MySQL, Oracle, etc.) are unable to simultaneously commit thousands of events per second while performing complex queries. To have a chance of analyzing events in reasonable amounts of time you must analyze the incoming streams of data before inserting the data into a database.
I ran into this scenario last year building a central log server using off the shelf components. Even a few dozen servers can stream events fast enough where you realize pretty quickly all the typical open source based how-to’s on building a system that can store, correlate, and alert are inadequate. Data stream processing is required when things get big.
