Skip to main content

Posts

Showing posts from January, 2024

How to design a solution for storage monitoring system?

Recently I have been asked to design a solution for monitoring system that can monitor different aspects of the system including time-series data, configuration data, alerts data, hardware failures, and SNMP traps. The system needs to be scalable, highly available and fault tolerant.  Since the system involves a storage box that requires monitoring of various statistics and components, so I'd chosen to store the time series data in a TDengine which is a highly scalable time series database and is a good choice for high cardinality data because it uses a partitioned storage model and columnar encoding to efficiently store and query large volume of data with a wide range of unique values. It can also handle data-ingestion, storage and retrieval efficiently in a number of ways. To handle configuration data, I'd chosen to use Puppet as a configuration management tool that allows the admin to configure all the major tools and components of other systems. Puppet can automate the deli