log.dirs accepts a comma separated list of disks and will distribute partitions across them, however:
Doesn’t rebalance, some disks could be full and others empty.
Doesn’t tolerate any disk failure, more info in KIP-18.
Raid 10 is probably the best middle ground between performance and reliability.
*num.io.threads, *number of I/O threads that the server uses for executing requests. You should have at least as many threads as you have disks.
num.network.threads, number of network threads that the server uses for handling network requests. Increase based on number of producers/consumers and replication factor.
KAFKA_HEAP_OPTS, 5–8Gb heap should be enough for most deployments, file system cache is way more important. Linkedin runs 5Gb heap in 32Gb RAM servers.
pcstat can help understand how well the system is caching:
A circuit breaker is an automatically operated electrical switch designed to protect an electrical circuit from damage caused by over-current or overload or short circuit. Its basic function is to interrupt current flow after protective relays detect a fault. A circuit breaker can be reset (either manually or automatically) to resume normal operation.
The software analogue as described in Release it! chapter 5.2 can prevent repeated calls to a failing service by detecting issues and providing a fallback, by using this pattern it is possible to avoid cascading failures.
Requirement 2: Maintain an inventory of system components in scope for PCI DSS to support effective scoping practices.
You will find that using public-key authentication is sometimes forbidden as it’s almost impossible to ensure employees are rotating the keys, keeping the private key safe and with a strong password.
Using Ansible without ssh key based authentication is painful if you need to run a playbook against hundreds of servers, as you will need to insert your password ad nauseam.
One of Docker’s killer features is the environment parity, yet it feels like one little detail was left untold: how to handle configuration files.
Unless you are using the same configuration between development, quality, production, etc. you will end up with different endpoints, API keys, secret tokens and feature switches for each environment.
Available Options
There are a couple of different ways to handle configuration in Docker. Below, you will find a non-exhaustive list.
Collectd is a Unix daemon that collects, transfers and stores performance data of computers and network equipment. The acquired data is meant to help system administrators maintain an overview over available resources to detect existing or looming bottlenecks.
Collectd provides a long list of plugins available out-of-box. However, if you need to collect additional metrics, one of the easiest ways to do so is using the exec plugin.
In order to use the exec plugin, create an Collectd configuration file: