I have come across Apache Spark when looking for tools for an ETL process. I am a big fan of Scala, and Spark with Scala was mentioned a few times in an ETL context. Therefore, I couldn’t not to use it as a potential option.
The first conference Cloud Developer Days has ended two weeks ago. It took place 28 – 29 May 2018 @ Cracow, Poland. I haven’t come across a similar conference in my country before, i.e. the one that would be focused on cloud in general and not on a specific technology.
The main focus was on Cloud Security, Machine Learning, Artificial Intelligence, Serverless and Blockchain.
Do you use CentOS or any other rpm based Linux distribution to host Java applications or services?
In this article you will see how Gradle can be used for that purpose, in particular:
- what are the naming convention that can be used with RPM
- which tools can help with rpm generation with Gradle
So far the following ideas have been introduced: topic, message, partition, producer, consumer and broker. By now, you should understand how Kafka stores messages on disk using commit log, topics and partitions. You should also know how a message is structured.
It’s time to introduce consumer groups, which are the missing piece of message distribution in Kafka.
The Topic, the Message and the Partition
Traditional messaging patterns: message queue and publish – subscribe, have some limitations as a result of their design.
In the previous post – Apache Kafka Ideas – Part 1, a couple of messaging use cases were introduced. In order to define those cases with Kafka, it is important to understand its ideas. At the very heart of Kafka are topics and partitions. This post explains basic concepts behind them.
What Apache Kafka is?
Apache Kafka can be thought of as a message broker. It has the following characteristics:
- allows sending messages between two parties
- allows one-to-one (peer to peer, queue) or one-to-many (broadcast, topic) message delivery
- persists messages
What ideas are behind Kafka and how does it differ from a classical broker? In this series of posts you’ll find out how does Apache Kafka work and be able to run and use Kafka cluster.
In my previous post I drew an idea of sending PostgreSQL metrics to Datadog using Java code. This post will reveal implementation details of the Send action described previously.
How does Datadog collect metrics?
There are two basic ways of collecting and sending data:
- Use Datadog agent
- Collect and send manually
I currently work on a project were AWS RDS PostgreSQL is used as a data storage. Since Postgres is using a mechanism called Multiversion Concurrency Control – MVCC, an UPDATE or DELETE command does not remove old versions of a row immediately. These are left on a disk, waiting to be collected and cleaned by a vacuum process. Vacuum can be automated and autovacuum serves that purpose.
Once autovacuum is configured, how do I know it works as expected? Is it triggered when I expect it to be?