There are a lot of companies today that use Apache Kafka. There's a claim that it's the most popular tool in the world. Kafka's applications range from simple message passing, via inter-service communication in microservices architecture to whole stream processing applications. Let's examine Kafka's use cases today and which companies use it.
Activision
Are you familiar with the computer game series called Call of Duty? Activision is the company behind it. They illustrate how they overcame their issues with Kafka in one of their presentations.
Activision's Kafka Cluster contains over 1000 topics, and they handle between 10k and 100k messages per second. Different information is sent, such as shooting events and death locations. Topic naming was a challenge, but they concluded that the name should not describe who is producing or consuming the data, but should describe the type of data. Activision leverages multiple data formats and has its own schema registry written in Python and based on Cassandra. Envelopes constructed with protobuf are used.
Tinder
Dating application Tinder relies on Kafka for a variety of purposes. Kafka Streams are used by various processes. You can find the following among them:
Scheduling notifications for onboarding users (for example, to upload a profile picture),analytics,content moderation, recommendations, user activation, user timezone update process, notifications,and others.
Approximately 40TB of data is sent daily by Tinder, which equates to 86 billion events per day. Compared to AWS SQS/Kinesis, Kafka saved them over 90%.
More than 200M people visit Pinterest every month. There are over 100B+ pins and 2B+ ideas searched each month. Kafka is used for multiple processes. Kafka messages are generated by every click, repin, and photo enlargement. Kafka streams are used for content indexing, recommendations, spam detection, but most importantly, for real-time ad budget calculations.
Uber
Uber relies heavily on real-time processing. They handle trillions of messages per day (information from 2017!) on tens of thousands of topics. It results in a data volume measured in petabytes. Kafka Streams are used for a wide range of processes, including so important ones like customer and driver matching, ETA calculations and auditing.
Technically, Uber uses a REST proxy based off of the Confluent one. As a result, they improved performance and reliability. Kafka is used mostly in at least once manner, so no data is lost. To achieve better throughput, batching is used. Data is divided into regional Kafka Clusters, which are later replicated using their own tool called uReplicator.
Netflix
Netflix uses Kafka clusters with Apache Flink for stream processing. A trillion messages are handled per day by these systems. It is interesting to note that Netflix uses two replicas per partition, additionally enabling unclean leader elections. This increases availability, but can result in data loss. This is why Netflix developed Inca, a tool that detects lost data. It provides related metrics and confirms that infrastructure delivers the required processing guarantees (e.g. at least once).
LinkedIn is where Apache Kafka originated. The system was actually designed to solve their problems with monitoring, tracing, and user activity tracking. LinkedIn now handles 7 trillion messages per day, divided into 100 000 topics, 7 million partitions, and 4000 brokers. REST Proxy is used for non-Java clients, and Schema Registry is used for schema management. LinkedIn has their own patches and releases of Kafka, so they can get some features earlier, before they are accepted into the official packages.
Conclusion
There are a number of companies that use Kafka, often in conjunction with big data processes. A billion or trillion messages can be handled with its almost linear scaling. You can recognize Kafka Streams and Schema Registry used together quite often. Kafka is constantly changing the way people and businesses consume data, all for the better. To know how ENIQUE can help transform your business, drop us a mail at info@eniquesolutions.com for a free consultation.