A technology consultancy assignment for a public sector client focuses on stabilising and operationalising an existing Apache Kafka platform. The work combines IT architecture and platform engineering to improve cluster reliability and observability, with hands-on configuration of Kafka topics, connectors and Grafana dashboards.
The mission
This three-month assignment sits inside a platform team responsible for event streaming that supports microservices and change-data-capture pipelines. The environment is based on Apache Kafka components including brokers, Schema Registry and Kafka Connect, and requires improvements to monitoring, security and operational runbooks. The output of this work will be an operational governance pack, upgraded monitoring dashboards and a set of hardened procedures for troubleshooting and high availability.
Day to day you will assess the current architecture, implement configuration changes and introduce monitoring and alerting in Grafana. You will tune topic partitioning and retention, review Kafka Connect topologies, apply best practices for TLS, SASL and ACLs and hand over documented runbooks to the technical team. The role expects close collaboration with platform engineers and operations to validate changes in staging and production during the three-month mission.
Your responsibilities
- Analyse the existing Kafka deployment and produce a prioritized remediation and optimisation plan that improves availability and maintainability
- Design and implement topic, partition and retention strategies that reduce latency and storage costs while meeting SLAs
- Configure and harden Kafka Connect and Kafka Streams setups, applying best practices for connector topologies and error handling
- Implement and extend Grafana dashboards and alerts to surface broker, consumer and connector health metrics
- Secure cluster communication and access using TLS, SASL and ACLs and document the security configuration for operations
- Produce troubleshooting runbooks and deliver knowledge transfer sessions for the platform and operations teams
Your profile
Essential skills
- Senior hands-on experience, typically 5+ years working with Apache Kafka in production environments
- Proven capability with Kafka components: brokers (KRaft or Zookeeper), Schema Registry, Kafka Connect and Kafka Streams
- Practical experience with monitoring tools, specifically Grafana, and with defining alert thresholds and dashboards
- Familiarity with containerised Kafka deployments such as Strimzi or Confluent and configuration management with Ansible
- Strong operational mindset, able to write runbooks, perform post-mortems and communicate technical governance to teams
Preferred skills
- Familiarity with streaming data stack components such as Iceberg, Spark or Flink is a plus