Job Description

Our client is an information technology company with corporate headquarters in Limassol, Cyprus. They are a world leader in the research and development world of state-of-the-art methods for ingesting data from heterogenous sources and adapting it to bespoke and intuitive solutions for civilian protection and they are currently looking to hire a skilled Big Data Architect to join their team.

The ideal candidate will be on researching optimal solutions appropriate for huge sets of data from heterogeneous domains then implementing, maintaining, and monitoring them.

Reference #: GC004372

Job Responsibilities

  • Design, create, deploy, manage data pipelines within the organization
  • Define how data will be streamed, stored, consumed, integrated by different data systems
  • Create data architecture documents, standards, and principles and maintain knowledge on the data models
  • Organize data at the macro-level (concepts), as well as micro-level by defining the data models, metadata, business entities, and relations
  • Collaborate and coordinate with multiple teams/departments to identify the data domains and data gaps between current state systems and future goals
  • Communicate clearly and effectively the data entities and their relationship within a business model
  • Analyze data flow and recommend appropriate technologies to support the data requirements across the organization
  • Design a system that can ingest structured and unstructured data and visualize, prepare datasets that can be used by data scientists, data engineers.
  • Develop key metrics for tests on data end create data quality rules
  • Focus on performance, scalability, availability, and data governance

Requirements

  • Experience in one of the following: Java, Python or Scala
  • Proficiency with Hadoop ecosystem services such as MapReduce v2, HDFS, YARN, Hive, HBase
  • Experience with building stream-processing systems using solutions such as Apache Kafka and Apache Spark streaming
  • Experience with designing, implementing, and deploying in cluster data pipelines using Apache Spark framework (RDD, Dataframe, Streaming)
  • Experience with integrating data from multiple heterogeneous sources and various formats (CSV, XML, JSON, Avro, Parquet)
  • Experience with SQL databases and NoSQL databases, such as Elasticsearch and MongoDB
  • Proficient understanding of microservices architecture and distributed systems
  • Experience with Hadoop ecosystem on-premise or on-cloud
  • Nice to have hands-on experience with Docker, Kubernetes

What You’ll Enjoy:

  • Competitive salary based on skills and experience
  • Paid medical insurance
  • Yearly performance bonus
  • Working from home policy & Flexible hours
  • Daily lunch voucher
  • Udemy unlimited membership
  • Sport/Gym membership
  • Happy hours/events and many more team bonding activities
  • Engage with new technologies and innovative products
  • Internal hackathons and technical seminars