Open Position

Senior Site Reliability Engineer

UK

Blockchain or Distributed Ledger Technology (DLT) is rapidly changing the way we think about and conduct business. Key advantages such as decentralisation, transparency, and security offer tremendous potential for optimising the interactions between market players in different industries, including digital advertising. 

 

Fiducia is a UK limited company with a US subsidiary, that has been developing a DLT platform to address the lack of trust and transparency in digital advertising. The platform harmonises, matches and records data in an immutable ledger across the supply chain of advertisers in near real time, delivering a unified record for each single ad as the “Share Truth”.

 

The Fiducia platform is deployed in the context of TAG TrustNet, a global industry consortium network launched jointly with the Trustworthy Accountability Group (TAG), the leading global initiative fighting criminal activity and increasing trust in the digital advertising industry. TAG’s mission is to eliminate fraudulent traffic, facilitatethe sharing of threat intelligence, and promote brand safety by connecting industry leaders, analyzing threats, and sharing best practices worldwide. The 700+ member TAG community include the world's largest and most influential brands, agencies, publishers, and ad tech providers. TAG was founded in the US by the ANA, the 4A’s and IAB.

 

Why work for Fiducia?

  • Competitive salary and stock options

  • Experienced and supportive team members 

  • Fast track career development with a forward-thinking company

  • Development of advanced high-impact technology  

 

Role Overview

We are looking for an experienced and talented individual, reporting to the CTO, to join our technology team as Senior Site Reliability Engineer. Your responsibility is to ensure the reliability, scalability and security of our cloud environment and platform software components. Our platform is used across the digital advertising supply chain to harmonise, match and record billions of  ad impression data points across multiple data feeds. We are using R3 Corda distributed ledger software, AWS cloud services and Java as primary programming language. 

 

The Senior Site Reliability Engineer needs to combine technical leadership with hands-on operational expertise in managing large-scale distributed systems in compliance with data security and service availability requirements. To qualify for the role, you need to be a team player with a solid background in the fundamentals of computer science, distributed computing, security and high availability cloud systems.

 

Your proven ability to define and implement technical concepts effectively and to solve complex problems as a team contributor will be a critical part of the consideration process. 

 

Responsibilities

  • Ensure compliance with availability, security and performance requirements of our AWS infrastructure and Linux environments in line with company goals.

  • Configure and manage Fiducia’s AWS infrastructure to ensure uptime and security, while controlling costs. 

  • Identify platform weaknesses and anomalies, review configurations, software and hardware choices, architecture trade-offs, and come up with recommendations and action plans.

  • Perform capacity planning for Fiducia infrastructure and application components. 

  • Organise incident management plan for production environment and provide operational support. Build tooling for timely identification and escalation of incidents. Build automation to remediate service failures in short timeframes.

  • Create strategies for permanent fixes to production incidents. 

  • Maintain high standards of quality and performance, including mentorship, documentation, performance and reliability testing, fault-tolerance standards, security and stress-testing. 

  • Advance our technology stack with innovative ideas and creative solutions.

  • Draft extensive platform guides for operation and project teams to streamline operational processes, ensure performance and business continuity.

  • Timely problem solving. 
     

Qualifications

  • Minimum 3 years of hands-on experience in managing of large-scale distributed system architectures and cloud environments.

  • Thorough understanding of AWS stack (ECS, Config, Security Hub, CloudTrail, CloudFormation, S3), Linux, networking (TCP/IP stack, load balancing), SQL databases (Amazon Aurora), containerisation (Docker) and scripting (Shell).  

  • Understanding and working knowledge of Unix operating systems, networking, reliability and scaling techniques.

  • Proven experience in measuring, monitoring and fine tuning performance in cloud environments.

  • Technical leadership experience in defining goals, visions, solutions, actions plans and managing their implementation within the defined timeframe.

  • Strong analytical, problem solving and decision-making skills.

  • A degree in Computer Science (preferred) or related engineering field. MS/PhD is preferred.

  • Must be hard working, team oriented, creative, friendly, cooperative and extraordinary problem solver. 

  • Great written and verbal communication skills. Ability to create easy to understand high quality project documentation.