Site Reliability Engineer
09/04/24
Remote, European Union
Overview
Telna provides Mobile Networks, CSPs and OEMs with a managed global network infrastructure for cellular connectivity. Telna has the largest LTE and LTE-M footprint in the world. Its multi-network platform enables simplified billing and localization, utilizing 6+ telco pops globally. Telna’s Cronus connectivity platform allows instant access to its virtualized cellular infrastructure via API or front-end portal. The engineering team is looking for an experienced SRE / SecOps Engineer. The candidate must be able to work independently, understand the needs and build the solutions for sophisticated architecture and be comfortable working under pressure at times.
Key Responsibilities
- Develop and Maintain Scalable Infrastructure: Design, implement, and manage scalable, highly available, and fault-tolerant systems on-premises or in the cloud.
- Automation: Automate routine operational tasks to improve efficiency and reduce human error. This includes infrastructure provisioning, configuration management, and deployment processes.
- Performance Monitoring: Implement and manage monitoring solutions to track the health, performance, and availability of services. Analyze metrics and logs to preempt potential issues.
- Incident Response: Lead incident response efforts, troubleshoot and resolve system outages or performance issues, and conduct post-mortem analysis to prevent recurrence.
- Capacity Planning and Management: Predict future system demands, plan for capacity increases, and manage resources to ensure optimal performance and cost-efficiency.
- Disaster Recovery and Backup: Develop and maintain disaster recovery plans, including regular testing, to ensure data integrity and availability.
- Security Monitoring and Incident Handling: Monitor security systems for anomalies and indicators of compromise. Lead the response to security incidents, including containment, eradication, recovery, and post-incident analysis.
- Vulnerability Management: Regularly assess systems and applications for vulnerabilities, prioritize remediation efforts based on risk, and apply necessary patches or mitigations.
- Secure Software Development Lifecycle (SDLC): Integrate security practices into the SDLC, including code reviews, security testing, and ensuring secure deployment practices.
- Compliance and Auditing: Ensure systems comply with industry standards and regulatory requirements. Conduct regular audits to identify and address compliance gaps.
- Security Awareness and Training: Promote security awareness within the organization and provide training to employees on security best practices, threat awareness, and incident response protocols.
- Network Security: Implement and manage network security measures such as firewalls, intrusion detection/prevention systems (IDS/IPS), and network segmentation to protect against unauthorized access.
Qualifications
- Programming/Scripting Languages: Proficiency in languages like Python, Go, or Shell scripting to automate tasks and develop internal tools.
- Infrastructure as Code (IaC): Experience with tools like Terraform, Ansible, Chef, or Puppet for automating the provisioning and management of infrastructure.
- Cloud Platforms: Strong knowledge of cloud service providers such as AWS, GCP, or Azure, including their managed services related to computing, storage, networking, and security.
- Containerization and Orchestration: Expertise in using Docker, Kubernetes, or other container management solutions to deploy, manage, and scale applications.
- Monitoring and Logging: Experience with tools like Prometheus, Grafana, ELK stack (Elasticsearch, Logstash, Kibana), or Splunk for monitoring, logging, and analyzing system performance and security events.
- Networking and Security: Understanding of network protocols (TCP/IP, HTTP/HTTPS), DNS, load balancing, firewall rules, VPNs, and encryption technologies (SSL/TLS). Familiarity with security protocols, compliance standards (e.g., PCI-DSS, GDPR), and vulnerability assessment tools.
- Incident Management: Experience in incident response and management processes, including post-mortem analysis and preventive measures.
- Security Practices: Knowledge of secure coding practices, security testing, and implementing security controls and policies to protect systems and data.
- Automation and CI/CD: Familiarity with continuous integration and continuous delivery (CI/CD) pipelines to ensure secure and reliable code deployments.
- Compliance and Risk Management: Understanding of regulatory requirements and risk management frameworks to ensure compliance and protect against vulnerabilities.
Location
Remote from European Union
Language
English - Native or Bilingual Proficiency
Job Type
Full-Time, Mid/Senior Level
Job Category
Engineering
Apply for this position
Grow Personally and Professionally
Working for us means being part of creating products that shape the future of digital industries.