
DevOps / Site Reliability Engineer
Two95 International Inc. • United States
Posted: January 13, 2026
Job Description
Job Title: Lead SRE (Site Reliability Engineer )
Location: Remote Work
Type: 6+ Month Contract to hire
Rate: $Open /hr.
Pl forward updated resume to deivy.malli@two95intl.com and include your rate requirement along with your contact details with a suitable time when we can reach you.
Responsibilities
· Own uptime, SLAs, and overall reliability of cloud infrastructure and kiosks platform.
· Lead incident response, root-cause analysis, and drive actionable postmortems.
· Automate infrastructure, deployments, and operational tasks using modern IaC and scripting in collaboration with the Platform Engineering team.
· Maintain and improve monitoring, alerting, and observability (Grafana, Prometheus, New Relic, etc).
· Manage, operate and recommend improvement of mo
· Execute and continuously improve disaster recovery and business continuity plans.
· Partner with platform engineering, QA, and development teams to ensure operational readiness.
· Establish and maintain runbooks, operational standards, and reliability best practices.
· Provide leadership, mentorship, and clear communication during both normal operations and incidents.
· Optimize cloud and Kubernetes environments for reliability, performance, and scalability.
Job Title: Lead SRE (Site Reliability Engineer )Location: Remote WorkType: 6+ Month Contract to hireRate: $Open /hr.Pl forward updated resume to [email protected] and include your rate requirement along with your contact details with ...
Qualifications
· 8+ years in SRE, DevOps, or Platform Engineering roles; 2+ years in a senior or lead capacity.
· Strong experience supporting production environments with strict SLAs and high uptime requirements.
· Deep knowledge of Kubernetes, containers, and cloud-native infrastructure.
· Proficiency in automation and scripting using Bash, Python, or Go.
· Hands-on experience with CI/CD pipelines and release engineering in modern environments.
· Expert-level familiarity with IaC tools (Terraform preferred).
· Strong understanding of monitoring, alerting, logging, and observability tooling.
· Experience implementing and managing GitOps workflows (ArgoCD or similar).
· Demonstrated ability to lead incidents and communicate effectively with technical and non-technical stakeholders.
· Solid understanding of disaster recovery planning, resilience practices, and system hardening.
Additional Content
Job Title: Lead SRE (Site Reliability Engineer )
Location: Remote Work
Type: 6+ Month Contract to hire
Rate: $Open /hr.
Pl forward updated resume to deivy.malli@two95intl.com and include your rate requirement along with your contact details with a suitable time when we can reach you.
Responsibilities
· Own uptime, SLAs, and overall reliability of cloud infrastructure and kiosks platform.
· Lead incident response, root-cause analysis, and drive actionable postmortems.
· Automate infrastructure, deployments, and operational tasks using modern IaC and scripting in collaboration with the Platform Engineering team.
· Maintain and improve monitoring, alerting, and observability (Grafana, Prometheus, New Relic, etc).
· Manage, operate and recommend improvement of mo
· Execute and continuously improve disaster recovery and business continuity plans.
· Partner with platform engineering, QA, and development teams to ensure operational readiness.
· Establish and maintain runbooks, operational standards, and reliability best practices.
· Provide leadership, mentorship, and clear communication during both normal operations and incidents.
· Optimize cloud and Kubernetes environments for reliability, performance, and scalability.
Job Title: Lead SRE (Site Reliability Engineer )Location: Remote WorkType: 6+ Month Contract to hireRate: $Open /hr.Pl forward updated resume to [email protected] and include your rate requirement along with your contact details with ...
Qualifications
· 8+ years in SRE, DevOps, or Platform Engineering roles; 2+ years in a senior or lead capacity.
· Strong experience supporting production environments with strict SLAs and high uptime requirements.
· Deep knowledge of Kubernetes, containers, and cloud-native infrastructure.
· Proficiency in automation and scripting using Bash, Python, or Go.
· Hands-on experience with CI/CD pipelines and release engineering in modern environments.
· Expert-level familiarity with IaC tools (Terraform preferred).
· Strong understanding of monitoring, alerting, logging, and observability tooling.
· Experience implementing and managing GitOps workflows (ArgoCD or similar).
· Demonstrated ability to lead incidents and communicate effectively with technical and non-technical stakeholders.
· Solid understanding of disaster recovery planning, resilience practices, and system hardening.