bigshyft
Nnference
nference
Staff Site Reliability Engineer
Series C
Start-up
201-500 employees
7y - 10y
₹30 - ₹50 LPA
Bengaluru/ Bangalore
Java, Python, Linux, AWS, CI/CD

Role

Company

Job Description

What you'll do:


  • System Design and Architecture: Design highly available, fault-tolerant systems that scale to meet customer and partner demands.
  • Automation and Tooling: Develop and maintain monitoring, deployment, and operational tools for efficiency and reduced manual intervention.
  • Incident Response and Root Cause Analysis: Lead post-incident reviews to identify causes and implement preventive measures.
  • Cross-functional collaboration: Work with software engineering teams to advocate reliability best practices and influence architectural decisions.
  • Mentorship and Leadership: Mentor junior engineers, conduct technical interviews, and contribute to the SRE community.



What makes you a great fit:


  • 7+ years of Site Reliability Engineering, or a blend of software engineering and DevOps.
  • Strong Linux fundamentals, system administration scripting, performance tuning, and troubleshooting.
  • Proficiency in one programming language - Python, Java, or Golang.
  • Deep understanding of AWS, GCP, Azure cloud platforms, and Kubernetes orchestration.
  • Experience building and managing Kubernetes clusters using Terraform. CRD and operator implementation experience preferred.
  • Familiarity with ArgoCD, Nexus repository is advantageous.
  • Skilled in creating and utilizing Terraform modules and CI/CD pipelines.
  • Implementation experience with open-source observability and alerting tools like Prometheus, Grafana, Cortex, Thanos, Alertmanager, etc.
  • Networking knowledge (VPC, VNet, DNS) and TCP/IP stack understanding, internet routing, and load balancing
  • Excellent interpersonal, communication, and teamwork skills across diverse environments including SREs, Engineers, and Product Managers.
  • Project or team leadership experience, with a commitment to mentoring and developing junior engineers.
  • Join us in advancing the reliability and scalability of our platform. Apply now and contribute to our dynamic team environment focused on innovation and excellence.
All about us
nference

  • nference partners with leading biopharmaceutical companies to solve major challenges in a variety of areas such as drug discovery, clinical research, life cycle management, clinical operations, and BD/commercial strategy.
  • nference is making the world's biomedical knowledge computable to solve urgent healthcare problems.
Provider of electronic medical records datasets. The company offers EMR data sets that can be analyzed for various life sciences and biopharma needs such as clinical trial design, target discovery, precision medicine, and early diagnosis among others.

Employee count
201-500 employees
Employment Type
Full Time Job
Company Type
Start-up
Headquarters
Cambridge, Massachusetts, United States

Apply to Similar Jobs

  • HHashiCorp
    HashiCorp
    Sr. Site Reliability Engineer
    Series E
    Start-up
    1001-5000 employees
    6y - 9y
    ₹30 - ₹60 LPA
    Bengaluru/ Bangalore
    CI/CD, Kubernetes, Terraform, Ansible, Python
  • AAster DM Healthcare
    Aster DM Healthcare
    Site Reliability Lead
    10
    001+ employees
    Private
    5y - 8y
    ₹16 - ₹27.5 LPA
    Bengaluru/ Bangalore, Gurugram/ Gurgaon
    CI/CD, Kubernetes, Docker, Jenkins, Ansible
  • HHashiCorp
    HashiCorp
    Sr. Site Reliability Engineer II
    Series E
    Start-up
    1001-5000 employees
    8y - 12y
    ₹50 - ₹100 LPA
    Bengaluru/ Bangalore
    CI/CD, Load Balancer, Kubernetes, Terraform, Ansible
  • AAdvisor360
    Advisor360
    Systems Engineer
    Unfunded
    Start-up
    501-1000 employees
    5y - 8y
    ₹20 - ₹30 LPA
    Bengaluru/ Bangalore
    Virtualization, Azure, PowerShell scripting, DevOps