Amaris.AI is currently seeking a professional-level Linux System Administrator – Artificial Intelligence to join infrastructure team in Amaris.AI.
The Infra and DevSecOps Engineer role is responsible for medium-to-advanced system administration tasks including Artificial Intelligence (AI) system and user support and for data-intensive computing resources operated by Amaris.AI CTO Office/Data Science teams in support of advanced scientific, engineering, and AI research experiments for commercial industry customers. Support for rapid AI experimentation is crucial.
To be successful in this position, the applicant should be able to:
• Build, expand and operate Amaris infrastructures, including large-scale systems in public and private clouds.
• Work in a fast-paced environment. Participate in technical operations and rotations in response to performance and reliability issues.
• Help improve the whole lifecycle of infrastructure services from inception and design throughout development, to deployment, user support and refinement.
• Maintain services once they are live by measuring and monitoring availability, latency, and overall system health
• Practice sustainable incident response and blameless postmortems.
• Aim to eliminate manual processes.
• Identify and resolve problems relating to critical service operations.
• Be involved in the design and subsequent implementation of software and service infrastructure.
• At least 3 years of relevant work experience.
• Bachelor’s degree or equivalent in a relevant field (Computer Science, Big Data, AI, Statistics, Math) or equivalent experience
• At least 3 years of experience working with Unix Linux systems.
• At least 3 years of experience in one or more programming languages such as Java, C++, Go, or scripting experience in Shell and Python
• Experience with Docker and Kubernetes.
• Ability to debug, optimize code, and automate routine tasks.
• Systematic problem-solving approach, coupled with effective communication skills and a sense of drive.
The ideal candidate should have experience or be familiar with the some of the following:
• Self-driven and capable of coping with ambiguity and move projects from concept to delivery.
• Strong in analytical skills and the ability to solve real world problems in a fast moving environment.
• Experience in designing, analyzing and building automation and tools for large scale systems.
• Experience in building solutions on On-premise as well as with AWS, Google, Azures and other cloud services.
• Experience in networking technologies such TCP/IP, BGP, DNS, etc. in a carrier-grade environment.
• Experience in Container Technologies like Docker, Kubernetes, Rancher etc.
• Experience in tools like Teraform, Ansible, Puppet or Chef.
• Experience in working on Data Science and Machine Learning projects.
• Familiarity of Big Data Platform.