Sr. AI/ML Ops Engineer

Overview

We are actively seeking a highly skilled and experienced Senior AI/ML Engineer with a focus on MLOps to join our innovative team. If you have 6 to 10 years of hands-on experience in the AI/ML space and a passion for driving technological advancements, this role is for you.

Key Technologies: Python, NumPy, Pandas, PyTorch, Docker, Kubernetes, Git, Jenkins, Azure DevOps, AWS SageMaker, Prometheus, Grafana

Specific Skills
  • Python Expertise: Proficiency in Object-Oriented Python
  • Data Science (Jupyter Notebooks): Demonstrated expertise in data science, including analysis and modeling using Jupyter Notebooks.
  • Deep Learning (PyTorch): Proven experience in deep learning, particularly with PyTorch, and familiarity with other frameworks.
  • Good LLM Knowledge: Good understanding of Natural Language Processing (NLP) and Language Models (LLM).
  • Any successful Implementation of GenAI (LLMs) on custom-data is preferred.
  • Bachelors/Masters in Data Science is preferred.
Responsible For
  • MLOps Implementation (Docker, Kubernetes, Azure DevOps, AWS SageMaker): Lead the implementation of MLOps practices, ensuring seamless integration of machine learning models into production systems. Leverage containerization with Docker and orchestration with Kubernetes. Implement MLOps technologies from both Azure and AWS, such as Azure DevOps and AWS SageMaker.
  • Code Development (Python, NumPy, Pandas): Develop and maintain scalable and efficient Python code for machine learning applications. Utilize NumPy and Pandas for effective data manipulation and analysis.
  • Collaboration (Git): Collaborate with cross-functional teams to understand business requirements and seamlessly integrate machine learning solutions into software applications. Utilize Git for version control and collaborative coding.
  • DevOps Integration (Jenkins, GitLab): Work closely with DevOps teams to streamline deployment processes, ensuring reliability and scalability. Implement continuous integration and deployment (CI/CD) practices with tools like Jenkins or GitLab.
  • Observability (Prometheus, Grafana, Azure Monitor, AWS CloudWatch): Focus on fine-tuning models and identifying data anomalies. Implement observability tools like Prometheus and Grafana for monitoring and troubleshooting. Leverage Azure Monitor and AWS CloudWatch for cloud-specific observability.
  • Model Evaluation (TensorBoard): Implement model evaluation tools such as TensorBoard to ensure models are working as expected and meet performance criteria.
  • Documentation (Confluence, Markdown): Create comprehensive documentation for code, models, and deployment processes using tools like Confluence and Markdown.
  • Training and Knowledge Sharing: Provide training and knowledge-sharing sessions to team members on best practices in MLOps and Python coding.
Job Nature
Full Time
Job Location
Remote, USA
Job Level
Sr. Position

How to Apply

 

Interested candidates can send their resumes to contact@blazop.com mentioning "Job Title" in the subject line.

Apply Online

Apply for this position

*
*
* Attach your resume. Max size 2mb Allowed Type(s): pdf

Request For A Demo

contact