Get introduced to vetted companies that are still hiring

Create a profile to become searchable by hiring managers.

12,414
JOBS
655
COMPANIES

Machine Learning Ops Engineer - US Remote

Hugging Face

Hugging Face

Software Engineering, Operations
United States
Posted on Friday, September 15, 2023

Here at Hugging Face, we’re on a journey to advance good Machine Learning and make it more accessible. Along the way, we contribute to the development of technology for the better.

We have built the fastest-growing, open-source, library of pre-trained models in the world. With more than 500K+ models and 250K+ stars on GitHub, over 15.000 companies are using HF technology in production, including leading AI organizations such as Google, Elastic, Salesforce, Algolia, and Grammarly.

About the role:

As a Machine learning Ops Engineer, you work mainly to empower our users with a rich feature set, high availability, and stellar performance level to pursue their journey. We are searching for someone who brings fresh ideas, demonstrates a unique and informed viewpoint, and enjoys collaborating with a progressive, nimble and decentralized approach to develop real-world solutions and positive user experiences at every interaction.

Objectives of this role :

  • Build software and systems to manage platform infrastructure and applications
  • Improve reliability, quality, and time-to-market of our suite of software solutions
  • Measure and optimize system performance, with an eye toward pushing our capabilities forward, getting ahead of customer needs, and innovating to continually improve
  • Provide primary operational support and engineering for multiple large distributed software applications
  • Run the production environment by monitoring availability and taking a holistic view of system health

Daily and monthly responsibilities:

  • Gather and analyze metrics from both operating systems and applications to assist in performance tuning and fault finding
  • Participate in system design consulting, platform management, and capacity planning
  • Create sustainable systems and services through automation and uplift
  • Balance feature development speed and reliability with well-defined service level objectives
Stack we use
  • Terraform
  • Kubernetes
  • Cloud: AWS & GCP
  • Python and Rust
  • Frameworks (transformers of course and Keras or PyTorch) and libraries (like scikit-learn)

About you:

You’ll enjoy working here if you love to talk tech, you know the different pros and cons of multiple languages and frameworks, and Github is in your favorite bookmarks. You care about users’ experience and understand diversity is great but inclusion is key. You like to build things (almost) from scratch and you thrive in a fast-growing international environment, Hugging Face is an English-first company. You also like to build great products and ship them to production, while ensuring everything works great and we support our community and customers to the best of our ability.

More about Hugging Face :

We are actively working to build a culture that values diversity, equity, and inclusivity. We are intentionally building a workplace where people feel respected and supported—regardless of who you are or where you come from. We believe this is foundational to building a great company and community. Hugging Face is an equal opportunity employer and we do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.

We value development. You will work with some of the smartest people in our industry. We are an organization that has a bias toward impact and is always challenging ourselves to continuously grow. We provide all employees with reimbursement for relevant conferences, training, and education.

We care about your well-being. We offer flexible working hours and remote options. We offer health, dental, and vision benefits for employees and their dependents. We also offer 12 weeks of parental leave (20 for birthing mothers) and unlimited paid time off.

We support our employees wherever they are. While we have office spaces in NYC and Paris, we're very distributed and all remote employees have the opportunity to visit our offices. If needed, we'll also outfit your workstation to ensure you succeed.

We want our teammates to be shareholders. All employees have company equity as part of their compensation package. If we succeed in becoming a category-defining platform in machine learning and artificial intelligence, everyone enjoys the upside.

We support the community. We believe major scientific advancements are the result of collaboration across the field. Join a community supporting the ML/AI community.