Senior Machine Learning Engineer / Technical Lead
PhazeRo · Mascate
Job description
About the role
We are seeking a hands‑on Senior Machine Learning Engineer to lead the design, deployment, and operation of production‑grade AI systems spanning large language models, OCR, and voice technologies. The role blends deep technical ownership with leadership of a high‑performing team of junior engineers and direct collaboration with stakeholders.
Key responsibilities
- Mentor and guide junior ML engineers through code reviews, design reviews, and technical direction.
- Enforce software engineering and ML best practices across the team.
- Design, deploy, and operate scalable AI systems with a focus on reliability, performance, and cost efficiency.
- Lead production deployment of LLMs and multimodal solutions (RAG, OCR, voice).
- Own end‑to‑end model performance, including evaluation pipelines, observability, and GPU optimization.
- Architect and manage GPU infrastructure, including model serving, load balancing, and scaling strategies.
- Build and maintain robust MLOps pipelines for version management, CI/CD, automated testing, and rollback.
- Engage with clients to gather requirements, translate business needs into technical solutions, and communicate progress.
Required profile
- Proven experience deploying large language models in production.
- Strong background in GPU‑based inference and performance tuning.
- Solid backend engineering skills, especially Python, APIs, and distributed systems.
- Hands‑on experience with MLOps, Docker, and Kubernetes.
- Familiarity with OCR/document AI and/or speech‑to‑text / text‑to‑speech systems.
- Experience leading or mentoring engineering teams.
- Excellent communication skills to bridge business and technical domains.
Required skills
- Python
- APIs
- Distributed systems
- GPU inference optimization
- MLOps
- Docker
- Kubernetes
- OCR / document AI
- Speech‑to‑text (STT) and text‑to‑speech (TTS)
- Large Language Models (LLMs)
- Retrieval‑augmented generation (RAG)
- Vector databases
- Agent workflow architectures
- Model serving and load balancing
- CI/CD pipelines
- Automated testing and rollback mechanisms
- Benchmarking and regression testing
- Observability (tracing, latency, error tracking)
- Multi‑GPU serving, batching, quantization, memory tuning
Questions fréquentes
Why are you reporting this job?
Apply in 30 seconds
Enter your email to apply. An account will be created automatically.
By continuing, you accept our terms of use.
Already have an account? Login
Published 2 hours ago
Expires 1 month from now
2 views · 0 applications
Boost your chances
Upload your CV — we will match you with relevant openings.
Analyzing your CV...
PhazeRo
Mascate