Accueil›Blog›Test technique MLflow avancé : projets, recipes, Model Registry, serving

Guide recrutement data

Test technique MLflow avancé : projets, recipes, Model Registry, serving

MLflow est bien plus qu un simple outil de tracking. En entretien ML Engineer Senior, on évalue la maîtrise de MLflow Projects pour la reproductibilité, du Model Registry pour la gouvernance et du serving.

Data Builder·Juin 2025·6 min de lecture·Data Scientist · ML Engineer

Sommaire

Tracking avancé
MLflow Projects
Model Registry avancé
MLflow Serving
CI/CD avec MLflow
MLflow sur Databricks
Grille

1Tracking avancé : tout loguer

Question discriminante

Qu est-ce que vous loguez systématiquement dans MLflow au-delà des métriques de base ?

import mlflow
import mlflow.sklearn
from mlflow.models import infer_signature

with mlflow.start_run(run_name='xgb_v3_features_engineered') as run:
    # Paramètres
    mlflow.log_params({
        'n_estimators': 500,
        'max_depth': 6,
        'learning_rate': 0.05,
        'feature_set': 'v3_with_lags'  # quelle version des features
    })
    
    # Métriques
    mlflow.log_metrics({
        'auc_test': 0.87,
        'auc_train': 0.93,
        'precision': 0.82,
        'recall': 0.79,
        'f1': 0.80
    })
    
    # Artefacts : tout ce qui permet de reproduire
    mlflow.log_artifact('feature_importance.png')
    mlflow.log_artifact('confusion_matrix.png')
    mlflow.log_artifact('data_stats.json')   # stats des données d entrainement
    
    # Signature : inputs/outputs attendus
    signature = infer_signature(X_train, model.predict_proba(X_train))
    mlflow.sklearn.log_model(model, 'model', signature=signature)

2MLflow Projects : reproductibilité

Question discriminante

Comment garantissez-vous que votre code d entrainement est reproductible avec MLflow Projects ?

# MLproject file - définit comment exécuter le projet
name: churn_prediction

conda_env: conda.yaml  # ou pip_requirements: requirements.txt

entry_points:
  train:
    parameters:
      n_estimators: {type: int, default: 100}
      max_depth: {type: int, default: 5}
      data_path: {type: str}
    command: 'python train.py --n-estimators {n_estimators} --max-depth {max_depth} --data {data_path}'

  evaluate:
    parameters:
      model_uri: {type: str}
    command: 'python evaluate.py --model {model_uri}'

# Lancer depuis n importe où
mlflow run . -P n_estimators=500 -P max_depth=6

# Ou depuis Git directement
mlflow run https://github.com/org/churn-model -P n_estimators=500

3Model Registry : workflow complet

Question discriminante

Décrivez le workflow complet de promotion d un modèle du développement à la production.

from mlflow.tracking import MlflowClient

client = MlflowClient()

# 1. Enregistrer le modèle depuis un run
model_uri = f'runs:/{run_id}/model'
model_details = mlflow.register_model(model_uri, 'churn_predictor')

# 2. Ajouter des métadonnées
client.update_model_version(
    name='churn_predictor',
    version=model_details.version,
    description='XGBoost v3, AUC=0.87, trained on 2025-01 data'
)

# 3. Promouvoir vers Staging après validation
client.transition_model_version_stage(
    name='churn_predictor',
    version=model_details.version,
    stage='Staging',
    archive_existing_versions=False
)

# 4. Tests d intégration sur Staging...

# 5. Promouvoir en Production
client.transition_model_version_stage(
    name='churn_predictor',
    version=model_details.version,
    stage='Production',
    archive_existing_versions=True  # archive l ancien modèle
)

4MLflow Serving : exposer un modèle

Question discriminante

Comment servez-vous un modèle MLflow via une API REST ?

## Serving via CLI
# Charger depuis le registry
mlflow models serve \
  --model-uri models:/churn_predictor/Production \
  --port 5000 \
  --no-conda

## Serving dans Python (pour les custom handlers)
from mlflow.pyfunc import PythonModel

class ChurnPredictor(PythonModel):
    def load_context(self, context):
        import joblib
        self.model = joblib.load(context.artifacts['model_path'])
        self.threshold = 0.6  # seuil business
    
    def predict(self, context, model_input):
        probas = self.model.predict_proba(model_input)[:, 1]
        return pd.DataFrame({
            'probability': probas,
            'prediction': (probas > self.threshold).astype(int)
        })

## Requête vers l API servie
import requests
response = requests.post(
    'http://localhost:5000/invocations',
    json={'dataframe_records': X_test.to_dict('records')}
)

5CI/CD avec MLflow : automatiser les promotions

Question discriminante

Comment intégrez-vous MLflow dans un pipeline CI/CD pour le ML ?

CI sur PR — lancer l entrainement sur un sample, vérifier que les métriques sont au-dessus du baseline
Comparaison automatique — comparer le nouveau modèle avec le modèle en Production. Promouvoir seulement si meilleur
GitHub Actions — workflow qui déclenche l entrainement, enregistre dans MLflow, promeut si métriques OK
Model approval gate — certains modèles critiques nécessitent une validation humaine avant la promotion Production

6MLflow sur Databricks : différences

Question discriminante

En quoi MLflow Managed sur Databricks diffère-t-il du MLflow open source ?

Tracking server centralisé — pas de configuration, intégré au workspace Databricks
Unity Catalog integration — dans Databricks 13+, le Model Registry est dans Unity Catalog. Lineage, accès RBAC
Auto logging — mlflow.autolog() capture automatiquement sklearn, XGBoost, PyTorch, LightGBM
Feature Store — intégration native entre les features Databricks Feature Store et les modèles MLflow

import mlflow, mlflow.sklearn
from mlflow.tracking import MlflowClient

# Autolog : MLflow capture tout automatiquement
mlflow.sklearn.autolog()  # params, métriques, modèle, feature importance

with mlflow.start_run(run_name="xgboost_churn_v3") as run:
    model = XGBClassifier(n_estimators=300, max_depth=6, learning_rate=0.03)
    model.fit(X_train, y_train)
    
    # Métriques custom
    mlflow.log_metrics({
        "auc_roc": roc_auc_score(y_test, model.predict_proba(X_test)[:,1]),
        "precision_at_k": precision_at_k(y_test, model, k=100)
    })
    mlflow.log_artifact("shap_summary.png")
    mlflow.set_tags({"team": "risk", "dataset_version": "2025-Q1"})

# Sélectionner et promouvoir le meilleur run
client = MlflowClient()
best = client.search_runs(
    experiment_ids=["1"],
    order_by=["metrics.auc_roc DESC"]
)[0]

mv = mlflow.register_model(f"runs:/{best.info.run_id}/model", "churn_model")
client.transition_model_version_stage("churn_model", mv.version, "Production")

# Servir le modèle
# mlflow models serve -m models:/churn_model/Production -p 5000

MLflow vs W&B — MLflow : open source, intégré Databricks, self-hostable. Weights & Biases : meilleure UX deep learning, collaboration, sweeps hyperparamètres, payant
Model Registry — versionner les modèles avec transitions (None → Staging → Production → Archived). Rollback en 1 ligne si régression en prod
MLflow Projects — packager le code d'entraînement (MLproject file + conda.yaml). Reproductibilité totale : mlflow run . -P learning_rate=0.01
MLflow Serving — endpoint REST simple pour du scoring. Pour la prod haute dispo : deployer le modèle MLflow dans Docker, Kubernetes ou Databricks Model Serving
Experiment tracking best practices — un run = un entraînement. Toujours logger les hyperparamètres, métriques train/val/test, version du dataset, artefacts (SHAP plots, confusion matrix)

MLflow vs W&B - MLflow : open source, integre Databricks, self-hostable. Weights & Biases : meilleure UX deep learning, sweeps hyperparametres, payant
Model Registry - versionner les modeles avec transitions (None -> Staging -> Production -> Archived). Rollback en 1 ligne si regression en prod
MLflow Projects - packager le code d entrainement (MLproject file). Reproductibilite totale : mlflow run . -P learning_rate=0.01
MLflow Serving - endpoint REST simple pour le scoring. Pour la prod haute dispo : deployer le modele dans Docker, Kubernetes ou Databricks Model Serving
Experiment tracking best practices - un run = un entrainement. Toujours logger les hyperparametres, metriques train/val/test, version du dataset, artefacts (SHAP plots)

7Grille par niveau

Niveau	Maitrise	Signal GO	NO-GO
Confirmé	Tracking complet, Model Registry, serving basique	Loggue paramètres + métriques + artefacts, a promu un modèle via Registry	Ne loggue que les métriques sans les paramètres ni les artefacts
Senior	MLflow Projects, serving production, CI/CD ML, custom PythonModel	A écrit un MLproject, serve un modèle custom en production, a un CI/CD ML	Ne sait pas ce qu est MLflow Projects

1Advanced Tracking: log everything

Discriminating question

What do you systematically log in MLflow beyond basic metrics?

import mlflow
import mlflow.sklearn
from mlflow.models import infer_signature

with mlflow.start_run(run_name='xgb_v3_features_engineered') as run:
    # Parameters
    mlflow.log_params({
        'n_estimators': 500,
        'max_depth': 6,
        'learning_rate': 0.05,
        'feature_set': 'v3_with_lags'  # which feature version
    })
    
    # Metrics
    mlflow.log_metrics({
        'auc_test': 0.87,
        'auc_train': 0.93,
        'precision': 0.82,
        'recall': 0.79,
        'f1': 0.80
    })
    
    # Artifacts: everything needed to reproduce
    mlflow.log_artifact('feature_importance.png')
    mlflow.log_artifact('confusion_matrix.png')
    mlflow.log_artifact('data_stats.json')   # training data stats
    
    # Signature: expected inputs/outputs
    signature = infer_signature(X_train, model.predict_proba(X_train))
    mlflow.sklearn.log_model(model, 'model', signature=signature)

2MLflow Projects: reproducibility

Discriminating question

How do you ensure your training code is reproducible with MLflow Projects?

# MLproject file - defines how to run the project
name: churn_prediction

conda_env: conda.yaml  # or pip_requirements: requirements.txt

entry_points:
  train:
    parameters:
      n_estimators: {type: int, default: 100}
      max_depth: {type: int, default: 5}
      data_path: {type: str}
    command: 'python train.py --n-estimators {n_estimators} --max-depth {max_depth} --data {data_path}'

  evaluate:
    parameters:
      model_uri: {type: str}
    command: 'python evaluate.py --model {model_uri}'

# Run from anywhere
mlflow run . -P n_estimators=500 -P max_depth=6

# Or directly from Git
mlflow run https://github.com/org/churn-model -P n_estimators=500

3Model Registry: complete workflow

Discriminating question

Describe the complete workflow for promoting a model from development to production.

from mlflow.tracking import MlflowClient

client = MlflowClient()

# 1. Register the model from a run
model_uri = f'runs:/{run_id}/model'
model_details = mlflow.register_model(model_uri, 'churn_predictor')

# 2. Add metadata
client.update_model_version(
    name='churn_predictor',
    version=model_details.version,
    description='XGBoost v3, AUC=0.87, trained on 2025-01 data'
)

# 3. Promote to Staging after validation
client.transition_model_version_stage(
    name='churn_predictor',
    version=model_details.version,
    stage='Staging',
    archive_existing_versions=False
)

# 4. Integration tests on Staging...

# 5. Promote to Production
client.transition_model_version_stage(
    name='churn_predictor',
    version=model_details.version,
    stage='Production',
    archive_existing_versions=True  # archives the old model
)

4MLflow Serving: exposing a model

Discriminating question

How do you serve an MLflow model via a REST API?

## Serving via CLI
# Load from the registry
mlflow models serve \
  --model-uri models:/churn_predictor/Production \
  --port 5000 \
  --no-conda

## Serving in Python (for custom handlers)
from mlflow.pyfunc import PythonModel

class ChurnPredictor(PythonModel):
    def load_context(self, context):
        import joblib
        self.model = joblib.load(context.artifacts['model_path'])
        self.threshold = 0.6  # business threshold
    
    def predict(self, context, model_input):
        probas = self.model.predict_proba(model_input)[:, 1]
        return pd.DataFrame({
            'probability': probas,
            'prediction': (probas > self.threshold).astype(int)
        })

## Request to the served API
import requests
response = requests.post(
    'http://localhost:5000/invocations',
    json={'dataframe_records': X_test.to_dict('records')}
)

5CI/CD with MLflow: automating promotions

Discriminating question

How do you integrate MLflow into a CI/CD pipeline for ML?

CI on PR — run training on a sample, verify that metrics are above the baseline
Automatic comparison — compare the new model with the model in Production. Promote only if better
GitHub Actions — workflow that triggers training, registers in MLflow, promotes if metrics are OK
Model approval gate — some critical models require human validation before Production promotion

6MLflow on Databricks: differences

Discriminating question

How does Managed MLflow on Databricks differ from open source MLflow?

Centralized tracking server — no configuration needed, integrated into the Databricks workspace
Unity Catalog integration — in Databricks 13+, the Model Registry is in Unity Catalog. Lineage, RBAC access
Auto logging — mlflow.autolog() automatically captures sklearn, XGBoost, PyTorch, LightGBM
Feature Store — native integration between Databricks Feature Store features and MLflow models

import mlflow, mlflow.sklearn
from mlflow.tracking import MlflowClient

# Autolog: MLflow captures everything automatically
mlflow.sklearn.autolog()  # params, metrics, model, feature importance

with mlflow.start_run(run_name="xgboost_churn_v3") as run:
    model = XGBClassifier(n_estimators=300, max_depth=6, learning_rate=0.03)
    model.fit(X_train, y_train)
    
    # Custom metrics
    mlflow.log_metrics({
        "auc_roc": roc_auc_score(y_test, model.predict_proba(X_test)[:,1]),
        "precision_at_k": precision_at_k(y_test, model, k=100)
    })
    mlflow.log_artifact("shap_summary.png")
    mlflow.set_tags({"team": "risk", "dataset_version": "2025-Q1"})

# Select and promote the best run
client = MlflowClient()
best = client.search_runs(
    experiment_ids=["1"],
    order_by=["metrics.auc_roc DESC"]
)[0]

mv = mlflow.register_model(f"runs:/{best.info.run_id}/model", "churn_model")
client.transition_model_version_stage("churn_model", mv.version, "Production")

# Serve the model
# mlflow models serve -m models:/churn_model/Production -p 5000

MLflow vs W&B — MLflow: open source, integrated with Databricks, self-hostable. Weights & Biases: better deep learning UX, collaboration, hyperparameter sweeps, paid
Model Registry — version models with transitions (None → Staging → Production → Archived). Rollback in 1 line if regression in prod
MLflow Projects — package training code (MLproject file + conda.yaml). Full reproducibility: mlflow run . -P learning_rate=0.01
MLflow Serving — simple REST endpoint for scoring. For high-availability prod: deploy the MLflow model in Docker, Kubernetes or Databricks Model Serving
Experiment tracking best practices — one run = one training. Always log hyperparameters, train/val/test metrics, dataset version, artifacts (SHAP plots, confusion matrix)

MLflow vs W&B - MLflow: open source, integrated with Databricks, self-hostable. Weights & Biases: better deep learning UX, hyperparameter sweeps, paid
Model Registry - version models with transitions (None -> Staging -> Production -> Archived). Rollback in 1 line if regression in prod
MLflow Projects - package training code (MLproject file). Full reproducibility: mlflow run . -P learning_rate=0.01
MLflow Serving - simple REST endpoint for scoring. For high-availability prod: deploy the model in Docker, Kubernetes or Databricks Model Serving
Experiment tracking best practices - one run = one training. Always log hyperparameters, train/val/test metrics, dataset version, artifacts (SHAP plots)

7Level grid

Level	Mastery	GO signal	NO-GO
Mid-level	Complete tracking, Model Registry, basic serving	Logs parameters + metrics + artifacts, has promoted a model via Registry	Only logs metrics without parameters or artifacts
Senior	MLflow Projects, production serving, ML CI/CD, custom PythonModel	Has written an MLproject, serves a custom model in production, has an ML CI/CD	Does not know what MLflow Projects is

Vous recrutez un ML Engineer ?

Premier entretien gratuit. Rapport GO/NO-GO sous 48h.

Tester gratuitement Reserver un appel