Deployment & Observability¶
This guide explains how Make MLOps Easy materialises a deployment, how to shape the deployment behaviour through configuration, and how the built-in observability stack records what happens after a model goes live.
Deployment Pipeline (High-Level)¶
make-mlops-easy train and MLOpsPipeline.run(..., deploy=True) construct a ModelDeployer (see easy_mlops/deployment/deployer.py:18). The deployer pulls options from the deployment section of your configuration, builds a DeploymentContext with the trained estimator, the fitted preprocessor, run metadata, and the resolved output directory (easy_mlops/deployment/steps.py:15), then executes a sequence of deployment steps. Steps are sourced from ModelDeployer.STEP_REGISTRY and run in order, each enriching the context with new artifacts.
Step Catalogue¶
- Context bootstrap.
DeploymentContexttracks the shared state: training results, absolute paths, generated metadata, and a central artifact dictionary. Every step can read or write to this context, making downstream steps deterministic. create_directory(CreateDeploymentDirectoryStep). Creates a timestamped directory underneathoutput_dir. The default prefix isdeployment, but any prefix can be supplied. If multiple deployments land within the same second, the step appends an incrementing suffix to keep paths unique (easy_mlops/deployment/steps.py:62).save_model(SaveModelStep). Delegates to the trainer backend by callingcontext.model.save_model(<path>), ensuring the persisted artifact includes backend-specific metadata (easy_mlops/deployment/steps.py:83).save_preprocessor(SavePreprocessorStep). Serialises the fittedDataPreprocessorwithjoblib.dump, guaranteeing that later inference reproduces the same transformations (easy_mlops/deployment/steps.py:99).save_metadata(SaveMetadataStep). CallsModelDeployer.create_model_metadatato writemetadata.jsondescribing the run: absolute artifact paths, training metrics, configured version tag, and the deployment timestamp (easy_mlops/deployment/deployer.py:43).endpoint_script(EndpointScriptStep, optional). When enabled, writes an executable helper that loads the persisted model and preprocessor to score new data. You can provide a custom script template or filename (easy_mlops/deployment/steps.py:177).
The pipeline short-circuits if a prerequisite is missing (for example, trying to save a model before the directory exists raises a RuntimeError), which keeps deployments predictable and debuggable.
Interaction with the CLI and Pipeline¶
make-mlops-easy trainperforms preprocessing, training, deployment, and log persistence in a single command. Use--config path/to/config.yamlto apply custom deployment settings.- Pass
--no-deployto skip the deployment stage entirely; the training results are still returned, but no deployment directory is created. - Call
ModelDeployer.deploy(model, preprocessor, training_results, create_endpoint=...)directly when orchestrating bespoke workflows. Overridingcreate_endpointat call time allows you to toggle endpoint generation without editing configuration files.
Deployment Configuration Recipes¶
The deployer reads options from the deployment node of the YAML configuration. Keys that are omitted fall back to defaults declared in easy_mlops/config/config.py:20 and inside ModelDeployer._initialize_steps. The following scenarios cover the supported deployment modes.
Default timestamped deployments¶
With no custom configuration, deployments land in ./models/deployment_<timestamp>/ and contain model.joblib, preprocessor.joblib, metadata.json, and logs/ when observability persists metrics. This mode is ideal for quick experiments and CI pipelines.
deployment: {}
output_dirdefaults to./models.deployment_prefixdefaults todeployment.create_endpointis disabled unless explicitly requested.
Custom output location and naming¶
Change where artifacts live and how directories are named by setting output_dir, deployment_prefix, and optionally overriding filenames.
deployment:
output_dir: ./artifacts/churn
deployment_prefix: churn_model
metadata_filename: manifest.json
- The directory becomes
./artifacts/churn/churn_model_<timestamp>/. manifest.jsonreplaces the default metadata filename (easy_mlops/deployment/deployer.py:162).- All other steps continue to run with their defaults unless specified.
Bundling an endpoint helper script¶
Enable the built-in predict.py template or supply your own script. The script is created during deployment and marked executable.
deployment:
create_endpoint: true
endpoint_filename: predict_sales.py
endpoint_template: |
#!/usr/bin/env python3
import json
from pathlib import Path
import joblib
def predict(path):
base = Path(__file__).parent
model = joblib.load(base / "model.joblib")["model"]
data = json.loads(Path(path).read_text())
return model.predict(data)
if __name__ == "__main__":
import sys
print(predict(sys.argv[1]))
create_endpoint: trueinjectsEndpointScriptStepduring_initialize_steps(easy_mlops/deployment/deployer.py:168).endpoint_filenameandendpoint_templateoverride the defaults. If you omit the template, the bundledDEFAULT_ENDPOINT_TEMPLATEis used (easy_mlops/deployment/steps.py:150).- You can still toggle script creation per run by calling
ModelDeployer.deploy(..., create_endpoint=False)when needed.
Explicit step lists¶
Provide an ordered list of step specifications to gain full control over the deployment pipeline. Strings refer to registered step names, while dictionaries let you override parameters.
deployment:
steps:
- create_directory
- save_model
- save_preprocessor
- type: save_metadata
params:
filename: detailed_metadata.json
- type: endpoint_script
params:
enabled: true
filename: predict.sh
- When
stepsis supplied, the default step sequence is ignored, so include every action you require. - Any step not in
ModelDeployer.STEP_REGISTRYraises a helpfulValueError(easy_mlops/deployment/deployer.py:182). - Use this pattern to skip steps (for example, omit
save_preprocessorif your backend embeds preprocessing) or to insert custom ones.
Registering custom deployment steps¶
Extend the framework by subclassing DeploymentStep, registering it, then referencing it from the steps list.
from easy_mlops.deployment import ModelDeployer, DeploymentStep, DeploymentContext
class UploadToS3Step(DeploymentStep):
name = "upload_to_s3"
def run(self, context: DeploymentContext) -> None:
artifact_dir = context.deployment_dir
uri = push_directory_to_s3(artifact_dir)
context.artifacts["s3_uri"] = uri
ModelDeployer.register_step(UploadToS3Step)
deployment:
steps:
- create_directory
- save_model
- save_preprocessor
- save_metadata
- upload_to_s3
ModelDeployer.register_stepguards against invalid subclasses and makes the new step available globally (easy_mlops/deployment/deployer.py:34).- Custom steps can write to the shared artifact map so later automation (for example, CI notifications) can read new locations or identifiers.
Controlling metadata content¶
The metadata writer picks up optional keys from the configuration:
deployment:
version: 2.1.0
deployment_time: "2024-05-01T09:00:00Z"
versionbecomes part ofmetadata.jsonand is surfaced byModelDeployer.load_deployed_model(easy_mlops/deployment/deployer.py:45).- Supplying
deployment_timeis useful when replaying historical runs. If omitted, the current time is recorded with ISO format. - You can also override
output_dirat call time viaModelDeployer.save_deployment_artifacts(..., output_dir="./staging")when the destination depends on runtime context.
Deployment Layout¶
Running make-mlops-easy train (without --no-deploy) creates a timestamped directory under models/ by default:
models/
└── deployment_20240101_120000/
├── metadata.json
├── model.joblib
├── preprocessor.joblib
├── predict.py # optional endpoint script
└── logs/
├── metrics_history.json
└── predictions_log.json
model.joblib- Serialized estimator plus metadata such as metrics and problem type.preprocessor.joblib- SerializedDataPreprocessorinstance capturing fitted scalers and encoders.metadata.json- Deployment metadata (paths, training summary, version, timestamps).predict.py- Lightweight CLI for inference (created whencreate_endpointis enabled).logs/- Persisted metrics and prediction entries managed byModelMonitoror any custom observability steps.
Loading Artifacts¶
The CLI predict, status, and observe commands rehydrate the model and preprocessor using ModelDeployer.load_deployed_model. You can follow the same pattern in custom scripts:
from easy_mlops.deployment import ModelDeployer
deployer = ModelDeployer({"output_dir": "./models"})
model_data, preprocessor, metadata = deployer.load_deployed_model("models/deployment_20240101_120000")
df = preprocessor.load_data("data/new_samples.csv")
X, _ = preprocessor.prepare_data(df, target_column=None, fit=False)
predictions = model_data["model"].predict(X)
Observability at a Glance¶
ModelMonitororchestrates a pipeline ofObservabilityStepinstances (seeeasy_mlops/observability/steps.py). Each step reacts tolog_metricsandlog_predictionevents emitted during training, inference, and CLI usage.- When you call
make-mlops-easy train, the pipeline logs evaluation metrics immediately after training completes. Runningpredict,status, orobservehydrates a fresh monitor using the same configuration and replays saved logs. - Every observability step persists its own state underneath the deployment directory (
logs/), keeping metrics, predictions, and alert evaluations colocated with the model artifacts. - The CLI surfaces monitoring insights through
status(metrics and predictions summaries) andobserve(formatted report). You can generate the same output in code viaModelMonitor.generate_report(). - The observability subsystem is extensible: register additional steps with
ModelMonitor.register_stepto stream events to third-party platforms or apply custom heuristics.
Observability Capabilities and Configuration¶
The observability section of config.yaml determines which steps are active and how they behave. Defaults are:
observability:
track_metrics: true
log_predictions: true
alert_threshold: 0.8
Metrics Logger (metrics_logger)¶
- Collects every metrics payload the pipeline emits, enriching it with timestamps and model versions.
- Produces trend statistics (mean, standard deviation, minimum, maximum) when multiple logs exist.
- Persists to
logs/metrics_history.json. Disable by settingtrack_metrics: false.
Predictions Logger (predictions_logger)¶
- Records individual predictions with timestamps and optional metadata. Non scalar predictions are converted to strings for safe JSON storage.
- Summaries report totals, first and last timestamps, and the last five entries for quick inspection.
- Persists to
logs/predictions_log.json. Disable withlog_predictions: false.
Metric Threshold Evaluator (metric_threshold)¶
- Supplies alert decisions via
check_metric_threshold(metric_name, value). By default it expects higher values for metrics such as accuracy or F1 and lower values for error metrics such as RMSE. - Configuration keys:
alert_threshold: Default threshold when no per metric value is provided.metric_thresholds: Mapping of metric name to custom threshold.metric_directions: Optional mapping to override direction (higherorlower) for custom metrics.- When the threshold step is disabled (for example, by redefining the step list),
ModelMonitorfalls back to the basicalert_thresholdlogic for compatibility.
Step Orchestration with steps¶
For full control, provide an explicit ordered list under observability.steps:
observability:
steps:
- metrics_logger # simple string uses default params
- type: metric_threshold # dict allows parameter overrides
params:
default_threshold: 0.75
metric_thresholds:
accuracy: 0.8
metric_directions:
rmse: higher # invert direction if needed
- Strings refer to registered step names (
metrics_logger,predictions_logger,metric_threshold). - Dictionaries must include a
typekey and can supply nestedparams. - Omitting
predictions_loggerremoves prediction capture entirely; downstream summaries will indicate that prediction logging is disabled.
Custom Steps and Integrations¶
Add instrumentation by subclassing ObservabilityStep and registering it:
from easy_mlops.observability import ModelMonitor, ObservabilityStep
class SlackAlertStep(ObservabilityStep):
name = "slack_alert"
def on_log_metrics(self, metrics, model_version):
if metrics.get("accuracy", 1) < 0.7:
send_to_slack(metrics, model_version)
ModelMonitor.register_step(SlackAlertStep)
Once registered, list the step inside observability.steps. Each custom step can persist its own files through save() or restore state with load().
Complete Configuration Example¶
observability:
track_metrics: true
log_predictions: true
alert_threshold: 0.82 # used when no per metric override exists
metric_thresholds:
f1_score: 0.78
rmse: 0.45
metric_directions:
rmse: lower # explicit direction for custom metrics
steps:
- metrics_logger
- predictions_logger
- type: metric_threshold
params:
default_threshold: 0.82
metric_thresholds:
precision: 0.75
Any keys you omit fall back to the defaults baked into easy_mlops/config/config.py.
Observability in Practice¶
Threshold Checks¶
Use check_metric_threshold(metric_name, value) to determine if alerting is required based on configuration:
monitor = ModelMonitor(
{
"track_metrics": True,
"log_predictions": True,
"alert_threshold": 0.8,
"metric_thresholds": {"accuracy": 0.85},
}
)
should_alert = monitor.check_metric_threshold("accuracy", value=0.82)
if should_alert:
trigger_notification()
The evaluator respects per metric thresholds and direction overrides, falling back to the global alert_threshold when no specific rule is defined.
Reports¶
make-mlops-easy observe renders a text report summarizing both logs. You can generate the same report programmatically:
report = monitor.generate_report()
print(report)
The report includes total entries, first and last timestamps, and the latest metrics snapshot to support runbooks or dashboards.
Log Management and Automation¶
- Logs are JSON files suited for downstream ingestion. Copy them to durable storage or enhance
save_logsto forward directly into your monitoring stack. - Schedule the
statuscommand to verify metric health and deployment metadata on a cadence. - Combine
predictwith automated data freshness checks; prediction logs provide an auditable trail. - Preserve each deployment directory for reproducibility - every run bundles the model, preprocessing artifacts, configuration snapshot, and observability logs together.