Extending AutoPipeline

AutoPipeline already exposes several extension surfaces, but they are not all equivalent in cost.

In practice, most changes should stop at one of these three layers:

Guide

Choose the Smallest Extension Surface

If you need to...	Preferred extension point	Typical files
Reuse a low-level backend, detector, segmenter, or helper across multiple pipes	Primitive	`src/autopipeline/components/primitives/`
Add a new executable metric, judge, or parsing block referenced by `pipe_name`	Module / Pipe	`src/autopipeline/components/modules/`
Add a new runtime orchestration pattern with its own input shaping and execution flow	Pipeline family	`src/autopipeline/pipelines/`

The practical rule is simple:

if YAML composition is enough, do not add a new Python type
if several metrics need the same low-level capability, add a primitive
if one new score or judge behavior must run from metric_configs, add a pipe
if the runtime no longer fits object-centric, human-centric, or vlm-as-a-judge, add a new pipeline family

AutoPipeline relies on import-side-effect registration in several places:

That means a new class is not really part of the framework until the corresponding package __init__.py imports it.

The current codebase already contains a few important reusable pieces:

BasePipeline handles config validation, parser-grounder loading, expert loading, image parsing, and metric pipe loading.
parser-grounder is already the standard front end for region-aware pipelines.
one pipe class can expose multiple metrics because the YAML metric key and the registered pipe_name are different surfaces.
pipes with the same (pipe_name, init_config) share one cached instance inside a pipeline runtime.

Those constraints mean the cheapest successful extension is often:

Only go wider when that stops being structurally correct.

A few extension points are more hardcoded than they first appear:

parser-grounder is loaded explicitly by the existing region-aware pipeline families rather than through metric_configs
BasePipeline only maps edit_area and unedit_area into mask_mode
human-centric expert loading is coupled to metric-name prefixes such as face_*, hair_*, and body_*
the runner and worker layer still contains explicit branching on the current pipeline families

This is why adding a primitive or a pipe is usually straightforward, while adding a new pipeline family requires more integration work.

No matter which layer you extend, validate in this order:

That order will save time because most real failures happen in registration, config merge, prompt assets, or runtime input-shape mismatches.