Skip to main content

Modules Overview

Modules are the executable units registered in PIPE_REGISTRY. They are the objects referenced by pipeline configs through pipe_name, and they are the components that actually compute scores, winners, or parsed region artifacts.

Overview

Runtime contract

At the pipeline layer, a module is selected through metric_configs or parser_grounder_config:

metric_configs:
lpips:
pipe_name: lpips-pipe
init_config:
model_path: ...
scope: unedit_area
runtime_params:
...

The pipeline runtime then:

  1. resolves pipe_name
  2. instantiates the module with init_config
  3. maps scope to mask_mode
  4. calls the module with the metric name and runtime_params

Scope and masking

BasePipeline translates scopes into mask behavior:

  • edit_area -> mask_mode="inner"
  • unedit_area -> mask_mode="outer"

This matters for modules that consume full images plus coords, especially lpips-pipe, ssim-pipe, clip-pipe, dino-v3-pipe, and depth-anything-v2-pipe.

Registered module catalog

Registry keyImplementationRoleTypical metric or output
parser-grounderparser_grounder.pyParse edit instructions and ground regionsparsed object dict + bbox list
pairwise-judgejudge.pyCompare two candidate images with an LLM or VLMImage A, Image B, Tie, Failed
viescorejudge.pyScore one or two edited images with a client-backed promptsingle score or pairwise winner
clip-pipeclip_pipe.pyCLIP-based semantic similarityemd, sam_clip_cls_sim
dino-v3-pipedino_pipe.pyDINO-based structure or object similaritydinov3_structure_similarity, sam_dino_cls_sim
lpips-pipelpips_pipe.pyPerceptual similaritylpips
ssim-pipessim_pipe.pyStructural similarityssim, L_channel_ssim
sam-pipesam_pipe.pyRegion mask overlap inside a bboxiou
depth-anything-v2-pipedepth_anything_v2_pipe.pyDepth-map similaritydepth_ssim
face-geometry-pipeface_pipe.pyFacial landmark geometry consistencyL2_distance
face-texture-pipeface_pipe.pyFacial texture or color consistencyhigh_frequency_diff, color_similarity, energy_ratio
face-identity-pipeface_pipe.pyFace identity preservationface_ID_sim, bg_faceID_sim, max_match_face_ID_sim
hair-consistency-pipehair_pipe.pyHair-region consistencycolor_distance, texture_energy_diff, high_frequency_diff
body-pose-and-shape-pipehuman_body.pyBody pose and silhouette consistencybody_shape_iou, body_pose_position_error
body-appearance-pipehuman_body.pyBody appearance similaritybody_appearance_dino_cosine_sim

Important design caveats

The module interface is intentionally flexible, not perfectly uniform:

  • some modules return float
  • some return dict
  • parser-grounder returns a tuple
  • many modules return None or 0.0 on soft failure

That flexibility is by design. The pipelines already tolerate sparse or partially missing scores, so extension code should generally follow the same pattern unless a failure must be fatal.

How to extend this layer safely

  • Reuse an existing module if your need is only a new task-specific config.
  • Add a new module when you need a new executable metric or judge behavior.
  • Import the new module in components/modules/__init__.py, otherwise it will not be registered.
  • Document the module's expected input contract, because not all modules consume the same input shape.

Source-aligned module pages

The module docs are now split to stay closer to the source tree: