Skip to content

API Reference

Generated from the ll_gen package source. Each symbol links to its definition on GitHub.

Source: ll_gen/ll_gen/generators/base.py:141

Abstract base class for all neural generators.

Subclasses must implement:

  • generate(): Single proposal generation
  • generate_candidates(): Batch proposal generation

Attributes: device: Target device (“cpu” or “cuda”). checkpoint_path: Path to model checkpoint (optional). _model: Lazy-initialized neural network model.

Methods

__init__(device: str = 'cpu', checkpoint_path: Path | None = None) -> None

Source: ll_gen/ll_gen/generators/base.py:154

Initialize the generator.

Args: device: Target device (“cpu” or “cuda”). If “cuda” is not available, falls back to “cpu”. checkpoint_path: Optional path to load model checkpoint from.

generate(prompt: str, conditioning: ConditioningEmbeddings | None = None, error_context: dict[str, Any] | None = None) -> BaseProposal

Source: ll_gen/ll_gen/generators/base.py:172

Generate a single proposal from a prompt.

Args: prompt: User prompt describing the shape/object to generate. conditioning: Optional conditioning embeddings (text/image/multimodal). error_context: Optional error feedback from a prior failed attempt. Contains keys like “error_category”, “previous_latent_vector”, “failure_description”, etc.

Returns: A typed proposal (CommandSequenceProposal or LatentProposal).

generate_candidates(prompt: str, num_candidates: int = 3, conditioning: ConditioningEmbeddings | None = None) -> list[BaseProposal]

Source: ll_gen/ll_gen/generators/base.py:193

Generate multiple candidate proposals from a prompt.

Args: prompt: User prompt describing the shape/object to generate. num_candidates: Number of proposals to generate. conditioning: Optional conditioning embeddings.

Returns: List of proposals, typically ordered by confidence descending.

generate_for_training(prompt: str, conditioning: ConditioningEmbeddings | None = None, error_context: dict[str, Any] | None = None, target_dimensions: tuple[float, float, float] | None = None) -> BaseProposal

Source: ll_gen/ll_gen/generators/base.py:212

Generate a proposal while retaining the computation graph for RL.

target_dimensions is accepted uniformly so the trainer can pass it to any generator; subclasses that support dimension conditioning act on it, the default implementation ignores it.

Unlike generate() (which runs under torch.no_grad), this method keeps gradients flowing so that proposal.log_probs is a differentiable scalar that can be used in a REINFORCE loss::

loss = -advantage * proposal.log_probs

Subclasses should override this to compute log-probs on the same stochastic trajectory that was sampled (same latent z for VAE, same codebook indices for VQ-VAE, etc.).

The default implementation falls back to generate() with a warning — callers should check proposal.log_probs is not None before relying on the gradient.

Args: prompt: User prompt describing the shape/object to generate. conditioning: Optional conditioning embeddings. error_context: Optional error feedback from a prior failed attempt.

Returns: A proposal with log_probs and entropy populated.

load_checkpoint(path: Path) -> None

Source: ll_gen/ll_gen/generators/base.py:297

Load model checkpoint into self._model.

Assumes self._model is already initialized. Loads the state dict and moves the model to the target device.

Args: path: Path to checkpoint file.

Raises: RuntimeError: If self._model is None or checkpoint cannot be loaded. ImportError: If torch is not available.

decode_command_logits(temperature: float = 1.0, latent: Any | None = None) -> tuple[Any, list[Any]] | None

Source: ll_gen/ll_gen/generators/base.py:617

Decode one batch of per-position command/parameter logits.

Runs the model’s decoder to produce the policy’s distribution over a command-token sequence, WITHOUT sampling it. Used by :meth:score_token_sequence to teacher-force the log-probability of a given sequence.

Args: temperature: Accepted for interface parity (the scorer applies the scaling); logits are returned unscaled. latent: Optional generator-specific latent to decode from. When provided (e.g. a proposal’s own latent_vector), the decode is deterministic, yielding a stable reconstruction- likelihood. When None, command-sequence generators draw a fresh prior sample, so the result is a single-sample, non-deterministic estimate.

Returns: (command_logits, param_logits_2d) where command_logits is a [S, C] tensor and param_logits_2d is a list of [S, P] tensors (one per parameter slot), or None for generators that do not emit command-token sequences (e.g. diffusion, which models continuous B-rep latents). Command- sequence generators (VAE, VQ-VAE) override this.

score_token_sequence(token_ids: Any, temperature: float = 1.0, latent: Any | None = None) -> tuple[Any, float]

Source: ll_gen/ll_gen/generators/base.py:649

Teacher-forcing log-probability of token_ids under the policy.

Re-decodes policy logits (:meth:decode_command_logits) and gathers the log-probability of the provided token_ids under those distributions — the exact inverse of the sampling performed in generate_for_training.

This is an evaluation / diagnostic score (sequence perplexity, reconstruction-likelihood logging): it runs a forward pass that is not the trajectory used for the RL gradient, so it must not be fed back as the REINFORCE signal. For the RL gradient use proposal.log_probs from generate_for_training (the sampled trajectory).

.. important:: Determinism depends on latent. When latent is None (the default), command-sequence generators decode from a fresh random prior draw, so repeated calls on the same token_ids return different values — a one-sample prior estimate, NOT a stable likelihood. Pass the sequence’s own latent (e.g. proposal.latent_vector) for a deterministic, meaningful reconstruction-likelihood — this is how evaluate_validity calls it.

Args: token_ids: Sequence of token IDs (geotoken vocabulary), optionally BOS-prefixed, as produced by the generator’s decode. temperature: Sampling temperature the logits are scaled by (match the temperature used when the sequence was generated). latent: Optional latent to decode from for a deterministic score (see the determinism note above).

Returns: (total_log_prob, mean_entropy). total_log_prob is a differentiable scalar tensor (sum of per-token log-probs), or None if no token could be scored / the generator does not emit command-token sequences. mean_entropy is a float.

Source: ll_gen/ll_gen/proposals/base.py:21

Common fields shared by all neural proposal types.

Subclasses (CodeProposal, CommandSequenceProposal, LatentProposal) add their domain-specific payloads on top of these shared fields.

Attributes: proposal_id: Unique identifier for this proposal instance. confidence: Neural generator’s self-assessed confidence in [0, 1]. For LLM code generation this comes from explicit self-rating or calibrated token probability. For VAE/diffusion it is derived from reconstruction loss or denoising confidence. attempt: Current retry attempt (1-indexed). max_attempts: Maximum retries before giving up. source_prompt: The original user prompt that triggered generation. conditioning_source: Description of the conditioning input type (“text”, “image”, “text+image”, “unconditional”, etc.). generation_metadata: Arbitrary key-value metadata from the generator (e.g. model name, latent vector norm, temperature). alternatives: Sibling proposals generated in the same batch. The orchestrator may try alternatives if this proposal fails. timestamp: UTC creation time. error_context: Structured error feedback from a prior failed attempt, used by the generator to correct on retry.

Methods

should_retry() -> bool

Source: ll_gen/ll_gen/proposals/base.py:83

Whether the retry budget allows another attempt.

Returns: True if attempt < max_attempts.

next_attempt() -> int

Source: ll_gen/ll_gen/proposals/base.py:91

Return the next attempt number.

Returns: attempt + 1, clamped to max_attempts.

with_error_context(error: dict[str, Any]) -> _ProposalT

Source: ll_gen/ll_gen/proposals/base.py:99

Create a shallow copy with updated error context and incremented attempt.

This is used by the orchestrator when building a retry proposal: the error context from the failed disposal is attached so the generator can condition on it.

Args: error: Structured error dict from DisposalResult.

Returns: New BaseProposal (same type) with updated fields. Subclasses should override to preserve their own fields.

summary() -> dict[str, Any]

Source: ll_gen/ll_gen/proposals/base.py:141

Return a compact summary dict for logging / serialization.

Returns: Dict with key identification and status fields.

__init__(proposal_id: str = (lambda: uuid.uuid4().hex)(), confidence: float = 0.0, attempt: int = 1, max_attempts: int = 3, source_prompt: str = '', conditioning_source: str | None = None, generation_metadata: dict[str, Any] = dict(), alternatives: list[Any] = list(), timestamp: str = (lambda: datetime.now(timezone.utc).isoformat())(), error_context: dict[str, Any] | None = None, log_probs: Any | None = None, entropy: float | None = None) -> None

Source: ll_gen/ll_gen/proposals/base.py

Source: ll_gen/ll_gen/codegen/cadquery_proposer.py:27

Wraps cadling’s CadQueryGenerator to produce typed CodeProposal objects.

This class manages the lifecycle of cadling’s CadQueryGenerator, handling both initial generation from prompts and repair generation for code that has failed validation or execution.

Attributes: config: The CodegenConfig for model selection and API provider. generator: The underlying cadling CadQueryGenerator instance.

Raises: ImportError: If cadling is not installed when propose() is called.

Methods

__init__(config: CodegenConfig | None = None) -> None

Source: ll_gen/ll_gen/codegen/cadquery_proposer.py:42

Initialize the CadQueryProposer.

Args: config: Optional CodegenConfig specifying model_name and api_provider. If None, uses defaults from CodegenConfig.

Side-effects: If cadling is available, creates a CadQueryGenerator instance.

propose(prompt: str, image_path: Path | None = None, error_context: dict | None = None, attempt: int = 1) -> CodeProposal

Source: ll_gen/ll_gen/codegen/cadquery_proposer.py:61

Generate a CadQuery code proposal from a prompt.

For the first attempt (attempt=1), generates code from scratch. For retry attempts (attempt>1), uses the repair endpoint with the previous code and error message.

Args: prompt: The natural language prompt describing the part. image_path: Optional path to a reference image (JPEG/PNG). error_context: Dictionary with keys: - “old_code” (str): The previous code that failed - “error_message” (str): The error from execution For retry attempts. attempt: Attempt number (1=initial, 2+=retry). Used to select generation vs repair mode.

Returns: A CodeProposal wrapping the generated code with: - language set to CADQUERY - imports_required extracted from the code - syntax_valid pre-checked

Raises: ImportError: If cadling is not installed. ValueError: If error_context is missing required keys on retry.

propose_batch(prompt: str, num_candidates: int = 3, image_path: Path | None = None) -> list[CodeProposal]

Source: ll_gen/ll_gen/codegen/cadquery_proposer.py:139

Generate multiple candidate CadQuery code proposals.

This method calls the generator multiple times to produce diverse candidates. Useful for downstream filtering or ranking.

Args: prompt: The natural language prompt describing the part. num_candidates: Number of distinct candidates to generate. image_path: Optional path to a reference image.

Returns: A list of CodeProposal objects, each with: - language set to CADQUERY - syntax_valid pre-checked - code_hash set for deduplication

Raises: ImportError: If cadling is not installed.

Source: ll_gen/ll_gen/config.py:33

Supported code generation languages.

Source: ll_gen/ll_gen/proposals/code_proposal.py:23

A proposal containing executable CAD code.

Attributes: code: The executable code string. language: Which CAD scripting language the code is written in. syntax_valid: Whether syntax has been pre-checked. None means not yet checked; True/False after validate_syntax() has run. imports_required: List of module names the code requires (extracted from import statements). code_hash: SHA-256 digest of code for dedup / caching.

Methods

validate_syntax() -> bool

Source: ll_gen/ll_gen/proposals/code_proposal.py:56

Check whether the code is syntactically valid.

For Python-based languages (CadQuery, pythonocc) this uses ast.parse. For OpenSCAD it uses a set of heuristic regex checks that catch the most common structural errors (unmatched braces, missing semicolons after statements).

Returns: True if the code parses without error.

Side-effects: Sets self.syntax_valid.

with_error_context(error: dict[str, Any]) -> CodeProposal

Source: ll_gen/ll_gen/proposals/code_proposal.py:189

Create a retry proposal with error context attached.

The new proposal keeps the same language and source prompt but clears the code (the generator will produce a new attempt conditioned on the error).

Args: error: Structured error dict from the disposal result.

Returns: New CodeProposal ready for retry generation.

summary() -> dict[str, Any]

Source: ll_gen/ll_gen/proposals/code_proposal.py:230

Extended summary including code-specific fields.

__init__(proposal_id: str = (lambda: uuid.uuid4().hex)(), confidence: float = 0.0, attempt: int = 1, max_attempts: int = 3, source_prompt: str = '', conditioning_source: str | None = None, generation_metadata: dict[str, Any] = dict(), alternatives: list[Any] = list(), timestamp: str = (lambda: datetime.now(timezone.utc).isoformat())(), error_context: dict[str, Any] | None = None, log_probs: Any | None = None, entropy: float | None = None, code: str = '', language: CodeLanguage = CodeLanguage.CADQUERY, syntax_valid: bool | None = None, imports_required: list[str] = list(), code_hash: str | None = None) -> None

Source: ll_gen/ll_gen/proposals/code_proposal.py

Source: ll_gen/ll_gen/config.py:198

Configuration for code generation (Path A).

Methods

__init__(model_name: str = 'claude-sonnet-4-20250514', api_provider: str = 'anthropic', max_tokens: int = 4096, temperature: float = 0.2, execution_timeout: int = 30, max_retries: int = 3, default_backend: CodeLanguage = CodeLanguage.CADQUERY, allowed_modules: list[str] = (lambda: ['cadquery', 'math', 'numpy'])(), include_examples: bool = True, max_example_tokens: int = 2000) -> None

Source: ll_gen/ll_gen/config.py

class CommandSequenceProposal(BaseProposal)

Section titled “class CommandSequenceProposal(BaseProposal)”

Source: ll_gen/ll_gen/proposals/command_proposal.py:22

A proposal containing a quantized command token sequence.

Attributes: token_ids: Flat integer token ID sequence as produced by the neural decoder. These are vocabulary-level IDs that include special tokens (PAD=0, BOS=1, EOS=2, SEP=3, UNK=4) and command/parameter tokens. command_dicts: Structured command list. Each dict contains::

{
"command_type": str, # "SOL", "LINE", ...
"parameters": list[int], # 16 quantized ints
"parameter_mask": list[bool], # 16 bools
}
Either ``token_ids`` or ``command_dicts`` must be non-empty.
quantization_bits: Bit-width used for parameter quantization.
normalization_range: Bounding cube size used for normalization
(typically 2.0 for a [-1,1] cube).
precision_tier: Name of the precision tier ("DRAFT", "STANDARD",
"PRECISION") from geotoken.
latent_vector: The latent-space vector that was decoded to
produce this sequence (if available). Shape ``(latent_dim,)``.

Methods

to_token_sequence() -> Any

Source: ll_gen/ll_gen/proposals/command_proposal.py:60

Convert to a geotoken TokenSequence for downstream processing.

Imports geotoken lazily so ll_gen can be used without it installed (the proposal dataclass itself is pure Python).

Returns: A geotoken.TokenSequence populated with CommandToken entries derived from command_dicts.

Raises: ImportError: If geotoken is not installed. ValueError: If neither token_ids nor command_dicts are populated.

dequantize() -> list[dict[str, Any]]

Source: ll_gen/ll_gen/proposals/command_proposal.py:212

Dequantize command parameters to continuous float values.

Maps quantized integer parameters back to continuous coordinates using the inverse of the symmetric quantization::

value = (param / (levels - 1)) * 2 * range - range

where levels = 2 ** quantization_bits and range = normalization_range. This maps [0, levels-1] back to [-range, +range].

Returns: List of dicts with command_type (str), parameters (list of floats), and parameter_mask (list of bools).

command_dicts_from_token_ids() -> list[dict[str, Any]]

Source: ll_gen/ll_gen/proposals/command_proposal.py:267

Decode token_ids into structured command dicts (quantized params).

Returns the same {command_type, parameters, parameter_mask} schema as a pipeline-decoded proposal, so a token-id-only proposal can expose per-command structure to consumers (and matches the pre-unification behavior of the generators). Returns [] when there are no token ids.

with_error_context(error: dict[str, Any]) -> CommandSequenceProposal

Source: ll_gen/ll_gen/proposals/command_proposal.py:320

Create a retry proposal with error context.

Preserves the latent vector (so the generator can perturb it) but clears the decoded tokens.

summary() -> dict[str, Any]

Source: ll_gen/ll_gen/proposals/command_proposal.py:337

Extended summary with command-specific fields.

__init__(proposal_id: str = (lambda: uuid.uuid4().hex)(), confidence: float = 0.0, attempt: int = 1, max_attempts: int = 3, source_prompt: str = '', conditioning_source: str | None = None, generation_metadata: dict[str, Any] = dict(), alternatives: list[Any] = list(), timestamp: str = (lambda: datetime.now(timezone.utc).isoformat())(), error_context: dict[str, Any] | None = None, log_probs: Any | None = None, entropy: float | None = None, token_ids: list[int] = list(), command_dicts: list[dict[str, Any]] = list(), quantization_bits: int = 8, normalization_range: float = 2.0, precision_tier: str = 'STANDARD', latent_vector: Any | None = None) -> None

Source: ll_gen/ll_gen/proposals/command_proposal.py

Source: ll_gen/ll_gen/config.py:350

Configuration for the conditioning layer.

Methods

__init__(text_model: str = 'bert-base-uncased', image_model: str = 'dino_vits16', conditioning_dim: int = 768, freeze_encoders: bool = True, fusion_method: str = 'concat', image_size: int = 224) -> None

Source: ll_gen/ll_gen/config.py

Source: ll_gen/ll_gen/conditioning/embeddings.py:19

Unified conditioning embeddings from text, image, or multimodal sources.

Attributes: token_embeddings: Optional (seq_len, embed_dim) array of per-token embeddings. None if pooled embedding only, or if source doesn’t produce token-level representations. pooled_embedding: Optional (embed_dim,) single-vector summary of the input. Can be None if only token_embeddings is provided. source_type: Type of source — “text”, “image”, or “multimodal”. source_model: Model identifier (e.g., “bert-base-uncased”, “dino_vits16”, “hash_fallback” for deterministic fallback). embed_dim: Embedding dimension. Must be consistent with array shapes. metadata: Additional metadata such as sequence length, image size, region coordinates, or language tags.

Methods

to_tensor(device: str = 'cpu') -> Any | None

Source: ll_gen/ll_gen/conditioning/embeddings.py:44

Convert pooled embedding to torch tensor.

Args: device: Target device (“cpu” or “cuda”).

Returns: torch.Tensor of shape (embed_dim,) or None if pooled_embedding is None.

to_token_tensor(device: str = 'cpu') -> Any | None

Source: ll_gen/ll_gen/conditioning/embeddings.py:69

Convert token embeddings to torch tensor.

Args: device: Target device (“cpu” or “cuda”).

Returns: torch.Tensor of shape (seq_len, embed_dim) or None if token_embeddings is None.

from_tensor(tensor: Any, source_type: str, source_model: str, token_tensor: Any | None = None, metadata: dict[str, Any] | None = None) -> ConditioningEmbeddings

Source: ll_gen/ll_gen/conditioning/embeddings.py:95

Create ConditioningEmbeddings from torch tensor(s).

Args: tensor: torch.Tensor of shape (embed_dim,) for pooled embedding. source_type: Type of source (“text”, “image”, or “multimodal”). source_model: Model identifier. token_tensor: Optional torch.Tensor of shape (seq_len, embed_dim). metadata: Optional metadata dictionary.

Returns: ConditioningEmbeddings instance.

validate() -> bool

Source: ll_gen/ll_gen/conditioning/embeddings.py:155

Validate consistency of embeddings and metadata.

Returns: True if embeddings are valid and consistent, False otherwise.

summary() -> dict[str, Any]

Source: ll_gen/ll_gen/conditioning/embeddings.py:200

Compact summary of the embeddings.

Returns: Dictionary with keys: - source_type - source_model - embed_dim - has_pooled_embedding - has_token_embeddings - token_seq_len (if available) - metadata

__init__(token_embeddings: np.ndarray | None = None, pooled_embedding: np.ndarray | None = None, source_type: str = 'text', source_model: str = 'unknown', embed_dim: int = 768, metadata: dict[str, Any] = dict()) -> None

Source: ll_gen/ll_gen/conditioning/embeddings.py

Source: ll_gen/ll_gen/conditioning/constraint_predictor.py:54

Prediction of a single geometric constraint.

Attributes: constraint_type: Type of constraint (from ConstraintType enum). confidence: Confidence score in [0.0, 1.0]. parameters: Type-specific parameters (e.g., dimensions, axis). source: Origin of prediction (“keyword”, “dimension_regex”, or “learned”).

Methods

__init__(constraint_type: ConstraintType, confidence: float, parameters: dict[str, Any], source: str) -> None

Source: ll_gen/ll_gen/conditioning/constraint_predictor.py

Source: ll_gen/ll_gen/conditioning/constraint_predictor.py:76

Extracts geometric constraints from prompts and embeddings.

Uses keyword patterns, regular expressions for dimensions, and optional learned MLP for constraint prediction from embeddings.

Attributes: device: Torch device for learned model. embedding_dim: Dimension of embeddings (default 768).

Methods

__init__(device: str = 'cpu', embedding_dim: int = 768) -> None

Source: ll_gen/ll_gen/conditioning/constraint_predictor.py:87

Initialize ConstraintPredictor.

Args: device: Torch device (“cpu” or “cuda:*”). embedding_dim: Embedding dimension for learned model input.

predict_from_prompt(prompt: str) -> list[ConstraintPrediction]

Source: ll_gen/ll_gen/conditioning/constraint_predictor.py:100

Extract constraints from natural language prompt.

Parses prompt for:

  • Dimensional patterns (mm, cm, m, in, inches)
  • Symmetry keywords (symmetric, mirror, symmetrical)
  • Smoothness keywords (smooth, fillet, round, continuous)
  • Regularity keywords (pattern, array, grid, repeated, evenly spaced)
  • Connectivity keywords (connected, joined, attached, assembled)
  • Watertight/manifold keywords (solid, closed, watertight, manifold, printable)
  • Planarity keywords (flat, planar, plane)

Always adds MANIFOLD constraint with confidence 0.5 (default expectation).

Args: prompt: Natural language prompt.

Returns: List of ConstraintPrediction objects.

predict_from_embeddings(embeddings: ConditioningEmbeddings) -> list[ConstraintPrediction]

Source: ll_gen/ll_gen/conditioning/constraint_predictor.py:272

Predict constraints from embeddings using learned model.

If no learned model has been trained/initialized, returns empty list. Requires torch to be available.

Args: embeddings: ConditioningEmbeddings instance.

Returns: List of ConstraintPrediction objects from learned model.

to_loss_weights(predictions: list[ConstraintPrediction]) -> dict[str, float]

Source: ll_gen/ll_gen/conditioning/constraint_predictor.py:333

Convert constraint predictions to loss weights.

Maps constraint types to confidence scores for use in RL training reward weighting.

Args: predictions: List of constraint predictions.

Returns: Dictionary mapping constraint type names to confidence weights.

initialize_learned_model() -> None

Source: ll_gen/ll_gen/conditioning/constraint_predictor.py:360

Public entry point for initializing the learned constraint model.

Delegates to _init_learned_model which creates the MLP.

set_learned_model(model: Any | None) -> None

Source: ll_gen/ll_gen/conditioning/constraint_predictor.py:396

Set a pre-trained learned constraint model.

Args: model: Torch nn.Module or None to clear the model.

get_learned_model() -> Any | None

Source: ll_gen/ll_gen/conditioning/constraint_predictor.py:409

Get the current learned constraint model.

Returns: The learned model or None.

Source: ll_gen/ll_gen/conditioning/constraint_predictor.py:30

Enumeration of geometric constraint types.

Attributes: BOUNDING_BOX: Dimensional constraints (width, height, depth, etc.). SYMMETRY: Mirror or rotational symmetry requirements. PLANARITY: Surfaces must be flat/planar. SMOOTHNESS: Continuous/smooth surface transitions required. CONNECTIVITY: Parts must connect or be joined. MANIFOLD: Closed, watertight topology. REGULARITY: Regular patterns (arrays, grids, evenly spaced). WATERTIGHT: Closed volume suitable for 3D printing.

Source: ll_gen/ll_gen/config.py:323

Configuration for research dataset loading.

Methods

__init__(deepcad_path: str = 'latticelabs/deepcad', abc_path: str = 'latticelabs/abc', text2cad_path: str = 'latticelabs/text2cad', sketchgraphs_path: str = 'latticelabs/sketchgraphs', streaming: bool = True, shuffle: bool = True, shuffle_buffer_size: int = 10000, max_samples: int | None = None, max_commands: int = 60, quantization_bits: int = 8, normalization_range: float = 2.0, train_split: str = 'train', val_split: str = 'validation', test_split: str = 'test') -> None

Source: ll_gen/ll_gen/config.py

Source: ll_gen/ll_gen/config.py:229

Configuration for the deterministic disposal engine.

Methods

__init__(tolerance: float = 1e-07, angular_tolerance: float = 1e-05, enable_auto_repair: bool = True, max_repair_passes: int = 3, shapefix_precision: float = 1e-07, shapefix_max_tolerance: float = 0.001, shapefix_min_tolerance: float = 1e-07, fuzzy_tolerance_steps: list[float] = (lambda: [1e-07, 1e-06, 1e-05, 0.0001, 0.001])(), check_manifoldness: bool = True, check_euler: bool = True, check_watertightness: bool = True, check_self_intersection: bool = True, always_introspect: bool = True) -> None

Source: ll_gen/ll_gen/config.py

Source: ll_gen/ll_gen/disposal/engine.py:35

Deterministic disposal engine for the propose/dispose architecture.

Accepts typed proposals from the neural layer, executes them through the appropriate executor, validates the result, attempts repair if invalid, computes introspection data, and exports to STEP/STL.

Args: disposal_config: Validation and repair settings. export_config: STEP/STL export settings. feedback_config: Reward signal weights. output_dir: Directory for exported files.

Example::

engine = DisposalEngine()
result = engine.dispose(code_proposal)
if result.is_valid:
print(f"Success! STEP at {result.step_path}")
else:
print(f"Failed: {result.error_message}")

Methods

__init__(disposal_config: DisposalConfig | None = None, export_config: ExportConfig | None = None, feedback_config: FeedbackConfig | None = None, output_dir: str | None = None) -> None

Source: ll_gen/ll_gen/disposal/engine.py:59

dispose(proposal: BaseProposal, export: bool = True, render: bool = False) -> DisposalResult

Source: ll_gen/ll_gen/disposal/engine.py:72

Execute a proposal through the full disposal pipeline.

Pipeline stages:

  1. Execute — Convert proposal to TopoDS_Shape via the appropriate executor (code, command, or surface).
  2. Validate — Run BRepCheck_Analyzer + manifold + Euler
    • watertight checks.
  3. Repair (if invalid) — Apply ShapeFix_* tools with tolerance escalation.
  4. Introspect — Compute GeometryReport (volume, bbox, face counts, surface types).
  5. Export (if valid) — Write STEP and/or STL files.
  6. Reward — Compute scalar reward for RL training.

Args: proposal: A CodeProposal, CommandSequenceProposal, or LatentProposal. export: Whether to export valid shapes to STEP/STL. render: Whether to generate multi-view renders.

Returns: DisposalResult with complete outcome data.

export_result(result: DisposalResult) -> DisposalResult

Source: ll_gen/ll_gen/disposal/engine.py:284

Export a previously-disposed result to STEP and STL.

Useful when disposal was run with export=False (e.g. during batch candidate generation) and only the winning result needs to be written to disk.

Args: result: A disposal result whose shape is not None and is_valid is True.

Returns: The same result instance with step_path and stl_path populated.

Source: ll_gen/ll_gen/proposals/disposal_result.py:169

Complete outcome of deterministic proposal execution.

This is the primary return type of the DisposalEngine.dispose() method. It carries the constructed shape (or None on failure), validation findings, repair history, introspection data, export paths, and a scalar reward signal for RL training.

Attributes: shape: The constructed TopoDS_Shape, or None if execution failed entirely. Stored as Any to avoid importing OCC at module level. is_valid: Whether the shape passed all validation checks. error_category: Primary error category if invalid, else None. error_details: Per-entity validation findings. geometry_report: Introspection data (volume, bbox, etc.). repair_attempted: Whether deterministic repair was tried. repair_succeeded: Whether repair produced a valid shape. repair_actions: Ordered list of repair steps taken. reward_signal: Scalar reward for RL training. 1.0 = valid, 0.0 = complete failure, with partial credit. error_message: LLM-readable error description for code retry. suggestion: Actionable correction suggestion. step_path: Path to exported STEP file (if valid and exported). stl_path: Path to exported STL file (if valid and exported). render_paths: Paths to multi-view renders (if generated). execution_time_ms: Wall-clock time for disposal in milliseconds. proposal_id: ID of the proposal that produced this result. proposal_type: Class name of the proposal (“CodeProposal”, etc.). generation_history: GenerationHistory from the orchestrator retry loop, capturing routing decisions, per-attempt telemetry, and total wall-clock time. None when the result comes from a single DisposalEngine.dispose() call outside the orchestrator.

Methods

to_dict() -> dict[str, Any]

Source: ll_gen/ll_gen/proposals/disposal_result.py:270

Serialize to a plain dict (shape excluded).

Useful for JSON logging and training data recording.

summary() -> dict[str, Any]

Source: ll_gen/ll_gen/proposals/disposal_result.py:330

Compact summary for logging.

__init__(shape: Any = None, is_valid: bool = False, error_category: ErrorCategory | None = None, error_details: list[ValidationFinding] = list(), geometry_report: GeometryReport | None = None, repair_attempted: bool = False, repair_succeeded: bool = False, repair_actions: list[RepairAction] = list(), reward_signal: float = 0.0, error_message: str | None = None, suggestion: str | None = None, step_path: Path | None = None, stl_path: Path | None = None, render_paths: list[Path] | None = None, execution_time_ms: float = 0.0, proposal_id: str = '', proposal_type: str = '', generation_history: Any = None) -> None

Source: ll_gen/ll_gen/proposals/disposal_result.py

Source: ll_gen/ll_gen/config.py:41

Neural-interpretable error categories.

OpenCASCADE’s BRepCheck_Analyzer reports 37 distinct error codes. These 6 categories collapse them into signals that an LLM or RL reward function can act on.

Source: ll_gen/ll_gen/config.py:65

Severity of a validation finding.

Source: ll_gen/ll_gen/config.py:267

Configuration for STEP/STL export.

Methods

__init__(step_schema: StepSchema = StepSchema.AP214, stl_linear_deflection: float = 0.1, stl_angular_deflection: float = 0.5, stl_ascii: bool = False, render_views: list[str] = (lambda: ['front', 'top', 'right', 'isometric'])(), render_resolution: int = 512) -> None

Source: ll_gen/ll_gen/config.py

Source: ll_gen/ll_gen/config.py:288

Configuration for feedback and reward signals.

Methods

__init__(validity_reward: float = 0.8, shape_constructed_reward: float = 0.16, repairable_reward: float = 0.0, per_tier_reward: float = 0.16, semantic_match_reward: float = 0.2, critical_error_penalty: float = -0.1, nonsolid_valid_fraction: float = 0.1, dimension_tolerance_pct: float = 0.1, dense_dimension_reward: bool = False, dimension_reward_scale: float = 0.5) -> None

Source: ll_gen/ll_gen/config.py

Source: ll_gen/ll_gen/training/metrics.py:20

Aggregated generation quality metrics.

Attributes: validity_rate: Fraction of proposals passing validation (0–1). compile_rate: Fraction of proposals that execute without error (0–1). coverage: Coverage of reference shape set via COV metric (0–1), or None when no reference shape set was supplied (distribution metrics are undefined without a reference and must NOT be reported as 0.0). mmd: Minimum Matching Distance between generated and reference sets, or None when no reference was supplied. jsd: Jensen-Shannon Divergence between point distributions (0–1), or None when no reference was supplied. mean_reward: Mean disposal reward signal across all samples. reward_std: Standard deviation of reward signal. num_samples: Total number of samples evaluated. num_valid: Count of valid samples. num_compiled: Count of successfully compiled samples. num_distinct_valid: Number of distinct valid shapes (by rounded bounding- box dimensions). Guards against mode collapse inflating the validity rate with the same shape repeated. mean_sequence_log_prob: Mean teacher-forcing log-probability of the generated token sequences under the policy, scored from each proposal’s own latent (a deterministic reconstruction-likelihood diagnostic). 0.0 when the generator emits no command-token sequence (e.g. diffusion). Populated by evaluate_validity.

Methods

summary() -> dict[str, Any]

Source: ll_gen/ll_gen/training/metrics.py:63

Generate a summary dict suitable for logging or JSON export.

Returns: Dictionary with all metric values and counts.

__init__(validity_rate: float = 0.0, compile_rate: float = 0.0, coverage: float | None = None, mmd: float | None = None, jsd: float | None = None, mean_reward: float = 0.0, reward_std: float = 0.0, num_samples: int = 0, num_valid: int = 0, num_compiled: int = 0, num_distinct_valid: int = 0, mean_sequence_log_prob: float = 0.0) -> None

Source: ll_gen/ll_gen/training/metrics.py

Source: ll_gen/ll_gen/pipeline/orchestrator.py:70

End-to-end generation orchestrator.

Ties together routing, proposal generation, disposal, feedback, and retry into a single generate() call.

Args: config: Top-level ll_gen configuration.

Example::

orchestrator = GenerationOrchestrator()
result = orchestrator.generate(
"A mounting bracket with 4 bolt holes, 80mm wide, 3mm thick"
)
if result.is_valid:
print(f"STEP file: {result.step_path}")

Methods

__init__(config: LLGenConfig | None = None) -> None

Source: ll_gen/ll_gen/pipeline/orchestrator.py:89

generate(prompt: str, image_path: Path | None = None, force_route: GenerationRoute | None = None, max_retries: int | None = None, export: bool = True, render: bool = False) -> DisposalResult

Source: ll_gen/ll_gen/pipeline/orchestrator.py:114

Generate CAD geometry from a text prompt.

Full pipeline:

  1. Route — Analyze the prompt to decide Code (Path A) vs Neural (Path B).
  2. Propose — Generate a typed proposal via the selected path.
  3. Dispose — Execute, validate, repair through the disposal engine.
  4. Feedback — If invalid, build structured feedback and retry with error context.
  5. Export — Write STEP/STL for valid results.

Args: prompt: Text description of the desired geometry. image_path: Optional image for conditioning. force_route: Override automatic routing. max_retries: Override default max retry count. export: Whether to export valid shapes. render: Whether to generate multi-view renders.

Returns: DisposalResult from the best attempt.

generate_batch(prompt: str, num_candidates: int = 3, image_path: Path | None = None, force_route: GenerationRoute | None = None) -> list[DisposalResult]

Source: ll_gen/ll_gen/pipeline/orchestrator.py:321

Generate multiple candidate shapes and return all results.

Unlike generate() which retries on failure, this method generates num_candidates independent proposals and returns all disposal results sorted by reward signal (best first).

Args: prompt: Text description. num_candidates: Number of candidates to generate. image_path: Optional image conditioning. force_route: Override automatic routing.

Returns: List of DisposalResults, sorted by reward (best first).

Source: ll_gen/ll_gen/config.py:22

Which generation path to use.

Source: ll_gen/ll_gen/routing/router.py:49

Automatic generation path router.

Analyzes the user prompt and optional context to decide between code generation (Path A) and neural generation (Path B).

Args: config: Routing configuration with keyword lists and thresholds.

Example::

router = GenerationRouter()
decision = router.route("A mounting bracket with 4 bolt holes")
print(decision.route) # GenerationRoute.CODE_CADQUERY
print(decision.confidence) # 0.85

Methods

__init__(config: RoutingConfig | None = None) -> None

Source: ll_gen/ll_gen/routing/router.py:66

route(prompt: str, has_image: bool = False, has_reference_geometry: bool = False, force_route: GenerationRoute | None = None) -> RoutingDecision

Source: ll_gen/ll_gen/routing/router.py:83

Analyze a prompt and decide the generation route.

Args: prompt: User’s text description of the desired geometry. has_image: Whether an image is provided as conditioning. has_reference_geometry: Whether reference geometry (STEP/STL) is provided. force_route: If set, override the analysis and use this route directly.

Returns: RoutingDecision with selected route and explanation.

explain(decision: RoutingDecision) -> str

Source: ll_gen/ll_gen/routing/router.py:247

Generate a human-readable explanation of a routing decision.

Args: decision: A RoutingDecision to explain.

Returns: Multi-line string describing the decision rationale.

Source: ll_gen/ll_gen/config.py:362

Configuration for neural generators.

Methods

__init__(vae_checkpoint: str | None = None, diffusion_checkpoint: str | None = None, vqvae_checkpoint: str | None = None, default_temperature: float = 0.8, diffusion_inference_steps: int = 50, diffusion_eta: float = 0.0, vqvae_codebook_dim: int = 512, latent_dim: int = 256, max_seq_len: int = 60) -> None

Source: ll_gen/ll_gen/config.py

Source: ll_gen/ll_gen/proposals/disposal_result.py:28

Introspection data for a TopoDS_Shape.

All measurements are in the model’s native unit system. Fields are set to None when the corresponding OCC query fails (e.g. volume is undefined for non-solid shapes).

Attributes: volume: Solid volume from BRepGProp.VolumeProperties. surface_area: Total surface area from BRepGProp.SurfaceProperties. bounding_box: Axis-aligned bbox as (x_min, y_min, z_min, x_max, y_max, z_max). center_of_mass: (x, y, z) from volume properties. inertia_tensor: 3×3 inertia matrix as a flat 9-element tuple (Ixx, Ixy, Ixz, Iyx, Iyy, Iyz, Izx, Izy, Izz). face_count: Number of topological faces. edge_count: Number of topological edges. vertex_count: Number of topological vertices. shell_count: Number of shells. solid_count: Number of solids. surface_types: Mapping from surface type name (e.g. “Plane”, “Cylinder”, “BSplineSurface”) to count. curve_types: Mapping from curve type name to count. euler_characteristic: V − E + F. is_solid: Whether the shape is a closed solid. oriented_bounding_box: Optional OBB as 15-element tuple (center_x, center_y, center_z, half_x, half_y, half_z, axis1_x, axis1_y, axis1_z, axis2_x, axis2_y, axis2_z, axis3_x, axis3_y, axis3_z).

Methods

matches_dimensions(target_dims: tuple[float, float, float], tolerance_pct: float = 0.1) -> bool

Source: ll_gen/ll_gen/proposals/disposal_result.py:92

Check if bounding box dimensions match a target within tolerance.

Compares sorted dimension tuples so axis order doesn’t matter.

Args: target_dims: Expected (w, h, d) in any order. tolerance_pct: Fractional tolerance (0.10 = 10%).

Returns: True if all three sorted dimensions are within tolerance.

__init__(volume: float | None = None, surface_area: float | None = None, bounding_box: tuple[float, ...] | None = None, center_of_mass: tuple[float, float, float] | None = None, inertia_tensor: tuple[float, ...] | None = None, face_count: int = 0, edge_count: int = 0, vertex_count: int = 0, shell_count: int = 0, solid_count: int = 0, surface_types: dict[str, int] = dict(), curve_types: dict[str, int] = dict(), euler_characteristic: int | None = None, is_solid: bool = False, oriented_bounding_box: tuple[float, ...] | None = None) -> None

Source: ll_gen/ll_gen/proposals/disposal_result.py

Source: ll_gen/ll_gen/embeddings/hybrid_encoder.py:26

Fused Transformer + GNN encoder for shape understanding.

Combines conditioning embeddings (text/image) processed through a transformer branch with B-Rep topology features processed through a GNN branch (lazy-imported from cadling’s BRepNetEncoder or UVNetEncoder).

The two branches produce fixed-dimension embeddings that are fused via concatenation + linear projection.

Inherits from torch.nn.Module so that parameters(), state_dict(), load_state_dict(), train(), eval(), and to() work automatically via the standard PyTorch machinery.

Args: input_dim: Dimension of input conditioning embeddings. Defaults to 768. transformer_dim: Hidden dimension for transformer layers. Defaults to 512. graph_dim: Hidden dimension for GNN branch. Defaults to 256. output_dim: Dimension of final fused embedding. Defaults to 512. num_transformer_layers: Number of transformer encoder layers. Defaults to 3. num_heads: Number of attention heads. Defaults to 8. dropout: Dropout probability. Defaults to 0.1. graph_encoder_type: Type of graph encoder (“brep_net” or “uv_net”). Defaults to “brep_net”. device: Target device (“cpu” or “cuda”). Defaults to “cpu”.

Example::

encoder = HybridShapeEncoder(
input_dim=768,
output_dim=512,
device="cuda"
)
# Encode conditioning only
cond_embedding = encoder.encode_conditioning_only(cond_array)
# Encode with graph
output = encoder.forward(cond_array, graph_data=graph_dict)

Methods

__init__(input_dim: int = 768, transformer_dim: int = 512, graph_dim: int = 256, output_dim: int = 512, num_transformer_layers: int = 3, num_heads: int = 8, dropout: float = 0.1, graph_encoder_type: str = 'brep_net', device: str = 'cpu') -> None

Source: ll_gen/ll_gen/embeddings/hybrid_encoder.py:67

Initialize the hybrid encoder.

Args: input_dim: Dimension of input conditioning embeddings. transformer_dim: Hidden dimension for transformer. graph_dim: Hidden dimension for GNN. output_dim: Output embedding dimension. num_transformer_layers: Number of transformer layers. num_heads: Number of attention heads. dropout: Dropout probability. graph_encoder_type: Type of graph encoder. device: Target device.

forward(conditioning_embeddings: Any, graph_data: dict[str, Any] | None = None) -> np.ndarray

Source: ll_gen/ll_gen/embeddings/hybrid_encoder.py:212

Encode conditioning + optional graph to shape embedding.

Args: conditioning_embeddings: Input embeddings, either numpy array (seq_len, input_dim) or torch tensor. If 1D, will be expanded to (1, input_dim). graph_data: Optional dict with keys: - node_features: (num_nodes, feat_dim) tensor or array - edge_index: (2, num_edges) edge indices - edge_attr: Optional (num_edges, edge_feat_dim) edge attributes

Returns: numpy array of shape (output_dim,) containing the fused embedding.

encode_conditioning_only(conditioning_embeddings: Any) -> np.ndarray

Source: ll_gen/ll_gen/embeddings/hybrid_encoder.py:313

Encode conditioning embeddings through transformer branch only.

Useful for text/image inputs without graph data.

Args: conditioning_embeddings: Input embeddings (numpy array or torch tensor).

Returns: numpy array of shape (output_dim,) or (transformer_out_dim,) if graph is available but not used.

encode_graph_only(graph_data: dict[str, Any]) -> np.ndarray

Source: ll_gen/ll_gen/embeddings/hybrid_encoder.py:347

Encode graph data through GNN branch only.

Args: graph_data: Dictionary with node_features, edge_index, edge_attr.

Returns: numpy array of shape (graph_dim,).

Raises: RuntimeError: If GNN encoder is not available.

to(device: str) -> HybridShapeEncoder

Source: ll_gen/ll_gen/embeddings/hybrid_encoder.py:365

Move all parameters to device.

Args: device: Target device (“cpu” or “cuda”).

Returns: Self for chaining.

Source: ll_gen/ll_gen/conditioning/image_encoder.py:39

Encodes images into ConditioningEmbeddings.

Uses ll_stepnet’s ImageConditioner if available, otherwise falls back to deterministic hash-based embeddings.

Attributes: model_name: Vision model identifier (e.g., “dino_vits16”). conditioning_dim: Embedding dimension. freeze_encoder: Whether to freeze encoder parameters. device: Torch device (“cpu” or “cuda:*”). image_size: Image size for preprocessing.

Methods

__init__(model_name: str = 'dino_vits16', conditioning_dim: int = 768, freeze_encoder: bool = True, device: str = 'cpu', image_size: int = 224) -> None

Source: ll_gen/ll_gen/conditioning/image_encoder.py:53

Initialize ImageConditioningEncoder.

Args: model_name: Vision model identifier. conditioning_dim: Embedding dimension. freeze_encoder: Whether to freeze encoder parameters. device: Torch device (“cpu” or “cuda:*”). image_size: Size to resize images to (square).

encode(image_path: str | Path) -> ConditioningEmbeddings

Source: ll_gen/ll_gen/conditioning/image_encoder.py:77

Encode an image from file path.

Args: image_path: Path to image file.

Returns: ConditioningEmbeddings with patch/region embeddings.

encode_from_array(image: np.ndarray) -> ConditioningEmbeddings

Source: ll_gen/ll_gen/conditioning/image_encoder.py:132

Encode an image from numpy array.

Args: image: Image array (H, W, 3) with values in [0, 255] or [0, 1].

Returns: ConditioningEmbeddings.

Source: ll_gen/ll_gen/config.py:391

Top-level configuration for the ll_gen package.

Methods

__init__(routing: RoutingConfig = RoutingConfig(), codegen: CodegenConfig = CodegenConfig(), disposal: DisposalConfig = DisposalConfig(), export: ExportConfig = ExportConfig(), feedback: FeedbackConfig = FeedbackConfig(), datasets: DatasetConfig = DatasetConfig(), conditioning: ConditioningConfig = ConditioningConfig(), generators: GeneratorConfig = GeneratorConfig(), training: TrainingConfig = TrainingConfig(), max_retries: int = 3, output_dir: str = 'output', log_level: str = 'INFO', device: str = 'cpu') -> None

Source: ll_gen/ll_gen/config.py

Source: ll_gen/ll_gen/proposals/latent_proposal.py:23

A proposal containing decoded B-rep component geometry.

Attributes: face_grids: Per-face point grids. Each entry is an ndarray of shape (U, V, 3) — typically (32, 32, 3) — sampling the face surface in parameter space. edge_points: Per-edge point arrays. Each entry is an ndarray of shape (N, 3) sampling the edge curve. face_bboxes: Face bounding boxes, shape (F, 6) where each row is (x_min, y_min, z_min, x_max, y_max, z_max). edge_bboxes: Edge bounding boxes, shape (E, 6). vertex_positions: Vertex positions, shape (V, 3). Vertices are the endpoints/intersections of edges. face_edge_adjacency: Mapping from face index to list of edge indices that bound that face. stage_latents: Raw latent tensors from each diffusion stage, keyed by stage name ("face_positions", "face_geometry", "edge_positions", "edge_vertex_geometry"). Preserved for debugging and ablation studies.

Methods

validate_shapes() -> list[str]

Source: ll_gen/ll_gen/proposals/latent_proposal.py:95

Check internal consistency of array shapes.

Returns: List of error messages. Empty list means all checks pass.

compute_bounding_box() -> np.ndarray | None

Source: ll_gen/ll_gen/proposals/latent_proposal.py:150

Compute overall axis-aligned bounding box from all points.

Returns: Array of shape (6,) as [x_min, y_min, z_min, x_max, y_max, z_max], or None if no geometry data is present.

compute_face_areas_approximate() -> list[float]

Source: ll_gen/ll_gen/proposals/latent_proposal.py:176

Approximate face areas from point grid spacing.

Uses the cross-product method on the UV grid to estimate surface area per face.

Returns: List of approximate areas for each face.

with_error_context(error: dict[str, Any]) -> LatentProposal

Source: ll_gen/ll_gen/proposals/latent_proposal.py:207

Create a retry proposal with error context.

Preserves stage_latents so the diffusion model can re-denoise from an intermediate stage, but clears decoded geometry.

summary() -> dict[str, Any]

Source: ll_gen/ll_gen/proposals/latent_proposal.py:225

Extended summary with latent-specific fields.

__init__(proposal_id: str = (lambda: uuid.uuid4().hex)(), confidence: float = 0.0, attempt: int = 1, max_attempts: int = 3, source_prompt: str = '', conditioning_source: str | None = None, generation_metadata: dict[str, Any] = dict(), alternatives: list[Any] = list(), timestamp: str = (lambda: datetime.now(timezone.utc).isoformat())(), error_context: dict[str, Any] | None = None, log_probs: Any | None = None, entropy: float | None = None, face_grids: list[np.ndarray] = list(), edge_points: list[np.ndarray] = list(), face_bboxes: np.ndarray | None = None, edge_bboxes: np.ndarray | None = None, vertex_positions: np.ndarray | None = None, face_edge_adjacency: dict[int, list[int]] | None = None, stage_latents: dict[str, Any] | None = None) -> None

Source: ll_gen/ll_gen/proposals/latent_proposal.py

Source: ll_gen/ll_gen/generators/latent_sampler.py:23

Utilities for exploring VAE latent space.

Provides interpolation, neighborhood sampling, prior sampling, and GAN-based latent vector generation. Optionally integrates with a NeuralVAEGenerator for decoding latent vectors back to proposals.

Attributes: vae_generator: Optional NeuralVAEGenerator for decoding latents. latent_dim: Latent vector dimension. device: Target device (“cpu” or “cuda”).

Methods

__init__(vae_generator: NeuralVAEGenerator | None = None, latent_dim: int = 256, device: str = 'cpu') -> None

Source: ll_gen/ll_gen/generators/latent_sampler.py:36

Initialize the latent sampler.

Args: vae_generator: Optional VAE generator for decoding latents. latent_dim: Dimensionality of latent vectors. device: Target device (“cpu” or “cuda”).

interpolate(latent1: np.ndarray, latent2: np.ndarray, steps: int = 5) -> list[np.ndarray]

Source: ll_gen/ll_gen/generators/latent_sampler.py:53

Interpolate between two latent vectors via spherical linear interpolation.

Uses SLERP to follow the geodesic on the hypersphere, providing smooth interpolation in latent space.

Args: latent1: First latent vector (shape: (latent_dim,)). latent2: Second latent vector (shape: (latent_dim,)). steps: Number of interpolation steps (including endpoints).

Returns: List of interpolated latent vectors, from latent1 to latent2.

explore_neighborhood(seed_latent: np.ndarray, radius: float = 0.3, num_samples: int = 5, seed: int | None = None) -> list[np.ndarray]

Source: ll_gen/ll_gen/generators/latent_sampler.py:112

Sample points in a hypersphere around a seed latent vector.

Generates random directions on the unit hypersphere, scales by the specified radius, and adds to the seed.

Args: seed_latent: Center of the neighborhood (shape: (latent_dim,)). radius: Radius of the hypersphere neighborhood. num_samples: Number of points to sample. seed: Optional random seed for reproducibility.

Returns: List of latent vectors sampled in the neighborhood.

sample_from_prior(num_samples: int = 3, seed: int | None = None) -> list[np.ndarray]

Source: ll_gen/ll_gen/generators/latent_sampler.py:151

Sample latent vectors from the prior N(0, I).

Args: num_samples: Number of vectors to sample. seed: Optional random seed for reproducibility.

Returns: List of latent vectors sampled from the standard normal prior.

sample_from_gan(num_samples: int = 3, seed: int | None = None) -> list[np.ndarray]

Source: ll_gen/ll_gen/generators/latent_sampler.py:175

Sample latent vectors from a learned GAN generator (if available).

Falls back to sample_from_prior if GAN is not available.

Args: num_samples: Number of vectors to sample. seed: Optional random seed for reproducibility.

Returns: List of latent vectors from GAN or prior.

decode_latents(latents: list[np.ndarray], prompt: str = '') -> list[CommandSequenceProposal]

Source: ll_gen/ll_gen/generators/latent_sampler.py:234

Decode latent vectors into command sequence proposals.

Requires self.vae_generator to be set.

Args: latents: List of latent vectors to decode. prompt: Optional prompt for the proposals.

Returns: List of CommandSequenceProposal objects.

Raises: RuntimeError: If vae_generator is not set.

Source: ll_gen/ll_gen/training/metrics.py:85

Computes generation quality metrics from disposal results.

Supports MMD (minimum matching distance), JSD (Jensen-Shannon divergence), coverage analysis, and reward aggregation.

Attributes: num_bins: Number of histogram bins for JSD computation. kernel_bandwidth: Bandwidth for RBF kernel in MMD.

Methods

__init__(num_bins: int = 64, kernel_bandwidth: float = 0.1) -> None

Source: ll_gen/ll_gen/training/metrics.py:96

Initialize the metrics computer.

Args: num_bins: Number of histogram bins for JSD computation. kernel_bandwidth: Bandwidth parameter for RBF kernel in MMD.

is_valid_solid(result: DisposalResult) -> bool

Source: ll_gen/ll_gen/training/metrics.py:106

Honest validity gate: a sample counts as valid only if it passes BRepCheck (is_valid) AND forms a non-degenerate solid.

is_valid (BRepCheck) alone passes volume-less shells and zero-volume degenerates, so a generator that emits unsewable faces can score ~1.0 while producing zero real CAD solids. We therefore additionally require a closed solid with positive volume whenever a geometry report is present. Abstract stand-ins without a geometry report (unit tests) fall back to is_valid so the rate arithmetic stays testable.

compute_validity_rate(results: list[DisposalResult]) -> float

Source: ll_gen/ll_gen/training/metrics.py:127

Compute fraction of valid samples (honest, non-degenerate-solid gated).

Args: results: List of disposal results.

Returns: Validity rate in [0, 1]. Returns 0.0 if empty.

compute_compile_rate(results: list[DisposalResult]) -> float

Source: ll_gen/ll_gen/training/metrics.py:141

Compute fraction of samples that compiled (produced a shape).

A shape is “compiled” if result.has_shape is True (i.e., result.shape is not None).

Args: results: List of disposal results.

Returns: Compile rate in [0, 1]. Returns 0.0 if empty.

compute_mmd(set1: list[np.ndarray], set2: list[np.ndarray], kernel: str = 'rbf') -> float

Source: ll_gen/ll_gen/training/metrics.py:158

Compute Maximum Mean Discrepancy between two point cloud sets.

MMD = E[k(x,x’)] + E[k(y,y’)] - 2*E[k(x,y)]

Each element in set1/set2 is a point cloud (N, 3). Points are flattened to centroids for efficient pairwise comparison.

Args: set1: List of point clouds (each (N, 3) array). set2: List of point clouds (each (M, 3) array). kernel: Kernel type (“rbf” or other).

Returns: MMD value. Returns 0.0 if either set is empty.

compute_jsd(dist1: np.ndarray, dist2: np.ndarray, num_bins: int | None = None) -> float

Source: ll_gen/ll_gen/training/metrics.py:277

Compute Jensen-Shannon Divergence between two point distributions.

Computes 1D histograms per axis, averages JSD across axes. JSD = 0.5 * KL(P||M) + 0.5 * KL(Q||M) where M = 0.5*(P+Q)

Args: dist1: (N, 3) point array. dist2: (M, 3) point array. num_bins: Number of histogram bins. Uses self.num_bins if None.

Returns: JSD in [0, 1]. Returns 0.0 if either distribution is empty.

compute_coverage(generated: list[np.ndarray], reference: list[np.ndarray], threshold: float = 0.05) -> float

Source: ll_gen/ll_gen/training/metrics.py:347

Compute coverage: fraction of reference shapes covered by generated.

For each reference point cloud, check if any generated point cloud is within Chamfer distance threshold (computed on centroids).

Args: generated: List of generated point clouds (each (N, 3)). reference: List of reference point clouds (each (M, 3)). threshold: Chamfer distance threshold for coverage.

Returns: Coverage in [0, 1]. Returns 0.0 if either set is empty.

compute_all(results: list[DisposalResult], reference_points: list[np.ndarray] | None = None) -> GenerationMetrics

Source: ll_gen/ll_gen/training/metrics.py:442

Compute all metrics at once.

Extracts point clouds from results that have geometry_report with bounding_box, then calls individual compute_* methods.

Args: results: List of disposal results. reference_points: Optional reference point clouds for coverage/MMD/JSD.

Returns: Populated GenerationMetrics dataclass.

compute_distinct_valid(results: list[DisposalResult], ndigits: int = 3) -> int

Source: ll_gen/ll_gen/training/metrics.py:514

Count distinct valid shapes by rounded bounding-box dimensions.

Two valid shapes are considered the same when their (w, h, d) bounding-box dimensions agree to ndigits decimals. Valid shapes without a geometry report fall back to a per-result identity so they are never silently merged.

Args: results: Disposal results to inspect. ndigits: Decimal places for the bounding-box comparison key.

Returns: Number of distinct valid shapes.

Source: ll_gen/ll_gen/conditioning/multimodal.py:21

Fuses text and image embeddings for multimodal conditioning.

Attributes: text_encoder: TextConditioningEncoder instance. image_encoder: ImageConditioningEncoder instance. conditioning_dim: Embedding dimension. fusion_method: Strategy for combining embeddings. device: Torch device.

Methods

__init__(text_model: str = 'bert-base-uncased', image_model: str = 'dino_vits16', conditioning_dim: int = 768, fusion_method: str = 'concat', device: str = 'cpu') -> None

Source: ll_gen/ll_gen/conditioning/multimodal.py:32

Initialize MultiModalConditioner.

Args: text_model: Text model identifier. image_model: Image model identifier. conditioning_dim: Base embedding dimension. fusion_method: Fusion strategy (“concat”, “average”, “text_only”, “image_only”). device: Torch device (“cpu” or “cuda:*”).

encode(prompt: str, image_path: str | Path | None = None) -> ConditioningEmbeddings

Source: ll_gen/ll_gen/conditioning/multimodal.py:70

Encode text and optional image into fused embeddings.

Args: prompt: Text prompt. image_path: Optional path to image file.

Returns: ConditioningEmbeddings with fused embeddings.

encode_batch(prompts: list[str], image_paths: list[str | Path | None] | None = None) -> list[ConditioningEmbeddings]

Source: ll_gen/ll_gen/conditioning/multimodal.py:193

Encode multiple text-image pairs.

Args: prompts: List of text prompts. image_paths: Optional list of image paths (aligned with prompts). If None, encodes text-only.

Returns: List of ConditioningEmbeddings.

class NeuralDiffusionGenerator(BaseNeuralGenerator)

Section titled “class NeuralDiffusionGenerator(BaseNeuralGenerator)”

Source: ll_gen/ll_gen/generators/neural_diffusion.py:23

Neural generator wrapping StructuredDiffusion for B-rep synthesis.

Generates per-face point grids (U×V×3) and per-edge point arrays (N×3) through progressive denoising, with support for stage-specific retry strategies.

Attributes: diffusion_config: Optional diffusion configuration. inference_steps: Number of denoising steps. eta: DDIM eta parameter (0.0 = deterministic, 1.0 = stochastic).

Methods

__init__(diffusion_config: Any | None = None, checkpoint_path: Path | None = None, device: str = 'cpu', inference_steps: int = 50, eta: float = 0.0) -> None

Source: ll_gen/ll_gen/generators/neural_diffusion.py:36

Initialize the diffusion generator.

Args: diffusion_config: Optional diffusion configuration object. checkpoint_path: Path to model checkpoint. device: Target device (“cpu” or “cuda”). inference_steps: Number of DDIM steps. eta: DDIM eta parameter.

generate(prompt: str, conditioning: ConditioningEmbeddings | None = None, error_context: dict[str, Any] | None = None) -> LatentProposal

Source: ll_gen/ll_gen/generators/neural_diffusion.py:109

Generate B-rep geometry via diffusion.

Args: prompt: User prompt (stored for tracing). conditioning: Optional conditioning embeddings. error_context: Optional error context from a failed attempt.

Returns: LatentProposal with face grids and edge points.

generate_for_training(prompt: str, conditioning: ConditioningEmbeddings | None = None, error_context: dict[str, Any] | None = None, target_dimensions: tuple[float, float, float] | None = None) -> LatentProposal

Source: ll_gen/ll_gen/generators/neural_diffusion.py:161

Generate with a REAL DDPO policy-gradient signal for RL training.

target_dimensions is accepted for trainer-call uniformity (diffusion does not yet condition on it).

This runs StructuredDiffusion.sample_with_log_prob — a stochastic DDIM reverse process (DDPO; Black et al., 2023) executed with gradients enabled. The returned log_probs is the sum of the per-step Gaussian log-probabilities of the actual sampled denoising trajectory, whose transition means are produced by the denoiser network. It is therefore connected to the model parameters: the RL trainer’s -advantage * log_probs REINFORCE update backpropagates into the diffusion denoisers and trains them. This replaces the former decoupled noise-prior stand-in, which produced a finite loss while updating zero parameters.

A non-zero DDIM eta is required for a usable policy gradient (a deterministic trajectory has a degenerate policy); when self.eta is 0 (the deterministic inference default) the training path uses eta = 1.0.

Args: prompt: User prompt. conditioning: Optional conditioning embeddings. error_context: Optional error context. target_dimensions: Accepted for call uniformity; unused here.

Returns: LatentProposal whose log_probs is a differentiable scalar connected to the model parameters and whose entropy is the trajectory’s summed Gaussian entropy.

generate_candidates(prompt: str, num_candidates: int = 3, conditioning: ConditioningEmbeddings | None = None) -> list[LatentProposal]

Source: ll_gen/ll_gen/generators/neural_diffusion.py:243

Generate multiple candidate geometries.

Args: prompt: User prompt. num_candidates: Number of candidates to generate. conditioning: Optional conditioning embeddings.

Returns: List of LatentProposal objects, sorted by confidence descending.

generate_from_error_context(error_context: dict[str, Any]) -> LatentProposal | None

Source: ll_gen/ll_gen/generators/neural_diffusion.py:299

Generate a retry proposal from a prior stage.

If error_context contains stage_latents, can re-run from an intermediate stage rather than from noise.

Args: error_context: Error dict with optional stage_latents.

Returns: LatentProposal or None if no stage latents available.

class NeuralVAEGenerator(BaseNeuralGenerator)

Section titled “class NeuralVAEGenerator(BaseNeuralGenerator)”

Source: ll_gen/ll_gen/generators/neural_vae.py:24

Neural generator wrapping a STEPVAE.

All generation paths (generate, generate_for_training, generate_candidates, generate_from_error_context) decode through the shared _decode_and_sample, so RL improvements to the policy are reflected at inference.

Attributes: vae_config: Optional VAE configuration object. temperature: Sampling temperature (higher = more stochastic). max_seq_len: Maximum sequence length (default 60).

Methods

__init__(vae_config: Any | None = None, checkpoint_path: Path | None = None, device: str = 'cpu', temperature: float = 0.8, max_seq_len: int = 60) -> None

Source: ll_gen/ll_gen/generators/neural_vae.py:37

Initialize the VAE generator.

Args: vae_config: Optional VAEConfig object from stepnet. checkpoint_path: Path to model checkpoint. device: Target device (“cpu” or “cuda”). temperature: Sampling temperature. max_seq_len: Maximum sequence length.

generate(prompt: str, conditioning: ConditioningEmbeddings | None = None, error_context: dict[str, Any] | None = None, target_dimensions: tuple[float, float, float] | None = None) -> CommandSequenceProposal

Source: ll_gen/ll_gen/generators/neural_vae.py:59

Generate a command sequence from the VAE model.

Args: prompt: User prompt (stored in proposal for tracing). conditioning: Optional conditioning embeddings. error_context: Optional error context from a failed attempt. target_dimensions: Optional (w, h, d) to condition generation on (shifts the latent via the trained dimension encoder). Lets the deployment/inference path use conditioning, not just training.

Returns: CommandSequenceProposal with decoded token sequence and command dicts.

generate_for_training(prompt: str, conditioning: ConditioningEmbeddings | None = None, error_context: dict[str, Any] | None = None, target_dimensions: tuple[float, float, float] | None = None) -> CommandSequenceProposal

Source: ll_gen/ll_gen/generators/neural_vae.py:112

Generate with gradients, computing log-probs on the sampled trajectory.

Runs the shared _decode_and_sample with gradients live so the returned proposal.log_probs is a differentiable scalar usable in a REINFORCE loss. Each token is sampled once from the live forward pass and its log-prob evaluated on that same sample — an unbiased estimator.

Args: prompt: User prompt (stored for tracing). conditioning: Optional conditioning embeddings. error_context: Optional error context from a failed attempt. target_dimensions: Optional (w, h, d) to condition generation on (shifts the latent via the trained dimension encoder).

Returns: CommandSequenceProposal with log_probs and entropy set.

decode_command_logits(temperature: float = 1.0, latent: Any | None = None) -> tuple[Any, list[Any]]

Source: ll_gen/ll_gen/generators/neural_vae.py:159

Decode per-position command/parameter logits for sequence scoring.

Mirrors the decode in :meth:_decode_and_sample (zdecodecommand_head / param_heads) but stops at the logits, returning them for teacher-forcing in :meth:BaseNeuralGenerator.score_token_sequence. Gradients are live, so the gathered log-probabilities are differentiable.

Args: temperature: Accepted for interface parity (the scorer applies the scaling); the logits themselves are returned unscaled. latent: Optional latent z ([1, latent_dim] tensor or numpy array, e.g. a proposal’s latent_vector). When provided the decode is deterministic (reconstruction-likelihood); when None a fresh N(0, I) prior is drawn, giving a single, non-deterministic sample.

Returns: (command_logits [S, C], param_logits_2d list of [S, P]).

generate_candidates(prompt: str, num_candidates: int = 3, conditioning: ConditioningEmbeddings | None = None) -> list[CommandSequenceProposal]

Source: ll_gen/ll_gen/generators/neural_vae.py:317

Generate multiple candidate sequences.

Args: prompt: User prompt. num_candidates: Number of candidates to generate. conditioning: Optional conditioning embeddings.

Returns: List of CommandSequenceProposal objects, sorted by confidence descending.

generate_from_error_context(error_context: dict[str, Any]) -> CommandSequenceProposal | None

Source: ll_gen/ll_gen/generators/neural_vae.py:368

Generate a retry proposal by perturbing a prior latent vector.

If error_context contains ‘previous_latent_vector’ and the error category is known, perturb the latent and decode directly.

Args: error_context: Error dict with optional ‘previous_latent_vector’.

Returns: CommandSequenceProposal or None if no prior latent available.

class NeuralVQVAEGenerator(BaseNeuralGenerator)

Section titled “class NeuralVQVAEGenerator(BaseNeuralGenerator)”

Source: ll_gen/ll_gen/generators/neural_vqvae.py:22

Neural generator wrapping VQVAEModel + CADGenerationPipeline.

Uses discrete quantized codebooks instead of continuous VAE latent, enabling masked codebook re-sampling for targeted error recovery.

Attributes: checkpoint_path: Path to model checkpoint. temperature: Sampling temperature. codebook_dim: Codebook vector dimension. max_seq_len: Maximum sequence length. _pipeline: Lazy-initialized CADGenerationPipeline.

Methods

__init__(checkpoint_path: Path | None = None, device: str = 'cpu', temperature: float = 0.7, codebook_dim: int = 512, max_seq_len: int = 60) -> None

Source: ll_gen/ll_gen/generators/neural_vqvae.py:36

Initialize the VQ-VAE generator.

Args: checkpoint_path: Path to model checkpoint. device: Target device (“cpu” or “cuda”). temperature: Sampling temperature. codebook_dim: Codebook embedding dimension. max_seq_len: Maximum sequence length.

generate(prompt: str, conditioning: ConditioningEmbeddings | None = None, error_context: dict[str, Any] | None = None) -> CommandSequenceProposal

Source: ll_gen/ll_gen/generators/neural_vqvae.py:99

Generate a command sequence from the VQ-VAE model.

Args: prompt: User prompt (stored for tracing). conditioning: Optional conditioning embeddings. error_context: Optional error context from a failed attempt.

Returns: CommandSequenceProposal with decoded token sequence.

decode_command_logits(temperature: float = 1.0, latent: Any | None = None) -> tuple[Any, list[Any]]

Source: ll_gen/ll_gen/generators/neural_vqvae.py:161

Decode per-position command/parameter logits for sequence scoring.

Samples codes from the three autoregressive codebook decoders (no gradient on the code draw), then deterministically decodes them to features and projects to command/parameter logits with gradients — so the log-probabilities teacher-forced in :meth:BaseNeuralGenerator.score_token_sequence are differentiable through the projection/decoder.

Args: temperature: Accepted for interface parity (the scorer applies the scaling); logits are returned unscaled. latent: Accepted for interface parity with the VAE. The VQ-VAE policy lives at the codebook level (discrete codes), not a continuous latent, so a caller-supplied latent is not applicable here — codes are always sampled fresh. The score is therefore a single-sample estimate; see the determinism note on :meth:BaseNeuralGenerator.score_token_sequence.

Returns: (command_logits [S, C], param_logits_2d list of [S, P]).

generate_for_training(prompt: str, conditioning: ConditioningEmbeddings | None = None, error_context: dict[str, Any] | None = None, target_dimensions: tuple[float, float, float] | None = None) -> CommandSequenceProposal

Source: ll_gen/ll_gen/generators/neural_vqvae.py:230

Generate with gradients for RL training.

target_dimensions is accepted for trainer-call uniformity (VQ-VAE does not yet condition on it).

Bypasses CADGenerationPipeline.generate() (which wraps everything in torch.no_grad) and instead runs the three autoregressive codebook decoders directly with gradients. Each code is sampled once and the log-probability is accumulated on that same draw, ensuring an unbiased REINFORCE estimator.

After code generation, the codes are deterministically decoded to features and then to command/parameter logits via the pipeline’s projection heads. Token IDs are obtained by argmax over those logits (deterministic — no second stochastic draw).

Args: prompt: User prompt. conditioning: Optional conditioning embeddings. error_context: Optional error context.

Returns: CommandSequenceProposal with log_probs and entropy.

generate_candidates(prompt: str, num_candidates: int = 3, conditioning: ConditioningEmbeddings | None = None) -> list[CommandSequenceProposal]

Source: ll_gen/ll_gen/generators/neural_vqvae.py:398

Generate multiple candidate sequences.

Args: prompt: User prompt. num_candidates: Number of candidates to generate. conditioning: Optional conditioning embeddings.

Returns: List of CommandSequenceProposal objects, sorted by confidence descending.

generate_from_masked_codebooks(error_context: dict[str, Any]) -> CommandSequenceProposal | None

Source: ll_gen/ll_gen/generators/neural_vqvae.py:451

Generate a retry by re-sampling with masked codebooks.

Excludes codebook entries that led to topology or geometry failures.

Args: error_context: Error dict with optional masked_codebook_indices.

Returns: CommandSequenceProposal or None if no mask available.

Source: ll_gen/ll_gen/codegen/openscad_proposer.py:27

Wraps cadling’s OpenSCADGenerator to produce typed CodeProposal objects.

This class manages the lifecycle of cadling’s OpenSCADGenerator, handling both initial generation from prompts and repair generation for code that has failed validation or execution.

Attributes: config: The CodegenConfig for model selection and API provider. generator: The underlying cadling OpenSCADGenerator instance.

Raises: ImportError: If cadling is not installed when propose() is called.

Methods

__init__(config: CodegenConfig | None = None) -> None

Source: ll_gen/ll_gen/codegen/openscad_proposer.py:42

Initialize the OpenSCADProposer.

Args: config: Optional CodegenConfig specifying model_name and api_provider. If None, uses defaults from CodegenConfig.

Side-effects: If cadling is available, creates an OpenSCADGenerator instance.

propose(prompt: str, image_path: Path | None = None, error_context: dict | None = None, attempt: int = 1) -> CodeProposal

Source: ll_gen/ll_gen/codegen/openscad_proposer.py:61

Generate an OpenSCAD code proposal from a prompt.

For the first attempt (attempt=1), generates code from scratch. For retry attempts (attempt>1), uses the repair endpoint with the previous code and error message.

Args: prompt: The natural language prompt describing the part. image_path: Optional path to a reference image (JPEG/PNG). error_context: Dictionary with keys: - “old_code” (str): The previous code that failed - “error_message” (str): The error from execution For retry attempts. attempt: Attempt number (1=initial, 2+=retry). Used to select generation vs repair mode.

Returns: A CodeProposal wrapping the generated code with: - language set to OPENSCAD - imports_required extracted from the code - syntax_valid pre-checked via heuristic validation

Raises: ImportError: If cadling is not installed. ValueError: If error_context is missing required keys on retry.

propose_batch(prompt: str, num_candidates: int = 3, image_path: Path | None = None) -> list[CodeProposal]

Source: ll_gen/ll_gen/codegen/openscad_proposer.py:139

Generate multiple candidate OpenSCAD code proposals.

This method calls the generator multiple times to produce diverse candidates. Useful for downstream filtering or ranking.

Args: prompt: The natural language prompt describing the part. num_candidates: Number of distinct candidates to generate. image_path: Optional path to a reference image.

Returns: A list of CodeProposal objects, each with: - language set to OPENSCAD - syntax_valid pre-checked via heuristic validation - code_hash set for deduplication

Raises: ImportError: If cadling is not installed.

Source: ll_gen/ll_gen/training/rl_trainer.py:26

RL alignment trainer using disposal validation as oracle.

Implements REINFORCE policy gradient with baseline subtraction. The generator’s neural model is trained to maximize disposal rewards (validity, compilation, geometry correctness).

Attributes: generator: Neural generator to train. disposal_config: Configuration for validation/repair. feedback_config: Configuration for reward signals. learning_rate: Optimizer learning rate. baseline_decay: EMA decay for reward baseline. entropy_coeff: Entropy bonus coefficient for exploration. max_grad_norm: Gradient clipping threshold. device: Training device (“cpu” or “cuda”). output_dir: Directory for saving checkpoints.

Methods

__init__(generator: BaseNeuralGenerator, disposal_config: DisposalConfig | None = None, feedback_config: FeedbackConfig | None = None, learning_rate: float = 1e-05, baseline_decay: float = 0.99, entropy_coeff: float = 0.01, max_grad_norm: float = 1.0, device: str = 'cpu', output_dir: str = 'training_output', seed: int | None = None) -> None

Source: ll_gen/ll_gen/training/rl_trainer.py:45

Initialize the RL alignment trainer.

Args: generator: The neural generator to train. disposal_config: Disposal/validation configuration. Uses defaults if None. feedback_config: Reward signal configuration. Uses defaults if None. learning_rate: Optimizer learning rate (Adam). baseline_decay: EMA decay for running reward baseline. entropy_coeff: Weight of entropy bonus for exploration. max_grad_norm: Gradient clipping threshold. device: Target device (“cpu” or “cuda”). output_dir: Directory for saving checkpoints and logs. seed: Random seed for reproducible shuffling. None for non-deterministic.

train_step(prompt: str, target_dimensions: tuple | None = None) -> dict[str, float]

Source: ll_gen/ll_gen/training/rl_trainer.py:133

Execute a single training step with REINFORCE.

  1. Generate proposal from prompt
  2. Dispose (validate + repair)
  3. Compute reward
  4. Update baseline (EMA)
  5. Compute policy gradient loss with advantage
  6. Backward + gradient clipping + optimizer step

Args: prompt: User prompt describing the shape. target_dimensions: Optional target bbox dimensions for semantic bonus.

Returns: Dictionary with keys: - “reward”: scalar reward from disposal - “advantage”: reward minus baseline - “loss”: scalar training loss - “baseline”: updated baseline value - “is_valid”: whether result passed validation

Raises: RuntimeError: If training not initialized or generation fails.

train_epoch(dataset: list[dict[str, Any]], batch_size: int = 4, shuffle: bool = True) -> dict[str, float]

Source: ll_gen/ll_gen/training/rl_trainer.py:391

Train for one epoch on a dataset.

Each sample in dataset should have at least a “prompt” key. Optionally includes “target_dimensions” for semantic matching.

Args: dataset: List of dicts with “prompt” and optional “target_dimensions”. batch_size: Batch size (processes one sample at a time; used for reporting). shuffle: Whether to shuffle dataset before training.

Returns: Dictionary with aggregated metrics: - “mean_reward”: average reward across epoch - “mean_loss”: average loss - “validity_rate”: fraction of valid results - “epoch_time_ms”: wall-clock time in milliseconds

evaluate(test_set: list[dict[str, Any]]) -> dict[str, float]

Source: ll_gen/ll_gen/training/rl_trainer.py:471

Evaluate on a test set without gradient updates.

Args: test_set: List of test samples with “prompt” keys.

Returns: Dictionary with metrics: - “validity_rate”: fraction of valid results - “compile_rate”: fraction with shape - “mean_reward”: average reward - “reward_std”: standard deviation of reward

save_checkpoint(path: Path) -> None

Source: ll_gen/ll_gen/training/rl_trainer.py:528

Save training checkpoint (model, optimizer, baseline, history).

Args: path: Path to save checkpoint to.

load_checkpoint(path: Path) -> None

Source: ll_gen/ll_gen/training/rl_trainer.py:564

Load training checkpoint.

Args: path: Path to checkpoint file.

get_training_history() -> list[dict[str, float]]

Source: ll_gen/ll_gen/training/rl_trainer.py:629

Get the training history.

Returns: List of dicts from each training step.

Source: ll_gen/ll_gen/proposals/disposal_result.py:148

Record of a single deterministic repair step.

Attributes: tool: Which ShapeFix tool was used (“ShapeFix_Shape”, “ShapeFix_Wire”, “ShapeFix_Face”, “ShapeFix_Shell”, “ShapeFix_Solid”, “BOPAlgo_PaveFiller”). action: Description of what was done. status: Outcome (“done”, “failed”, “partial”). tolerance_used: Tolerance value used for the repair. entities_affected: Number of entities modified.

Methods

__init__(tool: str = '', action: str = '', status: str = 'done', tolerance_used: float | None = None, entities_affected: int = 0) -> None

Source: ll_gen/ll_gen/proposals/disposal_result.py

Source: ll_gen/ll_gen/config.py:78

Configuration for the generation router.

Methods

__init__(mechanical_keywords: list[str] = (lambda: ['extrude', 'cut', 'hole', 'fillet', 'chamfer', 'thread', 'bore', 'counterbore', 'countersink', 'slot', 'pocket', 'boss', 'rib', 'shell', 'loft', 'sweep', 'revolve', 'mirror', 'pattern', 'mounting', 'bracket', 'plate', 'bolt', 'nut', 'washer', 'flange', 'housing', 'enclosure', 'gear', 'shaft', 'bearing', 'hinge', 'clamp', 'spacer', 'standoff', 'bushing', 'collar', 'pin', 'key', 'keyway', 'groove', 'channel', 'rail', 'guide'])(), openscad_keywords: list[str] = (lambda: ['union', 'difference', 'intersection', 'hull', 'minkowski', 'openscad', 'scad'])(), freeform_keywords: list[str] = (lambda: ['smooth', 'flowing', 'sculpted', 'organic', 'aerodynamic', 'freeform', 'curved', 'biomorphic', 'blob', 'amorphous', 'ergonomic', 'contoured', 'streamlined', 'natural'])(), exploration_keywords: list[str] = (lambda: ['interpolate', 'morph', 'vary', 'explore', 'blend', 'transition', 'mix', 'combine', 'latent', 'sample'])(), codebook_keywords: list[str] = (lambda: ['quantize', 'discrete', 'codebook', 'disentangle'])(), confidence_threshold: float = 0.3, default_route: GenerationRoute = GenerationRoute.CODE_CADQUERY) -> None

Source: ll_gen/ll_gen/config.py

Source: ll_gen/ll_gen/routing/router.py:30

Result of the routing analysis.

Attributes: route: Selected generation route. confidence: Confidence in [0, 1] that this is the right route. scores: Per-route scores from the analysis. reasons: Human-readable reasons for the decision. forced: Whether the route was user-forced (override).

Methods

__init__(route: GenerationRoute = GenerationRoute.CODE_CADQUERY, confidence: float = 0.5, scores: dict[str, float] = dict(), reasons: list[str] = list(), forced: bool = False) -> None

Source: ll_gen/ll_gen/routing/router.py

Source: ll_gen/ll_gen/config.py:57

STEP export application protocol.

Source: ll_gen/ll_gen/conditioning/text_encoder.py:29

Encodes text prompts into ConditioningEmbeddings.

Uses ll_stepnet’s TextConditioner if available, otherwise falls back to deterministic hash-based embeddings.

Attributes: model_name: Hugging Face model identifier. conditioning_dim: Embedding dimension. freeze_encoder: Whether to freeze encoder parameters. device: Torch device (“cpu” or “cuda:*”).

Methods

__init__(model_name: str = 'bert-base-uncased', conditioning_dim: int = 768, freeze_encoder: bool = True, device: str = 'cpu') -> None

Source: ll_gen/ll_gen/conditioning/text_encoder.py:42

Initialize TextConditioningEncoder.

Args: model_name: Hugging Face model identifier. conditioning_dim: Embedding dimension. freeze_encoder: Whether to freeze encoder parameters. device: Torch device (“cpu” or “cuda:*”).

encode(prompt: str) -> ConditioningEmbeddings

Source: ll_gen/ll_gen/conditioning/text_encoder.py:63

Encode a single text prompt.

Args: prompt: Text prompt to encode.

Returns: ConditioningEmbeddings with token and pooled embeddings.

encode_batch(prompts: list[str]) -> list[ConditioningEmbeddings]

Source: ll_gen/ll_gen/conditioning/text_encoder.py:120

Encode multiple text prompts.

Args: prompts: List of text prompts.

Returns: List of ConditioningEmbeddings.

Source: ll_gen/ll_gen/config.py:377

Configuration for RL alignment training.

Methods

__init__(learning_rate: float = 1e-05, batch_size: int = 4, num_epochs: int = 10, eval_interval: int = 5, baseline_decay: float = 0.99, entropy_coeff: float = 0.01, max_grad_norm: float = 1.0, checkpoint_dir: str = 'checkpoints') -> None

Source: ll_gen/ll_gen/config.py

Source: ll_gen/ll_gen/proposals/disposal_result.py:122

A single validation error or warning on a specific sub-shape.

Attributes: entity_type: TopAbs type name (“FACE”, “EDGE”, “VERTEX”, “WIRE”, “SHELL”, “SOLID”). entity_index: Index of the entity within its type (0-based enumeration order from TopExp_Explorer). error_code: Raw OCC BRepCheck_Status name (e.g. "BRepCheck_NotClosed"). error_category: Mapped neural-interpretable category. severity: How critical this finding is. description: Human-readable description of the error. suggestion: Actionable suggestion for correction.

Methods

__init__(entity_type: str = '', entity_index: int = 0, error_code: str = '', error_category: ErrorCategory = ErrorCategory.TOPOLOGY_ERROR, severity: ErrorSeverity = ErrorSeverity.CRITICAL, description: str = '', suggestion: str = '') -> None

Source: ll_gen/ll_gen/proposals/disposal_result.py

Source: ll_gen/ll_gen/pipeline/verification.py:60

Verify generated geometry against the original prompt.

Args: dimension_tolerance: Fractional tolerance for dimensional matching (0.15 = 15%). vlm_backend: Optional VLM backend name for vision-based verification. Supported: "clip", "llm". If None, only dimensional checking is used.

Example::

verifier = VisualVerifier()
result = verifier.verify(
render_paths=[Path("view_front.png"), ...],
prompt="A box 100mm × 50mm × 20mm",
geometry_report=report,
)
print(result.matches_intent) # True/False

Methods

__init__(dimension_tolerance: float = 0.15, vlm_backend: str | None = None) -> None

Source: ll_gen/ll_gen/pipeline/verification.py:81

verify(render_paths: list[Path] | None = None, prompt: str = '', geometry_report: GeometryReport | None = None) -> VerificationResult

Source: ll_gen/ll_gen/pipeline/verification.py:91

Run verification against the prompt.

Combines dimensional checking (if GeometryReport available) with optional VLM verification (if renders available).

Args: render_paths: Paths to rendered images of the shape. prompt: Original text prompt. geometry_report: GeometryReport from introspection.

Returns: VerificationResult with pass/fail and details.

build_code_feedback(result: DisposalResult, original_proposal: CodeProposal) -> str

Source: ll_gen/ll_gen/feedback/feedback_builder.py:38

Build an LLM-readable error message for code generation retry.

The returned string is designed to be injected into the LLM’s system prompt as error context so it can generate a corrected script. It includes:

  • The original code (truncated to 100 lines if needed)
  • The primary error category and description
  • All critical findings with entity type and suggestion
  • Geometry report summary (if available)
  • An explicit instruction to correct the issue

Args: result: DisposalResult from the failed disposal attempt. original_proposal: The CodeProposal that was executed.

Returns: Multi-line string suitable for LLM retry prompting.

build_neural_feedback(result: DisposalResult) -> dict[str, Any]

Source: ll_gen/ll_gen/feedback/feedback_builder.py:232

Build structured feedback for neural (VAE/diffusion) retry.

Returns a dict that the neural generator can use to condition its next attempt:

  • error_category: Primary category string.
  • failed_entity_indices: Dict mapping entity type to list of failing entity indices.
  • parameter_hints: Adjustment hints per error category.
  • topology_stats: V/E/F counts and euler characteristic.
  • severity_counts: Number of findings per severity level.

Args: result: DisposalResult from the failed disposal attempt.

Returns: Structured feedback dict.

build_training_feedback(result: DisposalResult) -> dict[str, Any]

Source: ll_gen/ll_gen/feedback/feedback_builder.py:321

Build feedback for RL training reward shaping.

Returns a dict containing per-finding penalty breakdown, tier pass/fail status, and the composite reward signal.

Args: result: DisposalResult from disposal.

Returns: Training feedback dict.

compute_reward(result: DisposalResult, config: FeedbackConfig | None = None, target_dimensions: tuple[float, float, float] | None = None, target_volume: float | None = None) -> float

Source: ll_gen/ll_gen/feedback/reward_signal.py:31

Compute a composite scalar reward from a DisposalResult.

The reward is built up from independent components, each contributing a signed delta. The components are:

  1. Base validity (config.validity_reward): +0.8 if the shape is completely valid, else 0.0.

  2. Shape constructed (config.shape_constructed_reward): +0.16 if a TopoDS_Shape was produced at all (even if invalid).

  3. Repairable (config.repairable_reward): +0.0 if deterministic repair succeeded.

  4. Per-tier bonus (config.per_tier_reward): +0.16 for each passing validation tier: manifold, watertight, euler, no-self-intersection.

  5. Semantic match (config.semantic_match_reward): +0.2 if bounding box dimensions match the target within config.dimension_tolerance_pct, plus an additional +0.1 (50% of semantic_match_reward) if the volume also matches. Combined semantic reward is capped at 1.5x semantic_match_reward.

  6. Critical error penalty (config.critical_error_penalty): −0.1 per critical-severity finding.

The final reward is floored at −1.0 but has no upper clamp, so the RL trainer receives full gradient signal for semantic match.

Args: result: Disposal result to score. config: Feedback config with reward weights. Uses defaults if None. target_dimensions: Expected (w, h, d) bounding box dims. If provided, enables semantic match scoring. target_volume: Expected volume. Used as supplementary semantic check (±10% tolerance).

Returns: Scalar reward ≥ −1.0.

get_ll_gen_config(**overrides) -> LLGenConfig

Source: ll_gen/ll_gen/config.py:412

Create an LLGenConfig with optional overrides.

Supports nested overrides via dotted keys: get_ll_gen_config(**{“codegen.temperature”: 0.5})

Args: **overrides: Key-value pairs to override defaults. Top-level keys are set directly on LLGenConfig. Dotted keys like “codegen.temperature” are set on the corresponding sub-config.

Returns: Configured LLGenConfig instance.