API Reference
Generated from the stepnet package source. Each symbol links to its definition on GitHub.
class AdaptiveLayer(nn.Module)
Section titled “class AdaptiveLayer(nn.Module)”Source: ll_stepnet/stepnet/conditioning.py:45
Single transformer block with cross-attention for conditioning injection.
Contains a self-attention sub-layer, a cross-attention sub-layer (attending to conditioning embeddings), and a feed-forward sub-layer.
Args: hidden_dim: Model hidden dimension. num_heads: Number of attention heads. dropout: Dropout rate.
Methods
__init__
Section titled “__init__”__init__(hidden_dim: int = 1024, num_heads: int = 8, dropout: float = 0.1) -> NoneSource: ll_stepnet/stepnet/conditioning.py:57
forward
Section titled “forward”forward(hidden_states: torch.Tensor, conditioning: torch.Tensor, attention_mask: Optional[torch.Tensor] = None) -> torch.TensorSource: ll_stepnet/stepnet/conditioning.py:93
Forward pass with self-attention, cross-attention, and FFN.
Args: hidden_states: [B, S, D] decoder hidden states. conditioning: [B, C, D] conditioning embeddings. attention_mask: Optional mask for the conditioning sequence.
Returns: Updated hidden states [B, S, D].
class BranchAnnotation
Section titled “class BranchAnnotation”Source: ll_stepnet/stepnet/annotations.py:20
Annotation for a single DFS branch (subtree rooted at a root entity).
Attributes: root_id: Entity ID of the branch root. root_type: STEP entity type of the branch root. descendant_count: Number of descendants (excluding root). max_depth: Maximum depth reached in this branch. type_distribution: Counts of each entity type in the branch.
Methods
format
Section titled “format”format(max_types: int = 5) -> strSource: ll_stepnet/stepnet/annotations.py:37
Format branch annotation as text string.
Args: max_types: Maximum number of type counts to include.
Returns: Formatted annotation string with [BRANCH] delimiters.
__init__
Section titled “__init__”__init__(root_id: int, root_type: str, descendant_count: int, max_depth: int, type_distribution: dict[str, int] = dict()) -> NoneSource: ll_stepnet/stepnet/annotations.py
class CADDenoiser(nn.Module)
Section titled “class CADDenoiser(nn.Module)”Source: ll_stepnet/stepnet/diffusion.py:356
Self-attention denoiser that predicts noise from noisy latents.
Architecture: sinusoidal timestep embedding + self-attention transformer with num_layers layers and num_heads heads.
Args: latent_dim: Dimension of the input noisy latent. hidden_dim: Transformer hidden dimension. num_layers: Number of self-attention layers. num_heads: Number of attention heads. dropout: Dropout rate.
Methods
__init__
Section titled “__init__”__init__(latent_dim: int = 256, hidden_dim: int = DEFAULT_DENOISER_HIDDEN_DIM, num_layers: int = 12, num_heads: int = DEFAULT_DENOISER_HEADS, dropout: float = 0.1) -> NoneSource: ll_stepnet/stepnet/diffusion.py:370
forward
Section titled “forward”forward(noisy_latent: torch.Tensor, timesteps: torch.Tensor) -> torch.TensorSource: ll_stepnet/stepnet/diffusion.py:413
Predict noise from a noisy latent and timestep.
Args: noisy_latent: [B, D] or [B, S, D] noisy data. timesteps: [B] integer timestep indices.
Returns: Predicted noise, same shape as noisy_latent.
class CADGenerationPipeline(nn.Module)
Section titled “class CADGenerationPipeline(nn.Module)”Source: ll_stepnet/stepnet/generation_pipeline.py:58
End-to-end CAD generation pipeline.
Connects ll_stepnet generative models → geotoken decoding → cadling reconstruction. Supports VAE, VQ-VAE, and Diffusion sampling modes.
Attributes: model: Generative model (STEPVAE, VQVAEModel, or StructuredDiffusion). mode: Generation mode (‘vae’, ‘vqvae’, or ‘diffusion’). device: Target device (‘cpu’ or ‘cuda’). executor_tolerance: Tolerance for cadling CommandExecutor validation. quantization_levels: Number of quantization levels per parameter.
Methods
__init__
Section titled “__init__”__init__(model: nn.Module, mode: str = 'vae', device: str = 'cpu', executor_tolerance: float = 1e-06, quantization_levels: int = 256, fallback_config: Optional[FallbackConfig] = None, on_error: Optional[Callable[[CADGenerationError], None]] = None) -> NoneSource: ll_stepnet/stepnet/generation_pipeline.py:72
Initialize the CAD generation pipeline.
Args: model: Generative model (STEPVAE, VQVAEModel, or StructuredDiffusion). mode: Generation mode - ‘vae’, ‘vqvae’, or ‘diffusion’. device: Target device - ‘cpu’ or ‘cuda’. executor_tolerance: Tolerance threshold for cadling CommandExecutor. quantization_levels: Number of quantization bins per parameter. fallback_config: Configuration for error recovery strategies. on_error: Optional callback invoked when generation errors occur.
Raises: ValueError: If mode is not one of the supported modes. TypeError: If model is not a recognized generative model.
generate
Section titled “generate”generate(num_samples: int = 1, seq_len: Optional[int] = None, reconstruct: bool = True, **kwargs) -> List[Dict[str, Any]]Source: ll_stepnet/stepnet/generation_pipeline.py:184
Generate CAD sequences end-to-end.
Samples from the generative model, decodes to geotoken TokenSequence, and optionally reconstructs using cadling’s CommandExecutor.
Args: num_samples: Number of sequences to generate. seq_len: Sequence length for generation (mode-dependent). reconstruct: Whether to reconstruct via cadling. Defaults to True. **kwargs: Additional keyword arguments passed to the model’s sampling method (e.g., temperature, top_k for diffusion).
Returns: List of result dictionaries, each containing: - ‘token_sequence’: geotoken TokenSequence (if geotoken installed) - ‘commands’: List of command dicts (if geotoken installed) - ‘command_logits’: Raw command logits [num_samples, seq_len, 6] - ‘param_logits’: Raw parameter logits, list of 16 tensors - ‘shape’: Reconstructed cadling Shape (if reconstruct=True and cadling installed) - ‘valid’: Boolean validity flag (if reconstruct=True and cadling installed) - ‘error’: Error message if reconstruction failed (if reconstruct=True)
evaluate
Section titled “evaluate”evaluate(generated_results: List[Dict[str, Any]], reference_shapes: Optional[List[Any]] = None) -> Dict[str, float]Source: ll_stepnet/stepnet/generation_pipeline.py:1055
Evaluate generation quality using cadling’s metrics.
Computes validity rate, and optionally novelty and coverage if reference shapes are provided.
Args: generated_results: List of result dicts from generate(). reference_shapes: Optional list of reference cadling.Shape objects for novelty/coverage computation.
Returns: Dictionary with evaluation metrics: - ‘validity_rate’: Fraction of valid generated shapes - ‘novelty’: Novelty score (if references provided) - ‘coverage’: Coverage score (if references provided)
Raises: ImportError: If cadling is not installed.
class CadlingDataset(Dataset)
Section titled “class CadlingDataset(Dataset)”Source: ll_stepnet/stepnet/data.py:413
PyTorch Dataset for cadling Sketch2DItem objects.
Accepts a list of cadling Sketch2DItem instances and converts
each one to the same dict format as :class:GeoTokenDataset by
calling item.to_geotoken_commands() to get the command sequence
in geotoken-compatible format.
This lets you train ll_stepnet’s generative models (STEPVAE, SkexGenVQVAE, etc.) directly on cadling’s in-memory geometry objects without writing them to disk first.
Each __getitem__ returns:
- command_types: [max_commands] integer command type IDs
- parameters: [max_commands, NUM_PARAM_SLOTS] parameter values
- parameter_mask: [max_commands, NUM_PARAM_SLOTS] active-parameter mask
- attention_mask: [max_commands] validity mask
Args:
sketch_items: List of cadling Sketch2DItem objects (or any object
with a to_geotoken_commands() method).
max_commands: Maximum command sequence length.
include_topology: If True, build topology and include in output.
labels: Optional labels for supervised learning.
Methods
__init__
Section titled “__init__”__init__(sketch_items: List, max_commands: int = DEFAULT_MAX_SEQ_LEN, include_topology: bool = False, labels: Optional[List] = None) -> NoneSource: ll_stepnet/stepnet/data.py:451
class CodebookDecoder(nn.Module)
Section titled “class CodebookDecoder(nn.Module)”Source: ll_stepnet/stepnet/vqvae.py:548
Autoregressive transformer decoder for generating codebook indices.
Given a sequence of codebook indices, this module predicts the next
index autoregressively. One CodebookDecoder is instantiated per
codebook stream (topology, geometry, extrusion) so that each stream
can be generated independently.
The architecture is a standard GPT-style transformer decoder with causal masking, learned positional embeddings, and a final linear head projecting to the codebook vocabulary.
Args:
code_dim: Hidden dimension of the transformer.
num_layers: Number of transformer decoder layers.
num_heads: Number of attention heads.
vocab_size: Number of codebook entries this decoder predicts
over (must match the corresponding VectorQuantizer
num_embeddings).
max_codes: Maximum sequence length of codes to generate.
dropout: Dropout rate applied throughout the transformer.
Methods
__init__
Section titled “__init__”__init__(code_dim: int = 256, num_layers: int = 4, num_heads: int = 8, vocab_size: int = 500, max_codes: int = 16, dropout: float = 0.1) -> NoneSource: ll_stepnet/stepnet/vqvae.py:571
forward
Section titled “forward”forward(codes: torch.Tensor, mask: Optional[torch.Tensor] = None) -> torch.TensorSource: ll_stepnet/stepnet/vqvae.py:627
Compute next-code logits for a sequence of codebook indices.
Args:
codes: (batch, seq_len) LongTensor of codebook indices.
Should be prepended with BOS token for teacher-forced
training.
mask: Optional (batch, seq_len) padding mask where
True/1 indicates valid positions and
False/0 indicates padding.
Returns:
Logits tensor of shape (batch, seq_len, vocab_size)
giving the predicted distribution over the next codebook
index at each position.
sample
Section titled “sample”sample(num_samples: int, max_codes: Optional[int] = None, temperature: float = 1.0, top_k: Optional[int] = None) -> torch.TensorSource: ll_stepnet/stepnet/vqvae.py:680
Generate code sequences autoregressively.
Starts from a BOS token and samples one code at a time, feeding each sampled code back as input for the next step.
Args:
num_samples: Batch size of sequences to generate in
parallel.
max_codes: Maximum number of codes to generate per
sequence. Defaults to self.max_codes.
temperature: Sampling temperature. Higher values produce
more diverse outputs.
top_k: If set, only sample from the top-k highest
probability entries at each step.
Returns:
(num_samples, max_codes) LongTensor of generated
codebook indices.
class CommandType(IntEnum)
Section titled “class CommandType(IntEnum)”Source: ll_stepnet/stepnet/output_heads.py:31
CAD command types matching geotoken’s 6 command vocabulary.
Ordering matches geotoken.CommandType enum ordering so that integer indices are directly interchangeable between the two modules.
class CommandTypeHead(nn.Module)
Section titled “class CommandTypeHead(nn.Module)”Source: ll_stepnet/stepnet/output_heads.py:66
Predicts the CAD command type at each sequence position.
Args: embed_dim: Dimension of the decoder hidden states. num_command_types: Number of distinct command types.
Methods
__init__
Section titled “__init__”__init__(embed_dim: int = 256, num_command_types: int = NUM_COMMAND_TYPES) -> NoneSource: ll_stepnet/stepnet/output_heads.py:74
forward
Section titled “forward”forward(hidden_states: torch.Tensor) -> torch.TensorSource: ll_stepnet/stepnet/output_heads.py:78
Compute command type logits.
Args: hidden_states: [batch, seq_len, embed_dim]
Returns: Logits [batch, seq_len, num_command_types].
class CompositeHead(nn.Module)
Section titled “class CompositeHead(nn.Module)”Source: ll_stepnet/stepnet/output_heads.py:129
Combined command-type and parameter prediction head with masking.
During training this head:
1. Predicts command type logits.
2. Predicts 16 parameter logits.
3. Applies PARAMETER_MASKS to zero-out gradients for
parameters that do not belong to the predicted (or target)
command type.
4. Optionally predicts vertex positions via
:class:VertexPredictionHead (when include_vertex_head=True).
Args:
embed_dim: Dimension of the decoder hidden states.
num_command_types: Number of distinct command types.
num_param_slots: Number of parameter slots.
num_levels: Number of quantisation levels per parameter.
include_vertex_head: Whether to include the
:class:VertexPredictionHead for direct 3D vertex prediction.
max_vertices: Maximum number of vertex slots (only used when
include_vertex_head=True).
num_refinement_steps: Number of learned vertex refinement
iterations (only used when include_vertex_head=True).
Methods
__init__
Section titled “__init__”__init__(embed_dim: int = 256, num_command_types: int = NUM_COMMAND_TYPES, num_param_slots: int = NUM_PARAM_SLOTS, num_levels: int = DEFAULT_QUANTIZATION_LEVELS, include_vertex_head: bool = False, max_vertices: int = 512, num_refinement_steps: int = 3) -> NoneSource: ll_stepnet/stepnet/output_heads.py:154
forward
Section titled “forward”forward(hidden_states: torch.Tensor, command_targets: Optional[torch.Tensor] = None) -> Dict[str, object]Source: ll_stepnet/stepnet/output_heads.py:204
Predict command types and parameters with optional masking.
Args: hidden_states: [batch, seq_len, embed_dim] command_targets: [batch, seq_len] integer command-type targets. When provided, masking uses the ground-truth command types; otherwise the argmax prediction is used.
Returns: Dictionary with: - command_type_logits: [batch, seq_len, num_command_types] - parameter_logits: list of 16 [batch, seq_len, num_levels] - parameter_mask: [batch, seq_len, num_param_slots] bool
decode_to_token_sequence
Section titled “decode_to_token_sequence”decode_to_token_sequence(command_logits: torch.Tensor, param_logits: List[torch.Tensor], batch_index: int = 0)Source: ll_stepnet/stepnet/output_heads.py:262
Convert model output logits to a geotoken TokenSequence.
Takes the raw logits from a generative model’s forward pass and
produces a geotoken-compatible TokenSequence by argmaxing
command types and parameters, applying PARAMETER_MASKS, and
stopping at the first EOS token.
This is the inverse of what :class:GeoTokenDataset does
(TokenSequence → tensors); here we go tensors → TokenSequence.
Args:
command_logits: [B, S, num_command_types] logits.
param_logits: List of 16 [B, S, num_levels] logits.
batch_index: Which sample in the batch to decode (default 0).
Returns:
A geotoken.TokenSequence with decoded command_tokens.
Only command tokens are populated; graph and constraint tokens
are not (those come from separate decoders in full models).
class ConditioningConfig
Section titled “class ConditioningConfig”Source: ll_stepnet/stepnet/config.py:212
Configuration for cross-attention conditioning modules.
Methods
__init__
Section titled “__init__”__init__(text_encoder_name: str = 'bert-base-uncased', image_encoder_name: str = 'facebook/dinov2-base', conditioning_dim: int = 1024, skip_cross_attention_blocks: int = 2, freeze_encoder: bool = True, num_adaptive_layers: int = 1) -> NoneSource: ll_stepnet/stepnet/config.py
class DDPMScheduler
Section titled “class DDPMScheduler”Source: ll_stepnet/stepnet/diffusion.py:30
Linear-beta DDPM noise scheduler.
Precomputes alpha, alpha_bar, and sigma schedules for T timesteps and supports: - add_noise (forward process) - step (single reverse step, DDPM) - pndm_step (accelerated PNDM/PLMS reverse step)
Args: num_timesteps: Total number of diffusion steps. beta_start: Starting value of the linear beta schedule. beta_end: Ending value of the linear beta schedule. inference_steps: Number of evenly-spaced steps for PNDM sampling.
Methods
__init__
Section titled “__init__”__init__(num_timesteps: int = 1000, beta_start: float = 0.0001, beta_end: float = 0.02, inference_steps: int = 200) -> NoneSource: ll_stepnet/stepnet/diffusion.py:46
add_noise
Section titled “add_noise”add_noise(x_start: torch.Tensor, noise: torch.Tensor, timesteps: torch.Tensor) -> torch.TensorSource: ll_stepnet/stepnet/diffusion.py:85
Forward diffusion: add noise to clean data.
q(x_t | x_0) = sqrt(alpha_bar_t) * x_0 + sqrt(1 - alpha_bar_t) * eps
Args:
x_start: Clean data [B, …] (e.g. [B, D] or [B, S, D]).
noise: Gaussian noise, same shape as x_start.
timesteps: Integer timestep indices [B].
Returns:
Noisy data, same shape as x_start, at the given timesteps.
step(model_output: torch.Tensor, timestep: int, sample: torch.Tensor) -> torch.TensorSource: ll_stepnet/stepnet/diffusion.py:114
Single DDPM reverse step: predict x_{t-1} from x_t.
Args: model_output: Predicted noise [B, D]. timestep: Current integer timestep. sample: Current noisy sample [B, D].
Returns: Denoised sample at t-1, [B, D].
ddim_step_with_log_prob
Section titled “ddim_step_with_log_prob”ddim_step_with_log_prob(model_output: torch.Tensor, timestep: int, timestep_prev: int, sample: torch.Tensor, eta: float = 1.0) -> Tuple[torch.Tensor, torch.Tensor, torch.Tensor]Source: ll_stepnet/stepnet/diffusion.py:155
Stochastic DDIM reverse step returning its Gaussian log-probability.
Implements the per-step transition used by diffusion policy-gradient
training (DDPO; Black et al., 2023). The math is the DDIM eta
sampler shared verbatim by the reference implementations
(Make-a-Shape gaussian_diffusion.ddim_sample and the identical
brepdiff/diff3d variants)::
x0 = (x_t - sqrt(1 - ab_t) * eps) / sqrt(ab_t)sigma = eta * sqrt((1 - ab_prev)/(1 - ab_t)) * sqrt(1 - ab_t/ab_prev)mean = sqrt(ab_prev) * x0 + sqrt(max(1 - ab_prev - sigma^2, 0)) * epsx_prev = mean + sigma * noise # noise ~ N(0, I)where eps is the model’s predicted noise. The returned log-prob is
log N(x_prev; mean, sigma^2 I) summed over the feature dimension
(one scalar per batch element). Because mean is a function of
model_output (the denoiser’s epsilon prediction), gradients flow
into the model parameters — this is what makes the RL signal real
rather than a detached stand-in.
Args:
model_output: Predicted noise eps_theta(x_t, t), shape [B, D].
timestep: Current integer timestep t.
timestep_prev: Next (smaller) timestep t'; pass -1 for the
final step landing on x_0 (ab_prev = 1.0).
sample: Current noisy sample x_t, shape [B, D].
eta: DDIM stochasticity. Must be > 0 for a non-degenerate policy
gradient; 1.0 recovers ancestral-DDPM-like noise.
Returns:
Tuple (x_prev, log_prob, entropy) where log_prob and
entropy are shape [B] tensors. log_prob carries the
gradient to model_output (and thus the model parameters).
reset_pndm
Section titled “reset_pndm”reset_pndm() -> NoneSource: ll_stepnet/stepnet/diffusion.py:261
Reset the PNDM multi-step buffer before a new sampling run.
pndm_step
Section titled “pndm_step”pndm_step(model_output: torch.Tensor, timestep: int, sample: torch.Tensor) -> torch.TensorSource: ll_stepnet/stepnet/diffusion.py:265
Pseudo Numerical Diffusion Model (PNDM/PLMS) reverse step.
Uses a linear multi-step method (up to 4th order) for faster inference with fewer function evaluations.
Args: model_output: Predicted noise [B, D]. timestep: Current timestep index. sample: Current noisy sample [B, D].
Returns: Denoised sample, [B, D].
class DataConfig
Section titled “class DataConfig”Source: ll_stepnet/stepnet/config.py:122
Configuration for data loading.
Methods
__init__
Section titled “__init__”__init__(data_dir: str = 'data', train_split: str = 'train', val_split: str = 'val', test_split: str = 'test', max_length: int = 2048, use_topology: bool = True, num_workers: int = 4) -> NoneSource: ll_stepnet/stepnet/config.py
class DiffusionConfig
Section titled “class DiffusionConfig”Source: ll_stepnet/stepnet/config.py:185
Configuration for the diffusion-based CAD denoiser.
Methods
__init__
Section titled “__init__”__init__(num_timesteps: int = 1000, beta_start: float = 0.0001, beta_end: float = 0.02, inference_steps: int = 200, denoiser_layers: int = 12, denoiser_heads: int = DEFAULT_DENOISER_HEADS, denoiser_hidden_dim: int = DEFAULT_DENOISER_HIDDEN_DIM, latent_dim: int = 256, num_faces: int = 8, num_edges: int = 12, uv_grid_size: int = 8, edge_num_points: int = 12, codec_hidden_dim: int = 256) -> NoneSource: ll_stepnet/stepnet/config.py
class DiffusionTrainer
Section titled “class DiffusionTrainer”Source: ll_stepnet/stepnet/training/diffusion_trainer.py:30
Trainer for denoising diffusion probabilistic models on CAD data.
Implements the DDPM training procedure:
- Sample random timesteps for each item in the batch
- Add noise according to the noise schedule at those timesteps
- Train the model to predict the added noise
- Maintain an EMA copy of the model for generation
Args: model: Denoising model that takes (noisy_input, timestep) and predicts noise. scheduler: Noise scheduler with add_noise() and step() methods, providing the beta schedule and noise levels for each timestep. train_dataloader: DataLoader for training data. val_dataloader: Optional DataLoader for validation data. device: Device string. ‘auto’ selects CUDA if available, else CPU. checkpoint_dir: Directory path for saving checkpoints and samples. ema_decay: Decay rate for exponential moving average (default 0.9999).
Methods
__init__
Section titled “__init__”__init__(model: nn.Module, scheduler: Any, train_dataloader: DataLoader, val_dataloader: Optional[DataLoader] = None, device: str = 'auto', checkpoint_dir: Optional[str] = None, ema_decay: float = 0.9999) -> NoneSource: ll_stepnet/stepnet/training/diffusion_trainer.py:50
train_epoch
Section titled “train_epoch”train_epoch() -> Dict[str, float]Source: ll_stepnet/stepnet/training/diffusion_trainer.py:231
Train for one epoch of denoising diffusion.
For each batch:
- Sample random timesteps uniformly
- Sample Gaussian noise
- Create noisy versions of the input
- Predict the noise with the model
- Compute MSE between predicted and actual noise
- Update EMA model
Returns: Dictionary with keys: ‘loss’, ‘noise_mse’.
validate
Section titled “validate”validate() -> Dict[str, float]Source: ll_stepnet/stepnet/training/diffusion_trainer.py:334
Run validation on the diffusion model.
Computes noise prediction MSE on the validation set using both the training model and the EMA model.
Returns: Dictionary with keys: ‘val_loss’, ‘val_noise_mse’, ‘ema_val_loss’.
sample_and_visualize
Section titled “sample_and_visualize”sample_and_visualize(num_samples: int, epoch: int) -> NoneSource: ll_stepnet/stepnet/training/diffusion_trainer.py:396
Generate samples using the EMA model and save visualizations.
Runs the full reverse diffusion process starting from pure noise and saves the resulting samples and a visualization to the checkpoint directory.
Args: num_samples: Number of samples to generate. epoch: Current epoch number for labeling the output.
train(num_epochs: int, save_every: int = 1) -> NoneSource: ll_stepnet/stepnet/training/diffusion_trainer.py:504
Train for multiple epochs with EMA updates and periodic sampling.
Orchestrates the full diffusion training loop:
- Per-epoch noise prediction training
- EMA model updates each step
- Validation and sample generation
- Checkpointing
Args: num_epochs: Total number of epochs to train. save_every: Save a checkpoint and generate samples every N epochs.
save_checkpoint
Section titled “save_checkpoint”save_checkpoint(filename: str) -> NoneSource: ll_stepnet/stepnet/training/diffusion_trainer.py:571
Save model and EMA model checkpoint to disk.
Args: filename: Name of the checkpoint file.
load_checkpoint
Section titled “load_checkpoint”load_checkpoint(filename: str) -> NoneSource: ll_stepnet/stepnet/training/diffusion_trainer.py:596
Load model and EMA model checkpoint from disk.
Args: filename: Name of the checkpoint file to load.
save_history
Section titled “save_history”save_history() -> NoneSource: ll_stepnet/stepnet/training/diffusion_trainer.py:623
Save training history to a JSON file in the checkpoint directory.
class DisentangledCodebooks(nn.Module)
Section titled “class DisentangledCodebooks(nn.Module)”Source: ll_stepnet/stepnet/vqvae.py:269
SkexGen’s three-codebook disentangled quantization system.
Maintains separate codebooks for three aspects of a CAD model:
- Topology codebook: encodes curve type sequences (e.g. line, arc, spline ordering in a sketch profile).
- Geometry codebook: encodes 2D point positions on a 64x64 quantized grid.
- Extrusion codebook: encodes 3D extrusion operations (direction, depth, taper, boolean type).
Each input feature stream is projected to code_dim, quantized
through its respective codebook, then decoded back. The model
produces 10 total codes per CAD model split across the three
codebooks (3 topology + 4 geometry + 3 extrusion by default).
Args: topology_codes: Number of entries in the topology codebook. geometry_codes: Number of entries in the geometry codebook. extrusion_codes: Number of entries in the extrusion codebook. code_dim: Dimensionality shared by all codebook vectors.
Methods
__init__
Section titled “__init__”__init__(topology_codes: int = 500, geometry_codes: int = 1000, extrusion_codes: int = 1000, code_dim: int = 256) -> NoneSource: ll_stepnet/stepnet/vqvae.py:299
set_epoch
Section titled “set_epoch”set_epoch(epoch: int) -> NoneSource: ll_stepnet/stepnet/vqvae.py:388
Propagate epoch counter to all child codebooks.
Args: epoch: Current training epoch (0-indexed).
encode
Section titled “encode”encode(sketch_features: torch.Tensor, geometry_features: torch.Tensor, extrusion_features: torch.Tensor) -> tuple[torch.Tensor, torch.Tensor, torch.Tensor]Source: ll_stepnet/stepnet/vqvae.py:411
Encode feature streams into discrete codebook indices.
Each feature tensor is projected, reshaped into a short code sequence, then quantized through the corresponding codebook.
Args:
sketch_features: Topology/sketch features of shape
(batch, code_dim).
geometry_features: Geometry features of shape
(batch, code_dim).
extrusion_features: Extrusion features of shape
(batch, code_dim).
Returns:
A tuple of (topology_codes, geometry_codes, extrusion_codes)
where each is a LongTensor of codebook indices with shape
(batch, num_codes_for_that_stream).
decode
Section titled “decode”decode(topology_codes: torch.Tensor, geometry_codes: torch.Tensor, extrusion_codes: torch.Tensor) -> tuple[torch.Tensor, torch.Tensor, torch.Tensor]Source: ll_stepnet/stepnet/vqvae.py:475
Decode codebook indices back to reconstructed feature vectors.
Args:
topology_codes: (batch, TOPOLOGY_NUM_CODES) LongTensor
of topology codebook indices.
geometry_codes: (batch, GEOMETRY_NUM_CODES) LongTensor
of geometry codebook indices.
extrusion_codes: (batch, EXTRUSION_NUM_CODES) LongTensor
of extrusion codebook indices.
Returns:
A tuple of (topo_features, geom_features, extr_features)
each of shape (batch, code_dim).
decode_quantized
Section titled “decode_quantized”decode_quantized() -> tuple[torch.Tensor, torch.Tensor, torch.Tensor]Source: ll_stepnet/stepnet/vqvae.py:513
Decode using the cached quantized outputs from the last encode.
This uses the straight-through quantized tensors (with gradients)
from the most recent encode call, which is needed during
training to allow gradient flow through the VQ bottleneck.
Returns:
A tuple of (topo_features, geom_features, extr_features)
each of shape (batch, code_dim).
Raises:
RuntimeError: If called before encode.
class GANTrainer
Section titled “class GANTrainer”Source: ll_stepnet/stepnet/training/gan_trainer.py:28
Trainer for Wasserstein GAN with gradient penalty in the latent space.
Trains a generator and discriminator (critic) to produce realistic latent vectors that can be decoded by a pre-trained VAE decoder into valid CAD token sequences.
The WGAN-GP formulation provides:
- Wasserstein distance as a meaningful training signal
- Gradient penalty for stable training without mode collapse
- Alternating critic/generator updates with configurable ratio
Args: generator: Generator network mapping noise -> latent vectors. discriminator: Discriminator (critic) network scoring latent vectors. train_dataloader: DataLoader providing real latent vectors for training. device: Device string. ‘auto’ selects CUDA if available, else CPU. checkpoint_dir: Directory path for saving checkpoints. gp_lambda: Gradient penalty coefficient (default 10.0 per WGAN-GP paper). n_critic: Number of critic updates per generator update (default 5). lr_gen: Learning rate for the generator optimizer. lr_disc: Learning rate for the discriminator optimizer.
Methods
__init__
Section titled “__init__”__init__(generator: nn.Module, discriminator: nn.Module, train_dataloader: DataLoader, device: str = 'auto', checkpoint_dir: Optional[str] = None, gp_lambda: float = 10.0, n_critic: int = 5, lr_gen: float = 0.0001, lr_disc: float = 0.0001) -> NoneSource: ll_stepnet/stepnet/training/gan_trainer.py:52
train_discriminator_step
Section titled “train_discriminator_step”train_discriminator_step(real_latents: torch.Tensor) -> Dict[str, float]Source: ll_stepnet/stepnet/training/gan_trainer.py:202
Perform one discriminator (critic) training step.
Computes the WGAN loss with gradient penalty: D_loss = D(fake) - D(real) + lambda * GP
Args: real_latents: Real latent vectors, shape (batch, latent_dim).
Returns: Dictionary with keys: ‘d_loss’, ‘gp_loss’, ‘wasserstein_dist’.
train_generator_step
Section titled “train_generator_step”train_generator_step(batch_size: int) -> Dict[str, float]Source: ll_stepnet/stepnet/training/gan_trainer.py:246
Perform one generator training step.
Generates fake latents and optimizes the generator to fool the critic: G_loss = -D(G(z))
Args: batch_size: Number of samples to generate.
Returns: Dictionary with key: ‘g_loss’.
train_epoch
Section titled “train_epoch”train_epoch() -> Dict[str, float]Source: ll_stepnet/stepnet/training/gan_trainer.py:275
Train for one epoch with alternating critic/generator updates.
Performs n_critic discriminator updates for every 1 generator update, following the WGAN-GP training protocol.
Returns: Dictionary with keys: ‘d_loss’, ‘g_loss’, ‘gp_loss’, ‘wasserstein_dist’.
validate
Section titled “validate”validate() -> Dict[str, float]Source: ll_stepnet/stepnet/training/gan_trainer.py:362
Compute validation metrics for the GAN.
Evaluates generation quality using FID-style metrics:
- Mean and std difference between generated and real latent distributions
- Approximate FID score from distribution moments
- Discriminator accuracy on real vs fake
Returns: Dictionary with validation metrics including ‘fid_approx’, ‘mean_diff’, ‘std_diff’, ‘disc_accuracy’.
sample
Section titled “sample”sample(num_samples: int) -> torch.TensorSource: ll_stepnet/stepnet/training/gan_trainer.py:444
Generate latent vectors using the trained generator.
Args: num_samples: Number of latent vectors to generate.
Returns: Tensor of generated latent vectors, shape (num_samples, latent_dim).
train(num_epochs: int, save_every: int = 1) -> NoneSource: ll_stepnet/stepnet/training/gan_trainer.py:466
Train for multiple epochs.
Orchestrates the full GAN training loop with checkpointing and periodic validation.
Args: num_epochs: Total number of epochs to train. save_every: Save a checkpoint every N epochs.
save_checkpoint
Section titled “save_checkpoint”save_checkpoint(filename: str) -> NoneSource: ll_stepnet/stepnet/training/gan_trainer.py:527
Save generator and discriminator checkpoint to disk.
Args: filename: Name of the checkpoint file.
load_checkpoint
Section titled “load_checkpoint”load_checkpoint(filename: str) -> NoneSource: ll_stepnet/stepnet/training/gan_trainer.py:552
Load generator and discriminator checkpoint from disk.
Args: filename: Name of the checkpoint file to load.
save_history
Section titled “save_history”save_history() -> NoneSource: ll_stepnet/stepnet/training/gan_trainer.py:587
Save training history to a JSON file in the checkpoint directory.
class GeoTokenCollator
Section titled “class GeoTokenCollator”Source: ll_stepnet/stepnet/data.py:520
Collate function for batching GeoTokenDataset samples.
Handles variable-length graph and constraint token sequences by padding to the maximum length in the batch.
Args: pad_token_id: Token ID for padding. Default 0.
Methods
__init__
Section titled “__init__”__init__(pad_token_id: int = 0) -> NoneSource: ll_stepnet/stepnet/data.py:530
class GeoTokenDataset(Dataset)
Section titled “class GeoTokenDataset(Dataset)”Source: ll_stepnet/stepnet/data.py:281
PyTorch Dataset for geotoken TokenSequence objects.
Consumes geotoken TokenSequences directly — no format conversion needed because ll_stepnet’s CommandType enum and PARAMETER_MASKS match geotoken’s natively.
The integer index of each geotoken CommandType enum member maps directly to ll_stepnet’s CommandType IntEnum: SOL=0, LINE=1, ARC=2, CIRCLE=3, EXTRUDE=4, EOS=5
Each item is a dictionary containing: - command_types: [seq_len] integer command type IDs - parameters: [seq_len, NUM_PARAM_SLOTS] quantized parameter values - parameter_mask: [seq_len, NUM_PARAM_SLOTS] boolean active-parameter mask - attention_mask: [seq_len] validity mask (1=real, 0=padding)
When encode_graph_tokens=True and the TokenSequence has graph
tokens, the item also contains:
- graph_token_ids: [variable] integer IDs from CADVocabulary
When encode_constraint_tokens=True and the TokenSequence has
constraint tokens, the item also contains:
- constraint_token_ids: [variable] integer IDs from CADVocabulary
Args: token_sequences: List of geotoken TokenSequence objects. max_commands: Maximum command sequence length (pad/truncate). labels: Optional labels for supervised learning. encode_graph_tokens: Encode graph tokens via CADVocabulary. Default False. encode_constraint_tokens: Encode constraint tokens via CADVocabulary. Default False.
Methods
__init__
Section titled “__init__”__init__(token_sequences: List, max_commands: int = DEFAULT_MAX_SEQ_LEN, labels: Optional[List] = None, encode_graph_tokens: bool = False, encode_constraint_tokens: bool = False) -> NoneSource: ll_stepnet/stepnet/data.py:318
class ImageConditioner(nn.Module)
Section titled “class ImageConditioner(nn.Module)”Source: ll_stepnet/stepnet/conditioning.py:332
Condition CAD generation on rendered images.
Wraps a frozen DINOv2 or CLIP vision encoder, projects patch embeddings to the decoder dimension, and applies AdaptiveLayer blocks for cross-attention injection.
Args: encoder_name: Hugging Face model identifier (e.g. “facebook/dinov2-base”). conditioning_dim: Dimension of the conditioning embeddings. freeze_encoder: Whether to freeze the vision encoder weights. num_adaptive_layers: Number of AdaptiveLayer blocks. skip_cross_attention_blocks: Number of initial decoder blocks that skip cross-attention (Text2CAD default = 2).
Methods
__init__
Section titled “__init__”__init__(encoder_name: str = 'facebook/dinov2-base', conditioning_dim: int = 1024, freeze_encoder: bool = True, num_adaptive_layers: int = 1, skip_cross_attention_blocks: int = 2) -> NoneSource: ll_stepnet/stepnet/conditioning.py:349
encode_image
Section titled “encode_image”encode_image(pixel_values: torch.Tensor) -> torch.TensorSource: ll_stepnet/stepnet/conditioning.py:438
Encode pixel values into conditioning embeddings.
Args: pixel_values: [B, C, H, W] preprocessed image tensors.
Returns: Conditioning embeddings [B, N, conditioning_dim] where N is the number of patch tokens.
forward
Section titled “forward”forward(hidden_states: torch.Tensor, pixel_values: torch.Tensor, block_index: Optional[int] = None) -> torch.TensorSource: ll_stepnet/stepnet/conditioning.py:461
Condition decoder hidden states on image features.
Early decoder blocks skip cross-attention (same logic as
:class:TextConditioner).
Args: hidden_states: [B, S, D] decoder hidden states. pixel_values: [B, C, H, W] preprocessed image tensors. block_index: Optional zero-based decoder block index.
Returns: Conditioned hidden states [B, S, D].
class LatentDiscriminator(nn.Module)
Section titled “class LatentDiscriminator(nn.Module)”Source: ll_stepnet/stepnet/latent_gan.py:75
Wasserstein critic that scores latent vectors.
Args: latent_dim: Dimensionality of the input latent vector. hidden_dims: Sizes of hidden layers in the MLP.
Methods
__init__
Section titled “__init__”__init__(latent_dim: int = 256, hidden_dims: Optional[List[int]] = None) -> NoneSource: ll_stepnet/stepnet/latent_gan.py:83
forward
Section titled “forward”forward(z: torch.Tensor) -> torch.TensorSource: ll_stepnet/stepnet/latent_gan.py:111
Score a latent vector (higher = more real).
Args: z: [batch_size, latent_dim].
Returns: Scalar scores [batch_size, 1].
class LatentGAN
Section titled “class LatentGAN”Source: ll_stepnet/stepnet/latent_gan.py:123
WGAN-GP training loop for the VAE latent space.
Manages the generator, discriminator, their optimizers, and the gradient-penalty computation.
Args: latent_dim: Dimensionality of the latent space. gen_hidden_dims: Generator MLP hidden sizes. disc_hidden_dims: Discriminator MLP hidden sizes. gp_lambda: Gradient penalty coefficient. n_critic: Number of discriminator updates per generator update. lr_gen: Generator learning rate. lr_disc: Discriminator learning rate. device: Torch device.
Methods
__init__
Section titled “__init__”__init__(latent_dim: int = 256, gen_hidden_dims: Optional[List[int]] = None, disc_hidden_dims: Optional[List[int]] = None, gp_lambda: float = 10.0, n_critic: int = 5, lr_gen: float = 0.0001, lr_disc: float = 0.0001, device: Optional[torch.device] = None) -> NoneSource: ll_stepnet/stepnet/latent_gan.py:140
train_step
Section titled “train_step”train_step(real_latents: torch.Tensor) -> Dict[str, float]Source: ll_stepnet/stepnet/latent_gan.py:217
One training step: update critic n_critic times, then generator once.
Args: real_latents: [batch_size, latent_dim] latent vectors produced by the VAE encoder on real data.
Returns: Dictionary of loss values for logging: - disc_loss: latest discriminator loss - gen_loss: generator loss (0 if not updated this step) - gp: gradient penalty - wasserstein_distance: estimated Wasserstein distance
sample
Section titled “sample”sample(num_samples: int = 1, device: Optional[torch.device] = None) -> torch.TensorSource: ll_stepnet/stepnet/latent_gan.py:285
Sample latent vectors from the trained generator.
Args: num_samples: Number of samples to generate. device: Target device (defaults to self.device).
Returns: Generated latent vectors [num_samples, latent_dim].
to(device: torch.device) -> LatentGANSource: ll_stepnet/stepnet/latent_gan.py:308
Move all models and state to the given device.
Args: device: Target torch device.
Returns: Self for chaining.
class LatentGANConfig
Section titled “class LatentGANConfig”Source: ll_stepnet/stepnet/config.py:173
Configuration for the latent-space WGAN-GP.
Methods
__init__
Section titled “__init__”__init__(latent_dim: int = 256, gen_hidden_dims: List[int] = (lambda: [512, 512])(), disc_hidden_dims: List[int] = (lambda: [512, 512])(), gp_lambda: float = 10.0, n_critic: int = 5, lr_gen: float = 0.0001, lr_disc: float = 0.0001) -> NoneSource: ll_stepnet/stepnet/config.py
class LatentGenerator(nn.Module)
Section titled “class LatentGenerator(nn.Module)”Source: ll_stepnet/stepnet/latent_gan.py:26
Maps noise vectors to fake latent codes.
Args: latent_dim: Dimension of both input noise and output latent. hidden_dims: Sizes of hidden layers in the MLP.
Methods
__init__
Section titled “__init__”__init__(latent_dim: int = 256, hidden_dims: Optional[List[int]] = None) -> NoneSource: ll_stepnet/stepnet/latent_gan.py:34
forward
Section titled “forward”forward(z_noise: torch.Tensor) -> torch.TensorSource: ll_stepnet/stepnet/latent_gan.py:63
Generate fake latent vectors from noise.
Args: z_noise: [batch_size, latent_dim] samples from N(0,I).
Returns: Fake latent vectors [batch_size, latent_dim].
class MultiModalConditioner(nn.Module)
Section titled “class MultiModalConditioner(nn.Module)”Source: ll_stepnet/stepnet/conditioning.py:494
Fuses text and image conditioning for CAD generation.
Combines TextConditioner and ImageConditioner by concatenating their conditioning embeddings along the sequence dimension before passing through shared AdaptiveLayer blocks.
Args: text_encoder_name: Hugging Face text model identifier. image_encoder_name: Hugging Face vision model identifier. conditioning_dim: Shared conditioning dimension. freeze_encoders: Whether to freeze both pretrained encoders. num_adaptive_layers: Number of shared AdaptiveLayer blocks.
Methods
__init__
Section titled “__init__”__init__(text_encoder_name: str = 'bert-base-uncased', image_encoder_name: str = 'facebook/dinov2-base', conditioning_dim: int = 1024, freeze_encoders: bool = True, num_adaptive_layers: int = 1, skip_cross_attention_blocks: int = 2) -> NoneSource: ll_stepnet/stepnet/conditioning.py:509
forward
Section titled “forward”forward(hidden_states: torch.Tensor, text_input_ids: Optional[torch.Tensor] = None, text_attention_mask: Optional[torch.Tensor] = None, pixel_values: Optional[torch.Tensor] = None, block_index: Optional[int] = None) -> torch.TensorSource: ll_stepnet/stepnet/conditioning.py:563
Condition decoder hidden states on text and/or image features.
At least one of text_input_ids or pixel_values must be provided. Early decoder blocks skip cross-attention per Text2CAD design.
Args: hidden_states: [B, S, D] decoder hidden states. text_input_ids: [B, L_text] text token ids (optional). text_attention_mask: [B, L_text] text padding mask. pixel_values: [B, C, H, W] image tensors (optional). block_index: Optional zero-based decoder block index.
Returns: Conditioned hidden states [B, S, D].
class ParameterHeads(nn.Module)
Section titled “class ParameterHeads(nn.Module)”Source: ll_stepnet/stepnet/output_heads.py:90
Sixteen independent linear heads, one per parameter slot.
Each head maps the decoder hidden state to num_levels logits representing the quantised bin for that parameter.
Args: embed_dim: Dimension of the decoder hidden states. num_param_slots: Number of parameter slots (default 16). num_levels: Number of quantisation levels per parameter.
Methods
__init__
Section titled “__init__”__init__(embed_dim: int = 256, num_param_slots: int = NUM_PARAM_SLOTS, num_levels: int = DEFAULT_QUANTIZATION_LEVELS) -> NoneSource: ll_stepnet/stepnet/output_heads.py:102
forward
Section titled “forward”forward(hidden_states: torch.Tensor) -> List[torch.Tensor]Source: ll_stepnet/stepnet/output_heads.py:115
Compute per-slot parameter logits.
Args: hidden_states: [batch, seq_len, embed_dim]
Returns: List of NUM_PARAM_SLOTS tensors, each [batch, seq_len, num_levels].
class STEPAnnotatedOutput
Section titled “class STEPAnnotatedOutput”Source: ll_stepnet/stepnet/annotations.py:103
Combined reserialized output with structural annotations.
Attributes: summary: File-level structural summary (if generated). branches: Per-branch annotations for each root subtree. reserialized_text: The DFS-reserialized entity text. annotated_text: Full output combining summary, branches, and entities.
Methods
format
Section titled “format”format() -> strSource: ll_stepnet/stepnet/annotations.py:118
Format full annotated output.
Combines summary, branch annotations, and reserialized text into a single string separated by newlines.
Returns: Complete annotated output string.
__init__
Section titled “__init__”__init__(summary: Optional[StructuralSummary] = None, branches: list[BranchAnnotation] = list(), reserialized_text: str = '', annotated_text: str = '') -> NoneSource: ll_stepnet/stepnet/annotations.py
class STEPAnnotationConfig
Section titled “class STEPAnnotationConfig”Source: ll_stepnet/stepnet/config.py:145
Configuration for structural annotations.
Methods
__init__
Section titled “__init__”__init__(include_file_summary: bool = True, include_branch_annotations: bool = True, max_types_shown: int = 5, verbose: bool = False) -> NoneSource: ll_stepnet/stepnet/config.py
class STEPCaptioningConfig
Section titled “class STEPCaptioningConfig”Source: ll_stepnet/stepnet/config.py:83
Configuration for STEP captioning model.
Methods
__init__
Section titled “__init__”__init__(vocab_size: int = 50000, decoder_vocab_size: int = 50000, output_dim: int = 1024, max_caption_length: int = 128) -> NoneSource: ll_stepnet/stepnet/config.py
class STEPClassificationConfig
Section titled “class STEPClassificationConfig”Source: ll_stepnet/stepnet/config.py:65
Configuration for STEP classification model.
Methods
__init__
Section titled “__init__”__init__(vocab_size: int = 50000, num_classes: int = 10, output_dim: int = 1024, dropout: float = 0.1) -> NoneSource: ll_stepnet/stepnet/config.py
class STEPCollator
Section titled “class STEPCollator”Source: ll_stepnet/stepnet/data.py:150
Collate function for batching STEP data. Handles variable-length sequences and topology graphs.
Methods
__init__
Section titled “__init__”__init__(pad_token_id: int = 0)Source: ll_stepnet/stepnet/data.py:156
class STEPDFSSerializer
Section titled “class STEPDFSSerializer”Source: ll_stepnet/stepnet/reserialization.py:181
DFS-based STEP entity serializer.
Reorders STEP entities by depth-first traversal of the reference graph, producing output where related entities appear contiguously. Each entity is expanded exactly once (branch pruning).
Args: config: Reserialization configuration. Uses defaults if None.
Methods
__init__
Section titled “__init__”__init__(config: Optional[STEPReserializationConfig] = None)Source: ll_stepnet/stepnet/reserialization.py:192
serialize
Section titled “serialize”serialize(graph: STEPEntityGraph) -> STEPReserializedOutputSource: ll_stepnet/stepnet/reserialization.py:195
Perform DFS reserialization of entity graph.
Algorithm:
- Find roots using configured strategy
- DFS traverse, visiting each entity exactly once
- Append orphans (unreachable entities)
- Optionally renumber IDs sequentially
- Optionally normalize floats
Args: graph: Parsed STEP entity graph.
Returns: STEPReserializedOutput with reserialized text and metadata.
class STEPDataset(Dataset)
Section titled “class STEPDataset(Dataset)”Source: ll_stepnet/stepnet/data.py:26
PyTorch Dataset for STEP files. Loads and preprocesses STEP files on-the-fly.
Methods
__init__
Section titled “__init__”__init__(file_paths: List[str], labels: Optional[List] = None, tokenizer: Optional[STEPTokenizer] = None, max_length: int = 2048, use_topology: bool = True, use_reserialization: bool = False, use_annotations: bool = False, reserialization_config: Optional[STEPReserializationConfig] = None, annotation_config: Optional[STEPAnnotationConfig] = None)Source: ll_stepnet/stepnet/data.py:32
Args: file_paths: List of paths to STEP files labels: Optional labels for supervised learning tokenizer: STEPTokenizer instance (creates default if None) max_length: Maximum sequence length use_topology: Whether to build topology graphs use_reserialization: Whether to apply DFS reserialization to entity text use_annotations: Whether to prepend structural annotations reserialization_config: Configuration for DFS reserialization annotation_config: Configuration for structural annotations
class STEPEncoder(nn.Module)
Section titled “class STEPEncoder(nn.Module)”Source: ll_stepnet/stepnet/encoder.py:454
Complete STEP encoder combining all components.
Architecture: 1. Tokenizer: Text → Token IDs (external) 2. Transformer: Token IDs → Sequence features 3. Feature Extractor: Entities → Geometric features (external) 4. Graph Network: Topology → Structural features 5. Fusion: Combine all representations
The graph encoder accepts node features from either source natively:
- cadling’s TopologyGraph: 48-dim features (default)
- ll_stepnet’s STEPTopologyBuilder: 129-dim features (set graph_input_dim=129)
No adapters or conversion needed — pass topology_data directly from cadling’s TopologyGraph or ll_stepnet’s topology builder.
Methods
__init__
Section titled “__init__”__init__(vocab_size: int = 50000, token_embed_dim: int = 256, graph_node_dim: int = 128, graph_input_dim: int = 48, output_dim: int = 1024, num_transformer_layers: int = 6, num_graph_layers: int = 3, expected_feature_dims: Optional[List[int]] = None)Source: ll_stepnet/stepnet/encoder.py:473
register_feature_projection
Section titled “register_feature_projection”register_feature_projection(in_dim: int) -> nn.LinearSource: ll_stepnet/stepnet/encoder.py:523
Pre-register a projection layer for a given input feature dimension.
Call this before constructing the optimizer to ensure the projection
parameters are included in model.parameters().
Args: in_dim: Input feature dimension to project from.
Returns:
The nn.Linear projection layer (also stored in
self._feature_projs).
load_state_dict
Section titled “load_state_dict”load_state_dict(state_dict, strict = True, assign = False)Source: ll_stepnet/stepnet/encoder.py:543
Pre-create lazy projection layers found in checkpoint before loading.
_feature_projs entries are created lazily during forward() so a
freshly constructed model has an empty ModuleDict. Without this
override, load_state_dict would either error (strict=True) or
silently drop the learned projection weights (strict=False).
forward
Section titled “forward”forward(token_ids: torch.Tensor, topology_data: Optional[Dict] = None) -> torch.TensorSource: ll_stepnet/stepnet/encoder.py:570
Args: token_ids: [batch_size, seq_len] from STEPTokenizer topology_data: Dict with topology from either cadling or ll_stepnet: - adjacency_matrix: [num_nodes, num_nodes] (dense, sparse COO, or sparse CSR) - node_features: [num_nodes, feature_dim] (48-dim from cadling TopologyGraph, or 129-dim from STEPTopologyBuilder)
Returns: encoded: [batch_size, output_dim] final encoding
prepare_topology_data
Section titled “prepare_topology_data”prepare_topology_data(topology_obj) -> Dict[str, torch.Tensor]Source: ll_stepnet/stepnet/encoder.py:661
Convert a cadling TopologyGraph or raw dict to forward()-ready format.
Accepts either:
- A dict already in forward() format (with
adjacency_matrixandnode_featurestensor values) — returned as-is after ensuring values are tensors. - A cadling
TopologyGraphobject (or any object withto_numpy_node_features()andto_edge_index()methods) — extracts numpy arrays, builds a sparse adjacency matrix, and converts to tensors.
The adjacency matrix is returned as a sparse COO tensor to avoid O(N^2) memory on large B-Rep graphs.
This lets callers pass a cadling TopologyGraph directly without writing glue code::
topo = cadling_item.topology_graph # cadling TopologyGraphout = encoder(token_ids, STEPEncoder.prepare_topology_data(topo))Args:
topology_obj: Either a dict with adjacency_matrix and
node_features keys, or an object with
to_numpy_node_features() / to_edge_index() methods
(e.g. cadling’s TopologyGraph).
Returns:
Dict with adjacency_matrix (sparse COO [N, N]) and
node_features [N, D] float tensors ready for
:meth:forward.
class STEPEncoderConfig
Section titled “class STEPEncoderConfig”Source: ll_stepnet/stepnet/config.py:52
Configuration for STEP encoder.
Methods
__init__
Section titled “__init__”__init__(vocab_size: int = 50000, token_embed_dim: int = 256, graph_node_dim: int = 128, graph_input_dim: int = 48, output_dim: int = 1024, num_transformer_layers: int = 6, num_graph_layers: int = 3, dropout: float = 0.1) -> NoneSource: ll_stepnet/stepnet/config.py
class STEPEntityGraph
Section titled “class STEPEntityGraph”Source: ll_stepnet/stepnet/reserialization.py:58
Graph of STEP entities with parent/child relationships.
Parses raw STEP text into an entity reference graph suitable for
DFS traversal. Each entity node stores its forward references
(children) and back references (parents). See also
STEPFeatureExtractor.extract_entity_info and
STEPFeatureExtractor.extract_references in features.py
for similar single-entity parsing — the regex approach used here
is optimised for bulk graph construction.
Methods
parse(step_text: str) -> 'STEPEntityGraph'Source: ll_stepnet/stepnet/reserialization.py:72
Parse STEP text into entity graph.
Extracts entity definitions and builds parent/child reference graph.
Args: step_text: Raw STEP text containing entity definitions.
Returns: Populated STEPEntityGraph instance.
roots_by_strategy
Section titled “roots_by_strategy”roots_by_strategy(strategy: str = 'both') -> list[int]Source: ll_stepnet/stepnet/reserialization.py:127
Find roots using specified strategy.
Strategies: no_incoming: Entities with no parent references type_hierarchy: Entities with highest B-Rep type weight both: Combine and deduplicate, no_incoming first
Args: strategy: Root-finding strategy name.
Returns: Ordered list of root entity IDs.
__init__
Section titled “__init__”__init__(nodes: dict[int, STEPEntityNode] = dict()) -> NoneSource: ll_stepnet/stepnet/reserialization.py
class STEPEntityNode
Section titled “class STEPEntityNode”Source: ll_stepnet/stepnet/reserialization.py:47
A single STEP entity with its references.
Methods
__init__
Section titled “__init__”__init__(entity_id: int, entity_type: str, parameters: str, children: list[int] = list(), parents: list[int] = list(), raw_line: str = '') -> NoneSource: ll_stepnet/stepnet/reserialization.py
class STEPFeatureExtractor
Section titled “class STEPFeatureExtractor”Source: ll_stepnet/stepnet/features.py:11
Extracts geometric features from tokenized STEP content. Separate from tokenization - operates on parsed entities.
Methods
__init__
Section titled “__init__”__init__()Source: ll_stepnet/stepnet/features.py:17
Initialize feature extractor with parameter patterns.
extract_entity_info
Section titled “extract_entity_info”extract_entity_info(entity_text: str) -> DictSource: ll_stepnet/stepnet/features.py:51
Parse a single STEP entity to extract basic info.
Args: entity_text: Single STEP entity string (e.g., “#31=CYLINDER(…);”)
Returns: Dictionary with entity_id, entity_type, parameters
extract_numeric_params
Section titled “extract_numeric_params”extract_numeric_params(params_text: str) -> List[float]Source: ll_stepnet/stepnet/features.py:80
Extract all numeric values from parameter string.
Args: params_text: Parameter text from entity
Returns: List of numeric values
extract_references
Section titled “extract_references”extract_references(params_text: str) -> List[int]Source: ll_stepnet/stepnet/features.py:103
Extract entity reference IDs (#123, #456, etc.).
Args: params_text: Parameter text from entity
Returns: List of referenced entity IDs
extract_geometric_features
Section titled “extract_geometric_features”extract_geometric_features(entity_text: str) -> DictSource: ll_stepnet/stepnet/features.py:117
Extract complete geometric features from an entity.
Args: entity_text: STEP entity text
Returns: Dictionary with: - entity_id: int - entity_type: str - numeric_params: List[float] - references: List[int] - named_params: Dict (if known pattern)
extract_features_from_chunk
Section titled “extract_features_from_chunk”extract_features_from_chunk(chunk_text: str) -> List[Dict]Source: ll_stepnet/stepnet/features.py:152
Extract features from a chunk of STEP text (multiple entities).
Args: chunk_text: Raw STEP text with multiple entities
Returns: List of feature dictionaries
class STEPForCaptioning(nn.Module)
Section titled “class STEPForCaptioning(nn.Module)”Source: ll_stepnet/stepnet/tasks.py:14
STEP encoder with caption generation head. Predicts: Natural language description of CAD part.
Use case: “This is a mounting bracket with 4 bolt holes”
Methods
__init__
Section titled “__init__”__init__(vocab_size: int = 50000, decoder_vocab_size: int = 50000, output_dim: int = 1024, max_caption_length: int = 128)Source: ll_stepnet/stepnet/tasks.py:22
forward
Section titled “forward”forward(token_ids: torch.Tensor, caption_ids: Optional[torch.Tensor] = None, topology_data: Optional[Dict] = None) -> torch.TensorSource: ll_stepnet/stepnet/tasks.py:46
Args: token_ids: [batch, seq_len] STEP tokens caption_ids: [batch, caption_len] target captions (for training) topology_data: Optional topology dict
Returns: logits: [batch, caption_len, vocab_size] - caption predictions
generate
Section titled “generate”generate(token_ids: torch.Tensor, topology_data: Optional[Dict] = None, max_length: int = 64, num_beams: int = 4, temperature: float = 1.0, eos_token_id: int = 2, pad_token_id: int = 0, bos_token_id: int = 1) -> torch.TensorSource: ll_stepnet/stepnet/tasks.py:78
Generate captions using beam search decoding.
Args: token_ids: [batch, seq_len] STEP tokens topology_data: Optional topology dict max_length: Maximum caption length to generate num_beams: Number of beams for beam search (1 = greedy) temperature: Sampling temperature (lower = more deterministic) eos_token_id: End of sequence token ID pad_token_id: Padding token ID bos_token_id: Beginning of sequence token ID
Returns: generated_ids: [batch, generated_len] generated caption token IDs
class STEPForCausalLM(nn.Module)
Section titled “class STEPForCausalLM(nn.Module)”Source: ll_stepnet/stepnet/pretrain.py:62
Autoregressive (GPT-style) token prediction for STEP files. Predicts next token given previous tokens.
STEP-aware architecture:
- Token sequence modeling with causal attention
- Topology/geometry understanding via graph encoder
- Fusion of both modalities
Train on raw STEP files with NO LABELS!
Methods
__init__
Section titled “__init__”__init__(vocab_size: int = 50000, embed_dim: int = 512, num_layers: int = 12, num_heads: int = 8, max_length: int = 4096, dropout: float = 0.1, graph_node_dim: int = 128, num_graph_layers: int = 3)Source: ll_stepnet/stepnet/pretrain.py:75
forward
Section titled “forward”forward(input_ids: torch.Tensor, attention_mask: Optional[torch.Tensor] = None, topology_data: Optional[Dict] = None, labels: Optional[torch.Tensor] = None) -> Dict[str, torch.Tensor]Source: ll_stepnet/stepnet/pretrain.py:118
Args: input_ids: [batch_size, seq_len] - tokenized STEP content attention_mask: [batch_size, seq_len] - mask for padding topology_data: Optional dict with: - adjacency_matrix: [num_nodes, num_nodes] - node_features: [num_nodes, feature_dim] labels: [batch_size, seq_len] - next token targets (optional, for training)
Returns: Dictionary with: - logits: [batch_size, seq_len, vocab_size] - loss: scalar (if labels provided)
generate
Section titled “generate”generate(input_ids: torch.Tensor, max_new_tokens: int = 100, temperature: float = 1.0, top_k: Optional[int] = 50) -> torch.TensorSource: ll_stepnet/stepnet/pretrain.py:173
Generate STEP tokens autoregressively.
Args: input_ids: [batch_size, seq_len] - prompt tokens max_new_tokens: How many tokens to generate temperature: Sampling temperature (higher = more random) top_k: Only sample from top K tokens
Returns: Generated token IDs [batch_size, seq_len + max_new_tokens]
class STEPForClassification(nn.Module)
Section titled “class STEPForClassification(nn.Module)”Source: ll_stepnet/stepnet/tasks.py:193
STEP encoder with classification head. Predicts: Part category.
Use case: “bracket”, “housing”, “shaft”, “gear”, etc.
Methods
__init__
Section titled “__init__”__init__(vocab_size: int = 50000, num_classes: int = 100, output_dim: int = 1024)Source: ll_stepnet/stepnet/tasks.py:201
forward
Section titled “forward”forward(token_ids: torch.Tensor, topology_data: Optional[Dict] = None) -> torch.TensorSource: ll_stepnet/stepnet/tasks.py:218
Args: token_ids: [batch, seq_len] topology_data: Optional topology
Returns: logits: [batch, num_classes]
class STEPForHybridLM(nn.Module)
Section titled “class STEPForHybridLM(nn.Module)”Source: ll_stepnet/stepnet/pretrain.py:335
Hybrid model combining causal and masked prediction. Best of both worlds for pre-training.
Both objectives share a single graph encoder so topology understanding learned from one objective transfers to the other.
Methods
__init__
Section titled “__init__”__init__(vocab_size: int = 50000, embed_dim: int = 512, num_layers: int = 12, num_heads: int = 8, max_length: int = 4096, dropout: float = 0.1, graph_node_dim: int = 128, num_graph_layers: int = 3)Source: ll_stepnet/stepnet/pretrain.py:344
forward
Section titled “forward”forward(input_ids: torch.Tensor, attention_mask: Optional[torch.Tensor] = None, topology_data: Optional[Dict] = None, labels: Optional[torch.Tensor] = None, masked_input_ids: Optional[torch.Tensor] = None, masked_labels: Optional[torch.Tensor] = None) -> Dict[str, torch.Tensor]Source: ll_stepnet/stepnet/pretrain.py:385
Train both objectives simultaneously with STEP topology awareness.
Args: input_ids: For causal LM attention_mask: Attention mask topology_data: STEP topology (adjacency + node features) labels: Next token labels for causal LM masked_input_ids: For masked LM masked_labels: Original tokens for masked LM
class STEPForMaskedLM(nn.Module)
Section titled “class STEPForMaskedLM(nn.Module)”Source: ll_stepnet/stepnet/pretrain.py:221
Masked language modeling (BERT-style) for STEP files. Predict masked tokens from context.
STEP-aware architecture:
- Token sequence modeling with bidirectional attention (can use STEPTransformerEncoder!)
- Topology/geometry understanding via graph encoder
- Fusion of both modalities
Train on raw STEP files with NO LABELS!
Methods
__init__
Section titled “__init__”__init__(vocab_size: int = 50000, embed_dim: int = 512, num_layers: int = 12, num_heads: int = 8, max_length: int = 4096, dropout: float = 0.1, graph_node_dim: int = 128, num_graph_layers: int = 3)Source: ll_stepnet/stepnet/pretrain.py:234
forward
Section titled “forward”forward(input_ids: torch.Tensor, attention_mask: Optional[torch.Tensor] = None, topology_data: Optional[Dict] = None, labels: Optional[torch.Tensor] = None) -> Dict[str, torch.Tensor]Source: ll_stepnet/stepnet/pretrain.py:286
Args: input_ids: [batch_size, seq_len] - tokenized STEP with [MASK] tokens attention_mask: [batch_size, seq_len] - mask for padding topology_data: Optional dict with: - adjacency_matrix: [num_nodes, num_nodes] - node_features: [num_nodes, feature_dim] labels: [batch_size, seq_len] - original tokens (before masking)
Returns: Dictionary with: - logits: [batch_size, seq_len, vocab_size] - loss: scalar (if labels provided)
class STEPForPropertyPrediction(nn.Module)
Section titled “class STEPForPropertyPrediction(nn.Module)”Source: ll_stepnet/stepnet/tasks.py:236
STEP encoder with regression head. Predicts: Physical properties (volume, mass, surface area, etc.)
Use case: Predict part weight, bounding box dimensions, etc.
Methods
__init__
Section titled “__init__”__init__(vocab_size: int = 50000, num_properties: int = 10, output_dim: int = 1024)Source: ll_stepnet/stepnet/tasks.py:244
forward
Section titled “forward”forward(token_ids: torch.Tensor, topology_data: Optional[Dict] = None) -> torch.TensorSource: ll_stepnet/stepnet/tasks.py:261
Args: token_ids: [batch, seq_len] topology_data: Optional topology
Returns: properties: [batch, num_properties]
class STEPForQA(nn.Module)
Section titled “class STEPForQA(nn.Module)”Source: ll_stepnet/stepnet/tasks.py:325
STEP encoder for question answering. Predicts: Answer to questions about the CAD part.
Use case: Q: “How many holes does this part have?” A: “4”
Methods
__init__
Section titled “__init__”__init__(step_vocab_size: int = 50000, text_vocab_size: int = 50000, output_dim: int = 1024)Source: ll_stepnet/stepnet/tasks.py:335
forward
Section titled “forward”forward(step_token_ids: torch.Tensor, question_token_ids: torch.Tensor, answer_token_ids: Optional[torch.Tensor] = None, topology_data: Optional[Dict] = None) -> torch.TensorSource: ll_stepnet/stepnet/tasks.py:362
Args: step_token_ids: [batch, step_seq_len] question_token_ids: [batch, q_seq_len] answer_token_ids: [batch, a_seq_len] (for training) topology_data: Optional
Returns: logits: [batch, a_seq_len, vocab_size]
generate
Section titled “generate”generate(step_token_ids: torch.Tensor, question_token_ids: torch.Tensor, topology_data: Optional[Dict] = None, max_length: int = 64, num_beams: int = 4, temperature: float = 1.0, eos_token_id: int = 2, pad_token_id: int = 0, bos_token_id: int = 1) -> torch.TensorSource: ll_stepnet/stepnet/tasks.py:402
Generate answers using beam search decoding.
Args: step_token_ids: [batch, step_seq_len] STEP tokens question_token_ids: [batch, q_seq_len] question tokens topology_data: Optional topology dict max_length: Maximum answer length to generate num_beams: Number of beams for beam search (1 = greedy) temperature: Sampling temperature (lower = more deterministic) eos_token_id: End of sequence token ID pad_token_id: Padding token ID bos_token_id: Beginning of sequence token ID
Returns: generated_ids: [batch, generated_len] generated answer token IDs
class STEPForSimilarity(nn.Module)
Section titled “class STEPForSimilarity(nn.Module)”Source: ll_stepnet/stepnet/tasks.py:279
STEP encoder for similarity/retrieval tasks. Predicts: Embedding for similar part search.
Use case: “Find similar CAD parts in database”
Methods
__init__
Section titled “__init__”__init__(vocab_size: int = 50000, embedding_dim: int = 512)Source: ll_stepnet/stepnet/tasks.py:287
forward
Section titled “forward”forward(token_ids: torch.Tensor, topology_data: Optional[Dict] = None) -> torch.TensorSource: ll_stepnet/stepnet/tasks.py:303
Args: token_ids: [batch, seq_len] topology_data: Optional topology
Returns: embeddings: [batch, embedding_dim] - L2 normalized
class STEPGraphEncoder(nn.Module)
Section titled “class STEPGraphEncoder(nn.Module)”Source: ll_stepnet/stepnet/encoder.py:350
Graph neural network for STEP/B-Rep topology.
Processes entity reference graphs from either:
- ll_stepnet’s STEPTopologyBuilder (129-dim features: 128 numeric + 1 hash)
- cadling’s TopologyGraph (48-dim features, native format)
The input_dim parameter controls which format is accepted. When
working with cadling data, set input_dim=48 to accept cadling’s
native topology features directly with no conversion.
Methods
__init__
Section titled “__init__”__init__(input_dim: int = 48, node_dim: int = 128, edge_dim: int = 64, num_layers: int = 3)Source: ll_stepnet/stepnet/encoder.py:363
forward
Section titled “forward”forward(node_features: torch.Tensor, adjacency_matrix: torch.Tensor) -> torch.TensorSource: ll_stepnet/stepnet/encoder.py:397
Args: node_features: [num_nodes, input_dim] adjacency_matrix: [num_nodes, num_nodes] — dense, sparse COO, or sparse CSR. Sparse inputs avoid O(N^2) memory for large B-Rep graphs.
Returns: updated_features: [num_nodes, node_dim]
class STEPLearningCurveGenerator
Section titled “class STEPLearningCurveGenerator”Source: ll_stepnet/stepnet/data_requirements.py:128
Generate learning curves for STEP models to determine data requirements.
This class trains models on varying dataset sizes and measures performance to establish empirical scaling relationships.
Methods
__init__
Section titled “__init__”__init__(model_class: type, model_kwargs: Dict, train_kwargs: Optional[Dict] = None, device: str = 'cuda' if torch.cuda.is_available() else 'cpu')Source: ll_stepnet/stepnet/data_requirements.py:136
Args: model_class: STEP model class (e.g., STEPForClassification) model_kwargs: Model initialization arguments train_kwargs: Training configuration (epochs, lr, etc.) device: Device to train on
generate_learning_curve
Section titled “generate_learning_curve”generate_learning_curve(train_dataset: STEPDataset, val_dataset: STEPDataset, sample_fractions: List[float], n_iterations: int = 3, save_dir: Optional[str] = None) -> Dict[str, np.ndarray]Source: ll_stepnet/stepnet/data_requirements.py:170
Generate learning curve by training on varying dataset sizes.
Args: train_dataset: Full training dataset val_dataset: Validation dataset sample_fractions: List of fractions of training data to use (e.g., [0.1, 0.2, 0.5, 1.0]) n_iterations: Number of training runs per sample size save_dir: Optional directory to save checkpoints
Returns: Dictionary with learning curve data
class STEPPropertyPredictionConfig
Section titled “class STEPPropertyPredictionConfig”Source: ll_stepnet/stepnet/config.py:74
Configuration for STEP property prediction model.
Methods
__init__
Section titled “__init__”__init__(vocab_size: int = 50000, num_properties: int = 6, output_dim: int = 1024, dropout: float = 0.1) -> NoneSource: ll_stepnet/stepnet/config.py
class STEPQAConfig
Section titled “class STEPQAConfig”Source: ll_stepnet/stepnet/config.py:99
Configuration for STEP question answering model.
Methods
__init__
Section titled “__init__”__init__(step_vocab_size: int = 50000, text_vocab_size: int = 50000, output_dim: int = 1024) -> NoneSource: ll_stepnet/stepnet/config.py
class STEPReserializationConfig
Section titled “class STEPReserializationConfig”Source: ll_stepnet/stepnet/config.py:134
Configuration for DFS reserialization of STEP files.
Methods
__init__
Section titled “__init__”__init__(max_depth: int = 50, float_precision: int = 6, normalize_floats: bool = True, renumber_ids: bool = True, root_strategy: str = 'both', include_orphans: bool = True) -> NoneSource: ll_stepnet/stepnet/config.py
class STEPReserializedOutput
Section titled “class STEPReserializedOutput”Source: ll_stepnet/stepnet/reserialization.py:170
Output of DFS reserialization.
Methods
__init__
Section titled “__init__”__init__(text: str, traversal_order: list[tuple[int, int]], entity_count: int, orphan_count: int, max_depth_reached: int, id_mapping: dict[int, int] = dict()) -> NoneSource: ll_stepnet/stepnet/reserialization.py
class STEPScalingLawAnalyzer
Section titled “class STEPScalingLawAnalyzer”Source: ll_stepnet/stepnet/data_requirements.py:416
Analyze scaling laws for STEP models and predict data requirements.
Fits power law relationships to learning curve data and extrapolates to estimate required dataset sizes for target performance.
Methods
__init__
Section titled “__init__”__init__()Source: ll_stepnet/stepnet/data_requirements.py:424
fit_power_law
Section titled “fit_power_law”fit_power_law(sample_sizes: np.ndarray, losses: np.ndarray, law_type: str = 'openai') -> Dict[str, float]Source: ll_stepnet/stepnet/data_requirements.py:428
Fit power law to learning curve data.
Args: sample_sizes: Array of dataset sizes losses: Corresponding validation losses law_type: ‘openai’ for L(D) = (D_c/D)^alpha_D or ‘standard’ for Error = a*n^(-b) + c
Returns: Dictionary of fitted parameters
predict_required_samples
Section titled “predict_required_samples”predict_required_samples(target_loss: float, current_sizes: Optional[np.ndarray] = None, current_losses: Optional[np.ndarray] = None) -> intSource: ll_stepnet/stepnet/data_requirements.py:501
Predict number of samples needed to achieve target loss.
Args: target_loss: Desired target loss current_sizes: Optional array of current dataset sizes (for fitting) current_losses: Optional array of current losses (for fitting)
Returns: Estimated required sample size
extrapolate_performance
Section titled “extrapolate_performance”extrapolate_performance(target_size: int, current_sizes: Optional[np.ndarray] = None, current_losses: Optional[np.ndarray] = None) -> floatSource: ll_stepnet/stepnet/data_requirements.py:559
Predict performance at a given dataset size.
Args: target_size: Dataset size to predict for current_sizes: Optional current dataset sizes (for fitting) current_losses: Optional current losses (for fitting)
Returns: Predicted loss at target_size
class STEPSimilarityConfig
Section titled “class STEPSimilarityConfig”Source: ll_stepnet/stepnet/config.py:92
Configuration for STEP similarity model.
Methods
__init__
Section titled “__init__”__init__(vocab_size: int = 50000, embedding_dim: int = 512) -> NoneSource: ll_stepnet/stepnet/config.py
class STEPStructuralAnnotator
Section titled “class STEPStructuralAnnotator”Source: ll_stepnet/stepnet/annotations.py:139
Generates structural annotations for STEP entity graphs.
Analyzes DFS-reserialized output and the underlying entity graph to produce human/LLM-readable summaries of the file structure.
Args: config: Annotation configuration. Uses defaults if None.
Methods
__init__
Section titled “__init__”__init__(config: Optional[STEPAnnotationConfig] = None)Source: ll_stepnet/stepnet/annotations.py:149
annotate
Section titled “annotate”annotate(graph: STEPEntityGraph, reserialized: STEPReserializedOutput) -> STEPAnnotatedOutputSource: ll_stepnet/stepnet/annotations.py:152
Generate annotations for reserialized output.
Args: graph: The entity graph (pre-reserialization). reserialized: The DFS reserialization output.
Returns: STEPAnnotatedOutput with summary, branch annotations, and full text.
class STEPTokenizer
Section titled “class STEPTokenizer”Source: ll_stepnet/stepnet/tokenizer.py:14
Standard tokenizer for STEP files. Only handles text → token IDs conversion. No feature extraction or graph building.
Methods
__init__
Section titled “__init__”__init__(vocab_size: int = 50000, config: Optional[STEPTokenizerConfig] = None)Source: ll_stepnet/stepnet/tokenizer.py:21
Args: vocab_size: Maximum vocabulary size config: Optional STEPTokenizerConfig instance. When provided, its values override the vocab_size and max_length parameters.
tokenize
Section titled “tokenize”tokenize(text: str) -> List[str]Source: ll_stepnet/stepnet/tokenizer.py:95
Split STEP text into tokens.
Args: text: Raw STEP text
Returns: List of token strings
encode
Section titled “encode”encode(text: str) -> List[int]Source: ll_stepnet/stepnet/tokenizer.py:110
Encode STEP text to token IDs.
Args: text: Raw STEP text
Returns: List of token IDs
decode
Section titled “decode”decode(token_ids: List[int]) -> strSource: ll_stepnet/stepnet/tokenizer.py:132
Decode token IDs back to text (approximate).
Args: token_ids: List of token IDs
Returns: Decoded text
batch_encode
Section titled “batch_encode”batch_encode(texts: List[str], add_special_tokens: bool = True) -> Dict[str, List[List[int]]]Source: ll_stepnet/stepnet/tokenizer.py:151
Batch encode multiple texts.
Args: texts: List of STEP text strings add_special_tokens: Add CLS and SEP tokens
Returns: Dictionary with token_ids
class STEPTopologyBuilder
Section titled “class STEPTopologyBuilder”Source: ll_stepnet/stepnet/topology.py:18
Builds topological graphs from STEP entity relationships. Separate from tokenization and feature extraction.
Methods
__init__
Section titled “__init__”__init__()Source: ll_stepnet/stepnet/topology.py:24
Initialize topology builder.
build_reference_graph
Section titled “build_reference_graph”build_reference_graph(features_list: List[Dict]) -> DictSource: ll_stepnet/stepnet/topology.py:28
Build entity reference graph from extracted features.
Args: features_list: List of feature dicts from STEPFeatureExtractor
Returns: Dictionary with: - adjacency_dict: Dict[int, List[int]] - entity_id → referenced_ids - edge_list: List[Tuple[int, int]] - list of (from, to) edges - num_nodes: int
build_adjacency_matrix
Section titled “build_adjacency_matrix”build_adjacency_matrix(reference_graph: Dict) -> torch.TensorSource: ll_stepnet/stepnet/topology.py:74
Convert reference graph to sparse adjacency matrix.
Returns a sparse COO tensor to avoid O(N^2) memory on large B-Rep graphs.
Args: reference_graph: Output from build_reference_graph
Returns: Sparse COO adjacency matrix [N, N] where N = num_nodes
build_edge_index
Section titled “build_edge_index”build_edge_index(reference_graph: Dict) -> torch.TensorSource: ll_stepnet/stepnet/topology.py:114
Build edge index in PyTorch Geometric format.
Args: reference_graph: Output from build_reference_graph
Returns: Edge index tensor [2, num_edges] for PyG
compute_node_degrees
Section titled “compute_node_degrees”compute_node_degrees(reference_graph: Dict) -> Dict[int, Dict[str, int]]Source: ll_stepnet/stepnet/topology.py:140
Compute in-degree and out-degree for each node.
Args: reference_graph: Output from build_reference_graph
Returns: Dict mapping node_id → {‘in_degree’: int, ‘out_degree’: int}
identify_topology_types
Section titled “identify_topology_types”identify_topology_types(features_list: List[Dict]) -> Dict[str, List[int]]Source: ll_stepnet/stepnet/topology.py:158
Categorize entities by topological role.
Args: features_list: List of feature dicts
Returns: Dict mapping category → list of entity IDs
build_node_features
Section titled “build_node_features”build_node_features(features_list: List[Dict], reference_graph: Dict) -> torch.TensorSource: ll_stepnet/stepnet/topology.py:200
Build node feature matrix from extracted features.
Args: features_list: List of feature dicts from STEPFeatureExtractor reference_graph: Output from build_reference_graph
Returns: Node features tensor [num_nodes, feature_dim]
build_complete_topology
Section titled “build_complete_topology”build_complete_topology(features_list: List[Dict], compact: bool = True) -> DictSource: ll_stepnet/stepnet/topology.py:242
Build complete topology representation.
Args:
features_list: List of feature dicts from STEPFeatureExtractor.
compact: If True (default), use build_compact_node_features()
to produce 48-dim features in cadling’s native layout. This
matches the default input_dim=48 of
:class:STEPGraphEncoder. Pass compact=False to use the
legacy build_node_features() (129-dim: 128 numeric + 1
type hash).
Returns: Complete topology dictionary with: - reference_graph - adjacency_matrix - edge_index - node_degrees - topology_types - node_features - num_nodes - num_edges
build_coedge_structure
Section titled “build_coedge_structure”build_coedge_structure(features_list: List[Dict]) -> DictSource: ll_stepnet/stepnet/topology.py:288
Build coedge adjacency structure from STEP topology.
In B-Rep topology, each topological edge is shared by (at most) two adjacent faces. Each such sharing creates two oriented coedges — one per face. This method reconstructs the coedge-level graph with next/prev/mate pointers from the STEP entity hierarchy.
The coedge structure is the primary input format for the BRepNet architecture (see cadling.models.segmentation.architectures.brep_net).
Args: features_list: List of feature dicts from STEPFeatureExtractor. Each dict should have ‘entity_id’, ‘entity_type’, ‘references’, and optionally ‘numeric_params’.
Returns: Dictionary with: - coedge_features: torch.Tensor [num_coedges, feature_dim] - next_indices: torch.Tensor [num_coedges] - prev_indices: torch.Tensor [num_coedges] - mate_indices: torch.Tensor [num_coedges] - face_indices: torch.Tensor [num_coedges] - num_coedges: int - num_faces: int - face_entity_ids: List[int] - entity IDs of faces - edge_entity_ids: List[int] - entity IDs of edges
build_compact_node_features
Section titled “build_compact_node_features”build_compact_node_features(features_list: List[Dict], reference_graph: Optional[Dict] = None, feature_dim: int = 48) -> torch.TensorSource: ll_stepnet/stepnet/topology.py:540
Build node features in cadling-compatible compact format.
Produces node features in the same 48-dim layout used by cadling’s TopologyGraph, so both ll_stepnet and cadling feed the same native representation into STEPGraphEncoder (default input_dim=48) and geotoken’s GraphTokenizer.
Feature layout (48 dims): [0:32] — first 32 numeric parameters (zero-padded) [32:48] — entity type one-hot (16 common B-Rep types)
Args: features_list: Feature dicts from STEPFeatureExtractor. reference_graph: Optional reference graph from build_reference_graph(). If provided, uses its num_nodes and id_to_idx to ensure node_features shape matches adjacency_matrix. If None, uses len(features_list) as num_nodes (legacy behavior). feature_dim: Output feature dimension (default 48).
Returns: torch.Tensor of shape [num_nodes, feature_dim].
to_cadling_topology_graph
Section titled “to_cadling_topology_graph”to_cadling_topology_graph(topo_dict: Dict)Source: ll_stepnet/stepnet/topology.py:622
Convert a build_complete_topology() output dict to a cadling TopologyGraph.
This closes the round-trip: cadling → ll_stepnet → cadling.
The cadling TopologyGraph is constructed from the adjacency
matrix and node features stored in topo_dict.
Args:
topo_dict: Dictionary returned by :meth:build_complete_topology.
Must contain adjacency_matrix (tensor or array) and
node_features (tensor or array).
Returns:
A cadling.datamodel.base_models.TopologyGraph instance.
Raises: ImportError: If cadling is not installed.
class STEPTrainer
Section titled “class STEPTrainer”Source: ll_stepnet/stepnet/trainer.py:39
Trainer for STEP models. Handles training loop, validation, and checkpointing.
Methods
__init__
Section titled “__init__”__init__(model: nn.Module, train_dataloader: DataLoader, val_dataloader: Optional[DataLoader] = None, optimizer: Optional[torch.optim.Optimizer] = None, loss_fn: Optional[Callable] = None, device: str = 'auto', checkpoint_dir: Optional[str] = None)Source: ll_stepnet/stepnet/trainer.py:45
Args: model: STEP model to train train_dataloader: Training data loader val_dataloader: Optional validation data loader optimizer: Optimizer (creates AdamW if None) loss_fn: Loss function (creates based on task if None) device: Device to train on checkpoint_dir: Directory to save checkpoints
train_epoch
Section titled “train_epoch”train_epoch() -> floatSource: ll_stepnet/stepnet/trainer.py:213
Train for one epoch.
Returns: Average training loss
validate
Section titled “validate”validate() -> Dict[str, float]Source: ll_stepnet/stepnet/trainer.py:249
Run validation.
Returns: Dictionary with validation metrics
train(num_epochs: int, save_every: int = 1)Source: ll_stepnet/stepnet/trainer.py:285
Train for multiple epochs.
Args: num_epochs: Number of epochs to train save_every: Save checkpoint every N epochs
save_checkpoint
Section titled “save_checkpoint”save_checkpoint(filename: str)Source: ll_stepnet/stepnet/trainer.py:333
Save model checkpoint.
load_checkpoint
Section titled “load_checkpoint”load_checkpoint(filename: str)Source: ll_stepnet/stepnet/trainer.py:347
Load model checkpoint.
save_history
Section titled “save_history”save_history()Source: ll_stepnet/stepnet/trainer.py:361
Save training history to JSON.
class STEPTransformerDecoder(nn.Module)
Section titled “class STEPTransformerDecoder(nn.Module)”Source: ll_stepnet/stepnet/encoder.py:112
Transformer decoder for STEP token sequences with causal attention. Used for autoregressive generation (GPT-style).
Supports optional conditioning via a conditioner module (TextConditioner, ImageConditioner, or MultiModalConditioner) that applies cross-attention to inject text/image features. Following Text2CAD, the first N blocks can skip cross-attention via the conditioner’s skip_cross_attention_blocks.
Methods
__init__
Section titled “__init__”__init__(vocab_size: int = 50000, embed_dim: int = 256, num_heads: int = 8, num_layers: int = 6, ff_dim: int = 1024, dropout: float = 0.1)Source: ll_stepnet/stepnet/encoder.py:123
forward
Section titled “forward”forward(token_ids: torch.Tensor, attention_mask: Optional[torch.Tensor] = None, cross_attention_memory: Optional[torch.Tensor] = None, conditioner: Optional[nn.Module] = None, conditioning_inputs: Optional[Dict] = None) -> torch.TensorSource: ll_stepnet/stepnet/encoder.py:226
Args: token_ids: [batch_size, seq_len] attention_mask: [batch_size, seq_len] optional cross_attention_memory: [batch_size, mem_len, embed_dim] optional When provided, used as the memory (key/value) for the decoder cross-attention layers. When None, the decoder uses self-attention only (existing behaviour). conditioner: Optional conditioning module (TextConditioner, ImageConditioner, or MultiModalConditioner). When provided, applies cross-attention conditioning after each decoder block. conditioning_inputs: Dict with conditioning data for the conditioner: - text_input_ids: [B, L] text token ids - text_attention_mask: [B, L] text padding mask - pixel_values: [B, C, H, W] image tensors
Returns: encoded: [batch_size, seq_len, embed_dim]
class STEPTransformerEncoder(nn.Module)
Section titled “class STEPTransformerEncoder(nn.Module)”Source: ll_stepnet/stepnet/encoder.py:38
Transformer encoder for STEP token sequences. Standard transformer architecture.
Methods
__init__
Section titled “__init__”__init__(vocab_size: int = 50000, embed_dim: int = 256, num_heads: int = 8, num_layers: int = 6, ff_dim: int = 1024, dropout: float = 0.1)Source: ll_stepnet/stepnet/encoder.py:44
forward
Section titled “forward”forward(token_ids: torch.Tensor, attention_mask: Optional[torch.Tensor] = None) -> torch.TensorSource: ll_stepnet/stepnet/encoder.py:78
Args: token_ids: [batch_size, seq_len] attention_mask: [batch_size, seq_len] optional
Returns: encoded: [batch_size, seq_len, embed_dim]
class STEPVAE(nn.Module)
Section titled “class STEPVAE(nn.Module)”Source: ll_stepnet/stepnet/vae.py:35
Variational Autoencoder wrapping existing STEP encoder/decoder.
Follows the DeepCAD architecture: sequences of CAD command tokens are encoded into a Gaussian latent, then decoded autoregressively back to command-type and parameter predictions.
Args: encoder_config: Configuration object with vocab_size, token_embed_dim, num_transformer_layers, dropout, etc. latent_dim: Dimensionality of the latent vector z. kl_weight: Maximum weight applied to the KL divergence term. num_command_types: Number of distinct CAD command types. num_param_levels: Number of quantisation levels per parameter. max_seq_len: Maximum sequence length the decoder can produce.
Methods
__init__
Section titled “__init__”__init__(encoder_config, latent_dim: int = DEFAULT_QUANTIZATION_LEVELS, kl_weight: float = 1.0, num_command_types: int = NUM_COMMAND_TYPES, num_param_levels: int = DEFAULT_QUANTIZATION_LEVELS, max_seq_len: int = DEFAULT_MAX_SEQ_LEN) -> NoneSource: ll_stepnet/stepnet/vae.py:52
set_epoch
Section titled “set_epoch”set_epoch(epoch: int) -> NoneSource: ll_stepnet/stepnet/vae.py:133
Update current epoch for KL warmup scheduling.
Args: epoch: Current training epoch (0-indexed).
set_kl_warmup_epochs
Section titled “set_kl_warmup_epochs”set_kl_warmup_epochs(warmup_epochs: int) -> NoneSource: ll_stepnet/stepnet/vae.py:141
Set the number of epochs over which beta warms up.
Args: warmup_epochs: Number of warmup epochs.
encode
Section titled “encode”encode(token_ids: torch.Tensor, attention_mask: Optional[torch.Tensor] = None) -> tuple[torch.Tensor, torch.Tensor]Source: ll_stepnet/stepnet/vae.py:149
Encode token sequence to Gaussian parameters.
Args: token_ids: [batch_size, seq_len] token indices. attention_mask: [batch_size, seq_len] 1=real, 0=pad.
Returns: Tuple of (mu, log_var) each [batch_size, latent_dim].
reparameterize
Section titled “reparameterize”reparameterize(mu: torch.Tensor, log_var: torch.Tensor) -> torch.TensorSource: ll_stepnet/stepnet/vae.py:180
Reparameterization trick: z = mu + eps * exp(0.5 * log_var).
Args: mu: Mean of the posterior, [batch_size, latent_dim]. log_var: Log-variance of the posterior, [batch_size, latent_dim].
Returns: Sampled latent z of shape [batch_size, latent_dim].
decode
Section titled “decode”decode(z: torch.Tensor, seq_len: Optional[int] = None, conditioner: Optional[nn.Module] = None, conditioning_inputs: Optional[Dict] = None) -> torch.TensorSource: ll_stepnet/stepnet/vae.py:198
Decode a latent vector to hidden states.
Optionally applies text/image conditioning via a conditioner module that injects cross-attention. Following Text2CAD, the first N decoder blocks can skip cross-attention (controlled by the conditioner’s skip_cross_attention_blocks parameter).
Args: z: Latent vector [batch_size, latent_dim]. seq_len: Length of output sequence. Defaults to max_seq_len. conditioner: Optional conditioning module (TextConditioner, ImageConditioner, or MultiModalConditioner). conditioning_inputs: Dict with conditioning data for the conditioner: - text_input_ids: [B, L] text token ids - text_attention_mask: [B, L] text padding mask - pixel_values: [B, C, H, W] image tensors
Returns: Hidden states [batch_size, seq_len, embed_dim].
forward
Section titled “forward”forward(token_ids: torch.Tensor, attention_mask: Optional[torch.Tensor] = None, command_targets: Optional[torch.Tensor] = None, param_targets: Optional[torch.Tensor] = None) -> Dict[str, torch.Tensor]Source: ll_stepnet/stepnet/vae.py:307
Full forward pass: encode, reparameterize, decode, compute losses.
Args: token_ids: [batch_size, seq_len] input token ids. attention_mask: [batch_size, seq_len] padding mask. command_targets: [batch_size, seq_len] ground-truth command types. param_targets: [batch_size, seq_len, 16] ground-truth params.
Returns: Dictionary with z, mu, log_var, command_logits, param_logits, kl_loss, and optionally recon_loss and loss.
sample
Section titled “sample”sample(num_samples: int = 1, seq_len: Optional[int] = None, device: Optional[torch.device] = None, conditioner: Optional[nn.Module] = None, conditioning_inputs: Optional[Dict] = None) -> Dict[str, torch.Tensor]Source: ll_stepnet/stepnet/vae.py:396
Sample new CAD sequences from the prior N(0, I).
Optionally applies text/image conditioning via a conditioner module. Following Text2CAD, the first N decoder blocks skip cross-attention to allow initial CAD structure formation before conditioning kicks in.
Args: num_samples: Number of sequences to generate. seq_len: Output sequence length. Defaults to max_seq_len. device: Target device for the generated tensors. conditioner: Optional conditioning module (TextConditioner, ImageConditioner, or MultiModalConditioner). conditioning_inputs: Dict with conditioning data: - text_input_ids: [B, L] text token ids - text_attention_mask: [B, L] text padding mask - pixel_values: [B, C, H, W] image tensors
Returns: Dictionary with command_preds [N, S] and param_preds [N, S, 16].
class SinusoidalTimestepEmbedding(nn.Module)
Section titled “class SinusoidalTimestepEmbedding(nn.Module)”Source: ll_stepnet/stepnet/diffusion.py:312
Sinusoidal positional embedding for diffusion timesteps.
Maps integer timesteps to dense vectors using sin/cos functions, then projects through a 2-layer MLP.
Args: embed_dim: Output embedding dimension.
Methods
__init__
Section titled “__init__”__init__(embed_dim: int = 1024) -> NoneSource: ll_stepnet/stepnet/diffusion.py:322
forward
Section titled “forward”forward(timesteps: torch.Tensor) -> torch.TensorSource: ll_stepnet/stepnet/diffusion.py:331
Compute timestep embeddings.
Args: timesteps: Integer timestep indices [B].
Returns: Embeddings [B, embed_dim].
class StreamingCadlingConfig
Section titled “class StreamingCadlingConfig”Source: ll_stepnet/stepnet/config.py:223
Configuration for streaming cadling data into ll_stepnet trainers.
When passed to a streaming trainer’s __init__, the trainer will
lazy-import cadling.data.streaming.CADStreamingDataset and build
a streaming data pipeline from cadling data automatically.
Attributes:
dataset_id: HuggingFace dataset ID or local path to cadling data.
split: Dataset split ("train", "val", "test").
streaming: Whether to use HuggingFace streaming mode.
batch_size: Training batch size.
shuffle: Whether to shuffle the stream.
shuffle_buffer_size: Buffer size for streaming shuffle.
max_samples: Maximum samples to load (None = all).
max_commands: Maximum command sequence length (pad/truncate).
compact_topology: Use 48-dim compact topology (cadling native).
lazy_load_topology: Whether to load topology on-demand.topology_cache_size: Maximum topologies to cache in memory.preprocess_fn: Preprocessing function name ('geotoken', 'tokenize', None).prefetch_factor: Number of batches to prefetch in background.max_memory_mb: Maximum memory for cached data.chunk_size: Number of samples per processing chunk.include_graph_data: Whether to include graph/topology data.graph_feature_dim: Expected node feature dimension (48 cadling, 129 legacy).num_workers: Number of dataloader workers.Methods
__init__
Section titled “__init__”__init__(dataset_id: str = '', split: str = 'train', streaming: bool = True, batch_size: int = 8, shuffle: bool = True, shuffle_buffer_size: int = 10000, max_samples: Optional[int] = None, max_commands: int = DEFAULT_MAX_SEQ_LEN, lazy_load_topology: bool = True, topology_cache_size: int = 1000, preprocess_fn: Optional[str] = None, prefetch_factor: int = 2, max_memory_mb: int = 4096, chunk_size: int = 1000, include_graph_data: bool = False, graph_feature_dim: int = 48, compact_topology: bool = True, num_workers: int = 4) -> NoneSource: ll_stepnet/stepnet/config.py
class StreamingDiffusionTrainer
Section titled “class StreamingDiffusionTrainer”Source: ll_stepnet/stepnet/training/streaming_diffusion_trainer.py:34
Trainer for diffusion models with streaming dataset support.
Combines the step-based training loop from StreamingVAETrainer with the denoising diffusion training from DiffusionTrainer. Key features:
- Step-based scheduling: Uses total_steps instead of epochs
- EMA model maintenance: Updates EMA weights per step for stable generation
- Noise scheduling: Configurable noise schedule with step-based warmup
- Mid-stream checkpointing: Save/resume within a streaming epoch
- Epoch-based shuffle seed: set_epoch() for reproducible shuffling
Usage: >>> from cadling.data.streaming import CADStreamingDataset, CADStreamingConfig >>> >>> config = CADStreamingConfig( … dataset_id=“latticelabs/deepcad-latents”, … batch_size=8, … ) >>> dataset = CADStreamingDataset(config) >>> >>> trainer = StreamingDiffusionTrainer( … model=diffusion_model, … scheduler=noise_scheduler, … dataset=dataset, … total_steps=100000, … warmup_steps=1000, … ema_decay=0.9999, … ) >>> trainer.train()
Args: model: Denoising model that takes (noisy_input, timestep) and predicts noise. scheduler: Noise scheduler with add_noise() method, providing the beta schedule and noise levels for each timestep. dataset: Streaming dataset with iter() method. val_dataset: Optional validation dataset. total_steps: Total training steps. warmup_steps: Steps for learning rate warmup. ema_decay: Decay rate for exponential moving average (default 0.9999). optimizer: Optional optimizer (creates AdamW if None). lr_scheduler: Optional LR scheduler. device: Device string (‘auto’ selects CUDA if available). checkpoint_dir: Directory for saving checkpoints. log_every: Log metrics every N steps. eval_every: Run validation every N steps. save_every: Save checkpoint every N steps. sample_every: Generate samples every N steps. gradient_accumulation_steps: Accumulate gradients over N steps. max_grad_norm: Maximum gradient norm for clipping. learning_rate: Learning rate for optimizer.
Methods
__init__
Section titled “__init__”__init__(model: 'nn.Module', scheduler: Any, dataset: Any = None, val_dataset: Optional[Any] = None, total_steps: int = 100000, warmup_steps: int = 1000, ema_decay: float = 0.9999, optimizer: Optional[Any] = None, lr_scheduler: Optional[Any] = None, device: str = 'auto', checkpoint_dir: Optional[str] = None, log_every: int = 100, eval_every: int = 5000, save_every: int = 10000, sample_every: int = 10000, gradient_accumulation_steps: int = 1, max_grad_norm: float = 1.0, learning_rate: float = 0.0001, dataset_config: Optional[Any] = None) -> NoneSource: ll_stepnet/stepnet/training/streaming_diffusion_trainer.py:87
train_step
Section titled “train_step”train_step(batch: Dict[str, Any]) -> Dict[str, float]Source: ll_stepnet/stepnet/training/streaming_diffusion_trainer.py:338
Execute a single training step.
Args: batch: Training batch.
Returns: Dictionary with step metrics.
validate
Section titled “validate”validate() -> Dict[str, float]Source: ll_stepnet/stepnet/training/streaming_diffusion_trainer.py:398
Run validation on the validation dataset.
Returns: Dictionary with validation metrics.
sample_and_visualize
Section titled “sample_and_visualize”sample_and_visualize(num_samples: int = 4) -> NoneSource: ll_stepnet/stepnet/training/streaming_diffusion_trainer.py:460
Generate samples using the EMA model and save visualizations.
train() -> NoneSource: ll_stepnet/stepnet/training/streaming_diffusion_trainer.py:514
Run the full training loop.
Iterates through the streaming dataset, training until total_steps is reached. Supports resuming from checkpoints and handles epoch boundaries for streaming datasets.
save_checkpoint
Section titled “save_checkpoint”save_checkpoint(filename: str) -> NoneSource: ll_stepnet/stepnet/training/streaming_diffusion_trainer.py:611
Save training checkpoint.
Args: filename: Checkpoint filename.
load_checkpoint
Section titled “load_checkpoint”load_checkpoint(filename: str) -> NoneSource: ll_stepnet/stepnet/training/streaming_diffusion_trainer.py:639
Load training checkpoint.
Args: filename: Checkpoint filename.
save_history
Section titled “save_history”save_history() -> NoneSource: ll_stepnet/stepnet/training/streaming_diffusion_trainer.py:667
Save training history to JSON.
class StreamingGANTrainer
Section titled “class StreamingGANTrainer”Source: ll_stepnet/stepnet/training/streaming_gan_trainer.py:31
Trainer for WGAN-GP models with streaming dataset support.
Combines the step-based training loop from StreamingVAETrainer with the Wasserstein GAN with gradient penalty from GANTrainer. Key features:
- Step-based scheduling: Uses total_steps instead of epochs
- Alternating critic/generator updates: n_critic critic steps per generator step
- Gradient penalty: WGAN-GP for stable training
- Mid-stream checkpointing: Save/resume within a streaming epoch
- Epoch-based shuffle seed: set_epoch() for reproducible shuffling
Usage: >>> from cadling.data.streaming import CADStreamingDataset, CADStreamingConfig >>> >>> config = CADStreamingConfig( … dataset_id=“latticelabs/deepcad-latents”, … batch_size=8, … ) >>> dataset = CADStreamingDataset(config) >>> >>> trainer = StreamingGANTrainer( … generator=generator_model, … critic=critic_model, … dataset=dataset, … total_steps=100000, … n_critic=5, … lambda_gp=10.0, … ) >>> trainer.train()
Args: generator: Generator network mapping noise -> latent vectors. critic: Critic (discriminator) network scoring latent vectors. dataset: Streaming dataset with iter() method. total_steps: Total training steps (generator updates). warmup_steps: Steps for learning rate warmup. n_critic: Number of critic updates per generator update (default 5). lambda_gp: Gradient penalty coefficient (default 10.0 per WGAN-GP paper). optimizer_gen: Optional generator optimizer. optimizer_critic: Optional critic optimizer. device: Device string (‘auto’ selects CUDA if available). checkpoint_dir: Directory for saving checkpoints. log_every: Log metrics every N steps. eval_every: Run validation every N steps. save_every: Save checkpoint every N steps. sample_every: Generate samples every N steps. max_grad_norm: Maximum gradient norm for clipping. lr_gen: Learning rate for generator. lr_critic: Learning rate for critic.
Methods
__init__
Section titled “__init__”__init__(generator: 'nn.Module', critic: 'nn.Module', dataset: Any = None, total_steps: int = 100000, warmup_steps: int = 1000, n_critic: int = 5, lambda_gp: float = 10.0, optimizer_gen: Optional[Any] = None, optimizer_critic: Optional[Any] = None, device: str = 'auto', checkpoint_dir: Optional[str] = None, log_every: int = 100, eval_every: int = 5000, save_every: int = 10000, sample_every: int = 10000, max_grad_norm: float = 1.0, lr_gen: float = 0.0001, lr_critic: float = 0.0001, dataset_config: Optional[Any] = None) -> NoneSource: ll_stepnet/stepnet/training/streaming_gan_trainer.py:83
train_critic_step
Section titled “train_critic_step”train_critic_step(real_latents: 'torch.Tensor') -> Dict[str, float]Source: ll_stepnet/stepnet/training/streaming_gan_trainer.py:328
Perform one critic training step.
Args: real_latents: Real latent vectors.
Returns: Dictionary with critic metrics.
train_generator_step
Section titled “train_generator_step”train_generator_step(batch_size: int) -> Dict[str, float]Source: ll_stepnet/stepnet/training/streaming_gan_trainer.py:373
Perform one generator training step.
Args: batch_size: Number of samples to generate.
Returns: Dictionary with generator metrics.
validate
Section titled “validate”validate() -> Dict[str, float]Source: ll_stepnet/stepnet/training/streaming_gan_trainer.py:408
Compute validation metrics for the GAN.
Returns: Dictionary with validation metrics.
sample
Section titled “sample”sample(num_samples: int = 16) -> 'torch.Tensor'Source: ll_stepnet/stepnet/training/streaming_gan_trainer.py:473
Generate latent vectors using the trained generator.
Args: num_samples: Number of samples to generate.
Returns: Generated latent vectors.
train() -> NoneSource: ll_stepnet/stepnet/training/streaming_gan_trainer.py:488
Run the full training loop.
Iterates through the streaming dataset, training until total_steps is reached. Uses n_critic critic updates per generator update.
save_checkpoint
Section titled “save_checkpoint”save_checkpoint(filename: str) -> NoneSource: ll_stepnet/stepnet/training/streaming_gan_trainer.py:592
Save training checkpoint.
load_checkpoint
Section titled “load_checkpoint”load_checkpoint(filename: str) -> NoneSource: ll_stepnet/stepnet/training/streaming_gan_trainer.py:618
Load training checkpoint.
save_history
Section titled “save_history”save_history() -> NoneSource: ll_stepnet/stepnet/training/streaming_gan_trainer.py:643
Save training history to JSON.
class StreamingVAETrainer
Section titled “class StreamingVAETrainer”Source: ll_stepnet/stepnet/training/streaming_vae_trainer.py:31
Trainer for VAE models with streaming dataset support.
Extends the standard VAETrainer to work with HuggingFace IterableDatasets and CADStreamingDataset. Key differences from epoch-based training:
- Step-based scheduling: Uses total_steps instead of epochs
- KL warmup on global_step: Linear warmup over warmup_steps
- Mid-stream checkpointing: Save/resume within a streaming epoch
- Epoch-based shuffle seed: set_epoch() for reproducible shuffling
Usage: >>> from cadling.data.streaming import CADStreamingDataset, CADStreamingConfig >>> >>> config = CADStreamingConfig( … dataset_id=“latticelabs/deepcad-sequences”, … batch_size=8, … ) >>> dataset = CADStreamingDataset(config) >>> >>> trainer = StreamingVAETrainer( … model=vae_model, … dataset=dataset, … total_steps=100000, … warmup_steps=5000, … ) >>> trainer.train()
Args: model: VAE model with encode(), decode(), and forward() returning (reconstructed, mu, log_var). dataset: Streaming dataset with iter() method. val_dataset: Optional validation dataset. total_steps: Total training steps. warmup_steps: Steps for KL divergence warmup. optimizer: Optional optimizer (creates AdamW if None). scheduler: Optional LR scheduler. device: Device string (‘auto’ selects CUDA if available). checkpoint_dir: Directory for saving checkpoints. log_every: Log metrics every N steps. eval_every: Run validation every N steps. save_every: Save checkpoint every N steps. gradient_accumulation_steps: Accumulate gradients over N steps. max_grad_norm: Maximum gradient norm for clipping.
Methods
__init__
Section titled “__init__”__init__(model: 'nn.Module', dataset: Any = None, val_dataset: Optional[Any] = None, total_steps: int = 100000, warmup_steps: int = 5000, optimizer: Optional[Any] = None, scheduler: Optional[Any] = None, device: str = 'auto', checkpoint_dir: Optional[str] = None, log_every: int = 100, eval_every: int = 5000, save_every: int = 10000, gradient_accumulation_steps: int = 1, max_grad_norm: float = 1.0, learning_rate: float = 0.0001, dataset_config: Optional[Any] = None) -> NoneSource: ll_stepnet/stepnet/training/streaming_vae_trainer.py:77
train_step
Section titled “train_step”train_step(batch: Dict[str, Any]) -> Dict[str, float]Source: ll_stepnet/stepnet/training/streaming_vae_trainer.py:296
Execute a single training step.
Args: batch: Training batch.
Returns: Dictionary with step metrics.
validate
Section titled “validate”validate() -> Dict[str, float]Source: ll_stepnet/stepnet/training/streaming_vae_trainer.py:348
Run validation on the validation dataset.
Returns: Dictionary with validation metrics.
train() -> NoneSource: ll_stepnet/stepnet/training/streaming_vae_trainer.py:406
Run the full training loop.
Iterates through the streaming dataset, training until total_steps is reached. Supports resuming from checkpoints and handles epoch boundaries for streaming datasets.
save_checkpoint
Section titled “save_checkpoint”save_checkpoint(filename: str) -> NoneSource: ll_stepnet/stepnet/training/streaming_vae_trainer.py:511
Save training checkpoint.
Args: filename: Checkpoint filename.
load_checkpoint
Section titled “load_checkpoint”load_checkpoint(filename: str) -> NoneSource: ll_stepnet/stepnet/training/streaming_vae_trainer.py:537
Load training checkpoint.
Args: filename: Checkpoint filename.
save_history
Section titled “save_history”save_history() -> NoneSource: ll_stepnet/stepnet/training/streaming_vae_trainer.py:563
Save training history to JSON.
class StructuralSummary
Section titled “class StructuralSummary”Source: ll_stepnet/stepnet/annotations.py:61
File-level structural summary.
Attributes: total_entities: Total number of entities in the graph. root_count: Number of root entities identified. max_depth: Maximum DFS depth reached. type_distribution: Counts of each entity type. dominant_category: Classified category (B-Rep, Geometry, Assembly, Mixed).
Methods
format
Section titled “format”format(max_types: int = 5) -> strSource: ll_stepnet/stepnet/annotations.py:78
Format summary as text string.
Args: max_types: Maximum number of type counts to include.
Returns: Formatted summary string with [SUMMARY] delimiters.
__init__
Section titled “__init__”__init__(total_entities: int, root_count: int, max_depth: int, type_distribution: dict[str, int] = dict(), dominant_category: str = 'unknown') -> NoneSource: ll_stepnet/stepnet/annotations.py
class StructuredDiffusion(nn.Module)
Section titled “class StructuredDiffusion(nn.Module)”Source: ll_stepnet/stepnet/diffusion.py:601
Four-stage sequential diffusion following BrepGen.
Stages (each with its own CADDenoiser): 1. Face positions 2. Face geometry 3. Edge positions 4. Edge-vertex geometry
Each stage is conditioned on the denoised output of the preceding stage via concatenation.
Args: config: DiffusionConfig with architectural hyperparameters.
Methods
__init__
Section titled “__init__”__init__(config: Optional[object] = None) -> NoneSource: ll_stepnet/stepnet/diffusion.py:624
forward_train
Section titled “forward_train”forward_train(stage_data: Optional[Dict[str, torch.Tensor]] = None, geometry: Optional[Dict[str, torch.Tensor]] = None) -> Dict[str, torch.Tensor]Source: ll_stepnet/stepnet/diffusion.py:738
Training forward: denoising loss per stage + codec reconstruction.
Each stage independently samples a random timestep, adds noise, and
predicts the noise (teacher-forced on a pooled summary of the previous
stage’s clean tokens). When geometry is supplied, the clean
per-stage token latents are the codec’s encoding of the real geometry,
so the diffusion learns to denoise in the codec’s latent space, and a
masked-MSE reconstruction term trains the codec itself — making the
latent<->geometry mapping coherent end to end.
Args:
stage_data: Optional explicit clean stage latents, each [B, S, D]
(or [B, D], which is promoted to a single token). Overrides the
geometry-derived targets per stage when both are given.
geometry: Optional dict with face_grids [B, N_faces, U, V, 3],
edge_points [B, N_edges, M, 3] and optional face_mask /
edge_mask [B, N] (True = padded/empty primitive).
Returns:
Dictionary with {stage_name}_loss denoising terms, optional
face_recon_loss / edge_recon_loss, and total_loss.
sample
Section titled “sample”sample(batch_size: int = 1, device: Optional[torch.device] = None, use_pndm: bool = True) -> Dict[str, torch.Tensor]Source: ll_stepnet/stepnet/diffusion.py:831
Generate new structured CAD data via sequential denoising.
Args: batch_size: Number of samples. device: Target device. use_pndm: Whether to use PNDM accelerated sampling.
Returns:
Dictionary mapping each stage name to its denoised token latents
([B, N_faces or N_edges, D]) plus decoded geometry tensors
face_grids [B, N_faces, U, V, 3] and edge_points
[B, N_edges, M, 3].
sample_with_log_prob
Section titled “sample_with_log_prob”sample_with_log_prob(batch_size: int = 1, device: Optional[torch.device] = None, num_inference_steps: Optional[int] = None, eta: float = 1.0) -> Tuple[Dict[str, torch.Tensor], torch.Tensor, torch.Tensor]Source: ll_stepnet/stepnet/diffusion.py:907
DDPO sampling: geometry plus a differentiable trajectory log-prob.
Runs stochastic DDIM reverse diffusion (eta > 0) over all four
stages without torch.no_grad and accumulates the per-step
Gaussian log-probabilities from
:meth:DDPMScheduler.ddim_step_with_log_prob. Each step’s transition
mean depends on its denoiser’s epsilon prediction, so the summed
log-prob backpropagates into every denoiser (and the stage conditioning
projections) — enabling real diffusion policy-gradient (REINFORCE /
DDPO) reinforcement learning. This is the path that makes the RL signal
train the actual model parameters, replacing the previously decoupled
noise-prior stand-in.
Trajectory states are detached between steps (each x_t is treated as
a fixed state in the action history, as in DDPO), which bounds memory
while preserving the gradient inside each transition’s log-prob.
Args:
batch_size: Number of samples to draw.
device: Target device (defaults to the model’s device).
num_inference_steps: Reverse steps per stage (defaults to the
scheduler’s inference_steps).
eta: DDIM stochasticity. Coerced to 1.0 if <= 0 because a
deterministic trajectory has a degenerate (delta) policy that
cannot provide a usable policy gradient.
Returns:
Tuple (results, total_log_prob, total_entropy):
* results: {stage_name: token latent [B, N, D]} plus
decoded face_grids / edge_points (all detached).
* total_log_prob: [B] sum of per-step log-probs across all
stages, connected to the model parameters.
* total_entropy: [B] sum of per-step Gaussian entropies.
class TextConditioner(nn.Module)
Section titled “class TextConditioner(nn.Module)”Source: ll_stepnet/stepnet/conditioning.py:131
Condition CAD generation on natural-language descriptions.
Wraps a frozen BERT or CLIP text encoder, projects its hidden states to the decoder dimension, and applies AdaptiveLayer blocks for cross-attention injection.
Following Text2CAD, the first skip_cross_attention_blocks decoder
blocks skip cross-attention to allow initial CAD structure formation
before conditioning kicks in. When block_index is passed to
:meth:forward, the cross-attention layers are only applied when
block_index >= skip_cross_attention_blocks.
Args: encoder_name: Hugging Face model identifier (e.g. “bert-base-uncased”). conditioning_dim: Dimension of the conditioning embeddings (must match the decoder hidden dim). freeze_encoder: Whether to freeze the pretrained encoder weights. num_adaptive_layers: Number of AdaptiveLayer blocks. skip_cross_attention_blocks: Number of initial decoder blocks that skip cross-attention (Text2CAD default = 2).
Methods
__init__
Section titled “__init__”__init__(encoder_name: str = 'bert-base-uncased', conditioning_dim: int = 1024, freeze_encoder: bool = True, num_adaptive_layers: int = 1, skip_cross_attention_blocks: int = 2) -> NoneSource: ll_stepnet/stepnet/conditioning.py:155
encode_text
Section titled “encode_text”encode_text(input_ids: torch.Tensor, attention_mask: Optional[torch.Tensor] = None) -> torch.TensorSource: ll_stepnet/stepnet/conditioning.py:253
Encode tokenised text into conditioning embeddings.
Args: input_ids: [B, L] token ids from the text tokenizer. attention_mask: [B, L] padding mask.
Returns: Conditioning embeddings [B, L, conditioning_dim].
forward
Section titled “forward”forward(hidden_states: torch.Tensor, text_input_ids: torch.Tensor, text_attention_mask: Optional[torch.Tensor] = None, block_index: Optional[int] = None) -> torch.TensorSource: ll_stepnet/stepnet/conditioning.py:281
Condition decoder hidden states on text.
Following Text2CAD, early decoder blocks can skip cross-attention
to let the initial CAD structure form before conditioning kicks in.
When block_index is provided and is less than
self.skip_cross_attention_blocks, the hidden states are returned
unchanged (no cross-attention applied).
Args:
hidden_states: [B, S, D] decoder hidden states.
text_input_ids: [B, L] text token ids.
text_attention_mask: [B, L] text padding mask.
block_index: Optional zero-based decoder block index. When
provided and < skip_cross_attention_blocks, the
cross-attention layers are bypassed entirely.
Returns: Conditioned hidden states [B, S, D].
class TrainingConfig
Section titled “class TrainingConfig”Source: ll_stepnet/stepnet/config.py:107
Configuration for training.
Methods
__init__
Section titled “__init__”__init__(batch_size: int = 8, learning_rate: float = 0.0001, num_epochs: int = 10, warmup_steps: int = 1000, max_grad_norm: float = 1.0, weight_decay: float = 0.01, save_every: int = 1, eval_every: int = 1, checkpoint_dir: str = 'checkpoints', log_dir: str = 'logs') -> NoneSource: ll_stepnet/stepnet/config.py
class VAEConfig
Section titled “class VAEConfig”Source: ll_stepnet/stepnet/config.py:154
Configuration for the STEP Variational Autoencoder.
Command types follow geotoken’s vocabulary: SOL=0, LINE=1, ARC=2, CIRCLE=3, EXTRUDE=4, EOS=5
Methods
__init__
Section titled “__init__”__init__(latent_dim: int = DEFAULT_QUANTIZATION_LEVELS, kl_weight: float = 1.0, kl_warmup_epochs: int = 10, encoder_vocab_size: int = 50000, encoder_embed_dim: int = DEFAULT_QUANTIZATION_LEVELS, encoder_layers: int = 6, decoder_layers: int = 6, num_command_types: int = NUM_COMMAND_TYPES, num_param_levels: int = DEFAULT_QUANTIZATION_LEVELS, max_seq_len: int = DEFAULT_MAX_SEQ_LEN) -> NoneSource: ll_stepnet/stepnet/config.py
class VAETrainer
Section titled “class VAETrainer”Source: ll_stepnet/stepnet/training/vae_trainer.py:31
Trainer for Variational Autoencoder models on CAD token sequences.
Extends the STEPTrainer concept with VAE-specific training:
- Beta-VAE warmup: linearly ramps KL weight from 0 to 1 over warmup epochs
- Reconstruction loss via cross-entropy on command tokens
- KL divergence regularization on the latent distribution
- Latent space visualization at each epoch
Supports two model output conventions:
- Dict output (STEPVAE):
forward()returns a dict with keyscommand_logits,param_logits,mu,log_var,kl_loss, and optionallyrecon_lossandloss. - Tuple output (legacy):
forward()returns(reconstructed, mu, log_var).
The trainer auto-detects which convention is used on the first batch and adapts accordingly.
Args: model: VAE model with encode(), decode(), and reparameterize() methods. train_dataloader: Training data loader. val_dataloader: Optional validation data loader. optimizer: Optimizer instance. Creates AdamW with lr=1e-4 if None. device: Device string. ‘auto’ selects CUDA if available, else CPU. checkpoint_dir: Directory path for saving checkpoints and visualizations. kl_warmup_epochs: Number of epochs to linearly ramp beta from 0 to 1.
Methods
__init__
Section titled “__init__”__init__(model: nn.Module, train_dataloader: DataLoader, val_dataloader: Optional[DataLoader] = None, optimizer: Optional[Any] = None, device: str = 'auto', checkpoint_dir: Optional[str] = None, kl_warmup_epochs: int = 10) -> NoneSource: ll_stepnet/stepnet/training/vae_trainer.py:61
train_epoch
Section titled “train_epoch”train_epoch() -> Dict[str, float]Source: ll_stepnet/stepnet/training/vae_trainer.py:262
Train for one epoch with beta-VAE warmup.
Computes:
- Reconstruction loss (cross-entropy on command tokens)
- KL divergence with current beta weight
- Total loss = recon_loss + beta * kl_loss
Returns: Dictionary with keys: ‘total_loss’, ‘recon_loss’, ‘kl_loss’, ‘beta’.
validate
Section titled “validate”validate() -> Dict[str, float]Source: ll_stepnet/stepnet/training/vae_trainer.py:340
Run validation and compute reconstruction quality metrics.
Computes:
- Validation loss (recon + beta * KL)
- Command accuracy: exact match rate of predicted vs target command tokens
- Parameter MSE: mean squared error of continuous parameter predictions
Returns: Dictionary with keys: ‘val_loss’, ‘recon_loss’, ‘kl_loss’, ‘command_accuracy’, ‘param_mse’.
visualize_latent_space
Section titled “visualize_latent_space”visualize_latent_space(epoch: int, max_samples: int = 5000) -> NoneSource: ll_stepnet/stepnet/training/vae_trainer.py:419
Encode validation set and visualize latent space in 2D.
Uses t-SNE (or UMAP if available) to reduce latent representations to 2D and saves the scatter plot to the checkpoint directory.
Args: epoch: Current epoch number, used for the filename. max_samples: Maximum number of samples to process (default 1000).
train(num_epochs: int, save_every: int = 1) -> NoneSource: ll_stepnet/stepnet/training/vae_trainer.py:534
Train for multiple epochs with beta-VAE scheduling.
Orchestrates the full training loop with:
- Per-epoch training with beta warmup
- Validation after each epoch
- Latent space visualization
- Checkpointing
Args: num_epochs: Total number of epochs to train. save_every: Save a checkpoint every N epochs.
save_checkpoint
Section titled “save_checkpoint”save_checkpoint(filename: str) -> NoneSource: ll_stepnet/stepnet/training/vae_trainer.py:619
Save model checkpoint to disk.
Args: filename: Name of the checkpoint file.
load_checkpoint
Section titled “load_checkpoint”load_checkpoint(filename: str) -> NoneSource: ll_stepnet/stepnet/training/vae_trainer.py:643
Load model checkpoint from disk.
Args: filename: Name of the checkpoint file to load.
save_history
Section titled “save_history”save_history() -> NoneSource: ll_stepnet/stepnet/training/vae_trainer.py:669
Save training history to a JSON file in the checkpoint directory.
class VQVAEModel(nn.Module)
Section titled “class VQVAEModel(nn.Module)”Source: ll_stepnet/stepnet/vqvae.py:749
Complete VQ-VAE model for CAD generation.
Combines an encoder MLP, the :class:DisentangledCodebooks vector
quantization layer, and a decoder MLP into an end-to-end model that
can:
- Encode continuous CAD feature vectors into compact discrete code sequences (10 codes per model, split 3/4/3 across topology, geometry, and extrusion codebooks).
- Decode discrete codes back to reconstructed feature vectors.
- Train the full pipeline with reconstruction loss + commitment loss.
The encoder splits its output into three equal-sized chunks that are fed to the topology, geometry, and extrusion codebook streams respectively. The decoder concatenates the three decoded streams and projects back to the original input dimensionality.
Args: input_dim: Dimensionality of the input feature vector (e.g. flattened STEP entity features). code_dim: Internal dimensionality for codebook vectors. topology_codes: Number of topology codebook entries. geometry_codes: Number of geometry codebook entries. extrusion_codes: Number of extrusion codebook entries. encoder_hidden_dim: Hidden dimension of the encoder MLP. decoder_hidden_dim: Hidden dimension of the decoder MLP.
Methods
__init__
Section titled “__init__”__init__(input_dim: int, code_dim: int = 256, topology_codes: int = 500, geometry_codes: int = 1000, extrusion_codes: int = 1000, encoder_hidden_dim: int = 512, decoder_hidden_dim: int = 512) -> NoneSource: ll_stepnet/stepnet/vqvae.py:779
set_epoch
Section titled “set_epoch”set_epoch(epoch: int) -> NoneSource: ll_stepnet/stepnet/vqvae.py:859
Propagate epoch to all codebooks for warmup tracking.
Args: epoch: Current training epoch (0-indexed).
forward
Section titled “forward”forward(x: torch.Tensor) -> Dict[str, torch.Tensor]Source: ll_stepnet/stepnet/vqvae.py:871
Full forward pass: encode -> quantize -> decode.
Args:
x: Input features of shape (batch, input_dim).
Returns: Dictionary containing:
- ``"reconstructed"``: Reconstructed features ``(batch, input_dim)``.- ``"commitment_loss"``: Scalar commitment loss.- ``"codes"``: Dictionary with ``"topology"``, ``"geometry"``, and ``"extrusion"`` index tensors.- ``"reconstruction_loss"``: MSE reconstruction loss.encode_to_codes
Section titled “encode_to_codes”encode_to_codes(x: torch.Tensor) -> Dict[str, torch.Tensor]Source: ll_stepnet/stepnet/vqvae.py:934
Encode input features to compact discrete codes.
Produces 10 codes per model (3 topology + 4 geometry + 3 extrusion), which serve as a compact discrete representation of the CAD model.
Args:
x: Input features of shape (batch, input_dim).
Returns:
Dictionary with "topology", "geometry", and
"extrusion" keys mapping to LongTensor codebook
indices.
decode_from_codes
Section titled “decode_from_codes”decode_from_codes(topology_codes: torch.Tensor, geometry_codes: torch.Tensor, extrusion_codes: torch.Tensor) -> torch.TensorSource: ll_stepnet/stepnet/vqvae.py:970
Decode discrete codebook indices back to reconstructed features.
Args:
topology_codes: (batch, 3) topology codebook indices.
geometry_codes: (batch, 4) geometry codebook indices.
extrusion_codes: (batch, 3) extrusion codebook indices.
Returns:
Reconstructed features of shape (batch, input_dim).
generate
Section titled “generate”generate(num_samples: int = 1, temperature: float = 1.0, top_k: Optional[int] = None) -> Dict[str, torch.Tensor]Source: ll_stepnet/stepnet/vqvae.py:1004
Generate new CAD models by sampling codes autoregressively.
Uses the three independent CodebookDecoder instances to
generate topology, geometry, and extrusion code sequences in
parallel, then decodes them back to feature space.
Args: num_samples: Number of CAD models to generate. temperature: Sampling temperature for all three decoders. top_k: Top-k filtering for sampling.
Returns: Dictionary containing:
- ``"reconstructed"``: Generated features ``(num_samples, input_dim)``.- ``"codes"``: Dictionary with ``"topology"``, ``"geometry"``, and ``"extrusion"`` index tensors.compute_ar_loss
Section titled “compute_ar_loss”compute_ar_loss(topology_codes: torch.Tensor, geometry_codes: torch.Tensor, extrusion_codes: torch.Tensor) -> Dict[str, torch.Tensor]Source: ll_stepnet/stepnet/vqvae.py:1079
Compute autoregressive next-code prediction losses.
Used for training the CodebookDecoder modules. Takes
ground-truth code sequences (from encode_to_codes) and
computes cross-entropy loss for each decoder.
Args:
topology_codes: (batch, 3) ground-truth topology indices.
geometry_codes: (batch, 4) ground-truth geometry indices.
extrusion_codes: (batch, 3) ground-truth extrusion indices.
Returns:
Dictionary with "topology_ar_loss",
"geometry_ar_loss", "extrusion_ar_loss", and
"total_ar_loss" scalar tensors.
codebook_utilization
Section titled “codebook_utilization”codebook_utilization() -> Dict[str, float]Source: ll_stepnet/stepnet/vqvae.py:1139
Report utilization (fraction of active entries) per codebook.
Returns:
Dictionary mapping codebook name to utilization float
in [0.0, 1.0].
total_codes_per_model
Section titled “total_codes_per_model”total_codes_per_model() -> intSource: ll_stepnet/stepnet/vqvae.py:1152
Number of discrete codes produced per CAD model.
Returns: Total number of codes (10 by default: 3 + 4 + 3).
num_parameters
Section titled “num_parameters”num_parameters() -> Dict[str, int]Source: ll_stepnet/stepnet/vqvae.py:1160
Count trainable parameters by component.
Returns: Dictionary mapping component names to parameter counts.
class VectorQuantizer(nn.Module)
Section titled “class VectorQuantizer(nn.Module)”Source: ll_stepnet/stepnet/vqvae.py:33
Core vector quantization layer with EMA codebook updates.
Maps continuous latent vectors to the nearest entry in a learned codebook embedding table. During training the codebook is updated via exponential moving average (EMA) rather than straight gradient descent, which is more stable for VQ-VAE training.
A warmup period (SkexGen stabilisation trick) bypasses quantization
for the first warmup_epochs epochs so the encoder can learn a
reasonable latent distribution before the codebook locks in.
Args: num_embeddings: Number of codebook entries (K). embedding_dim: Dimensionality of each codebook vector. commitment_cost: Weight beta for the commitment loss term. decay: EMA decay factor for codebook updates. warmup_epochs: Number of initial training epochs to skip quantization (pass-through mode).
Methods
__init__
Section titled “__init__”__init__(num_embeddings: int, embedding_dim: int, commitment_cost: float = 0.25, decay: float = 0.99, warmup_epochs: int = 25) -> NoneSource: ll_stepnet/stepnet/vqvae.py:54
set_epoch
Section titled “set_epoch”set_epoch(epoch: int) -> NoneSource: ll_stepnet/stepnet/vqvae.py:106
Update the internal epoch counter (used for warmup gating).
Args: epoch: Current training epoch (0-indexed).
forward
Section titled “forward”forward(inputs: torch.Tensor) -> tuple[torch.Tensor, torch.Tensor, torch.Tensor]Source: ll_stepnet/stepnet/vqvae.py:114
Quantize continuous input vectors to nearest codebook entries.
During warmup (epoch < warmup_epochs) quantization is skipped
and the inputs are returned as-is with zero commitment loss so
the encoder can pre-train freely.
Args:
inputs: Continuous latent vectors of shape
(batch, *, embedding_dim) where * is any number
of intermediate dimensions (typically sequence length).
Returns:
A tuple of (quantized, commitment_loss, encoding_indices)
where:
- **quantized** has the same shape as *inputs* and contains the selected codebook vectors (with straight-through gradient).- **commitment_loss** is a scalar ``beta * ||z_e - sg(z_q)||^2``.- **encoding_indices** is a ``LongTensor`` of shape ``(batch * seq_len,)`` giving the selected codebook index for each input vector.codebook_utilization
Section titled “codebook_utilization”codebook_utilization() -> floatSource: ll_stepnet/stepnet/vqvae.py:258
Fraction of codebook entries actively used (non-zero EMA count).
Returns:
A float in [0.0, 1.0] representing the proportion of
codebook entries that have received at least one assignment.
chinchilla_optimal_tokens
Section titled “chinchilla_optimal_tokens”chinchilla_optimal_tokens(num_params: int) -> intSource: ll_stepnet/stepnet/data_requirements.py:108
Estimate optimal number of training tokens based on Chinchilla scaling laws.
Rule: ~20-25 tokens per parameter for compute-optimal training.
Args: num_params: Number of model parameters (non-embedding)
Returns: Recommended number of training tokens/samples
count_model_parameters
Section titled “count_model_parameters”count_model_parameters(model: nn.Module, exclude_embeddings: bool = True) -> intSource: ll_stepnet/stepnet/data_requirements.py:789
Count model parameters (excluding embeddings for scaling law analysis).
Args: model: PyTorch model exclude_embeddings: Whether to exclude embedding parameters
Returns: Number of parameters
create_dataloader
Section titled “create_dataloader”create_dataloader(file_paths: List[str], labels: Optional[List] = None, batch_size: int = 8, max_length: int = 2048, use_topology: bool = True, shuffle: bool = True, num_workers: int = 0) -> DataLoaderSource: ll_stepnet/stepnet/data.py:189
Create DataLoader for STEP files.
Args: file_paths: List of STEP file paths labels: Optional labels batch_size: Batch size max_length: Maximum sequence length use_topology: Whether to build topology graphs shuffle: Whether to shuffle data num_workers: Number of worker processes
Returns: DataLoader instance
estimate_data_requirements
Section titled “estimate_data_requirements”estimate_data_requirements(model: nn.Module, target_accuracy: float, sample_sizes: np.ndarray, val_accuracies: np.ndarray) -> intSource: ll_stepnet/stepnet/data_requirements.py:759
Convenience function to estimate required dataset size for target accuracy.
Args: model: STEP model target_accuracy: Desired accuracy sample_sizes: Current dataset sizes tested val_accuracies: Corresponding validation accuracies
Returns: Estimated required dataset size
get_config
Section titled “get_config”get_config(task: str = 'classification', **kwargs)Source: ll_stepnet/stepnet/config.py:286
Get configuration for a specific task.
Args: task: Task name (‘classification’, ‘property’, ‘captioning’, ‘similarity’, ‘qa’) **kwargs: Additional config overrides
Returns: Configuration object
inverse_power_law_accuracy
Section titled “inverse_power_law_accuracy”inverse_power_law_accuracy(D: np.ndarray, a: float, b: float, c: float) -> np.ndarraySource: ll_stepnet/stepnet/data_requirements.py:90
Inverse power law for accuracy.
Acc(D) = c - a * D^(-b)
Args: D: Dataset size a: Scaling coefficient b: Scaling exponent c: Maximum achievable accuracy (asymptote)
Returns: Accuracy values
mask_tokens
Section titled “mask_tokens”mask_tokens(input_ids: torch.Tensor, mask_token_id: int, vocab_size: int, mask_prob: float = 0.15, replace_prob: float = 0.1, random_prob: float = 0.1) -> tuple[torch.Tensor, torch.Tensor]Source: ll_stepnet/stepnet/pretrain.py:430
Create masked input for BERT-style training.
Args: input_ids: [batch_size, seq_len] - original tokens mask_token_id: ID for [MASK] token vocab_size: Vocabulary size mask_prob: Probability of masking a token replace_prob: Probability of replacing with random token instead of [MASK] random_prob: Probability of keeping original token
Returns: masked_input: [batch_size, seq_len] - input with masks labels: [batch_size, seq_len] - targets (-100 for non-masked)
plot_learning_curve_with_scaling_law
Section titled “plot_learning_curve_with_scaling_law”plot_learning_curve_with_scaling_law(sample_sizes: np.ndarray, val_losses: np.ndarray, val_accuracies: Optional[np.ndarray] = None, fitted_params: Optional[Dict] = None, fit_type: str = 'openai', extrapolate_to: Optional[int] = None, save_path: Optional[str] = None)Source: ll_stepnet/stepnet/data_requirements.py:605
Plot learning curves with fitted scaling law.
Args: sample_sizes: Array of dataset sizes val_losses: Validation losses [num_sizes, num_iterations] val_accuracies: Optional validation accuracies fitted_params: Fitted power law parameters fit_type: ‘openai’ or ‘standard’ extrapolate_to: Optional target size for extrapolation save_path: Optional path to save figure
power_law_error
Section titled “power_law_error”power_law_error(n: np.ndarray, a: float, b: float, c: float) -> np.ndarraySource: ll_stepnet/stepnet/data_requirements.py:72
Power law for error as a function of dataset size.
Error(n) = a * n^(-b) + c
Args: n: Dataset size a: Scaling coefficient b: Scaling exponent c: Irreducible error (Bayes error rate)
Returns: Error values
power_law_loss
Section titled “power_law_loss”power_law_loss(D: np.ndarray, D_c: float, alpha_D: float) -> np.ndarraySource: ll_stepnet/stepnet/data_requirements.py:55
OpenAI-style power law for loss as a function of dataset size.
L(D) = (D_c / D)^alpha_D
Args: D: Dataset size (number of samples or tokens) D_c: Data scaling constant alpha_D: Data scaling exponent (~0.095 for language models)
Returns: Loss values
reserialize_step
Section titled “reserialize_step”reserialize_step(step_text: str, config: Optional[STEPReserializationConfig] = None) -> STEPReserializedOutputSource: ll_stepnet/stepnet/reserialization.py:388
Convenience function to reserialize STEP text via DFS.
Parses the raw STEP text into an entity graph, then performs DFS reserialization to produce semantically-grouped output.
Args: step_text: Raw STEP file text (DATA section entities). config: Optional reserialization configuration.
Returns: STEPReserializedOutput with reserialized text and metadata.
suggest_dataset_size
Section titled “suggest_dataset_size”suggest_dataset_size(model: nn.Module, task_type: str = 'classification', quality_level: str = 'good') -> Dict[str, int]Source: ll_stepnet/stepnet/data_requirements.py:819
Suggest dataset size based on model size and task complexity.
Based on research guidelines:
- Classification: 1,000-5,000 samples per class
- Property prediction: 10,000-100,000 samples
- Fine-tuning: 100-1,000 samples (with pretrained model)
Args: model: STEP model task_type: ‘classification’, ‘property’, ‘captioning’, etc. quality_level: ‘minimum’, ‘good’, or ‘excellent’
Returns: Dictionary with recommended dataset sizes