API Reference

Generated from the ll_ocadr package source. Each symbol links to its definition on GitHub.

Module `ll_ocadr.run_ll_ocadr_hf`

`build_model_and_tokenizer`

build_model_and_tokenizer(model_name: str, device: str = 'cpu', shape_depth: int | None = None) -> tuple[LatticelabsOCADRForCausalLM, AutoTokenizer, LLOCADRConfig, LLOCADRProcessor]

Source: ll_ocadr/run_ll_ocadr_hf.py:51

Build the OCADR model, tokenizer, config, and preprocessor for HF inference.

n_embed is derived from the chosen language model so the mesh embeddings line up with the LM’s embedding space. The <mesh> token is registered and the LM’s token-embedding matrix is resized to match.

`main`

main() -> None

Source: ll_ocadr/run_ll_ocadr_hf.py:151

`run_inference`

run_inference(model: LatticelabsOCADRForCausalLM, processor: LLOCADRProcessor, tokenizer, mesh_file: str | Sequence[str], prompt: str, max_new_tokens: int = 64, cropping: bool = True, do_sample: bool = False) -> str

Source: ll_ocadr/run_ll_ocadr_hf.py:94

Run the full mesh-file -> text pipeline and return the decoded output.

mesh_file may be a single path or a sequence of paths. The number of <mesh> placeholders in prompt must equal the number of mesh files; as a convenience, a single mesh with no placeholder gets one appended.

The target device is taken from the model itself (single source of truth), so the inputs always land on the same device as the model.

Module `ll_ocadr.vllm.latticelabs_ocadr`

`class LLOCADRMultiModalProcessor`

Source: ll_ocadr/vllm/latticelabs_ocadr.py:575

vLLM integration layer for LL-OCADR. Mirrors DeepseekOCRMultiModalProcessor structure.

Methods

`init`

__init__(config)

Source: ll_ocadr/vllm/latticelabs_ocadr.py:581

`class LLOCADRProcessingInfo`

Source: ll_ocadr/vllm/latticelabs_ocadr.py:520

Metadata about mesh processing for vLLM. Provides token count calculations for KV cache allocation.

Methods

`init`

__init__(config)

Source: ll_ocadr/vllm/latticelabs_ocadr.py:526

`get_num_mesh_tokens`

get_num_mesh_tokens(mesh_file: str, chunking: bool = True) -> int

Source: ll_ocadr/vllm/latticelabs_ocadr.py:532

Calculate token count based on mesh complexity. Critical for vLLM’s KV cache allocation.

Args: mesh_file: Path to mesh file chunking: Whether to use spatial chunking

Returns: Total number of tokens this mesh will generate

`class LatticelabsOCADRForCausalLM(nn.Module)`

Source: ll_ocadr/vllm/latticelabs_ocadr.py:15

Main model class integrating 3D geometry processing with LLM. Mirrors DeepseekOCRForCausalLM structure but for 3D meshes.

Methods

`init`

__init__(config)

Source: ll_ocadr/vllm/latticelabs_ocadr.py:21

`get_input_embeddings`

get_input_embeddings(input_ids: torch.Tensor, multimodal_embeddings: list[torch.Tensor] | None = None, image_embeddings: list[torch.Tensor] | None = None) -> torch.Tensor

Source: ll_ocadr/vllm/latticelabs_ocadr.py:338

Merge mesh (and optional rendered-image) embeddings with text embeddings. Mesh logic is identical to DeepSeek-OCR’s implementation; image tokens are spliced at image_token_id positions when the vision modality is enabled.

Args: input_ids: [batch, seq_len] with mesh_token_id (and optionally image_token_id) placeholders multimodal_embeddings: List of [num_mesh_tokens, n_embed] tensors image_embeddings: Optional list of [num_vision_tokens, n_embed] tensors (one per item)

Returns: inputs_embeds: [batch, seq_len, n_embed] merged embeddings

`forward`

forward(input_ids: torch.Tensor, attention_mask: torch.Tensor | None = None, vertex_coords: torch.Tensor | None = None, vertex_normals: torch.Tensor | None = None, chunks_coords: torch.Tensor | None = None, chunks_normals: torch.Tensor | None = None, mesh_spatial_partition: torch.Tensor | None = None, pixel_values: torch.Tensor | None = None, **kwargs)

Source: ll_ocadr/vllm/latticelabs_ocadr.py:376

Full inference pipeline integrating 3D geometry + language.

Args: input_ids: [batch, seq_len] with mesh_token_id placeholders attention_mask: [batch, seq_len] vertex_coords: [batch, N, 3] vertex_normals: [batch, N, 3] chunks_coords: [batch, num_chunks, M, 3] chunks_normals: [batch, num_chunks, M, 3] mesh_spatial_partition: [batch, 3] pixel_values: optional rendered images [batch, 3, H, W] or [batch, V, 3, H, W] (V views); used only when the vision modality is enabled (config.use_vision)

Returns: Language model outputs

`generate`

generate(input_ids: torch.Tensor, attention_mask: torch.Tensor | None = None, vertex_coords: torch.Tensor | None = None, vertex_normals: torch.Tensor | None = None, chunks_coords: torch.Tensor | None = None, chunks_normals: torch.Tensor | None = None, mesh_spatial_partition: torch.Tensor | None = None, pixel_values: torch.Tensor | None = None, **kwargs)

Source: ll_ocadr/vllm/latticelabs_ocadr.py:439

Autoregressive generation with 3D mesh (and optional rendered-image) conditioning.

Processes mesh inputs through the 3D encoders (and rendered images through the vision tower when config.use_vision is set), merges the resulting embeddings with text token embeddings, then delegates to the inner language model’s generate() (which inherits from GenerationMixin).

Accepts all keyword arguments supported by transformers.GenerationMixin.generate (e.g. max_new_tokens, temperature, top_p, do_sample).

Returns: Generated token IDs from the language model.

`build_ll_ocadr_model`

build_ll_ocadr_model(config)

Source: ll_ocadr/vllm/latticelabs_ocadr.py:503

Build LatticeLabs OCADR model.

Module `ll_ocadr.vllm.lattice_encoder.geometry_net`

`class GeometryNet(nn.Module)`

Source: ll_ocadr/vllm/lattice_encoder/geometry_net.py:241

Local geometry encoder for mesh chunks. Equivalent to SAM for images, extracts fine-grained geometric features.

Architecture: - PointNet++ with 2 set abstraction layers - Multi-head attention for local context - Outputs 256-dimensional features per sampled point

Input: coords [B, N, 3], normals [B, N, 3] Output: features [B, 128, 256]

Methods

`init`

__init__()

Source: ll_ocadr/vllm/lattice_encoder/geometry_net.py:255

`forward`

forward(coords, normals)

Source: ll_ocadr/vllm/lattice_encoder/geometry_net.py:285

Args: coords: [B, N, 3] vertex coordinates normals: [B, N, 3] vertex normals

Returns: features: [B, 128, 256] - 128 sampled points with 256-dim features

`class PointNetSetAbstraction(nn.Module)`

Source: ll_ocadr/vllm/lattice_encoder/geometry_net.py:195

Set Abstraction layer for hierarchical point cloud feature learning.

Methods

`init`

__init__(npoint, radius, nsample, in_channel, mlp)

Source: ll_ocadr/vllm/lattice_encoder/geometry_net.py:200

`forward`

forward(xyz, points)

Source: ll_ocadr/vllm/lattice_encoder/geometry_net.py:213

Args: xyz: [B, N, 3] coordinates points: [B, N, D] features

Returns: new_xyz: [B, npoint, 3] new_points: [B, npoint, mlp[-1]]

`build_geometry_net`

build_geometry_net()

Source: ll_ocadr/vllm/lattice_encoder/geometry_net.py:311

Build GeometryNet encoder.

`farthest_point_sample`

farthest_point_sample(xyz, npoint)

Source: ll_ocadr/vllm/lattice_encoder/geometry_net.py:68

Farthest Point Sampling for downsampling point clouds.

Uses torch_cluster.fps (single fused CUDA kernel) when available, falling back to a Python loop implementation otherwise.

Args: xyz: [B, N, 3] point cloud npoint: number of samples

Returns: centroids: [B, npoint] sampled point indices

`index_points`

index_points(points, idx)

Source: ll_ocadr/vllm/lattice_encoder/geometry_net.py:100

Index points based on indices.

Args: points: [B, N, C] idx: [B, S] or [B, S, K]

Returns: new_points: [B, S, C] or [B, S, K, C]

`query_ball_point`

query_ball_point(radius, nsample, xyz, new_xyz)

Source: ll_ocadr/vllm/lattice_encoder/geometry_net.py:127

Find all points within radius from query points.

Args: radius: local region radius nsample: max sample number in local region xyz: [B, N, 3] all points new_xyz: [B, S, 3] query points

Returns: group_idx: [B, S, nsample]

`sample_and_group`

sample_and_group(npoint, radius, nsample, xyz, points)

Source: ll_ocadr/vllm/lattice_encoder/geometry_net.py:159

Sample and group points.

Args: npoint: number of centroids radius: ball query radius nsample: max number of samples per ball xyz: [B, N, 3] coordinates points: [B, N, D] point features

Returns: new_xyz: [B, npoint, 3] new_points: [B, npoint, nsample, 3+D]

`square_distance`

square_distance(src, dst)

Source: ll_ocadr/vllm/lattice_encoder/geometry_net.py:12

Calculate Euclidean distance between each pair of points.

Args: src: [B, N, C] dst: [B, M, C]

Returns: dist: [B, N, M]

Module `ll_ocadr.vllm.lattice_encoder.shape_net`

`class PointPatchEmbedding(nn.Module)`

Source: ll_ocadr/vllm/lattice_encoder/shape_net.py:15

Tokenize point cloud into spatial patches. Divides point cloud into groups and embeds each group.

Methods

`init`

__init__(patch_size = 32, embed_dim = 768)

Source: ll_ocadr/vllm/lattice_encoder/shape_net.py:21

`forward`

forward(coords, normals)

Source: ll_ocadr/vllm/lattice_encoder/shape_net.py:41

Args: coords: [B, N, 3] vertex coordinates normals: [B, N, 3] vertex normals

Returns: patch_tokens: [B, num_patches, embed_dim]

`class ShapeNet(nn.Module)`

Source: ll_ocadr/vllm/lattice_encoder/shape_net.py:148

Global shape encoder for full mesh context. Equivalent to CLIP for images, extracts high-level semantic features.

Architecture: - Patch-based tokenization of point cloud - Transformer encoder with positional encoding - CLS token for global shape representation

Input: coords [B, N, 3], normals [B, N, 3] Output: features [B, 257, 768] - CLS + 256 patch tokens

Methods

`init`

__init__(embed_dim = 768, depth = 12, num_heads = 12)

Source: ll_ocadr/vllm/lattice_encoder/shape_net.py:162

`forward`

forward(coords, normals)

Source: ll_ocadr/vllm/lattice_encoder/shape_net.py:195

Args: coords: [B, N, 3] downsampled mesh vertices normals: [B, N, 3] vertex normals

Returns: features: [B, 257, 768] - CLS token + 256 patch tokens

`class TransformerBlock(nn.Module)`

Source: ll_ocadr/vllm/lattice_encoder/shape_net.py:108

Standard Transformer block with self-attention and FFN.

Methods

`init`

__init__(embed_dim, num_heads, mlp_ratio = 4.0, dropout = 0.0)

Source: ll_ocadr/vllm/lattice_encoder/shape_net.py:113

`forward`

forward(x)

Source: ll_ocadr/vllm/lattice_encoder/shape_net.py:129

Args: x: [B, N, embed_dim]

Returns: x: [B, N, embed_dim]

`build_shape_net`

build_shape_net(embed_dim = 768, depth = 12, num_heads = 12)

Source: ll_ocadr/vllm/lattice_encoder/shape_net.py:237

Build ShapeNet encoder.

API Reference

Module ll_ocadr.run_ll_ocadr_hf

build_model_and_tokenizer

main

run_inference

Module ll_ocadr.vllm.latticelabs_ocadr

class LLOCADRMultiModalProcessor

__init__

class LLOCADRProcessingInfo

__init__

get_num_mesh_tokens

class LatticelabsOCADRForCausalLM(nn.Module)

__init__

get_input_embeddings

forward

generate

build_ll_ocadr_model

Module ll_ocadr.vllm.lattice_encoder.geometry_net

class GeometryNet(nn.Module)

__init__

forward

class PointNetSetAbstraction(nn.Module)

__init__

forward

build_geometry_net

farthest_point_sample

index_points

query_ball_point

sample_and_group

square_distance

Module ll_ocadr.vllm.lattice_encoder.shape_net

class PointPatchEmbedding(nn.Module)

__init__

forward

class ShapeNet(nn.Module)

__init__

forward

class TransformerBlock(nn.Module)

__init__

forward

build_shape_net

Module `ll_ocadr.run_ll_ocadr_hf`

`build_model_and_tokenizer`

`main`

`run_inference`

Module `ll_ocadr.vllm.latticelabs_ocadr`

`class LLOCADRMultiModalProcessor`

`init`

`class LLOCADRProcessingInfo`

`init`

`get_num_mesh_tokens`

`class LatticelabsOCADRForCausalLM(nn.Module)`

`init`

`get_input_embeddings`

`forward`

`generate`

`build_ll_ocadr_model`

Module `ll_ocadr.vllm.lattice_encoder.geometry_net`

`class GeometryNet(nn.Module)`

`init`

`forward`

`class PointNetSetAbstraction(nn.Module)`

`init`

`forward`

`build_geometry_net`

`farthest_point_sample`

`index_points`

`query_ball_point`

`sample_and_group`

`square_distance`

Module `ll_ocadr.vllm.lattice_encoder.shape_net`

`class PointPatchEmbedding(nn.Module)`

`init`

`forward`

`class ShapeNet(nn.Module)`

`init`

`forward`

`class TransformerBlock(nn.Module)`

`init`

`forward`

`build_shape_net`