ll_ocadr — Overview

ll_ocadr (LatticeLabs Optical CAD Recognition) is a DeepSeek-OCR-inspired pipeline for feeding 3D CAD/mesh geometry into a large language model. Instead of document images, it processes CAD (STEP) and mesh (STL/OBJ/PLY) files: it encodes a global, full-resolution view of the object together with tiled local chunks, projects those features into the LLM’s embedding space, and lets the LLM reason over the combined tokens.

Architecture

mesh / STEP file
   │  (process/: chunkers, mesh/STEP loaders)
   ▼
GeometryNet (PointNet++ local features) + ShapeNet (ViT global features)
   │  concatenate → MLP projector → LLM embedding space
   ▼
LatticelabsOCADRForCausalLM → HF language model (forward / generate)

vllm/latticelabs_ocadr.py — the multimodal model LatticelabsOCADRForCausalLM.
vllm/lattice_encoder/ — GeometryNet, ShapeNet, and the MLP projector.
vllm/process/ — file-content chunkers, mesh/STEP loaders, tokenizer glue.

Inference (HF-native, supported)

python ll_ocadr/run_ll_ocadr_hf.py \
    --model Qwen/Qwen2-1.8B \
    --mesh part.step \
    --prompt "Describe this CAD part: <mesh>" \
    --max-new-tokens 64

The <mesh> placeholder token is registered (and the LM embeddings resized) at build time; n_embed is derived automatically from the chosen language model.

vLLM serving — experimental / future

Native MLX — a trained, geometry-grounded model

ll_ocadr/mlx/train_ocadr_mlx.py trains the real geometry tower natively on Apple Silicon: a faithful MLX port of GeometryNet (PointNet++) + ShapeNet (Point-BERT) + projector — forward-parity-verified at ~1e-6 against the PyTorch encoders (ll_ocadr/mlx/faithful_tower_mlx.py) — projects a CAD point cloud into 256 mesh tokens spliced into a frozen 4-bit Qwen2, with LoRA + the encoder trained jointly. On a held-out CAD point-cloud → class task it reaches llm-generation accuracy 0.919 vs a shuffled-mesh control of 0.313 (majority 0.374) — i.e. the model genuinely reads the geometry and verbalizes it, rather than guessing from the text prior.

python ll_ocadr/mlx/faithful_tower_mlx.py --mode parity   # prove the tower == real encoders
python ll_ocadr/mlx/train_ocadr_mlx.py    --mode train    # train encoder + projector + LoRA

Status

Use the sidebar for Installation, Usage, and the API Reference.