geotoken — Overview
geotoken is a geometric tokenizer with adaptive quantization. It converts 3D geometry — CAD (STEP, IGES, B-Rep) and meshes (STL, OBJ) — into discrete token sequences suitable for transformer-based models, at three levels:
- Mesh-level — raw vertex/face geometry tokenization.
- Parametric-level — construction history (sketch-and-extrude command sequences, DeepCAD format).
- Topology-level — B-Rep graph structures with feature vectors.
The key idea: adaptive precision
Section titled “The key idea: adaptive precision”Adaptive precision quantization allocates more bits to geometrically complex regions (high curvature, dense features) and fewer bits to flat/simple regions — reducing token count while preserving the features that matter.
| Tier | Bits | Levels | Use case |
|---|---|---|---|
DRAFT | 6 | 64 | Fast preview, low bandwidth |
STANDARD | 8 | 256 | Balanced quality/size (default) |
PRECISION | 10 | 1024 | High fidelity, lossless-adjacent |
A first taste
Section titled “A first taste”from geotoken import GeoTokenizer, QuantizationConfig, PrecisionTierimport numpy as np
vertices = np.array([[0, 0, 0], [1, 0, 0], [0, 1, 0], [0, 0, 1]], dtype=np.float32)faces = np.array([[0, 1, 2], [0, 1, 3], [0, 2, 3], [1, 2, 3]], dtype=np.int64)
config = QuantizationConfig(tier=PrecisionTier.STANDARD, adaptive=True)tokenizer = GeoTokenizer(config)
tokens = tokenizer.tokenize(vertices, faces)reconstructed = tokenizer.detokenize(tokens)
impact = tokenizer.analyze_impact(vertices, faces)print(f"Mean error: {impact.mean_error:.6f}")Key classes
Section titled “Key classes”GeoTokenizer— mesh tokenization.CommandSequenceTokenizer— CAD command sequences.GraphTokenizer— B-Rep topology graphs (consumes a cadlingTopologyGraph).CADVocabulary— token → integer-ID encoding.
Status
Section titled “Status”Use the sidebar for Installation, Usage, and the API Reference.