Tutorial: Tokenize a mesh
In this tutorial you will tokenize a mesh with geotoken, reconstruct it, and compare precision tiers. Allow ~10 minutes. geotoken needs only NumPy (trimesh is optional).
1. Build a mesh
Section titled “1. Build a mesh”import numpy as np
# A unit tetrahedronvertices = np.array( [[0, 0, 0], [1, 0, 0], [0, 1, 0], [0, 0, 1]], dtype=np.float32)faces = np.array( [[0, 1, 2], [0, 1, 3], [0, 2, 3], [1, 2, 3]], dtype=np.int64)To start from a real mesh file instead, load it with trimesh and pass
mesh.vertices / mesh.faces.
2. Tokenize with the default (STANDARD, adaptive) config
Section titled “2. Tokenize with the default (STANDARD, adaptive) config”from geotoken import GeoTokenizer, QuantizationConfig, PrecisionTier
config = QuantizationConfig(tier=PrecisionTier.STANDARD, adaptive=True)tokenizer = GeoTokenizer(config)
tokens = tokenizer.tokenize(vertices, faces)reconstructed = tokenizer.detokenize(tokens)
print("reconstructed vertices:\n", reconstructed)The tokenizer normalizes into a unit cube, scores per-vertex complexity (curvature + feature density), allocates bits accordingly, and prevents distinct vertices from collapsing into the same quantized value.
3. Measure the quantization impact
Section titled “3. Measure the quantization impact”impact = tokenizer.analyze_impact(vertices, faces)print(f"mean error: {impact.mean_error:.6f}")print(f"hausdorff distance: {impact.hausdorff_distance:.6f}")4. Compare precision tiers
Section titled “4. Compare precision tiers”for tier in (PrecisionTier.DRAFT, PrecisionTier.STANDARD, PrecisionTier.PRECISION): t = GeoTokenizer(QuantizationConfig(tier=tier, adaptive=True)) impact = t.analyze_impact(vertices, faces) print(f"{tier.name:9s} mean_error={impact.mean_error:.6f}")Expect error to fall as bit-width rises (DRAFT 6-bit → STANDARD 8-bit → PRECISION 10-bit), at the cost of more tokens.
5. Encode to integer IDs for a model
Section titled “5. Encode to integer IDs for a model”from geotoken import CommandSequenceTokenizer, CADVocabulary
commands = [ {"type": "SOL", "params": [0.5, 0.5] + [0] * 14}, {"type": "LINE", "params": [0.0, 0.0, 0, 1.0, 0.0] + [0] * 11}, {"type": "EXTRUDE", "params": [0] * 15 + [5.0]}, {"type": "EOS", "params": [0] * 16},]seq = CommandSequenceTokenizer().tokenize(commands)ids = CADVocabulary().encode(seq.command_tokens)print("token ids:", ids[:10], "…")Where to next
Section titled “Where to next”- Understand why quantization is necessary: How geometry becomes tokens.
- Feed tokens to a model: ll_stepnet Usage.