Skip to content

geotoken — Usage

geotoken tokenizes geometry at three levels — mesh, parametric (command sequences), and topology (B-Rep graphs) — using adaptive quantization.

from geotoken import GeoTokenizer, QuantizationConfig, PrecisionTier
import numpy as np
vertices = np.array([[0, 0, 0], [1, 0, 0], [0, 1, 0], [0, 0, 1]], dtype=np.float32)
faces = np.array([[0, 1, 2], [0, 1, 3], [0, 2, 3], [1, 2, 3]], dtype=np.int64)
config = QuantizationConfig(tier=PrecisionTier.STANDARD, adaptive=True)
tokenizer = GeoTokenizer(config)
tokens = tokenizer.tokenize(vertices, faces)
reconstructed = tokenizer.detokenize(tokens)
impact = tokenizer.analyze_impact(vertices, faces)
print(f"Mean error: {impact.mean_error:.6f}")
print(f"Hausdorff distance: {impact.hausdorff_distance:.6f}")
from geotoken import CommandSequenceTokenizer, CADVocabulary
commands = [
{"type": "SOL", "params": [0.5, 0.5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]},
{"type": "LINE", "params": [0.0, 0.0, 0, 1.0, 0.0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]},
{"type": "EXTRUDE", "params": [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 5.0]},
{"type": "EOS", "params": [0] * 16},
]
token_seq = CommandSequenceTokenizer().tokenize(commands)
token_ids = CADVocabulary().encode(token_seq.command_tokens)
from geotoken import GraphTokenizer
import numpy as np
node_features = np.random.randn(10, 48).astype(np.float32) # 10 nodes, 48-dim
edge_index = np.array([[0, 1, 2, 3], [1, 2, 3, 0]], dtype=np.int64)
edge_features = np.random.randn(4, 16).astype(np.float32) # 4 edges, 16-dim
token_seq = GraphTokenizer().tokenize(node_features, edge_index, edge_features)
TierBitsLevelsMax errorUse case
DRAFT664< 0.5Preview, low bandwidth
STANDARD8256< 0.2Default
PRECISION101024< 0.05High fidelity

Adaptive allocation weights curvature (0.7) and feature density (0.3) into a complexity score, then assigns bits by percentile interpolation. After quantization, a spatial-hash pass detects and perturbs collapsed vertices so distinct features stay distinct.

The bridge cadling.backend.geotoken_integration.GeoTokenIntegration tokenizes a whole CADlingDocument:

from cadling.backend.geotoken_integration import GeoTokenIntegration
bridge = GeoTokenIntegration()
result = bridge.tokenize_document(
doc, include_mesh=True, include_graph=True,
include_commands=True, include_constraints=True,
)
mesh_tokens, graph_tokens = result.mesh_tokens, result.graph_tokens
token_ids = result.token_ids

geotoken and ll_stepnet share the same CommandType enum and feature dimensions (48-dim nodes, 16-dim edges, 16 command parameters), enabling zero-adapter integration. ll_stepnet provides a GeoTokenDataset PyTorch wrapper for tokenized data.

from geotoken.vertex import VertexValidator, VertexClusterer, VertexMerger
report = VertexValidator().validate(vertices, faces)
clustering = VertexClusterer(merge_distance=0.005).cluster(vertices)
merged_verts, clean_faces = VertexMerger.merge(vertices, faces, clustering)