Tutorial: Parse a STEP file
In this tutorial you will parse a CAD file with cadling, inspect the resulting document, chunk it for RAG, and export it. Allow ~10 minutes.
Prerequisites
Section titled “Prerequisites”- cadling installed (
pip install -e ".[all]"inside the conda env — see Installation). - A CAD file. Any
.step,.stl,.brep, or.igesfile works. The repo ships an examplepart.stepat its root.
1. Convert the file
Section titled “1. Convert the file”from cadling import DocumentConverter, ConversionStatus
converter = DocumentConverter()result = converter.convert("part.step")
assert result.status in (ConversionStatus.SUCCESS, ConversionStatus.PARTIAL)doc = result.documentprint(f"Parsed {len(doc.items)} items from a {result.status.name} conversion")DocumentConverter detects the format, selects a backend, and runs the
Build → Assemble → Enrich pipeline, returning a CADlingDocument.
2. Inspect the document
Section titled “2. Inspect the document”for item in doc.items[:10]: print(type(item).__name__, getattr(item, "entity_type", ""))
# Topology: the entity-reference graphtopo = doc.topologyif topo is not None: print("topology nodes:", len(topo.adjacency_list))Items are typed: STEPEntityItem, MeshItem, AssemblyItem, AnnotationItem.
3. Chunk it for RAG
Section titled “3. Chunk it for RAG”from cadling.chunker.hybrid_chunker import CADHybridChunker
chunker = CADHybridChunker(max_tokens=512, overlap_tokens=50)chunks = list(chunker.chunk(doc))print(f"{len(chunks)} chunks")print(chunks[0].text[:200])Each chunk carries meta with entity types, a topology subgraph, embeddings, and
a 3D bounding box — ready to index in a vector database.
4. Export
Section titled “4. Export”import jsonfrom pathlib import Path
# export_to_json() returns a dict; export_to_markdown() returns a string.Path("part.json").write_text(json.dumps(doc.export_to_json(), indent=2))Path("part.md").write_text(doc.export_to_markdown())Or do it all from the CLI
Section titled “Or do it all from the CLI”cadling convert part.step --format json --pretty -o part.jsoncadling chunk part.step --max-tokens 512 --overlap 50 -o chunks.jsonlcadling info part.stepWhere to next
Section titled “Where to next”- Tokenize this geometry: Tokenize a mesh.
- Generate Q&A training data: see cadling Usage → SDG.