cadling — Usage
cadling’s processing flow is: DocumentConverter → Backend → Pipeline (Build → Assemble → Enrich) → CADlingDocument → Chunking / SDG / Export.
Python API
Section titled “Python API”Basic conversion
Section titled “Basic conversion”from cadling import DocumentConverter, ConversionStatus
converter = DocumentConverter()result = converter.convert("part.step")
if result.status == ConversionStatus.SUCCESS: doc = result.document print(f"Parsed {len(doc.items)} items")
json_data = doc.export_to_json() markdown = doc.export_to_markdown()With format options
Section titled “With format options”from cadling import DocumentConverter, FormatOption, InputFormatfrom cadling.backend.step.step_backend import STEPBackendfrom cadling.pipeline.hybrid_pipeline import HybridPipeline
converter = DocumentConverter( allowed_formats=[InputFormat.STEP], format_options={ InputFormat.STEP: FormatOption( backend=STEPBackend, pipeline_cls=HybridPipeline, ) },)result = converter.convert("assembly.step")Chunking for RAG
Section titled “Chunking for RAG”from cadling import DocumentConverterfrom cadling.chunker.hybrid_chunker import CADHybridChunker
result = DocumentConverter().convert("part.step")
chunker = CADHybridChunker(max_tokens=512, overlap_tokens=50)for chunk in chunker.chunk(result.document): print(f"Chunk {chunk.chunk_id}: {len(chunk.meta.entity_ids)} entities") # chunk.text → text representation # chunk.meta → entity types, topology subgraph, embeddings, bboxSynthetic data generation (SDG)
Section titled “Synthetic data generation (SDG)”from pathlib import Pathfrom cadling.sdg.qa import ( CADPassageSampler, CADGenerator, CADJudge, CADSampleOptions, CADGenerateOptions, CADCritiqueOptions, LlmProvider,)
# 1. Sample passages from CAD filessampler = CADPassageSampler(CADSampleOptions(sample_file=Path("samples.jsonl")))sampler.sample([Path("part.step"), Path("assembly.step")])
# 2. Generate Q&A pairsgen = CADGenerator(CADGenerateOptions( provider=LlmProvider.OPENAI, model_id="gpt-4o", generated_file=Path("generated.jsonl"),))gen.generate(Path("samples.jsonl"))
# 3. Critique and improvejudge = CADJudge(CADCritiqueOptions( provider=LlmProvider.OPENAI, model_id="gpt-4o", critiqued_file=Path("critiqued.jsonl"),))judge.critique(Path("generated.jsonl"))cadling (main)
Section titled “cadling (main)”cadling convert part.step --format json --pretty -o part.jsoncadling chunk part.step --max-tokens 512 --overlap 50 -o chunks.jsonlcadling generate-qa part.step -n 100 -m gpt-4 -o qa.jsonlcadling info part.stepcadling-sdg
Section titled “cadling-sdg”cadling-sdg qa sample part.step assembly.step --chunker hybrid -o samples.jsonlcadling-sdg qa generate samples.jsonl -p openai -m gpt-4o -o generated.jsonlcadling-sdg qa critique generated.jsonl -p openai -m gpt-4o --rewrite -o critiqued.jsonlThe layers
Section titled “The layers”| Layer | Where | Role |
|---|---|---|
| Backends | cadling/backend/ | Format-specific parsing. DeclarativeCADBackend (text) and RenderableCADBackend (vision). STEP and STL are dual-mode. |
| Pipelines | cadling/pipeline/ | Orchestrate Build → Assemble → Enrich. SimpleCADPipeline, STEPPipeline, STLPipeline, VisionPipeline, VlmPipeline, HybridPipeline. |
| Enrichment models | cadling/models/ | Optional post-processing: geometry analysis, topology validation, mesh quality, surface analysis, interference, GNN segmentation. |
| Chunkers | cadling/chunker/ | RAG chunking: CADHybridChunker, CADHierarchicalChunker, format-specific. |
| SDG | cadling/sdg/ | CADPassageSampler → CADGenerator → CADJudge (+ CADConceptualGenerator). |
Core data models
Section titled “Core data models”InputFormat # STEP | STL | BREP | IGES | CAD_IMAGEConversionStatus # SUCCESS | PARTIAL | FAILURECADlingDocument # items, topology, segments, embeddingsCADItem # STEPEntityItem | MeshItem | AssemblyItem | AnnotationItemTopologyGraph # entity reference graph (adjacency list)Related
Section titled “Related”- Overview · Installation
- Tokenize cadling output with geotoken.