Unified data extraction and preprocessing toolkit for Retrieval-Augmented Generation (RAG) pipelines.