Vector Indexing and Hybrid Models in Information Retrieval
Kenan Matawie, Western Sydney University
Co-authors: Sargon Hasso, Illinois Institute of Technology
Abstract: This study extends prior research on semantic enrichment and corpus quality in information retrieval by evaluating the impact of using the Weaviate vector database and document embeddings for indexing. Focusing on ranking effectiveness, it compares vector-based retrieval, the traditional BM25 model, and a hybrid approach. All methods are assessed on the same TREC dataset using Mean Average Precision (mAP) to maintain comparability. Statistical significance testing is applied to analyze differences and correlations in retrieval performance. The findings offer a rigorous evaluation of vector-based methods, both standalone and combined with BM25, and provide insights into their relative effectiveness within modern retrieval frameworks.