APPLICATION OF INTELLIGENT DATA ANALYSIS METHODS TO EMPLOYEE INFORMATION PROCESSING USING DuckDB and ChromaDB
DOI:
https://doi.org/10.54309/IJICT.2025.23.3.009Keywords:
Vector databases, Semantic search, OLAP analysis, DuckDB, ChromaDB, Employee clustering, Text data analysisAbstract
Intelligent analysis of employee data is becoming increasingly relevant in the context of digital transformation of human resource management processes and the growing need to process both structured and unstructured sources of information. Modern approaches require the implementation of solutions capable of efficiently handling tabular data while simultaneously extracting meaning from textual employee descriptions. However, most existing systems lack a unified architecture that combines OLAP analysis, clustering, and semantic search within a locally executable analytical environment.
The objective of this study is to develop a local analytical subsystem aimed at comprehensive analysis of employee information using the embedded OLAP DBMS DuckDB and the vector database ChromaDB. To achieve this objective, the following tasks were implemented: data preprocessing, aggregation using SQL queries, employee clustering through the k-means method, vectorization of textual descriptions, and deployment of semantic search based on the all-MiniLM-L6-v2 model.
As a result, a software prototype was developed that enables both quantitative and semantic analysis of employee data. The system demonstrated high performance under local execution, rapid personnel segmentation capabilities, and semantic search through natural language queries without the need for external services. Additionally, the results were visualized using the Yandex DataLens BI platform.
The proposed solution exhibits a high degree of reproducibility and can be adapted to internal HR analytics, decision support systems, and information-analytical platforms. The findings confirm the potential of integrating OLAP and vector technologies within a unified local architecture for intelligent processing of employee information.
Downloads
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 INTERNATIONAL JOURNAL OF INFORMATION AND COMMUNICATION TECHNOLOGIES

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
https://creativecommons.org/licenses/by-nc-nd/3.0/deed.en