AI-Assisted Data Modelling & Design: A Systematic Literature Review
Abstract
Purpose: This paper explores how artificial intelligence (AI) supports Data Modelling and Design (DM&D) across its lifecycle and how human-in-the-loop (HITL) mechanisms can enhance model quality in enterprise governance.
Methodology/Approach: A PRISMA 2020–guided systematic review (2023–2025) of Scopus, Web of Science, and ACM Digital Library identified 28 eligible studies. Evidence was synthesised along the DAMA P–D–C–O cycle, with a focus on HITL.
Findings: AI supports planning, building, reviewing, and managing data models through schema generation, enrichment, validation, and optimisation. Results vary with model accuracy, data quality, and semantic gaps. Effective use relies on HITL workflows such as propose–validate, tutoring, co-editing, and feedback loops. A scorecard combining technical, performance, efficiency, and governance indicators traceable via metadata helps demonstrate impact.
Research Limitation/Implication: Findings reflect studies published between 2023 and 2025; results may evolve as AI capabilities progress.
Originality/Value of paper: This study presents the first lifecycle synthesis of AI-assisted DM&D. It organises evidence from 28 studies, defines key HITL patterns for quality assurance, and outlines ways to evaluate AI’s role in modelling governance. The findings provide a basis for further research and for developing frameworks that connect AI-driven modelling with enterprise data governance.
Full text article
References
Alazba, A., Aljamaan, H. and Alshayeb, M., 2024. Automated detection of class diagram smells using self-supervised learning. Automated Software Engineering. [online] https://doi.org/10.1007/s10515-024-00429-w.
Ardimento, P., Bernardi, M.L., Cimitile, M. and Scalera, M., 2024. A RAG-based Feedback Tool to Augment UML Class Diagram Learning. Proceedings of the ACM/IEEE 27th International Conference on Model Driven Engineering Languages and Systems — MODELS Companion ’24. [online] https://doi.org/10.1145/3652620.3687784.
Asaad, C., 2023. Towards Leveraging Artificial Intelligence for NoSQL Data Modeling, Querying and Quality Characterization. 0. [online] https://doi.org/10.1109/MODELS-C59198.2023.00047.
Avignone, A., Tierno, A., Fiori, A. and Chiusano, S., 2025. Exploring Large Language Models’ Ability to Describe Entity-Relationship Schema-Based Conceptual Data Models. INFORMATION. [online] https://doi.org/10.3390/info16050368.
Cao, Y., Jiang, P. and Xia, H., 2025. Generative and Malleable User Interfaces with Generative and Evolving Task-Driven Data Model. In: Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems. [online] CHI 2025: CHI Conference on Human Factors in Computing Systems. Yokohama Japan: ACM. pp.1–20. https://doi.org/10.1145/3706598.3713285.
Colombo, A., Bernasconi, A. and Ceri, S., 2025. An LLM-assisted ETL pipeline to build a high-quality knowledge graph of the Italian legislation. INFORMATION PROCESSING & MANAGEMENT. [online] https://doi.org/10.1016/j.ipm.2025.104082.
De Bari, D., Garaccione, G., Coppola, R., Ardito, L. and Torchiano, M., 2024. Evaluating Large Language Models in Exercises of UML Class Diagram Modeling. Proceedings of the 18th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement — ESEM ’24. [online] https://doi.org/10.1145/3674805.3690741.
Depoo, L., Hajerová-Mullerová, L., Kronberger, Z., Říhová, G., Stříteský, M., Hořáková, M., Legnerová, K., Palíšková, M., Němec, O., Šmíd, D., Jurčík, T. and Kopecký, M., 2025. The Impact of AI Implementation on Job Transformation and Competency Requirements: Prioritising Reskilling and Soft Skills Development. Quality Innovation Prosperity, [online] 29(2), pp.71–89. https://doi.org/10.12776/qip.v29i2.2165.
Döhmen, T., Geacu, R., Hulsebos, M. and Schelter, S., 2024. SchemaPile: A Large Collection of Relational Database Schemas. Proc. ACM Manag. Data. [online] https://doi.org/10.1145/3654975.
Earley, S., Henderson, D., and Data Management Association eds., 2017. DAMA-DMBOK: data management body of knowledge. 2nd edition ed. Basking Ridge, New Jersey: Technics Publications.
Haddaway, N.R., Page, M.J., Pritchard, C.C. and McGuinness, L.A., 2022. PRISMA2020: An R package and Shiny app for producing PRISMA 2020-compliant flow diagrams, with interactivity for optimised digital transparency and Open Synthesis. Campbell Systematic Reviews, [online] 18(2), p.e1230. https://doi.org/10.1002/cl2.1230.
Hu, F., Wang, C. and Wu, X., 2025. Generative Artificial Intelligence-Enabled Facility Layout Design Paradigm. Applied Sciences (Switzerland). [online] https://doi.org/10.3390/app15105697.
Jia, Y., Wei, J., Wang, X., Xu, D., Zuo, X., Yang, Y. and Xiao, X., 2024. Graphologue: Bridging RDBMS and Graph Databases with Natural Language Interfaces. DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, PT VII, DASFAA 2024. [online] https://doi.org/10.1007/978-981-97-5575-2_40.
Jiang, L., Borgida, A., Topaloglou, T. and Mylopoulos, J., 2007. Data Quality By Design: A Goal-Oriented Approach. [online] 2th International Conference on Information Quality. p.263. Available at: https://www.researchgate.net/publication/220918633_Data_Quality_By_Design_A_Goal-Oriented_Approach.
Kim, D., Yang, H., Hwang, S., Yeom, K., Shim, M. and Lee, K.-H., 2025. Active Learning Framework for Improving Knowledge Graph Accuracy. IEEE ACCESS. [online] https://doi.org/10.1109/ACCESS.2025.3551370.
Kitchenham, B.A., Budgen, D. and Brereton, P., 2015. Evidence-Based Software Engineering and Systematic Reviews. New York: Chapman and Hall/CRC. https://doi.org/10.1201/b19467.
Köhnen, C., Heuer, U., Zumbrägel, J. and Scherzinger, S., 2025. Making a Case for Visual Feedback in Teaching Database Schema Normalization. In: Proceedings of the 4th International Workshop on Data Systems Education: Bridging Education Practice with Education Research. [online] SIGMOD/PODS ’25: International Conference on Management of Data. Berlin Germany: ACM. pp.11–16. https://doi.org/10.1145/3735091.3737528.
Li, L., Luo, L., Chen, H. and Qiu, Z., 2024. Large Language Models in Data Governance: Multi-source Data Tables Merging. 0. [online] https://doi.org/10.1109/BigData62323.2024.10826092.
Liu, Y. and Dai, Y., 2023. Semi-supervised Entity Alignment via Noisy Student-Based Self Training. KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, PT II, KSEM 2023. [online] https://doi.org/10.1007/978-3-031-40286-9_28.
Liu, Y., Jian, Q. and Eckert, C.M., 2023. A semantic similarity-based method to support the conversion from EXPRESS to OWL. AI EDAM-ARTIFICIAL INTELLIGENCE FOR ENGINEERING DESIGN ANALYSIS AND MANUFACTURING. [online] https://doi.org/10.1017/S0890060423000185.
Ma, L., Thakurdesai, N., Chen, J., Xu, J., Korpeoglu, E., Kumar, S. and Achan, K., 2023. LLMs with User-defined Prompts as Generic Data Operators for Reliable Data Processing. 0. [online] https://doi.org/10.1109/BigData59044.2023.10386472.
Maljugić, B., Ćoćkalo, D., Bakator, M. and Stanisavljev, S., 2024. The Role of the Quality Management Process within Society 5.0. Societies, [online] 14(7), p.111. https://doi.org/10.3390/soc14070111.
Mo, F., Rehman, H.U., Ugarte, M., Carrera-Rivera, A., Rea Minangoe, N., Monetti, F.M., Maffei, A. and Chaplin, J.C., 2025. Development of a runtime-condition model for proactive intelligent products using knowledge graphs and embedding. Knowledge-Based Systems. [online] https://doi.org/10.1016/j.knosys.2025.113484.
Mohsenzadegan, K., Tavakkoli, V., Kambale, W.V. and Kyamakya, K., 2024. A Hybrid AI Framework Integrating Ontology Learning, Knowledge Graphs, and Large Language Models for Improved Data Model Translation in Smart Manufacturing and Transportation. [online] https://doi.org/10.1109/SDF63218.2024.10773919.
Mozaffari, M., Dignös, A., Gamper, J. and Störl, U., 2024. Self-tuning Database Systems: A Systematic Literature Review of Automatic Database Schema Design and Tuning. ACM Computing Surveys. [online] https://doi.org/10.1145/3665323.
Ortega-Guzmán, V.H., Gutiérrez-Preciado, L., Cervantes, F. and Alcaraz-Mejia, M., 2024. A Methodology for Knowledge Discovery in Labeled and Heterogeneous Graphs. Applied Sciences (Switzerland). [online] https://doi.org/10.3390/app14020838.
Ouzzani, M., Hammady, H., Fedorowicz, Z. and Elmagarmid, A., 2016. Rayyan—a web and mobile app for systematic reviews. Systematic Reviews, [online] 5(1), p.210. https://doi.org/10.1186/s13643-016-0384-4.
Pernici, B., Cappiello, C., Bono, C.A., Sancricca, C., Catarci, T., Angelini, M., Filosa, M., Palmonari, M., De Paoli, F., Bergamaschi, S., Simonini, G., Mozzillo, A. and Zecchini, L., 2025. Sustainable quality in data preparation. Journal of Data and Information Quality, [online] p.3769120. https://doi.org/10.1145/3769120.
Purohit, S., Chin, G., Mackey, P.S. and Cottam, J.A., 2024. GraphAide: Advanced Graph-Assisted Query and Reasoning System. 0. [online] https://doi.org/10.1109/BigData62323.2024.10825705.+
Rebboud, Y., Tailhardat, L., Lisena, P. and Troncy, R., 2025. Can LLMs Generate Competency Questions? SEMANTIC WEB: ESWC 2024 SATELLITE EVENTS, PT I. [online] https://doi.org/10.1007/978-3-031-78952-6_7.
van Renen, A., Stoian, M. and Kipf, A., 2024. DataLoom: Simplifying Data Loading with LLMs. Proc. VLDB Endow. [online] https://doi.org/10.14778/3685800.3685897.
Rosenthal, K., Strecker, S. and Snoeck, M., 2023. Modeling difficulties in creating conceptual data models. Software and Systems Modeling, [online] 22(3), pp.1005–1030. https://doi.org/10.1007/s10270-022-01051-8.
San Emeterio de la Parte, M., Martínez-Ortega, J., Lucas, N. and Hernández, V., 2025. SISS: Semantic Interoperability Support System for the Internet of Things. IEEE Internet of Things Journal. [online] https://doi.org/10.1109/JIOT.2025.3577776.
Shirvani-Mahdavi, N., Akrami, F., Saeef, M.S., Shi, X. and Li, C., 2023. Comprehensive Analysis of Freebase and Dataset Creation for Robust Evaluation of Knowledge Graph Link Prediction Models. Lecture Notes in Computer Science. [online] https://doi.org/10.1007/978-3-031-47243-5_7.
Stanek, S., 2025. AI-Assisted Data Governance: From Theory to Tools. Interdisciplinary Information Management Talks. https://doi.org/10.35011/IDIMT-2025-163.
Yordanova, Z., 2025. Role of Artificial Intelligence in Facilitating Open Innovation in VAT Management. SUPPLY CHAINS, PT I, ICSC 2024. [online] https://doi.org/10.1007/978-3-031-69344-1_14.
Authors
Copyright (c) 2025 Stepan Stanek, Ota Novotny

This work is licensed under a Creative Commons Attribution 4.0 International License.
This is an open access journal which means that all content is freely available without charge to the user or his/her institution. Users are allowed to read, download, copy, distribute, print, search, or link to the full texts of the articles in this journal without asking prior permission from the publisher or the author. This is in accordance with the BOAI definition of open access. This journal is licensed under a Creative Commons Attribution 4.0 License - http://creativecommons.org/licenses/by/4.0.
Authors who publish with the Quality Innovation Prosperity agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.
