AI-Assisted Data Modelling & Design: A Systematic Literature Review

Stepan Stanek (1), Ota Novotny (2)
(1) Department of Information Technology, Faculty of Informatics and Statistics, Prague University of Economics and Business, Prague, Czech Republic, Czechia,
(2) Department of Information Technology, Faculty of Informatics and Statistics, Prague University of Economics and Business, Prague, Czech Republic, Czechia

Abstract

Purpose: This paper explores how artificial intelligence (AI) supports Data Modelling and Design (DM&D) across its lifecycle and how human-in-the-loop (HITL) mechanisms can enhance model quality in enterprise governance.


Methodology/Approach: A PRISMA 2020–guided systematic review (2023–2025) of Scopus, Web of Science, and ACM Digital Library identified 28 eligible studies. Evidence was synthesised along the DAMA P–D–C–O cycle, with a focus on HITL.


Findings: AI supports planning, building, reviewing, and managing data models through schema generation, enrichment, validation, and optimisation. Results vary with model accuracy, data quality, and semantic gaps. Effective use relies on HITL workflows such as propose–validate, tutoring, co-editing, and feedback loops. A scorecard combining technical, performance, efficiency, and governance indicators traceable via metadata helps demonstrate impact.


Research Limitation/Implication: Findings reflect studies published between 2023 and 2025; results may evolve as AI capabilities progress.


Originality/Value of paper: This study presents the first lifecycle synthesis of AI-assisted DM&D. It organises evidence from 28 studies, defines key HITL patterns for quality assurance, and outlines ways to evaluate AI’s role in modelling governance. The findings provide a basis for further research and for developing frameworks that connect AI-driven modelling with enterprise data governance.

Full text article

Generated from XML file

References

Alazba, A., Aljamaan, H. and Alshayeb, M., 2024. Automated detection of class diagram smells using self-supervised learning. Automated Software Engineering. [online] https://doi.org/10.1007/s10515-024-00429-w.

Ardimento, P., Bernardi, M.L., Cimitile, M. and Scalera, M., 2024. A RAG-based Feedback Tool to Augment UML Class Diagram Learning. Proceedings of the ACM/IEEE 27th International Conference on Model Driven Engineering Languages and Systems — MODELS Companion ’24. [online] https://doi.org/10.1145/3652620.3687784.

Asaad, C., 2023. Towards Leveraging Artificial Intelligence for NoSQL Data Modeling, Querying and Quality Characterization. 0. [online] https://doi.org/10.1109/MODELS-C59198.2023.00047.

Avignone, A., Tierno, A., Fiori, A. and Chiusano, S., 2025. Exploring Large Language Models’ Ability to Describe Entity-Relationship Schema-Based Conceptual Data Models. INFORMATION. [online] https://doi.org/10.3390/info16050368.

Cao, Y., Jiang, P. and Xia, H., 2025. Generative and Malleable User Interfaces with Generative and Evolving Task-Driven Data Model. In: Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems. [online] CHI 2025: CHI Conference on Human Factors in Computing Systems. Yokohama Japan: ACM. pp.1–20. https://doi.org/10.1145/3706598.3713285.

Colombo, A., Bernasconi, A. and Ceri, S., 2025. An LLM-assisted ETL pipeline to build a high-quality knowledge graph of the Italian legislation. INFORMATION PROCESSING & MANAGEMENT. [online] https://doi.org/10.1016/j.ipm.2025.104082.

De Bari, D., Garaccione, G., Coppola, R., Ardito, L. and Torchiano, M., 2024. Evaluating Large Language Models in Exercises of UML Class Diagram Modeling. Proceedings of the 18th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement — ESEM ’24. [online] https://doi.org/10.1145/3674805.3690741.

Depoo, L., Hajerová-Mullerová, L., Kronberger, Z., Říhová, G., Stříteský, M., Hořáková, M., Legnerová, K., Palíšková, M., Němec, O., Šmíd, D., Jurčík, T. and Kopecký, M., 2025. The Impact of AI Implementation on Job Transformation and Competency Requirements: Prioritising Reskilling and Soft Skills Development. Quality Innovation Prosperity, [online] 29(2), pp.71–89. https://doi.org/10.12776/qip.v29i2.2165.

Döhmen, T., Geacu, R., Hulsebos, M. and Schelter, S., 2024. SchemaPile: A Large Collection of Relational Database Schemas. Proc. ACM Manag. Data. [online] https://doi.org/10.1145/3654975.

Earley, S., Henderson, D., and Data Management Association eds., 2017. DAMA-DMBOK: data management body of knowledge. 2nd edition ed. Basking Ridge, New Jersey: Technics Publications.

Haddaway, N.R., Page, M.J., Pritchard, C.C. and McGuinness, L.A., 2022. PRISMA2020: An R package and Shiny app for producing PRISMA 2020-compliant flow diagrams, with interactivity for optimised digital transparency and Open Synthesis. Campbell Systematic Reviews, [online] 18(2), p.e1230. https://doi.org/10.1002/cl2.1230.

Hu, F., Wang, C. and Wu, X., 2025. Generative Artificial Intelligence-Enabled Facility Layout Design Paradigm. Applied Sciences (Switzerland). [online] https://doi.org/10.3390/app15105697.

Jia, Y., Wei, J., Wang, X., Xu, D., Zuo, X., Yang, Y. and Xiao, X., 2024. Graphologue: Bridging RDBMS and Graph Databases with Natural Language Interfaces. DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, PT VII, DASFAA 2024. [online] https://doi.org/10.1007/978-981-97-5575-2_40.

Jiang, L., Borgida, A., Topaloglou, T. and Mylopoulos, J., 2007. Data Quality By Design: A Goal-Oriented Approach. [online] 2th International Conference on Information Quality. p.263. Available at: https://www.researchgate.net/publication/220918633_Data_Quality_By_Design_A_Goal-Oriented_Approach.

Kim, D., Yang, H., Hwang, S., Yeom, K., Shim, M. and Lee, K.-H., 2025. Active Learning Framework for Improving Knowledge Graph Accuracy. IEEE ACCESS. [online] https://doi.org/10.1109/ACCESS.2025.3551370.

Kitchenham, B.A., Budgen, D. and Brereton, P., 2015. Evidence-Based Software Engineering and Systematic Reviews. New York: Chapman and Hall/CRC. https://doi.org/10.1201/b19467.

Köhnen, C., Heuer, U., Zumbrägel, J. and Scherzinger, S., 2025. Making a Case for Visual Feedback in Teaching Database Schema Normalization. In: Proceedings of the 4th International Workshop on Data Systems Education: Bridging Education Practice with Education Research. [online] SIGMOD/PODS ’25: International Conference on Management of Data. Berlin Germany: ACM. pp.11–16. https://doi.org/10.1145/3735091.3737528.

Li, L., Luo, L., Chen, H. and Qiu, Z., 2024. Large Language Models in Data Governance: Multi-source Data Tables Merging. 0. [online] https://doi.org/10.1109/BigData62323.2024.10826092.

Liu, Y. and Dai, Y., 2023. Semi-supervised Entity Alignment via Noisy Student-Based Self Training. KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, PT II, KSEM 2023. [online] https://doi.org/10.1007/978-3-031-40286-9_28.

Liu, Y., Jian, Q. and Eckert, C.M., 2023. A semantic similarity-based method to support the conversion from EXPRESS to OWL. AI EDAM-ARTIFICIAL INTELLIGENCE FOR ENGINEERING DESIGN ANALYSIS AND MANUFACTURING. [online] https://doi.org/10.1017/S0890060423000185.

Ma, L., Thakurdesai, N., Chen, J., Xu, J., Korpeoglu, E., Kumar, S. and Achan, K., 2023. LLMs with User-defined Prompts as Generic Data Operators for Reliable Data Processing. 0. [online] https://doi.org/10.1109/BigData59044.2023.10386472.

Maljugić, B., Ćoćkalo, D., Bakator, M. and Stanisavljev, S., 2024. The Role of the Quality Management Process within Society 5.0. Societies, [online] 14(7), p.111. https://doi.org/10.3390/soc14070111.

Mo, F., Rehman, H.U., Ugarte, M., Carrera-Rivera, A., Rea Minangoe, N., Monetti, F.M., Maffei, A. and Chaplin, J.C., 2025. Development of a runtime-condition model for proactive intelligent products using knowledge graphs and embedding. Knowledge-Based Systems. [online] https://doi.org/10.1016/j.knosys.2025.113484.

Mohsenzadegan, K., Tavakkoli, V., Kambale, W.V. and Kyamakya, K., 2024. A Hybrid AI Framework Integrating Ontology Learning, Knowledge Graphs, and Large Language Models for Improved Data Model Translation in Smart Manufacturing and Transportation. [online] https://doi.org/10.1109/SDF63218.2024.10773919.

Mozaffari, M., Dignös, A., Gamper, J. and Störl, U., 2024. Self-tuning Database Systems: A Systematic Literature Review of Automatic Database Schema Design and Tuning. ACM Computing Surveys. [online] https://doi.org/10.1145/3665323.

Ortega-Guzmán, V.H., Gutiérrez-Preciado, L., Cervantes, F. and Alcaraz-Mejia, M., 2024. A Methodology for Knowledge Discovery in Labeled and Heterogeneous Graphs. Applied Sciences (Switzerland). [online] https://doi.org/10.3390/app14020838.

Ouzzani, M., Hammady, H., Fedorowicz, Z. and Elmagarmid, A., 2016. Rayyan—a web and mobile app for systematic reviews. Systematic Reviews, [online] 5(1), p.210. https://doi.org/10.1186/s13643-016-0384-4.

Pernici, B., Cappiello, C., Bono, C.A., Sancricca, C., Catarci, T., Angelini, M., Filosa, M., Palmonari, M., De Paoli, F., Bergamaschi, S., Simonini, G., Mozzillo, A. and Zecchini, L., 2025. Sustainable quality in data preparation. Journal of Data and Information Quality, [online] p.3769120. https://doi.org/10.1145/3769120.

Purohit, S., Chin, G., Mackey, P.S. and Cottam, J.A., 2024. GraphAide: Advanced Graph-Assisted Query and Reasoning System. 0. [online] https://doi.org/10.1109/BigData62323.2024.10825705.+

Rebboud, Y., Tailhardat, L., Lisena, P. and Troncy, R., 2025. Can LLMs Generate Competency Questions? SEMANTIC WEB: ESWC 2024 SATELLITE EVENTS, PT I. [online] https://doi.org/10.1007/978-3-031-78952-6_7.

van Renen, A., Stoian, M. and Kipf, A., 2024. DataLoom: Simplifying Data Loading with LLMs. Proc. VLDB Endow. [online] https://doi.org/10.14778/3685800.3685897.

Rosenthal, K., Strecker, S. and Snoeck, M., 2023. Modeling difficulties in creating conceptual data models. Software and Systems Modeling, [online] 22(3), pp.1005–1030. https://doi.org/10.1007/s10270-022-01051-8.

San Emeterio de la Parte, M., Martínez-Ortega, J., Lucas, N. and Hernández, V., 2025. SISS: Semantic Interoperability Support System for the Internet of Things. IEEE Internet of Things Journal. [online] https://doi.org/10.1109/JIOT.2025.3577776.

Shirvani-Mahdavi, N., Akrami, F., Saeef, M.S., Shi, X. and Li, C., 2023. Comprehensive Analysis of Freebase and Dataset Creation for Robust Evaluation of Knowledge Graph Link Prediction Models. Lecture Notes in Computer Science. [online] https://doi.org/10.1007/978-3-031-47243-5_7.

Stanek, S., 2025. AI-Assisted Data Governance: From Theory to Tools. Interdisciplinary Information Management Talks. https://doi.org/10.35011/IDIMT-2025-163.

Yordanova, Z., 2025. Role of Artificial Intelligence in Facilitating Open Innovation in VAT Management. SUPPLY CHAINS, PT I, ICSC 2024. [online] https://doi.org/10.1007/978-3-031-69344-1_14.

Authors

Stepan Stanek
stepan.stanek@vse.cz (Primary Contact)
Ota Novotny
Author Biographies

Stepan Stanek, Department of Information Technology, Faculty of Informatics and Statistics, Prague University of Economics and Business, Prague, Czech Republic

PhD student and lecturer at the Department of Information Technology at the Prague University of Economics and Business, in Data Governance.

Ota Novotny, Department of Information Technology, Faculty of Informatics and Statistics, Prague University of Economics and Business, Prague, Czech Republic

Head of the Department of Information Technology, Prague University of Economics and Business, Associate professor since 2009 in Informatics.

Stanek, S., & Novotny, O. (2025). AI-Assisted Data Modelling & Design: A Systematic Literature Review. Quality Innovation Prosperity, 29(3), 39–55. https://doi.org/10.12776/qip.v29i3.2253

Article Details

Similar Articles

<< < 7 8 9 10 11 12 13 14 15 16 > >> 

You may also start an advanced similarity search for this article.

The Impact of AI Implementation on Job Transformation and Competency Requirements: Prioritising Reskilling and Soft Skills Development

Lucie Depoo, Lenka Hajerová-Mullerová, Zdeněk Kronberger, Gabriela Říhová, Marek Stříteský, Marie...
Abstract View : 1542
Download :392

The Necessary Skillset Based on the Use of Artificial Intelligence in Czech Top Organisations

Zdeněk Kronberger, Lucie Depoo, Gabriela Říhová
Abstract View : 1851
Download :406