Data Integration and Data Engineering Techniques

Data Integration and Data Engineering, Industry 4.0, Internet of Things (IoT), Artificial Intelligence (AI), Machine Learning (ML), Smart Manufacturing (SM),Computer Science, Data Science,Vehicle, Vehicle Reliability


July 13, 2024


The Keeneland Science Application Partnership (Keeneland SAp) is a National Science Foundation (NSF) funded project in its third year to enable short-term, on-demand, and large-scale computational capability to 21 scientific computing teams involved in computation- and data-intensive projects, for a combined total of 700,000 Service Units (SUs). A unique aspect of the K-SAP is its strong alignment with NSF-supported research and the level of engagement from both the National Institute for Computational Science (NICS) and the Texas Advanced Computing Center (TACC) that are partners in this K-SAP. This report outlines the technical plan of the K-SAP and discusses the experiences of the first two years.

Data Integration is the process that takes data of various shapes and structures, from a wide range of sources and makes them available to users and processes. It is a result of advancements in technology and international regulations that result in large amounts of data being generated every day in various sectors. To gain useful insights from this data, it needs to be accessible, of high quality, relevant to an organization’s objectives, and capable of meeting an ever-increasing regulation and compliance load. Data Engineering is the first step in making data available to stakeholders. Data Engineering involves data integration, transforming the data, and loading the data into a database so the data can be managed by data professionals, business analysts, and data scientists.