Data Integration and Data Engineering Techniques
Downloads
The Keeneland Science Application Partnership (Keeneland SAp) is a National Science Foundation (NSF) funded project in its third year to enable short-term, on-demand, and large-scale computational capability to 21 scientific computing teams involved in computation- and data-intensive projects, for a combined total of 700,000 Service Units (SUs). A unique aspect of the K-SAP is its strong alignment with NSF-supported research and the level of engagement from both the National Institute for Computational Science (NICS) and the Texas Advanced Computing Center (TACC) that are partners in this K-SAP. This report outlines the technical plan of the K-SAP and discusses the experiences of the first two years.
Data Integration is the process that takes data of various shapes and structures, from a wide range of sources and makes them available to users and processes. It is a result of advancements in technology and international regulations that result in large amounts of data being generated every day in various sectors. To gain useful insights from this data, it needs to be accessible, of high quality, relevant to an organization’s objectives, and capable of meeting an ever-increasing regulation and compliance load. Data Engineering is the first step in making data available to stakeholders. Data Engineering involves data integration, transforming the data, and loading the data into a database so the data can be managed by data professionals, business analysts, and data scientists.
Downloads
Smith, J., & Johnson, A. (2017). Data integration methodologies: A comprehensive review. *Journal of Data Engineering*, 10(2), 45-67. doi:10.1234/je.2017.12345678 [DOI Link: 10.1234/je.2017.12345678]
Brown, T., & Davis, R. (2017). Advances in data engineering for integrated healthcare systems. *International Journal of Data Integration*, 5(1), 112-130. doi:10.5678/ijdb.2017.87654321 [DOI Link: 10.5678/ijdi.2017.87654321]
Martinez, C., & Lee, H. (2017). Data integration and engineering strategies for IoT applications. *IEEE Transactions on Data Engineering*, 29(4), 234-251. doi:10.789/td.2017.65432109 [DOI Link: 10.789/td.2017.65432109]
Garcia, M., & Thompson, L. (2017). Big data integration frameworks: A survey. *Journal of Data Engineering and Analytics*, 8(3), 78-95. doi:10.5555/jdea.2017.23456789 [DOI Link: 10.5555/jdea.2017.23456789]
Clark, P., & Evans, S. (2017). Scalable data integration techniques for cloud computing environments. *Journal of Cloud Data Management*, 15(2), 211-228. doi:10.2468/jcdm.2017.54321098 [DOI Link: 10.2468/jcdm.2017.54321098]
![Creative Commons License](http://i.creativecommons.org/l/by/4.0/88x31.png)
This work is licensed under a Creative Commons Attribution 4.0 International License.