Diplom thesis on ETL processes in current data warehouses

Diplom thesis on ETL processes in current data warehouses

Motivation

The data warehouse has already established itself as a data source for strategic reporting in larger companies. It is usually the data warehouse technology that has been implemented identically by the various providers. Although the technology used is often comparable, the definition of a data warehouse leaves enough room for manoeuvre to allow differentiation between products. In addition to data modelling and reporting, data procurement is one of the main aspects of a data warehouse. Data warehousing involves the ETL process, the extraction, transformation and loading of data. Science has already made many suggestions as to how this process should be implemented. However, implementation in practice can deviate from the guidelines, either intentionally or unintentionally.

Aim of the work

The aim of this Diplom thesis is to examine how the ETL process was realised by the data warehouse providers SAP, Oracle and Microsoft. The products SAP BW (NetWeaver2004s), Oracle 10g and Microsoft SQL Server 2005 are compared with each other and with the specifications from the scientific community. The following points are to be analysed:

  • How is the ETL process described in theory? What specifications are made, what are the objectives? What should be paid particular attention to during extraction, transformation and loading?
  • How was the ETL process implemented in detail by the providers SAP, Oracle and Microsoft?
  • Extraction: Which data sources are affected? In what qualitative state is the data stored in these sources? Is delta extraction possible? What does the source system need to do? What is crucial for performance?
  • Transformation: How can the data be cleansed? How efficient is this process? Are there automated routines for conversions or even transposing tables?
  • Loading: What does the data flow within the DWH look like? How can data integration be realised? Is there master data integration? Can the data still be modified?
  • What advantages/potential, disadvantages/weaknesses are evident in the approaches?
  • How do the providers differ, what are the reasons for this?
  • What other methods are there for loading data into the DWH? What other data sources can be used? Is the ETL process the only sensible way to load data into a DWH?
  • What can be improved in the ETL process?

If you are interested, please contact or .

(Changed: 11 Feb 2026)  Kurz-URL:Shortlink: https://uol.de/p40768en
Zum Seitananfang scrollen Scroll to the top of the page

This page contains automatically translated content.