This project is maintained by C-EB
Creation of a Data Integration System in a Data Warehouse
The project consisted of two distinct parts:
In this phase, I set up a PostgreSQL database (COM_INGESTION_DB) and developed SQL scripts for schema creation, table creation, and data insertion. I carefully managed the ingestion process, ensuring seamless integration of flat files into the database using Talend’s powerful data integration capabilities. Key tasks included:
In this phase, I expanded the PostgreSQL database to include a data warehouse (VENTE_DWH) for advanced analytics and reporting purposes. Similar to Part 1, I developed SQL scripts for schema creation, table creation, and data manipulation. Within the Talend environment (ICOMMERCE_REPORTING), I designed Talend jobs to orchestrate the flow of data into the data warehouse (DWH), ensuring consistency and accuracy. Key tasks included:
Six CSV files
Extracting data from multiple sources (six CSV files)
Loading data into the target database
| Tool | Purpose | |:————:|:———————————-:| | PostgreSQL | managing ODS and QWH tables | | Talend | ETL process | | Github | Hosting the project documentation |
Creation of a directory containing the daily data
Creation of various ODS jobs (data transformation and loading)
Connection to the database: COM_INGESTION_DB
Execution of SQL scripts:
script_create_schema_dwh.sql: Creation of the schema VENTE_DWH
script_create_table_dwh.sql: Creation of the various DWH tables
script_create_table_ods.sql: Updating the ODS_VENTE table
script_insert_context_variable.sql: Inserting data into the CONTEXT table
script_select_table_dwh.sql: This script selects the data inserted into the various tables
script_truncate_table_dwh.sql This script deletes the data present in the various DWH tables
Creating the DWH jobs