Home » Services » B & I » ETL / Data Warehousing

ETL / Data Warehousing

etl

Opulent Soft Consultants in Data Warehousing and ETL specialize in Data Warehouse and Data Mart tools, databases, and approaches:

Relational Database Management Systems :

    •    Oracle
    •    Sybase
    •    DB2

ETL (Extraction, Transformation, and Loading) Tools

 Informatics Power Center/Power art :

  •    Abs Initio
  •    Oracle Warehouse Builder
  •    Business Objects Data Integrator
  •    Accentual Data stage,
  •    Hyperion Application Link
  •    Congo’s Decision Stream

ETL is short for extract, transform, load, three database functions that are combined into one tool to pull data out of one database and place it into another database. Extract is the process of reading data from a database.

  EXTRACT

Our software code can Extract data from a single source or multiple data sources

Our clients use our customised Data extraction software to retrieve data out of internal or external environments such as ERP backups, mail servers, remote servers, attendance systems, data entry systems, employees systems, for further data processing or data storage (data migration).

The usual practice is to import data into an intermediate extracting system followed by data transformation with the addition of metadata prior to export to another stage for further processing.

Usually data extraction happens into a computer from primary sources, like measuring or recording devices or electrical connector (e.g. USB) through which ‘raw data’ can be streamed into any computer.

External data sources such as web pages, emails, documents, PDFs, scanned text, mainframe reports, spool files etc. Extracting data from these sources is a technical challenge we deal with changes in physical hardware formats. Data extraction from web is called as web scraping

We add structure to unstructured data

Using text pattern matching such as regular expressions to identify small or large-scale structure e.g. records in a report and their associated data from headers and footers

Using a table-based approach to identify common sections within a limited domain e.g. in emailed resumes, identifying skills, previous work experience, qualifications etc. using a standard set of commonly used headings (these would differ from language to language), e.g. Education might be found under Education/Qualification/Courses

Using text analytics to attempt to understand the text and link it to other information

 TRANSFORM

We transform the data for storing it in proper format or structure for querying and analysis purpose in data warehouse, our data transformation program converts a set of data values from the data format of a source data system into a usable data format.

Data element to data element mapping is frequently complicated by complex transformations that require one-to-many and many-to-one transformation rules.

Our code generation takes the data element mapping specification and creates an executable program that can be run on a computer system. Code generation creates transformation in an easy-to-maintain computer language such as Java or XSLT.

We also undertake a master data recast where the entire database of data values is transformed or recast without extracting the data from the database. All data in a well designed database is directly or indirectly related to a limited set of master database tables by a network of foreign key constraints. Each foreign key constraint is dependent upon a unique database index from the parent database table. Therefore, when the proper master database table is recast with a different unique index, the directly and indirectly related data are also recast or restated. The directly and indirectly related data may also still be viewed in the original form since the original unique index still exists with the master data. Also, the database recast will be done in such a way as to not impact the applications architecture software.

LOAD

In the load phase we load the data into the end target which could be a simple delimited flat file or a data warehouse. Depending on client requirements we process the data. Our design allows overwriting existing information with cumulative information; updating extracted data on a daily, weekly, or monthly basis. The Data updating, retrieval timing and scope to replace or append are based on strategic design choices and client business needs. Our software systems can maintain a history and audit trail of all changes to the data loaded in the data warehouse.

As the load phase interacts with a database, the constraints defined in the database scheme — as well as in triggers activated upon data load — apply (for example, uniqueness, referential integrity, mandatory fields), which also contribute to the overall data quality performance of the ETL process.

Data elements are typically stored by different departments with different labels. Our ETL code can bundle all these data elements and consolidate them into a uniform presentation, for storing in a database or data warehouse.

 Our ETL code can move information to another application permanently. ETL can be used to transform the data into a format suitable for a new application to use.

An example of this would be an Expense and Cost Recovery System (ECRS) such as used by accountancies, consultancies and lawyers. The data usually end up in the time and billing system, although some businesses may also utilize the raw data for employee productivity reports to Human Resources (personnel dept.) or equipment usage reports to Facilities Management.

Data Warehousing

In computing, a data warehouse (DW or DWH), also known as an enterprise data warehouse (EDW), is a system used for reporting and data analysis. DWs are central repositories of integrated data from one or more disparate sources. They store current and historical data and are used for creating trending reports for senior management reporting such as annual and quarterly comparisons.

The data stored in the warehouse is uploaded from the operational systems. The data may pass through an operational data store for additional operations before it is used in the DW for reporting.