JsonEDI provides automation to ETL similar to what WordPress provides in creating a website. Choose plugins based on what you're trying to accomplish. Examples include data warehousing or loading a search engine etc. Then our user wizards guide the developer by extracting information from several locations including source data, schema and data models to populate a data dictionary style repository. The metadata can be further modified, if needed, with simple tags to implement ETL functionality. Minimal or no coding is required. JsonEDI's metadata is centrally managed, responsive to schema changes, semi or fully automatically created, synchronized with database schema and is reused to automate DBA, QA and requirements management tasks. We use Json documents for internal processing and can codelessly transform between hierarchical and tabular data.
Equator is software as a services provider that is feeding complex data to 3 of the 4 largest banks in the US. They have 1/2 million data attributes in 5,000 tables are under ETL management. This is managed with just one ETL DBA and 1.5 ETL developers. This small ETL team is supporting change management and production support from a software company with 155 employees. JsonEDI has transformed client integration from an expensive, brittle pain point to a low cost, reliable, customer valued feature. The OLTP source systems are a hybrid of relational and semistructured data populating multiple normalized ODS style databases.
JsonEDI,as it name implies, uses Json Documents for internal processing. Source data is converted to Json if needed. During data persistence, the Json is converted to the native data format of the destination. We have several reasons for using Json;
1. We are ETL for hierarchical data first, adding relational data was easy
2. Lightweight and fast
3. Easy to manipulate
4. Everyone is familiar with it
Declarative programming separates the "what needs to be done" from the "how to do it". This contrasts to the imperative style of object oriented and procedural programming that explicitly defines everything in code. We placed the "what needs to be done" into our data dictionary with tags. The technical implementation, the "how to implement ETL" is already coded using ETL best practices and is now behind the scenes. The fact that the data dictionary is in a metadata repository sets the foundation for our other innovations.
We maintain the data dictionary in a metadata database. This metadata is a combination of the data model and ETL requirements. This allows us to combine the management of database schema and ETL. This metadata can also be managed as a master data management solution, something we call Master Schema Management. This allows application developers to synchronize with data integration. There are countless value propositions to data managed ETL.
Since we're data managed with metadata you can choose between a design time ETL designer or a dynamically managed, rules based ETL server. These concepts work well together. Start with you initial design, then as new schema changes occur at the source they can be automatically added to the data dictionary. Schema changes can also be applied to the destination.
Our foundation is a Java Framework with plugins added to implement ETL. You choose plugins based on the rules you want to implement. These could be NoSQL to SQL, data warehousing, dimensional modeling etc. The metadata completes the ETL configuration. The frame work supplies a multi-threaded ETL engine plus management of metadata, job/batch, schema, errors and cache. Our Java technology can leverage nearly any Java library to connect to almost any data connection.
Lookups (SQL, REST or Cached)
Split (Several Methods)
Merge or Union
Primary Keys/Foreign Keys
Surrogate Key Management
Master Data Management Integration
External Data Integration
Primitive Datatype Cleansing
Form Value Cleansing (Address, Email etc)
External Data Quality Integration
Bad Data Error Logging
Master Schema Management
Write Buffering per Node/Partition
Isolation of Read/Transform/Write for Scalability
Integrate Hadoop Java Libraries
Standard Job Management
Incremental/Full Load Job Integration
Destination Error Automatic Retry
Detailed Job Monitoring/Reporting
Record Count Reporting
Data Compares Using Metadata
Detailed Error Logging/Reporting
Compute Columns or Json Elements
User Defined Functions
Extensible Java Framework for ETL
Metadata Supports Hierarchical Data
Automatic Detection of Schema Changes
Automatic Normalization of Semistructured Data
Automatic Datatype Detection
Meet HIPAA, EU, GSA PII Requirements
Tracking/Audit of PII Data
Remove or Obfuscate PII Data
Data Dictionary Report for Documentation
If you gave 10 ETL programmers identical non-trivial requirements you would get back 10 different ETL coding packages. JsonEDI has simply prebuilt all the technical functionality. All the subjective programming decisions has been replaced with ETL best practices.
Some people assume metadata managed ETL must be slow. In fact JsonEDI is very lightweight and fast. We are not an "interpretive" technology but 100% fully compiled Java code after start. We are memory and CPU efficient using only lightweight Json documents during data processing. JsonEDI is fully multithreaded.
Our core innovation is that ETL can implemented using the Declarative Programming Paradigm. We separated the technology of "how to implement ETL" from the our data dictionary that states "what ETL to do". We also discovered this approach allows for the coordination of database schema with ETL.