Category: Integration

  • Integration platform vs ETL/AI cloud platform

    An integration platform and a cloud ETL platform are not the same thing.

    They both move data.
    That is why they are often confused.

    But they solve different problems.

    The integration platform runs the business.
    The ETL and AI platform studies the business.


    The business example

    Return to the seafood processing company.

    It needs operational integrations:

    • Orders from ERP to warehouse
    • Batch data from production to quality control
    • Stock updates from warehouse to ERP
    • Shipment booking to transport partners
    • Invoice basis from ERP to finance

    These flows affect daily operations. If they stop, the business is disrupted.

    The same company also needs analytical data:

    • Production yield trends
    • Stock forecasting
    • Customer demand analysis
    • Transport cost analysis
    • Quality deviation patterns
    • AI models that predict demand, waste or capacity problems

    These flows help the business understand itself and plan better.

    Both are important.
    But they should not be the same platform.

    Two different responsibilities

    %%{init: {'theme': 'dark'}}%%
    graph TD
        OPS[Operational Systems] --> INT[Integration Platform]
        INT --> OPS2[Operational Systems]
    
        OPS --> ETL[Cloud ETL Platform]
        ETL --> LAKE[Data Lake / Warehouse]
        LAKE --> AI[AI / Analytics]

    The integration platform handles operational movement.

    • Fast enough for business processes
    • Reliable enough for production
    • Traceable enough for support
    • Controlled enough for change
    • Clear enough for ownership

    The ETL and AI platform handles analytical consumption.

    • Historical storage
    • Aggregation
    • Data modelling
    • Reporting
    • Forecasting
    • Machine learning and AI workloads

    One is about running the business.
    The other is about learning from the business.

    Why separation matters

    If the ETL platform becomes the integration platform, analytical concerns start leaking into operational flows.

    That creates problems.

    • Batch timing starts affecting business processes
    • Reporting transformations become operational dependencies
    • AI pipelines consume data before operational meaning is clear
    • Failure handling is optimized for data loads, not business recovery
    • Ownership becomes unclear between operations, analytics and development

    This is dangerous because ETL pipelines often tolerate things operational systems cannot tolerate.

    • Late data
    • Duplicate data
    • Partial loads
    • Reprocessing
    • Schema drift
    • Eventually correct results

    Those may be acceptable in analytics.

    They are not always acceptable when booking transport, reserving stock, approving quality documentation or sending invoice basis.

    Operational integration needs correctness at the point of business action.
    Analytics can often correct itself later.

    The wrong architecture

    A common mistake is to route everything through the cloud data platform because it already has connectors, storage and transformation tools.

    %%{init: {'theme': 'dark'}}%%
    graph LR
        ERP[ERP] --> ETL[Cloud ETL Platform]
        WH[Warehouse] --> ETL
        PROD[Production] --> ETL
        QA[Quality System] --> ETL
    
        ETL --> ERP
        ETL --> WH
        ETL --> TMS[Transport Portal]
        ETL --> FIN[Finance]
        ETL --> AI[AI / Analytics]

    This looks efficient on paper.

    But the platform is now responsible for two very different jobs:

    • Running operational business processes
    • Feeding analytical and AI workloads

    Those jobs have different priorities.

    ConcernIntegration platformETL / AI platform
    Main purposeOperational system communicationAnalytics and insight
    Failure handlingImmediate support, retry, alertingReload, reprocess, reconcile
    Time sensitivityBusiness-process dependentOften batch or near-real-time
    Data shapeContract-driven messagesAnalytical models and history
    OwnershipIntegration / application operationsData / analytics team
    CorrectnessCorrect when usedCan often become correct later

    The better architecture

    The integration platform should own operational flows.

    The ETL and AI platform should consume from stable, governed sources.

    %%{init: {'theme': 'dark'}}%%
    graph TD
        ERP[ERP] --> INT[Integration Platform]
        WH[Warehouse] --> INT
        PROD[Production] --> INT
        QA[Quality System] --> INT
        TMS[Transport Portal] --> INT
        FIN[Finance] --> INT
    
        INT --> ERP
        INT --> WH
        INT --> PROD
        INT --> QA
        INT --> TMS
        INT --> FIN
    
        INT --> EVENTS[Operational Events / Canonical Data]
        EVENTS --> ETL[Cloud ETL Platform]
        ETL --> LAKE[Data Lake / Warehouse]
        LAKE --> AI[AI / Analytics]

    This separation gives each platform a clean responsibility.

    • The integration platform keeps operations running
    • The ETL platform prepares data for analysis
    • The AI platform consumes governed and explainable data

    The AI platform should not become the owner of operational truth.

    It should consume from systems and integration events that are already understood, validated and traceable.

    Why AI makes this more important

    AI does not remove the need for integration architecture.

    It increases it.

    An AI model can only be useful if the data it consumes is meaningful.

    • What does this stock number represent?
    • When was it valid?
    • Was it corrected later?
    • Was this batch approved or only produced?
    • Was the order cancelled, delayed or partially fulfilled?
    • Which system owns the truth?

    These are integration and data lineage questions before they are AI questions.

    If the operational data flow is unclear, AI will automate the confusion.

    AI should consume from controlled flows

    In the seafood company, an AI model may predict demand, production capacity or waste.

    But it needs data that has operational meaning:

    • Confirmed orders, not abandoned drafts
    • Approved quality data, not temporary production notes
    • Actual stock, not stale warehouse extracts
    • Real shipment status, not optimistic planning data
    • Corrected historical data, not raw failure states

    The integration platform helps define and expose these flows clearly.

    The ETL platform can then store, model and prepare the data for reporting and AI.

    The rule of separation

    A practical rule:

    If the flow is needed to run the business, it belongs in the integration platform.
    If the flow is needed to understand the business, it belongs in the ETL and AI platform.

    Some data will move between both worlds.

    That is fine.

    But the direction of responsibility should be clear:

    %%{init: {'theme': 'dark'}}%%
    graph LR
        OPS[Operational Systems] --> INT[Integration Platform]
        INT --> BUSINESS[Business Processes]
        INT --> GOVERNED[Governed Operational Data]
        GOVERNED --> ETL[ETL / Data Platform]
        ETL --> AI[AI / Analytics]

    The integration platform should feed the analytical platform.
    The analytical platform should not quietly become the operational backbone.

    The real lesson

    Cloud ETL platforms are useful.

    AI platforms are useful.

    But they do not replace integration architecture.

    Before a company can safely use AI on operational data, it needs to know:

    • Where the data comes from
    • Which system owns it
    • How it moves
    • How failures are detected
    • How corrections are handled
    • What the data means in business terms

    AI does not make integration less important.
    It makes poor integration more expensive.

    The integration platform keeps business data reliable in motion.

    The ETL and AI platform turns reliable data into insight.

    Both are needed.
    They should not be the same thing.

  • Introducing the integration platform

    In the previous post, we looked at how point-to-point integrations decay as systems grow.

    The problem was not that systems needed to exchange data.
    The problem was that no one owned the exchange.

    This is where an integration platform becomes important.

    An integration platform is not just a technical runtime.
    It is the control layer for how systems communicate.


    Back to the business example

    Our example company is a mid-sized seafood processing business.

    It has production, warehouse, transport, finance, quality control and reporting systems. Each system has a valid reason to exist. The problem is that business processes cross all of them.

    A customer order may touch:

    • ERP for customer, item and order data
    • Production for batch and packing information
    • Warehouse for stock and lot handling
    • Quality system for certificates and approvals
    • Transport portal for shipment booking
    • Data platform for reporting and forecasting

    Without an integration platform, each system learns too much about the others.

    %%{init: {'theme': 'dark'}}%%
    graph LR
        ERP[ERP] --> WH[Warehouse]
        ERP --> PROD[Production]
        ERP --> FIN[Finance]
        WH --> PROD
        WH --> TMS[Transport Portal]
        PROD --> QA[Quality System]
        PROD --> DATA[Data Platform]
        QA --> DATA
        WH --> DATA

    Every system becomes both a business application and an integration engine.

    That is not a clean separation of responsibility.

    What the integration platform changes

    An integration platform introduces a controlled middle layer.

    Systems no longer need to know every other system directly.
    They communicate through a layer that owns transport, transformation, routing, retries, monitoring and traceability.

    %%{init: {'theme': 'dark'}}%%
    graph LR
        ERP[ERP] --> INT[Integration Platform]
        WH[Warehouse] --> INT
        PROD[Production] --> INT
        QA[Quality System] --> INT
        TMS[Transport Portal] --> INT
        FIN[Finance] --> INT
        DATA[Data Platform] --> INT
    
        INT --> ERP
        INT --> WH
        INT --> PROD
        INT --> QA
        INT --> TMS
        INT --> FIN
        INT --> DATA

    This does not remove complexity.
    It moves complexity into a place where it can be owned.

    The purpose of an integration platform is not to make integration disappear.
    It is to make integration visible, governable and supportable.

    What an integration platform owns

    A useful integration platform owns the concerns that should not be scattered across every application.

    • Message routing
    • Protocol handling
    • Data transformation
    • Contract validation
    • Retries and error handling
    • Monitoring and alerting
    • Traceability and logging
    • Operational documentation
    • Security boundaries
    • Versioning and change control

    These are not minor details.
    They decide whether a business process can be trusted in production.

    The integration platform as a business layer

    In the seafood company, an order is not just a row in an ERP system.

    It becomes a business event.

    • An order was created
    • A batch was produced
    • Stock was reserved
    • A quality certificate was approved
    • A shipment was booked
    • An invoice was prepared

    The integration platform should understand these flows at a technical level.

    Not because it replaces the business systems, but because it connects them in a controlled way.

    %%{init: {'theme': 'dark'}}%%
    graph TD
        ORDER[Order Created] --> INT[Integration Platform]
        INT --> STOCK[Reserve Stock]
        INT --> PROD[Notify Production]
        INT --> QA[Check Quality Requirement]
        INT --> TRANSPORT[Prepare Transport]
        INT --> FINANCE[Prepare Invoice Data]
        INT --> TRACE[Trace and Monitor Flow]

    Now the full process can be observed from one place.

    Why this matters operationally

    When something fails, the question should not be:

    Which system do we log into first?

    The question should be:

    Where did this business flow stop?

    An integration platform gives operations and developers a central place to answer that question.

    • Was the message received?
    • Was the message valid?
    • Was it transformed correctly?
    • Was the target system available?
    • Was it retried?
    • Was anyone alerted?
    • Can the message be replayed safely?

    Without this, teams investigate by guessing.

    With it, they investigate by following the flow.

    Integration platform does not mean one product

    An integration platform does not have to mean one specific vendor or one monolithic tool.

    It can be built from several technologies:

    • API gateway
    • Message broker
    • Integration runtime
    • File transfer handling
    • Transformation services
    • Monitoring and logging
    • Metadata and documentation
    • DevOps pipelines

    The important part is not the brand.
    The important part is that integration is treated as a platform responsibility, not as scattered side effects inside each application.

    What good looks like

    A good integration platform gives the organization:

    • One place to see important flows
    • Clear ownership of integration behavior
    • Reusable patterns for APIs, files, messages and events
    • Consistent logging and alerting
    • Safer change management
    • Better separation between business systems and transport logic
    • More reliable data for reporting and AI

    The integration platform is where data movement becomes an owned capability.

    The real goal

    The goal is not to make every integration complicated.

    The goal is to make important integrations boring.

    Boring means:

    • Visible
    • Logged
    • Monitored
    • Documented
    • Repeatable
    • Supportable
    • Safe to change

    That is what a business needs from its integration platform.


    In the next post, we separate the integration platform from another important layer: the cloud ETL and AI data platform.

  • Why Point to Point Integrations Break at Scale

    Why Point to Point Integrations Break at Scale

    Most systems don’t fail because of complex business logic.
    They fail because of how they are connected.

    Point-to-point integrations are not usually chosen as an architecture.
    They are what happens when integration architecture is not owned.

    A system needs data, so it calls another system.
    A report needs numbers, so a scheduled export is added.
    A new department needs the same data, so another connection appears.

    No one designs the whole.
    Connections accumulate.

    Over time, the system landscape stops being a set of systems.
    It becomes a network of implicit dependencies.


    A plausible business example

    Imagine a mid-sized seafood processing company.

    It has production facilities, cold storage, transport partners, sales teams, finance, quality control, and customers who expect accurate delivery information.

    The company uses different systems for different jobs:

    • ERP for orders, invoicing, items and customers
    • Warehouse system for stock, lots and cold storage
    • Production system for batches, yield and packing
    • Transport portal for shipment booking and tracking
    • Quality system for certificates, deviations and approvals
    • Data platform for reporting, forecasting and AI analysis

    At first, the integrations are simple.

    The ERP sends orders to the warehouse.
    The warehouse returns stock status.
    The production system sends batch information.
    Finance receives invoice data.

    Each integration makes sense in isolation.
    The problem appears when they become dependent on each other.

    %%{init: {'theme': 'dark'}}%%
    graph LR
        ERP[ERP] --> WH[Warehouse]
        WH --> ERP
        ERP --> PROD[Production]
        PROD --> QA[Quality System]
        PROD --> WH
        WH --> TMS[Transport Portal]
        ERP --> FIN[Finance]
        PROD --> DATA[Data Platform]
        WH --> DATA
        QA --> DATA

    Now a delayed batch update can affect stock levels, transport booking, customer delivery promises, invoice timing, reports and AI forecasts.

    The integration is no longer just moving data.
    It is carrying business trust.


    The hidden complexity

    What starts as a few integrations quickly becomes a dependency graph.

    Each connection introduces:

    • Knowledge of another system’s contract
    • Dependency on another system’s availability
    • Responsibility for retries, errors and edge cases
    • Unclear ownership when data is wrong
    • Manual investigation when something fails

    Individually, these are manageable.
    Collectively, they create a system no one fully understands.

    There is no single place where the full data flow is visible or owned.

    This is the core problem:
    logic is no longer inside systems — it is distributed between them.

    The scaling law

    Point-to-point integrations do not scale linearly.

    The number of possible connections grows with the number of systems:

    • 3 systems → 3 connections
    • 6 systems → 15 connections
    • 10 systems → 45 connections
    • 20 systems → 190 connections

    Every new system increases the number of relationships that must be understood, tested and maintained.

    This is where most teams lose control.
    Not because the individual systems are too complex, but because the connections are.


    Failure is not binary

    Integrations rarely fail in obvious ways.

    Failures are often:

    • Partial — only some data is transferred
    • Delayed — data arrives too late to be useful
    • Silent — an error is logged, but no one observes it
    • Semantic — the data arrives, but means something different than expected

    This creates a dangerous state:

    Systems appear to work — until someone depends on the data.

    The system does not crash.
    It becomes unreliable in ways that are hard to detect.

    The scheduled script problem

    A common pattern is scheduled scripts moving data between systems.

    They work because they are:

    • Simple to implement
    • Low cost
    • Easy to understand in isolation

    But they often lack:

    • Central monitoring
    • Alerting
    • End-to-end traceability
    • Retry handling
    • Clear ownership

    In the seafood company, a scheduled stock export fails on Friday night.

    The warehouse system still works.
    The ERP still works.
    The reporting platform still works.

    But the data is stale.

    By Monday morning, sales sees stock that is no longer available. Transport planning is based on old volumes. Finance expects invoices for orders that should have been delayed. The AI forecast consumes incorrect inventory data and produces confident nonsense.

    The result is not constant failure.
    It is uncertain correctness.


    Why point-to-point breaks

    Point-to-point is not inherently wrong.

    It works when:

    • There are few systems
    • Integrations are simple
    • Change is infrequent
    • Ownership is clear

    It breaks when:

    • The number of systems grows
    • Systems evolve independently
    • Business processes cross many applications
    • No central ownership exists
    • No one can see the whole flow

    The problem is not connections.
    It is unmanaged connections.

    The real issue

    At small scale, point-to-point is efficient.

    At medium scale, it becomes fragile.

    At large scale, it becomes unmanageable.

    By the time teams realize this, the problem is no longer fixing one integration.

    It is regaining control over a system where:

    • No one owns the full data flow
    • No one sees the full picture
    • No one can change it safely
    • No one trusts the data completely

    Point-to-point does not fail suddenly.
    It decays until change becomes dangerous.


    In the next post, we introduce the control layer that prevents this: the integration platform.