Orchestrate your data pipelines in one managed platform
STOIX orchestrates, schedules and monitors everything in your data ecosystem. Combine your pipelines into complex dependency chains. Take control of your data infrastructure and orchestrate your custom code or external tools such as dbt, Snowflake and Tableau.


Features
Monitoring
STOIX monitors and makes it easy to visualise the status of your data pipelines through the real-time dashboard. Set up alert policies to be notified using your favourite communication tool if your data pipelines fail.
Dependencies
STOIX enables creation of complex dependency trees of data pipelines that need to be run in a specific order. This allows you to run chains of pipelines that depend on other data pipeline outputs including external tools. For example your dbt transformation can be run directly after your PySpark ingestion job to BigQuery has succeeded.
Backfills
Easily backfill date or time partitioned pipelines with historical data. This also allows updates of datasets and tables after bug fixes as well as refreshing datasets when source data has been updated.
Write code in any language
By leveraging container technologies STOIX allows flexibility to choose any programming language or tool when creating pipelines. Now everyone with development experience in the organisation can contribute to your data ecosystem.
Ownership
Collaborate in real-time with your team. Get an overview of responsible parties for different data pipelines so potential issues with output data can be resolved efficiently.
Infrastructure management
STOIX manages your infrastructure and handles computing resource allocation. STOIX automatically scales the nodes in your infrastructure so that your data pipelines are cost effective for your organization.
System Architecture
The Conductor
Takes care of all the complex logic regarding orchestration, scheduling and monitoring for you. Manage your data infrastructure in the cloud real-time dashboard, providing ultimate flexibility to meet your business requirements.
The Controller
A Kubernetes controller that runs in your cluster and executes containers as instructed by the Conductor. Execution metadata is reported back to the Conductor. Running the Controller in your cluster ensures that data never leaves your infrastructure, making it easy to comply with security standards. Learn more about how the Controller works here.