How Sail Utilizes and Extends Apache DataFusion
Sail adopts Apache DataFusion’s trusted query engine and extends it as part of a larger, distributed compute framework with Spark-compatible semantics.

Everett focuses on go-to-market strategy and bringing advanced data infrastructure to organizations at scale. A Bay Area native, he holds a bachelor’s degree in Business from the University of Colorado Boulder and went on to lead marketing, growth, and consulting initiatives, scaling audiences into the millions and advising organizations on brand and market expansion.
Sail adopts Apache DataFusion’s trusted query engine and extends it as part of a larger, distributed compute framework with Spark-compatible semantics.
The future of data is composable. Sail brings distributed computation into that vision—modular, Arrow-native, and Spark-compatible.
Sail 0.4 introduces native Apache Iceberg support and major improvements to Delta Lake integration.
Sail turns one! Celebrate with us as we reflect on our journey, and look ahead to the future of unified data and AI workloads.
Sail 0.3.2 brings native Delta Lake read/write support and expanded object storage integration with Azure, GCS, Cloudflare R2, and AWS S3.
Sail 0.3 adds support for Spark 4.0 while maintaining compatibility with Spark 3.5. This is the latest evidence of our long-term commitment to your data processing needs.
With the Sail MCP server, data analytics in Spark is possible for both LLM agents and humans.
Join us for a close look at Sail’s new in-house SQL parser built using parser combinators and Rust procedural macros.
Rust is redefining big data infrastructure by offering superior performance, memory safety, and scalability over traditional JVM-based systems.
Join us in a brief tour of how the enhanced UDF support opens up possibilities to bridge the gap between traditional ETL workloads and AI.