Blog

Engineering deep-dives and updates from the LakeSail team.

Agent Skills for Spark Workloads

Meet the new Sail CLI feature for one-shot execution of any PySpark script. Access instant Spark-compatible compute in a single command, and give your agents data and AI engineering capabilities on demand.

2 min read Mar 2026

Release

Sail 0.5: Resilient and Observable Distributed Execution

Sail 0.5 introduces a redesigned control plane for distributed execution with task region scheduling, unified shuffle, failure recovery, and a system catalog queryable via SQL.

6 min read Feb 2026

Engineering

Spark's Python Problem and How Sail Solves It

AI workflows rely on Python, but Spark isolates Python behind an inter-process boundary. Sail executes Python UDFs natively in-process for true high-performance distributed compute.

5 min read Dec 2025

Engineering

How Sail Utilizes and Extends Apache DataFusion

Sail adopts Apache DataFusion's trusted query engine and extends it as part of a larger, distributed compute framework with Spark-compatible semantics.

3 min read Nov 2025

Engineering

Sail, the Last Piece of the Composable Data Stack

The future of data is composable. Sail brings distributed computation into that vision, modular, Arrow-native, and Spark-compatible.

6 min read Nov 2025

Release

Sail 0.4: Native Apache Iceberg Support

Sail 0.4 introduces native Apache Iceberg support and major improvements to Delta Lake integration.

4 min read Oct 2025

Engineering

Sail Turns One

Sail turns one! Celebrate with us as we reflect on our journey, and look ahead to the future of unified data and AI workloads.

5 min read Sep 2025

Release

Sail 0.3: Long Live Spark

Sail 0.3 adds support for Spark 4.0 while maintaining compatibility with Spark 3.5, along with faster object store performance and revamped documentation.

5 min read Jul 2025

Release

Sail 0.3.2: Start the Journey from Your Lakehouse

Sail 0.3.2 brings native Delta Lake read/write support and expanded object storage integration with Azure, GCS, Cloudflare R2, and AWS S3.

4 min read Aug 2025

Release

Announcing Sail 0.2.6

Sail 0.2.6 delivers enhancements across temporal data handling, SQL compatibility, Parquet integration, and the MCP server.

2 min read May 2025

Engineering

Sail MCP Server: Spark Analytics for LLM Agents

With the Sail MCP server, data analytics in Spark is possible for both LLM agents and humans.

7 min read Mar 2025

Engineering

Writing a Rust SQL Parser in One Week

A close look at Sail's new in-house SQL parser built using parser combinators and Rust procedural macros.

9 min read Mar 2025

Engineering

Beyond the JVM: How Rust is Redefining Big Data for the AI Era

Rust is redefining big data infrastructure by offering superior performance, memory safety, and scalability over traditional JVM-based systems.

5 min read Feb 2025

Release

Sail 0.2.1: Enhanced UDF Support

How enhanced UDF support in Sail opens up possibilities to bridge the gap between traditional ETL workloads and AI.

4 min read Jan 2025

Engineering

Why It's Possible Now: Sail 0.2 and the Evolution of Distributed Compute Frameworks

Announcing Sail 0.2, the latest milestone in the evolution of distributed compute frameworks. Explore how advancements in programming languages and data infrastructure make it possible to unify batch, stream, and AI workloads into a high-performance framework.

6 min read Dec 2024

Engineering

Sail 0.2 and the Future of Distributed Processing

We are thrilled to unveil the preview release of Sail 0.2, which introduces support for distributed processing on Kubernetes. A detailed architectural deep dive and an overview of our increased support for Spark.

5 min read Nov 2024

Engineering

Introducing Sail Enterprise Support

Discover how Sail Enterprise Support empowers your team with dedicated, flexible, and customizable solutions to meet the needs of your organization.

1 min read Sep 2024

Engineering

A Sail Recipe: Tackling an Out-of-Control Redshift Bill

Deriving insights from your data shouldn't cost you an arm, a leg, and a kidney. Learn how you can work directly with your data in Amazon S3 using Sail, saving both time and money.

3 min read Sep 2024

Release

The First PySail Release

We are thrilled to announce the 0.1 release of Sail. Get started with the PySail package today, and check out the documentation site.

2 min read Aug 2024

Benchmarks

Supercharge Spark: Quadruple Speed, Cut Costs by 94%

The preview of Sail is here. In the derived TPC-H benchmark, Sail achieves nearly 4x speed-up and 94% hardware cost reduction, with the same PySpark code.

6 min read Jul 2024

Ready to get started?

Get early access to LakeSail.

Get Access Talk to us →

Blog

Agent Skills for Spark Workloads

Sail 0.5: Resilient and Observable Distributed Execution

Spark's Python Problem and How Sail Solves It

How Sail Utilizes and Extends Apache DataFusion

Sail, the Last Piece of the Composable Data Stack

Sail 0.4: Native Apache Iceberg Support

Sail Turns One

Sail 0.3: Long Live Spark

Sail 0.3.2: Start the Journey from Your Lakehouse

Announcing Sail 0.2.6

Sail MCP Server: Spark Analytics for LLM Agents

Writing a Rust SQL Parser in One Week

Beyond the JVM: How Rust is Redefining Big Data for the AI Era

Sail 0.2.1: Enhanced UDF Support

Why It's Possible Now: Sail 0.2 and the Evolution of Distributed Compute Frameworks

Sail 0.2 and the Future of Distributed Processing

Introducing Sail Enterprise Support

A Sail Recipe: Tackling an Out-of-Control Redshift Bill

The First PySail Release

Supercharge Spark: Quadruple Speed, Cut Costs by 94%

Stay in the loop

Ready to get started?