Because your pipelines shouldn’t have to suffer
Upgrading your workloads to a faster, more cost-efficient engine has never been easier.
Built for data engineers
A Rust-native engine, Spark compatibility, and the day-to-day tooling you need.
Rust-Native Engine
Zero-copy Arrow execution with no JVM. Up to 8x faster on average than Spark.
Drop-in Compatibility
Run existing Spark Connect workloads without rewrites. Same API, faster engine.
Proven at Scale
Arrow Flight data exchange, pipelined shuffles, and automatic failure recovery keep jobs scaling without reconfiguration.
On-Demand Provisioning and Autoscaling
Nodes are provisioned automatically per job, and scale down after completion. Pay only for the compute you're actively using.
Job Orchestration
Dependencies, cron schedules, and automatic retries built in. Connects with Airflow, Dagster, and the orchestration tools you already use.
Open Formats
Read and write with Rust-native Delta Lake and Iceberg support. Ingest from any modality.
Python and SQL Jobs
Write, run, and schedule jobs from a single workspace. One place for ad hoc analysis and production pipelines alike.
Runs in Your Cloud
Deploys inside your AWS account. Retain full control over security, networking, and data residency.
How LakeSail takes your workloads to the next level
The engineering advantages that save you time and money every day.
Seconds to Ready
Lightweight native processes replace heavyweight startup, so your jobs begin processing immediately. No more minutes of delay before any real work begins.
Native-Speed Python UDFs
Sail embeds a Python interpreter directly in the engine process. No data serialization or copying between built-in operations and your Python UDFs.
Compile-Time Safety
Sail is built in Rust, which guarantees memory safety and prevents data races at compile time. No garbage collector, no GC overhead, and fewer bugs in production.
Lower Infrastructure Costs
LakeSail finishes the same workloads on smaller instances. No more paying for capacity you don't need.
Bring your data in. Keep it open.
Connect any source or sink. Land in open lakehouse tables with no lock-in.
Native format support
Read and write any data modality natively. No external connectors or conversion steps needed.
First-class lakehouse tables
Read and write with Rust-native Delta Lake and Iceberg support.
Python Data Sources
Can't find your data source? Define custom readers and writers in Python to connect to any system.
Zero rewrites required
LakeSail drops into your existing stack: same APIs, same data, faster engine.
- Spark Connect compatibility: same API, faster engine, zero rewrites
- Run existing jobs faster on smaller instances
- Keep your data where it is: your S3, your VPC
- Existing orchestrators work as-is
Simple to get started
LakeSail runs in your AWS account, so there are a few setup steps. Here’s what to expect.
Create Account
Sign up with email, verify via code, and set up mandatory 2FA.
Connect AWS
Launch a CloudFormation template in your account. Requires admin access.
Create Cluster
Set up a VPC with your CIDR block, then create a cluster.
Run Your First Query
Open the SQL editor, point to your data, and go.