Big Data Processing for the AI Era
Sail: An open-source computation framework with a mission to unify batch processing, stream processing, and compute-intensive AI workloads.
Announcing Sail 0.2.1: Enhanced UDF support and better Spark compatibility. Read more
The LakeSail Slack Community is now live. Join now
Get Started with Sail
Sail features a drop-in replacement for Spark SQL and the Spark DataFrame API in both single-host and distributed settings.
bash
pip install "pysail[spark]"
bash
sail spark server --port 50051
python
from pysail.spark import SparkConnectServer
server = SparkConnectServer(port=50051)
server.start(background=False)
bash
kubectl apply -f sail.yaml
kubectl -n sail port-forward service/sail-spark-server 50051:50051
Once you have a running Sail server, you can connect to it in PySpark.
No changes are needed in your PySpark code!
python
from pyspark.sql import SparkSession
spark = SparkSession.builder.remote("sc://localhost:50051").getOrCreate()
spark.sql("SELECT 1 + 1").show()
Sail Support Options
LakeSail offers commercial support for Sail, with flexible coverage tailored to your needs. Get in touch for more details.