all: Remove unused/old crates

About

Data exists everywhere: your laptop, Postgres, Snowflake and as files in S3. It exists in various formats such as Parquet, CSV and JSON. Regardless, there will always be multiple steps spanning several destinations to get the insights you need.

GlareDB is designed to query your data wherever it lives using SQL that you already know.

Install

Install/update glaredb in the current directory:

curl https://glaredb.com/install.sh | sh

It may be helpful to install the binary in a location on your PATH. For example, ~/.local/bin.

If you prefer manual installation, download, extract and run the GlareDB binary from a release in our releases page.

Getting Started

After Installing, get up and running with:

CLI
- Run GlareDB server
Hybrid Execution
Python

Local CLI

To start a local session, run the binary:

./glaredb

Or, you can execute SQL and immediately return (try it out!):

# Query a CSV on Hugging Face
./glaredb --query "SELECT * FROM \
'https://huggingface.co/datasets/fka/awesome-chatgpt-prompts/raw/main/prompts.csv';"

To see all options use --help:

./glaredb --help

Hybrid Execution

Sign up at https://console.glaredb.com for a free fully-managed deployment of GlareDB

Copy the connection string from GlareDB Cloud, for example:

./glaredb --cloud-url="glaredb://user:pass@host:port/deployment"
# or
./glaredb
> \open "glaredb://user:pass@host:port/deployment

Read our announcement on Hybrid Execution for more information.

Using GlareDB in Python

Install the official GlareDB Python library
```
pip install glaredb
```

Import and use glaredb.

import glaredb
con = glaredb.connect()
con.sql("select 'hello world';").show()

To use Hybrid Execution, sign up at https://console.glaredb.com and use the connection string for your deployment. For example:

import glaredb
con = glaredb.connect("glaredb://user:pass@host:port/deployment")
con.sql("select 'hello hybrid exec';").show()

GlareDB work with Pandas and Polars DataFrames out of the box:

import glaredb
import polars as pl

df = pl.DataFrame(
    {
        "A": [1, 2, 3, 4, 5],
        "fruits": ["banana", "banana", "apple", "apple", "banana"],
        "B": [5, 4, 3, 2, 1],
        "cars": ["beetle", "audi", "beetle", "beetle", "beetle"],
    }
)

con = glaredb.connect()

df = con.sql("select * from df where fruits = 'banana'").to_polars();

print(df)

Local Server

The server subcommand can be used to launch a server process for GlareDB:

./glaredb server

To see all options for running in server mode, use --help:

./glaredb server --help

When launched as a server process, GlareDB can be reached on port 6543 using a Postgres client. The following example uses psql to connect to a locally running server:

psql "host=localhost user=glaredb dbname=glaredb port=6543"

Configure the First Data Source

You can use a demo Postgres instance at pg.demo.glaredb.com. Adding this Postgres instance as data source is as easy as running the following command:

CREATE EXTERNAL DATABASE my_pg
    FROM postgres
    OPTIONS (
        host = 'pg.demo.glaredb.com',
        port = '5432',
        user = 'demo',
        password = 'demo',
        database = 'postgres',
    );

Once the data source has been added, it can be queried using fully qualified table names:

SELECT *
FROM my_pg.public.lineitem
WHERE l_shipdate <= date '1998-12-01' - INTERVAL '90'
LIMIT 5;

Check out the docs to learn about all supported data sources. Many data sources can be connected to the same GlareDB instance.

Done with this data source? Remove it with the following command:

DROP DATABASE my_pg;

Supported Data Sources

Source	Read	`INSERT INTO`	`COPY TO`	Table Function	External Table	External Database
Databases	--	--	--	--	--	--
MySQL	✅	✅	✅	✅	✅	✅
PostgreSQL	✅	✅	✅	✅	✅	✅
MariaDB (via mysql)	✅	✅	✅	✅	✅	✅
MongoDB	✅	✅	✅	✅	✅	✅
Microsoft SQL Server	✅	🚧	🚧	✅	✅	✅
Snowflake	✅	🚧	🚧	✅	✅	✅
BigQuery	✅	🚧	🚧	✅	✅	✅
Cassandra/ScyllaDB	✅	🚧	🚧	✅	✅	✅
ClickHouse	✅	🚧	🚧	✅	✅	✅
Oracle	🚧	🚧	🚧	🚧	🚧	🚧
ADBC	🚧	🚧	🚧	🚧	🚧	🚧
ODBC	🚧	🚧	🚧	🚧	🚧	🚧
Database Files	--	--	--	--	--	--
SQLite	✅	✅	🚧	✅	✅	✅
Microsoft Excel	✅	🚧	🚧	✅	✅	➖
DuckDB	🚧	🚧	🚧	🚧	🚧	🚧
File Formats	--	--	--	--	--	--
Apache Arrow	✅	🚧	✅	✅	✅	➖
Apache Parquet	✅	🚧	✅	✅	✅	➖
CSV	✅	🚧	✅	✅	✅	➖
JSON	✅	🚧	✅	✅	✅	➖
BSON	✅	🚧	✅	✅	✅	➖
Apache Avro	🚧	🚧	🚧	🚧	🚧	➖
Apache ORC	🚧	🚧	🚧	🚧	🚧	➖
Table Formats	--	--	--	--	--	--
Lance	✅	✅	✅	✅	✅	➖
Delta	✅	✅	✅	✅	✅	➖
Iceberg	✅	🚧	🚧	✅	✅	➖

✅ = Supported ➖ = Not Applicable 🚧 = Not Yet Supported

Building from Source

Building GlareDB requires Rust/Cargo to be installed. Check out rustup for an easy way to install Rust on your system.

Running the following command will build a release binary:

just build --release

The compiled release binary can be found in target/release/glaredb.

Documentation

Browse GlareDB documentation on our docs.glaredb.com.

Contributing

Contributions welcome! Check out CONTRIBUTING.md for how to get started.

License

See LICENSE. Unless otherwise noted, this license applies to all files in this repository.

Acknowledgements

GlareDB is proudly powered by Apache Datafusion and Apache Arrow. We are grateful for the work of the Apache Software Foundation and the community around these projects.

	#[derive(Debug, Clone, PartialEq, Serialize, Deserialize)]
	pub enum UnaryOperation {
	IsNull,
	IsNotNull,
	}

	/// A readable source is able read dataframes and dataframe schemas.
	#[async_trait]
	pub trait ReadTx: Sync + Send {
	/// Read from a source, returning a stream of dataframes.
	///
	/// An optional filter expression can be provided.
	///
	/// Returns `None` if the table doesn't exist.
	async fn scan(
	&self,
	table: &RelationKey,
	filter: Option<ScalarExpr>,
	) -> Result<Option<DataFrameStream>>;

	/// Get the schema for a given table.
	///
	/// Returns `None` if the table doesn't exist.
	async fn get_schema(&self, table: &RelationKey) -> Result<Option<Schema>>;
	}

	/// A writeable source is able to write dataframes to underlying tables, as well
	/// as create, alter, and delete tables.
	#[async_trait]
	pub trait WriteTx: ReadTx + Sync + Send {
	async fn commit(self) -> Result<()>;
	async fn rollback(self) -> Result<()>;

	/// Create a table with the given schema. Errors if the table already
	/// exists.
	async fn create_table(&self, table: RelationKey, schema: Schema) -> Result<()>;

	/// Drop a table. Errors if the table doesn't exist.
	async fn drop_table(&self, table: &RelationKey) -> Result<()>;

	/// Insert data into a table. Errors if the table doesn't exist.
	async fn insert(&self, table: &RelationKey, data: DataFrame) -> Result<()>;
	}

	left = match join.join_operator {
	ast::JoinOperator::Inner(constraint) => {
	let on = self.translate_expr(scope, on_expr(constraint)?)?;
	ReadPlan::Join(Join {
	left: Box::new(left),
	right: Box::new(right),
	join_type: JoinType::Inner,
	on: on.lower_scalar()?,
	})
	}
	ast::JoinOperator::CrossJoin => ReadPlan::CrossJoin(CrossJoin {
	left: Box::new(left),
	right: Box::new(right),
	}),
	other => return Err(anyhow!("unsupported join operator: {:?}", other)),
	};

	/// Read from a source, returning a stream of dataframes.
	///
	/// An optional filter expression can be provided.
	///
	/// Returns `None` if the table doesn't exist.
	async fn scan(
	&self,
	table: &RelationKey,
	filter: Option<ScalarExpr>,
	) -> Result<Option<DataFrameStream>>;

Col	Type	Index
a	int	0
b	int	1
c	int	2

glaredb / glaredb Goto Github PK

glaredb's Introduction

About

Install

Getting Started

Local CLI

Hybrid Execution

Using GlareDB in Python

Local Server

Configure the First Data Source

Supported Data Sources

Building from Source

Documentation

Contributing

License

Acknowledgements

glaredb's People

Contributors

Stargazers

Watchers

Forkers

glaredb's Issues

Footnotes

Recommend Projects

Recommend Topics

Recommend Org