oap-project Goto Github PK
Name: Optimized Analytics Package for Spark Platform (OAP)
Type: Organization
Name: Optimized Analytics Package for Spark Platform (OAP)
Type: Organization
Apache Arrow is a cross-language development platform for in-memory data. It specifies a standardized language-independent columnar memory format for flat and hierarchical data, organized for efficient analytic operations on modern hardware. It also provides computational libraries and zero-copy streaming messaging and interprocess communication. Languages currently supported include C, C++, Java, JavaScript, Python, and Ruby.
Spark DataSouce plugin for reading files from various formats like Parquet into Arrow compatible columnar vectors.
Cloud Scale Platform for Distributed Analytics and AI
Native SQL Engine plugin for Spark SQL with vectorized SIMD optimizations.
Gluten: Plugin to Boost Trino's Performance
HDFS file read access for ClickHouse
a native c/c++ hdfs client (downstream fork from apache-hawq)
Optimized Spark package to accelerate machine learning algorithms in Apache Spark MLlib.
The OAP project web site
Tools for building, packaging, and OAP public cloud integrations such as AWS EMR, Google Dataproc and K8S.
Common library for accessing PMEM native library functions including memkind, vmemcache and so on.
Spark* Shuffle plugin for support shuffling through remote persistent memory over fabrics, which leverages the RDMA network and remote persistent memory (for read) to provide extremely high performance and low latency shuffle solutions for Spark*.
Spark plug-in package for accelerating Spark runtime spill functions using PMem such as RDD cache PMem extension.
A Intel customized Protocol Buffers - Google's data interchange format
English SDK for Apache Spark
RayDP provides simple APIs for running Spark on Ray and integrating Spark with AI libraries.
Spark* shuffle plugin for support shuffling data through a remote Hadoop-compatible file system, as opposed to vanilla Spark's local-disks.
Example solutions or code for using OAP features.
Gluten: Plugin to Double SparkSQL's Performance
Spark* plug-in for accelerating Spark* SQL performance by using cache and index at SQL data source layer.
A new C++ vectorized database acceleration library aimed to optimizing query engines and data processing systems.
A high-throughput and memory-efficient inference and serving engine for LLMs
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.