Comments (3)
太极的背景和动机(Motivation)是这样的:计算机视觉的程序通常是用 cpp 和 cuda 实现的。这样的方式有很多缺点:
- 性能调优很难,需要非常熟悉 cpp 和 cuda。在工业界里这样的工程师通常稀少且昂贵。
- 可移植性很差,在未来 Nvidia 未必是唯一的 GPU,所以这个问题会越来越严重。
- 生产力低下,类比互联网领域,之前用 cpp 写 backend 的时代效率很低,现代语言如 go java 都极大的提高了生产力。
不过在 CG 这个领域,太极也有一些 non-goals:
- tasks with domain-specific hardware support, and
- coarse-granularity tasks, where function call overheads and data transfer time are negligible, with well-optimized libraries.
文中给出了几个例子,比如 vulkan openGL 优化的非常好的渲染任务,或者 TF PyTorch 已经处理的非常好的 DL 训练等。
from papers-notebook.
from papers-notebook.
CG 这个场景下有一些领域特定的特点,比如上图。CG 的数据分布是空间上的稀疏,所以利用这个特性,能够做一些通用的编译器/库做不了的优化。
太极做了这么几个设计决策(挑着几个印象深刻的写):
- 分离数据结构和计算。这个的意思应该是用户看到的数据结构是声明式的,声明自己的期望即可。使用起来 dense 场景和 sparse 场景的数据结构是一个样子的(反面例子见 TF Tensor/SparseTensor?)
- Regular Grids 是基本的数据结构,暂时不太明白(
- 自动生成后端代码,用户只需要指定是想要 cuda 还是 vulkan 还是别的,代码会自动生成
from papers-notebook.
Related Issues (20)
- Retiarii: A Deep Learning Exploratory-Training Framework HOT 1
- Gradient Compression Supercharged High-Performance Data Parallel DNN Training
- Generating Complex, Realistic Cloud Workloads using Recurrent Neural Networks
- Boki: Stateful Serverless Computing with Shared Logs HOT 2
- Faster and Cheaper Serverless Computing on Harvested Resources
- MAPA: Multi-Accelerator Pattern Allocation Policy for Multi-Tenant GPU Servers
- An Evaluation of WebAssembly and eBPF as Offloading Mechanisms in the Context of Computational Storage HOT 1
- Maglev: A Fast and Reliable Software Network Load Balancer HOT 3
- ZeRO & DeepSpeed: New system optimizations enable training models with over 100 billion parameters HOT 13
- ZeRO-Offload: Democratizing Billion-Scale Model Training HOT 2
- Serving DNN Models with Multi-Instance GPUs: A Case of the Reconfigurable Machine Scheduling Problem HOT 1
- BAGUA: Scaling up Distributed Learning with System Relaxations
- Mind the Gap: Broken Promises of CPU Reservations in Containerized Multi-tenant Clouds HOT 2
- To FUSE or Not to FUSE: Performance of User-Space File Systems HOT 4
- DeepRest: Deep Resource Estimation for Interactive Microservices.
- Cntr: Lightweight OS Containers HOT 4
- VMSH: hypervisor-agnostic guest overlays for VMs HOT 6
- Towards Observability for Production Machine Learning Pipelines
- How Large Language Models Will Disrupt Data Management HOT 1
- VBASE: Unifying Online Vector Similarity Search and Relational Queries via Relaxed Monotonicity
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from papers-notebook.