sanqui / discard Goto Github PK
View Code? Open in Web Editor NEWPython tool for medium-scale Discord server archival operations
License: ISC License
Python tool for medium-scale Discord server archival operations
License: ISC License
<Sanqui> I've been compressing jsonl with gzip
<Sanqui> should I look into zstd?
<b> I'd say you should
<b> Yeah, zstd is bloody magic
<b> I got like 3-4x better compression out of zstd
<c> Zstd is magic we use it at work for VM backups/snapshots```
Let's say if a log file goes over 100mb I want to start partitioning it
Some options:
Discard currently does not support downloading files. Nonetheless, it is clear that a backup of a server is not complete without fetching the attachments. Still, given the possible scope of archival operations, I'm concerned for time spent downloading and disk space, and wouldn't want to enable them by default.
Now, here are some thoughts on how I want to implement downloading files (including avatars, custom emoji ec.).
First, it's important to note that all Discord files are served from a CDN that does not perform any authorization. I can just get the link to an attachment from any server and post it somewhere else and anybody can download it. This means that we don't actually need a specialized tool like Discard to fetch them. It also means they can benefit from being included in the Wayback Machine.
My first step towards supporting files will be similar to the reader: parse a completed run and output a list of URLs for files to download by another tool (e.g. wget or ArchiveBot).
My main question here to potential users of Discard is whether outputting a list of file URLs is enough for you, or if you would prefer Discard to download files as a part of a run. And if so, if you want just the files, or WARCs.
Your input is appreciated!
Arguments for:
Arguments against:
Technically, Discord doesn't require a user account to join a Discord server and read history. With a public invite and permissive moderation levels on the server, you can join with just a nick, not even a password needed. While spamming open servers with daily puppet accounts is clearly not advisable, archiving through these "light" accounts might reduce load on verified user accounts (for more restricted servers).
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.