Git Product home page Git Product logo

codex's Introduction

What This Is

A container for discussion and early exploratory work towards a new package repository for Haskell (or at least ). Our mission is to be transparent about our design process and welcoming of community support and feedback, and to actually ship.

We are not the Hackage 2 project.

We intend to help support Hackage 2 succeed, and provide a vehicle for experimenting with complementary ideas to enable hackage style tooling for a rich set of use cases.

We expect to lean heavily on the decisions that its members have made on major design issues, and perhaps even borrow some code as permitted by its license, we are dedicated to a process that everyone can participate in, and to previews becoming usable and useful from early on.

As far as specific design decisions, no individual decision is so important as to be part of our mission statement here. But, see our issue tracker for our latest thinking on some key questions.

What This Is Not

There's no code yet.

codex's People

Contributors

cartazio avatar ireneknapp avatar

Watchers

 avatar  avatar  avatar

codex's Issues

Description of server state

Okay, so the idea is that all state is divided into three parts: configuration, which describes how parts of the system can find each other; database, which is relational information about packages, users, and so on; and blob repository, which is a bunch of, er, blobby files.

I'd like the config files to use JSON (I am a fan of Aeson as the interface to it), because it's reasonably standard and doesn't have any really bizarre or surprising syntactic rules like YAML does. That also saves us time on writing a parser; the other obvious option would be a simple plaintext format that we define.

I'd like the database to use SQLite3, because I know and trust it, and it's trivial to set up and use and even back up and restore. Another sane option would be PostgreSQL, but that has substantially more administrative overhead.

I'd like the blob repository to live in Amazon S3. This makes distribution of files almost trivial, since we can simply grant public access to the appropriate parts of it. I've already poked around a bit and created a possible folder hierarchy we might use; see issue #16. The alternative to S3 would be the local, per-mirror filesystem, but this runs into size constraints, and means that each mirror, which doesn't really need to poke at the contents of packages except when it's in the act of building them, has to do a large up-front download before it can come online. It also introduces complications of synchronizing this state across federated mirrors. Now, S3 is not without its administrative hassles, but they relate to assigning permissions, which feels like a cleaner category of problem to have.

business user remarks / opinoins

what follows is a transcript of bullet points from someone using haskell in their business (and I think it articulates a number of ideas better than I would have )

  • internal use [editor: of a hackage-like, eg codex], and security policies are pretty much essential features to using it, i believe
  • It's hard to justify building a package management system today with out signatures (for author validation) given the Ruby Debacle
  • I want a place where I can put a proxy in front of hackage for any internal packages
  • And where I can trust the package I downloaded and compiled into my system as the same one being vetted accross the community
  • because how often do you grep your cabal install downloads for use of unsafePeformIO? If you're like me: never

near term plan

theres a lot of nice things we want to do over time, but
dead easy hackage mirror is the first step. Theres a lot of nice things we can do on top of that, but thats really step one.

Make a command-line tool to assist in the creation of a new mirror

See issue #17 first, for background reading. I'm moving forward on the assumption that we're going to use the technologies I suggest there - JSON for a config file, SQLite3 for a database engine, and Amazon S3 for blob storage and distribution - but I could be talked out of any of these.

Make a command-line tool to assist in the creation of a new mirror. I'm envisioning a multi-command executable that takes a subcommand name as its first argument, with the first subcommand to be implemented being "config", which simply asks questions, pokes at the systems it's ostensibly connecting to a little, and spits out a config file.

The config file should at the very least contain S3 credentials and bucket identifier. The credentials are two fields, an access key and a secret. (A "bucket" is the top-level container of stuff in S3.)

So I'm in a hurry now and want to get all these thoughts down, so I'm going to just describe the flow of the steps I envision "config" doing. I originally thought the command line might be suitable, but now that I see how many steps there are, I'm thinking something more like the "dialog" program, which is that great set of tools that Debian and the Linux kernel makefiles both use for graphical terminal-based configuration.

The reason I think something interactive of this nature is necessary is because I was trying to document these steps and they're fairly error-prone. Plus, Amazon's console is subject to change; its API is not.

"This command will assist you in configuring a new mirror of Codex, a Haskell software-distribution system. You probably don't need to run your own mirror, unless you have code which you wish to publish internally but not to the world at large. I'll assume since you haven't ^Ced out of the program that you wish to continue..."

"First, do you wish to set up the first server in a federation of servers, or a mirror of an existing federation?"

(User chooses first in a federation.)

"Okay. You will need to have an existing Amazon Web Services account. This tool can create the resources it needs therein, which consist of an S3 bucket, an IAM group, and an IAM user with an access key. The tool can also utilize existing resources, if you wish to create them manually. If you wish to go with the automated solution, you will need to supply an access key and secret which will not be stored, only used to create the credentials which will actually be used. Which would you like to do - automated, or manual?"

(User chooses automated.)

"I'm pleased to hear that." [Software should be polite! :D] "What is your access key?" (User does so.) "And your secret?" (User does.) "Checking - okay, these are valid. If there is an existing IAM group you wish to use for the machines in this federation, please select it now; otherwise, just choose "create" to create a new one. The following are the IAM groups extant: ..."

(User chooses "create".)

"Okay. Do you have a preference for the name of this group? If so, specify it now. If not, I will use "codex"."

(User chooses the default.)

"Good. I will use the group "codex" as the IAM group to create my user in. Or have you already created the user? There are no IAM users in the codex group, and fifteen users overall, as follows: ..."

(User chooses "create".)

"I notice that this computer's hostname is "silly-cat-joke". Would you like the user to be named that as well, or do you have a preference, or should it be set to something arbitrary?"

(User chooses "silly-cat-joke".)

"Good. Next, would you like to create an S3 bucket, or use an existing one? There are 3 extant buckets, as follows: ..."

(User chooses "create".)

"What should it be called? If you have no preference, I will use "codex"."

(User chooses the default.)

"All right. The bucket has been created." [Conveniently, we don't need to create directory structure; it doesn't really exist.] "I have also granted the "codex" group the appropriate permissions on it."

"We need to know where to keep our local database. The default is /var/lib/codex/database."

(User chooses the default.)

"Okay. The new config file is written to config.json in the current working directory; move it to wherever your init.d script will be able to find it. Note that this file contains precious information, so don't casually delete it to start over; doing so will leave inaccessible resources that require cleanup work by the federation administrator."

Is building a hoogle database in-scope?

Is building a hoogle database in-scope? That would be rather nice, but it's not clear to me how much work it would be or whether it is deployable. If, for example, it were based on acid-state, that would consume far more server resources than we (or anybody) could afford.

Is building documentation in-scope?

Is building documentation in-scope? I think that the documentation repository is the main nice thing about Hackage 1, as it stands, so we really want this to be.

Is building packages in-scope?

Is building packages for testing and documentation purposes in-scope? We should consider this carefully. It may be an "in a few weeks" thing rather than a "right this moment" thing.

What is our backup strategy?

What is our backup strategy?

I am the author of direct-sqlite, which is the layer underneath sqlite-simple. I think it would be very little work to add support for SQLite3's online-backup API to both these projects. Then, as long as we store everything in the database (except for configuration, which properly belongs outside it for ops reasons), we can easily create self-contained backups of everything.

Are user accounts in-scope? What is their nature?

Are user accounts in-scope? What is their nature?

Specifically, do we have both user and organization accounts? I think that we must. Is there any sort of privilege system to allow users to be designated as curators of all or part of the package tree? If so, what are its specifics?

What is our strategy for integrating with Cabal?

What is our strategy for integrating with Cabal? That is, do we shell around cabal-install, or do we link directly against the Cabal library? I strongly favor the latter; it's just a lot easier...

Is TLS support in-scope?

Is TLS support in-scope? I think that it almost certainly has to be. This is a crucial flaw in our infrastructure at present, and a shameful one.

What web framework are we using?

What web framework are we using? You mentioned snap, which I haven't previously used but I just looked at it and it seems like a fine choice. Shall we proceed on that assumption?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.