Git Product home page Git Product logo

rfcs's Introduction

OARepo RFCs

The OARepo RFC process (Request For Comments) is a communication tool inspired with Invenio RFCs with the purpose to:

  • coordinate the design process
  • document design decisions
  • produce consensus among OArepo stakeholders

The RFCs are not meant to be a heavy long process, but rather an agile process to aid communication between geographically dispersed teams as well as to document OARepo development so that we can avoid knowledge loss when people leave and ease knowledge transfer when people joins.

OARepo RFCs is not an official approval process which you might known from other RFC processes.

TL;DR

Quick links

Process overview

  1. Request RFC (focus on scope): Before starting to write and implement a new RFC, you first have to request the approval of OARepo architects by opening an issue. This is to aid scoping the RFC, avoid duplication as well as save everyone time..
  2. Write RFC (focus on content): If the request is accepted by architects, the writing process of the RFC is started using the following template. This part of the process focuses on the content. The goal is that most discussions and alignment on a solutions happens in this writing phase.
  3. Review RFC (focus on quality): Once the RFC document is written, it needs to be reviewed and approved. The review focuses on quality of RFC and readiness for implementation, not the content of the RFC (that should already have been agreed upon in the writing phase).
  4. Merge RFC: RFC is merged into repository (RFCs does not need to be complete to be merged, as long as unresolved questions have been listed in the RFC document and quality has passed the review).
  5. Start implementing RFC: Create implementation issues for RFC features in repositories affected by the RFC, assign developers to be implementing these features.
  6. Finish RFC implementation: When all implementation issues are resolved and closed, user stories from the Motivation section are checked against implementation and RFC document is then marked as Implemented.

When to write a RFC?

You need to write a RFC to make changes to OARepo modules functionality. A change could be:

  • Adding/removing larger features and/or modules.
  • Changing existing features/APIs.
  • Changes of design patterns, idiomatic usage or conventions.

Step 1: Request an RFC

Requestor
  • Open an issue.
    • Document the 1) Motivation 2) Summary of proposed changes and 3) Expected resources needed
  • Label and assign the issue
    • All new RFCs should have the "Proposal: Pending" label and a label for the product if applicable (e.g. "NR").
  • Review the request
  • If rejected:
    • Add a justification to the comments of the issue.
    • Change the label to "Proposal: Rejected".
    • Close the issue.
  • If accepted
    • Change the label to "Proposal: Accepted".
    • Create a RFC draft document in a new branch by running a new-rfc GitHub Action:
      1. Action should take issue number (e.g. 10) as user input
      2. Creates & checkouts new branch in format rfc-10
      3. Copies 0000-template.md to docs/0010-your-rfc-issue-title-slugified.md
      4. Fills the RFC document header with current date to Start Date, current issue author to Authors, and replaces <RFC title> with issue title.
      5. Creates a Pull Request from rfc-10 to main branch
      6. Fills the RFC document header with link to the created PR
      7. Links the issue with the PR
    • Assign the point-of-contact architect (person responsible for drafting the RFC document)

Step 2: Write the RFC

Following is optional. It is just advices for writing the RFC in an collaborative and efficient manner:

  • Choose an editor (person) being responsible for this phase (e.g. the architect or another OARepo team member)
  • Brainstorming phase:
    • Fill the template with unstructured bullet points and high-level outline.
    • Try to clearly define scope - what's included, and what's excluded.
    • Try to identify multiple options for solutions.
    • What issues should be addressed?
    • Identify possible stakeholders, and include them in the discussions.
  • Reading phase (moderated by editor):
    • Add a "Questions" subsection to each section.
    • Read the RFC and add questions/comments to the questions sections (prefix each question with <name>: ...). Be clear purpose and support it with examples.
    • Purpose is to identify sections that needs further discussion.
  • Discussion phase (moderated by editor):
    • Expect discussion on semantics, naming and scope to possibly be long discussions (i.e. take these discussions first, subsequent discussions will be much faster).
    • Identify discussions points and list them in the document
    • Meet live to discuss discussion points
      • Moderator takes notes as bullet points for each discussion point/question.
      • Moderator must ensure everybody is explicitly asked about their opinion.
      • Conclusion:
        • Ask for preferred solution: Once sufficient discussion has taken place, the moderator asks each person for their preferred solution. Goal is to identify if there is consensus or disagreement.
        • Propose conclusion: The moderator looks for a consensus solution an proposes this solution.
        • Ask explicitly everyone if they agree
        • If consensus is not possible, the conclusion can be TBD and perhaps needs more research, and/or ask for input from non-designated architects.
    • Meet live to discuss all questions
      • Use same procedure as for discussion points.
  • Cleaning phase:
    • Clean up the RFC document - it should be readable and coherent for a third-party which was not part of the discussions.
    • Write up the summary focusing on explaining a third-party about the gist of the RFC.
  • Reviewing phase:
    • Ask for input from the non-designated architects and other stakeholders.

You can jump around between phases.

Disagreement resolution

Fight for what you believe, but gracefully accept defeat.

Please do your outmost to not have unresolvable disagreements! The more senior your are, the more responsible you are to not have unresolvable disagreements.

In case all attempts to reach consensus have failed, and really only as a very very last resort, the architects can resolve the conflict by taking a decision. This decision should be properly documented.

Step 3: Review the RFC

  • Comment on quality of the RFC, not the chosen solutions (this was already done in step 2)!
  • Can the RFC be understood by an experienced third-party that didn't participate in the discussions?
  • Is the RFC coherent?
  • Are the unresolved questions properly documented?
  • Set RFC status to Ready
  • Are there sufficient resources to implement the RFC?

Step 4: Merge the RFC

As soon as the RFC has reached sufficient quality level and consensus it can be merged into this RFC repository. The RFC does not need to be fully completed to be merged, as long as unresolved questions have been listed in the RFC.

Step 5: Start RFC implementation

  • Create implementation issues in repositories affected by the RFC, assign developers
  • Update RFC document's Implemented in: header with links to issues
  • Set RFC status to Being implemented

Step 6. Finish RFC implementation

When all implementation issues are resolved and closed:

  • Check that user stories from the Motivation section are fullfilled by the implementation
  • Set RFC status to Implemented

RFC States overview

  • Draft: The RFC has reached sufficient quality to be merged, some discussions has happened, but there's open questions and it's not ready yet to be implemented.
  • Ready: The design is ready, enough discussions has happened to reach a reasonable consensus and quality of RFC is good.
  • Being implemented: The RFC implementation has started.
  • Implemented: The RFC has been implemented in the community or code.

The OARepo RFC process owes it's initial inspiration to the Invenio RFC process.

rfcs's People

Contributors

mirekys avatar

Watchers

 avatar Mirek Simek avatar  avatar David Antos avatar Tomáš Hlava avatar

rfcs's Issues

[Proposal] Submission of rfcs

Motivation

We need to define a process to handle change requests for OARepo platform from multiple parties.

Summary

  1. Request RFC (focus on scope): Before starting to write a new RFC, you first have to request the approval of OARepo architects by opening an issue. This is to aid scoping the RFC, avoid duplication as well as save everyone time..
  2. Write RFC (focus on content): If the request is accepted by architects, you start collaborative writing of the RFC document using the template. This part of the process focuses on the content, and an architect is assigned to support you in writing the RFC. The goal is that most discussions and alignment on a solutions happens in the writing phase.
  3. Review RFC (focus on quality): Once the RFC is complete, you submit a pull-request with the new RFC for final review. The review focuses on quality of RFC, not the content of the RFC (should already have been agreed upon in the writing phase).
  4. Merge RFC: RFC is merged into repository (RFCs does not need to be complete to be merged, as long as unresolved questions have been listed in the RFC and quality has passed the review).

Resources

Timeline ASAP. Should be done by architects.

[Proposal] Communities backend

Motivation

Organize users and record submissions into community interest groups (similar to Invenio communities). Members of a community with elevated roles (editor, curator) should be able to manage approval proces of record submissions inside community. We should take into account synchronization of community members from Perun AAI. Each community could have a different approval process (e.g. some steps could be skipped) with each role having different permissions.

Summary

Implement a library that provides:

  • REST APIs to manage & fetch configured user communities
  • DB models to store community configuration (permissions, approval process...)
  • models or neccessary request classes that enables record submission to communities
  • synchronization tools for synchronization of community members with Perun AAI groups
  • admin interface

Resources

High priority - at least basic approvement workflow (skipping most of the approvement steps) needed for next planed milestone

Primary assignees: @mirekys

[Proposal] Loose validation

OARepo Loose validation

Motivation

For needs of processing harvested data from external sources, it is necessary to be able to store invalid records. Relevant error messages must be indexed for subsequent work to enable their aggregation and retrieval. Invalid records also need to be marked as non-valid.

Summary

Within the loosely validated models both valid and non-valid records are stored. Errors in data can be of two types:

  • Structural errors (wrong data type, non-existent field…)
  • Non-structural errors (too many characters in string, not enough values in object…)

In case of structural error

Problematic field will be erased from record and its value will be stored in field for non-valid values with information about original field. Respective error message will be stored in field for error messages with information about original field. Record will be labeled as non-valid.

In case of non-structural error

Problematic field will be stored as it is. Respective error message will be stored in field for error messages with information about original field. Record will be labeled as non-valid.

Detailed design

oarepo:validity field

Field added to top-level of JSON schema, elastic search mapping and marshmallow schema. It is used to store non-valid values and error messages.

JSON schema
{
  "oarepo:validity": {
    "type": "object",
    "properties": {
      "valid": {
        "type": "boolean"
      },
      "errors": {
        "type": "array",
        "items": {
          "type": "object",
          "properties": {
            "path": {
              "type": "string",
            },
            "message": {
              "type": "string",
            }
          }
        }
      },
      "invalid_fields": {
        "type": "array",
        "items": {
          "type": "object",
          "properties": {
            "path": {
              "type": "string",
            },
            "content": {
              
            }
          }
        }
      }
    }
  }
}
Elasticsearch mapping
{
  "oarepo:validity": {
    "type": "object",
    "properties": {
      "valid": {
        "type": "boolean"
      },
      "errors": {
        "type": "nested",
        "properties": {
          "path": {
            "type": "keyword"
          },
          "message": {
            "type": "keyword"
          }
        }
      },
      "invalid_fields": {
        "type": "object",
        "properties": {
          "path": {
            "type": "keyword"
          },
          "content": {
            "type": "flattened"
          }
        }
      }
    }
  }
}
Marshmallow schema
class InvalidSchema(Schema):
    path = ma_fields.String()
    content = ma_fields.Raw()

class ValiditySchema(Schema):
    valid = ma_fields.Bool(required=True)
    errors = ma_fields.List(ma_fields.Nested(ErrorsSchema))
    invalid_fields = ma_fields.List(ma_fields.Nested(InvalidSchema))

class RecordMetadataSchema(Schema):
    _validity = ma_fields.Nested(ValiditySchema(), data_key='oarepo:validity', attribute='oarepo:validity', required=True)

Record model

JSON schema

JSON schema will be as little restrictive as possible. Fields will have no constraints defined and all validation will be handled by Marshmallow. I.e.: Fields in JSON schema can only have defined their names and types (and in case of objects their property names and types). Additional properties are allowed.

Marshmallow

Marshmallow schema fields contain all defined model restrictions. Base record schema is inherited from modified Marshmallow base schema which will be defined in oarepo-loose-validity library and will provide the entire validation logic.

Error analysis

The type of error will be detected through the content of respective error message. Basic error messages are defined here. It is also possible to define own validation rules and errors. Because of that it needs to be decided how to distinguish structural errors from non-structural.
Options:

  1. Do it by using regular expressions where primarily everything is taken as a structural error unless the error message falls within the list of specified non-structural error messages. In case of custom error messages add information that it is non-structural error (for example add suffix “Loose validation”)
  2. Specify all non-structural errors inside application config (needs to be as regex because of errors of type “less then XY” etc.)
  3. Something else?

Example

Data Schema

class AuthorSchema(Schema):
    first_name = ma_fields.String(validate=[ma_valid.Length(min=5, max=None)])
    last_name = ma_fields.String(validate=[ma_valid.Length(min=5, max=None)])
    
class RecordMetadataSchema(Schema):

    title = ma_fields.String(validate=[ma_valid.Length(min=5, max=10)], required=True)
    authors = ma_fields.Nested(AuthorSchema)

    _validity = ma_fields.Nested(ValiditySchema(), data_key='oarepo:validity', attribute='oarepo:validity', required=True)

Harvested data

{
  "metadata": {
    "title": "jej",
    "authors": {
      "first_name": "yxyxy",
      "last_name": "xyxyx",
      "something": "wrong"
    }
  }
}

Stored data

{
  "updated": "1970-10-19",
  "id": "hmb7c-ryf20",
  "created": "1970-10-19",
  "metadata": {
    "oarepo:validity": {
      "valid": false,
      "errors": [{"path":  "metadata.title", "message":  "Length must be between 5 and 10."}, 
                {"path":  "metadata.title.something", "message": "Unknown field."}]
      "invalid_fields": [{"path":  "metadata.title.something", "content":  "wrong"}]
    },
    "authors": {
      "last_name": "yxyxy",
      "first_name": "xyxyx"
    },
    "title": "jej"
  },
  "links": {
    "self": "/validity_example/hmb7c-ryf20"
  }
}

Diskuze

  • oarepo:validity pole bude jako system field - tzn nebude na úrovni metadat, ale jako samostaný atribut v tabulce
  • Chybové hlášky řešit pomocí loose validation obalu + generování vlastních zpráv z důvodu problému například s chybami typu "špatný format data"
  • Kam patří taxonomické chyby? Pokud to má správnou strukturu, tak to jde uložit - tzn nestrukturální chyba i když hodnota není ve slovníku
  • když chybí povinná věc, jedná se o nestrukturální chyby
  • oarepo:validity přejmenovat na oarepo:metadataValidity, aby bylo jasné, že se jedná o validační chyby v metadatech a ne v dokumentech
  • Je potřeba udělat plugin do model builderu v samostatné knihovně a připojit skrze `oarepo:use : [loose-validity]
  • v ui je potřeba počítat s tím, že žádná věc nemusí být vyplněna
  • do budoucna zavést i striktní chyby které nikdy nemohou být uloženy (například pro potřeby uživatelského formuláře)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.