cortex-cms / cortex Goto Github PK

View Code? Open in Web Editor NEW

32.0 6.0 6.0 14.17 MB

:pencil: A headless, multitenant dynamic content platform powered by Rails, GraphQL and Elasticsearch

Home Page: https://docs.cortexcms.org/

License: Apache License 2.0

Ruby 66.39% JavaScript 8.40% CSS 15.75% HTML 9.46%

cms rails api headless custom-content content distribution multitenant reporting graphql

cortex's Introduction

Cortex CMS Engine

Cortex CMS is a multitenant identity, custom content distribution/management and reporting platform built by the Content Enablement team at CareerBuilder. Its purpose is to provide central infrastructure for next-generation applications; exposing a single point of management while enabling quicker build-out of new software.

Cortex adheres to a headless, API-only architecture - it avoids a monolithic, all-in-one architecture associated with CMSs like WordPress or Drupal.

Documentation

Cortex CMS features a comprehensive documentation portal. To get started, please refer to either the Docker Compose guide (recommended) or the manual setup guide.

Contributing

Anyone and everyone is encouraged to fork Cortex and submit pull requests, propose new features and create issues.

Review CONTRIBUTING for complete instructions before you submit a pull request or feature proposal.

License

Cortex utilizes the Apache 2.0 License. See LICENSE for details.

Copyright

cortex's People

Contributors

Stargazers

Watchers

Forkers

vikksr vashadow anandlakshman sboracb xenisys thedigitaloctopus

cortex's Issues

Redesign Header/Navigation

The header is a bit difficult to navigate. Currently, you must read each header item and familiarize yourself with the interface before you can begin. If common-sense operations were grouped together (i.e., user functions underneath a user dropdown), and icons added, the admin interface would become much more intuitive.

For inspiration, see: https://github.com/google/material-design-lite/tree/master/templates

Refactor ContentItem/FieldItem Model Implementations

They're complicated, unwieldy and slow!

More details forthcoming.

Upgrade Doorkeeper + Devise

These are significantly out of date now, and need an upgrade.

Dynamic Plugin Installation

Engineers wishing to prop up their own Cortex instance shouldn't have to modify core, and installation of plugins in a SaaS environment should be self-service. To accomplish this, we cannot rely on the standard Bundler implementation - we'll have to get dynamic!

Let's use this issue to map out potential solutions. Some places to start:

Restrict Admin Interface to Authenticated Users

Currently, states aren't protected by authentication - just the API resources they depend on. So an unauthenticated can navigate directly to an administration-related route and will be presented with a view not populated with any data. This is confusing for many users - we should figure out a good solution.

Debug times out in RubyMine

Currently, debugging in RubyMine can be a bit of a pain - it times out and resumes execution after a certain amount of time. Look into and fix this, or get better at CLI debugging! Yeah!

Remove wildcard_redirect_uri

wildcard_redirect_uri represents a security risk, and we should instead be using the state OAuth parameter. See: doorkeeper-gem/doorkeeper@fd57c47

This precludes an upgrade to the latest Doorkeeper, as the feature has been removed as of 2.1.1.

Email as Unique Identifier for Bulk User Upload

What it says on the tin! Right now, if an administrator wants to update existing users with the Bulk Job tool, they need to manually set the id column for each existing user. In future, we should remove the id column entirely from the Bulk User Job template and treat email as the primary identifier for new/non-new users.

Rudimentary RESTful API

To liberate our data sooner rather than later, we should spin up a simple RESTful API for ContentTypes/ContentItems utilizing our existing RESTful infrastructure.

Redesign Login Page

The current login page is not the good face Cortex should be putting forward. It should be clean, informative, and have some small amount of branding. It should also contain some marketing for the CMS, and release and team updates. I'll update this issue with mockups, examples, and so on.

Rails 5

This issue exists to track Rails 5 upgrade feasability. Rails 5 is now stable, so we should start evaluating our dependent libraries over the next couple months:

acts-as-taggable-on: master branch contains stable 5.x support, release imminent
sprockets: Does not yet support Rails 5
sidekiq + sinatra: Sidekiq requires Sinatra for its management interface, which relies on an old version of Rack that is not supported by Rails 5.
redis-rails: 5.x support present in 5.0.0.pre version of this gem
paranoia: Very alpha-looking 5.x support present in master branch
rails-observers: 5.x support present in master branch

Once we can tick off all these libraries as stable/working, we can upgrade! During upgrading, we must follow this guide, as a decent bit has changed. We should also step through the release notes as a team.

Move Grape classes to /lib

the /app/api directory + modules, which contain Resources, Entities, and Grape Helpers, might make more sense living in /lib. This is how Gitlab and other Grape/Rails projects structure things, and makes the API classes more explicitly portable.

More justification of this forthcoming!

Cortex cannot be setup for local development with existing seeds - missing newsfeed

Doesn't appear to be any seed data.

Commenting out the code that pulls this in the helper lets me load the sign in page.

Expand Readme

The Readme is functional, but doesn't contain Linux instructions, nor details of Cortex's inner workings, methodologies and dependencies. Additionally, a good Readme should include contribution guidelines, so that the rest of the company can start getting involved with Cortex.

Replace Pagination Helper

There are now libraries available to set pagination headers for Grape APIs. We should look into replacing our own pagination helper methods with these. Here are some options:

Kaminari only (Grape): https://github.com/monterail/grape-kaminari
Kaminari + will_paginate (Rails + Grape): https://github.com/davidcelis/api-pagination
will_paginate only (Grape): https://github.com/remind101/grape-pagination

Tenants as Subdomains

Given our nested tenancy structure, this merits some debate re: value and implementation. Usually, SaaS products use a subdomain to indicate to the platform and to a user that they're restricted to content/config in that tenant. Our platform may be more 'flexible', however, and could make this paradigm more complicated and unnecessary. At the very least, it may be used to let supervisors bookmark a tenant and be dropped into it upon visiting/logging into that URL.

Let's use this issue to debate.

Enforce OAuth Scopes

Currently, OAuth Scopes are not being enforced for both user authorization and client authorization. This is a potentially massive security oversight, and should be taken care of immediately.

Cortex SASS variable auditing

As we approach the end of V2 and and dawn of V3 it's very important that we get our ducks in a row in terms of our style variables / our adherence to the style base that Susan drew up for us. Basically we need to go through our SASS to make sure that every variable that's there should be there and ensure that we are using our variables everywhere we can.

Hopefully it'll make our lives easier going into the Cortex React Renaissance!

Media ContentItems updated to Assets with different extensions will not be reflected in pre-existing Blog Posts

Currently I have fixed the inconsistencies with the image Asset thumb and style urls not changing with the Asset URL after updating by reusing the asset_field_type_id used in the existing asset endpoints. So this ensures that the original assets will get overwritten in S3.

This works just fine as long as the new asset is of the same extension as the previous asset. If the updated asset shares the same extension, updates will be reflected immediately in all Blog posts referencing the ContentItem. But this if the extension changes after updating then all pre-existing blog posts will still reference and display the previous asset.

Here is an example of an embedded image:

<media id="9bbf1db1-3d1b-493d-91b8-9c8b3e86cd04">
<img  src="http://localhost:3000/asset_field_types/assets/careerbuilder-original-391a9f00-8604-4b60-a53f-0a09ac9b0a70.png?1488226812" alt="" height="" style="" width="" />
</media>

With the above embedded image, the new image will not be reflected if the new image has a different extension like .jpg(but styled URLs will change).

Right now this is the metadata we pass to paperclip:

{:path=>":class/:attachment/careerbuilder-:style-:id.:extension",
 :styles=>
  {:mini=>{:format=>"jpg", :geometry=>"100x100>"},
   :large=>{:format=>"jpg", :geometry=>"1800x1800>"},
   :micro=>{:format=>"jpg", :geometry=>"50x50>"},
   :medium=>{:format=>"jpg", :geometry=>"800x800>"},
   :default=>{:format=>"jpg", :geometry=>"300x300>"},
   :post_tile=>{:format=>"jpg", :geometry=>"1140x"}},
 :processors=>["thumbnail", "paperclip_optimizer"],
 :s3_headers=>{:"Cache-Control"=>"public, max-age=315576000"},
 :preserve_files=>true}

All image assets, whether gif, png, svg will all use the .jpg extension resulting in something like careerbuilder-large-376fcb4e-e270-40cd-aa09-51bdd2aaca20 using :class/:attachment/careerbuilder-:style-:id.:extension as a template.

Possible Options:

Save all original image asset URLs as .jpg or .image and allow the browser to infer the type. Of course for the styled URLs .gif will be converted into a .jpg
Change the embedding code to use a ContentItem id and use that to look up the asset's URL
Setup a has_and_belongs_to_many relationship for ContentItems and have updates with different extensions trigger a update background job to swap out the url.

EDIT:

Just to emphasize the fact we pass format to paperclip in styles means that if we send in a gif , :default , :medium and all styled URLS will no longer be a moving gif, but instead be a jpeg of the first frame.

Endpoint Throttling

If Cortex is going to serve more than its internal users (especially if it's going to be open sourced), it needs endpoint protection, in the form of per-client request throttling.

Some questions to answer:

What should the default throttle rate be?
Certain trusted applications, like A&R, will be throttled significantly less by Cortex, so that it (A&R, or the trusted application) can handle its own rate limiting. What should the trusted application rate be?

Search by Media Taxon Broken

Currently, searching on the Taxon/UID field via the Media search endpoint does not work.

Refactor Searchable Concerns

Currently, all the Searchable concerns for our models feature repeated code. Either they do not belong in a concern, or we need to abstract the repeated logic into the shared Searchable concern.

See: https://github.com/cbdr/cortex/tree/develop/app/models/concerns

Implement State Machine

This Issue will hold our discussions re: what state engine we go with. It looks like there are two options: state_machine and aasm. The former was abandoned 4 years ago, but now has a fork with a decent amount of activity. The latter looks just as active and fleshed out- therefore, I'd like to go through a quick research phase where we suss out the value of each. I'll be doing some research, but please comment if you have any further insight.

Repair Testing on Travis

It's broken! Bypass their ElasticSearch versions and install 2.x so that tests can pass.

Revisit ES Indexing for all new Field Types

Every field type has an elasticsearch mapping, unfortunately many of the currently created field types are not correctly being represented to their mapping. Revisit the mapping for each Field Type and ensure that the mapping / indexing for its data accurately reflects the relevant ES concerns.

Secure Password Reset Flow

Our current Password Reset implementation immediately changes a user's password - this could allow a malicious actor to reset a user's password without their consent. This is definitely a security risk and should be resolved before we consider open sourcing Cortex.

Replace require_scope! and other OAuth helpers

There are now Doorkeeper-specific libraries that provide DSL extensions for Grape for enforcing scope requirements, as well as various helpers for grabbing access token, etc. We should switch to using them, especially as Doorkeeper has changed their API a decent bit.

See:
https://github.com/fuCtor/grape-doorkeeper
https://github.com/antek-drzewiecki/wine_bouncer
And in Doorkeeper 3.x: https://github.com/doorkeeper-gem/doorkeeper/tree/master/lib/doorkeeper/grape

Decorator Plugin System

@ElliottAYoung will be executing on this story, and he will be commenting on this Issue with additions to what's been laid out here. Naming for this system has yet to be fully decided - 'plugin system' is awfully vague, and this proposal should try to account for a better vocabulary.

From JIRA:

As a Cortex end-user, I'd like to see informative bits of data related to my ContentItems presented in a user-friendly fashion, so that I may more easily find and edit my content.
Currently, our decorator system features the ability to render 'methods'. If we want to use this beyond demonstration stuff, we'll need to build a plugin system that removes presentation details from our models (methods on our ContentItem model that render out HTML for the Decorator Cells to consume) and extracts them to Cells. This will also allow for more flexibility in what we can display - the best example of this is the little robots that appear in our Index pages. It is not currently reasonable or responsible to display a ContentItem author's gravatar with our current 'method' system, but would be easy to do with a plugin system that utilized Cells.

AC:

Upon inclusion in a Decorator, this system can render out things such as:
- Media Info boxes
- Publisha-bility
- Author's (not creator's, which is a standard AR field) gravatar
- Media thumbnail
System should be able to handle creation/edit gracefully. i.e.:
- If it's a publishable box, it should be displayable both during creation and edit stages, and user's selections should be persisted.
- If it's a Media Info box, it should only display during edit, and not error during creation
Multiple cells should be allowed to be arranged anywhere within the wizard
CSS classes can be applied, grid_width can be applied
System should allow for configuration, such as:
- Media info box needs a way to know what field contains the data it should render out. i.e. some field (and its concrete fielditem) possesses the data necessary for Media Info to render out filename, etc.

Technical AC:

Plugins live alongside other plugins (FieldTypes, Cells) in the cortex-plugins-core gem
Plugin system that allows the rendering of cells, dynamically, within both the Wizard and Index decorators
- Configuration/data can be passed to the cell
- By default, 'show' method is called to render the cell
- Optionally, a different method (something other than 'show') can be invoked for the render output
- Analyze use cases for index decorator (convenience methods?). You have the freedom to determine whether the Plugin system is appropriate for the Index Decorator, but the 'Media Thumbnail in Index' use case must be accounted for as part of this story, either way.

Rails 5.1 Upgrade

This issue tracks our upgrade path for Rails 5.1. Please log potential upgrade issues as a comment in this issue.

ElasticSearch 5.x Upgrade

This issue tracks our upgrade path for ElasticSearch 5.x. Please log potential upgrade issues as a comment in this issue.

Remove "Rails Assets"

Rails Assets is occasionally a bother. Packages are frequently out of date, the re-packaging queue trudges along slowly (though this is better lately), and there are occasionally incompatibilies with malformed Bower configs, or poorly configured projects. It would warm my heart to switch to a tool like bower-rails.

This is low-priority, I suppose!

Move Organization

To accomodate all the repositories we're going to be creating, we need to move Cortex and its dependencies over to the cortex-cms organization.

Dashboard/Timeline

There's a black hole of information in Cortex - a CMS should be very good at telling an administrator what's changing within their realm. We need a ticker/timeline either as a drawer or as a dashboard. Will update with potential gems and mockups.

If we switch to JCR/CMIS/etc, we could make a wonderful contribution to the community by creating a gem/JCR extension (?) that appends a bit of metadata to the base content object: timeline icon, display format, etc.

Cortex/Plugin Testing Infrastructure

In progress: https://cb-content-enablement.atlassian.net/browse/COR-631

Content Libraries

We've mostly coalesced behind this idea as our main content distribution model. Content creators would be able to create a 'library' and publish content to it. This library would be able to be distributed outside of a tenant. Upon creation, a user would be able to generate a link that would enable sharing of that content, or specify an email/tenant admin to send a private link to. Upon clicking that link, a different tenant administrator would be able to add that library to their list of libraries for that tenant (or Org?). From there, they'd be able to distribute individual pieces of content or setup distribution rules to automatically make the content available as part of their tenant's feed. Upon applying distribution, this is when uniqueness checks would be performed, and uniqueness conflicts could be resolved.

In the longer term, we may want to consider even more distribution models - this one has its limitations. At that point, however, we would have to prove the use case before we made things more complicated with another way to distribute.

I'll be uploading a diagram sometime this quarter or next that digs into this idea a bit more.

Things to consider:

Permissions. Upon distribution, who controls the library's settings? Is a copy of the library made, or is it a reference? Is there a thin layer ontop that merely controls the settings of the library rather than copies it entirely?
After distribution, can the original library creator continue adding content?
Can someone who picks up the library from a distributor add content to the library themselves? This may be akin to the 'view vs edit' setting you select when creating a public link in Google Drive
Can the library be distributed outside of the hosted instance entirely? This may be very powerful for some users. Could use a webhook/pubsub/microservice pattern to implement something like this. Would we lose flexibility this way?
Automatic distribution rules
Uniqueness validations
Relationship with tenants and organizations
Use in API

ElasticSearch after suite is breaking Semaphore

The after_suite hook in spec_helper is causing the following error on Semaphore when executed:

/home/runner/.rbenv/versions/2.3.1/bin/rspec: No such file or directory - elasticsearch

An error occurred in an `after(:suite)` hook.
Failure/Error: Elasticsearch::Extensions::Test::Cluster.stop(port: 9200) if Elasticsearch::Extensions::Test::Cluster.running? on: 9200

NoMethodError:
  undefined method `empty?' for nil:NilClass
# ./spec/spec_helper.rb:74:in `block (2 levels) in <top (required)>'

RSS Decorator Proposal

We are now going to be serving RSS Feeds via custom RSS decorators - this issue will serve as the documentation ground for all discussions surrounding that

Feed Index (Search/Relevancy) Not Filtering Expired Posts

Currently, we're seeing stale, expired Posts appear in our Feed Index endpoints, both for search and relevancy. These should already be filtered out, so we'll need to research why that is no longer occuring.

Use RFC-5988 for Pagination Web Linking

See: http://tools.ietf.org/html/rfc5988

Replace Redactor/CKEditor with Scribe

After disappointing trials with both Redactor and CKEditor, I've been made aware of a really interesting new WYSIWYG editor tool produced by The Guardian. It's built ontop of ContentEditable and is focused on extensibility at its core. This meshes very well with our in-page editing desires, and our complex content editing suite. I believe more research should be done, and a POC generated for this tool before we move forward, but it looks incredibly promising. There also exists ContentTools, which looks superb as well!
See:

Ontop of Scribe, there exists a "web content" manipulator that we could use in conjunction with our custom media types. See: http://madebymany.github.io/sir-trevor-js/

Dockerize

To be frank, no one wants to use Cloud 66's Rails provisioning. Let's dockerize Cortex.

Flexible Content Model

Most recently, Cortex has started using a more flexible content model, but this has led to confusing ActiveRecord associations. In our opinion, the cleanest way to support the kind of complex content model we desire is by using a graph database of some kind.

We should evaluate content-specific repositories that build on the concept of a graph database, such as JCR, CMIS, etc. We should also research using a graph database on its own, and implementing versioning, etc ourselves.

The best way to clean the domain, in my mind, is to allow system administrators/developers to create custom content types on the fly. For example, an A&R Post would inherit from a normal Post and add a couple extra ONET fields. JCR and other systems make this very easy.

Unfortunately, JCR is tied to Java. At the very least, it seems we would have to switch to JRuby (fine with me!) and start using/maintaining safety-pin. Alternatively, we could re-write Cortex's backend in something like Groovy, as suggested by Colin. Popular JCR implementations include the reference implementation, Apache Jackrabbit, and ModeShape. At first blush, Jackrabbit's documentation seems sparse, while ModeShape's implementation seems more advanced and friendly, and less tied down to implementing the complete spec.

CMIS is language agnostic, exposing its functionality purely via a SOAP API. It can act as an abstraction layer ontop of JCR, much like ODBC standardizes different SQL services. Another plus includes its ability to store data in a SQL database. The JCR's storage backend, on the other hand, is completely opaque and can only be interacted with via the JCR API. In critiques of CMIS, however, I've read that its content model is too focused on streams of binary data/files, rather than more human-readable storage.

Youtube API v2 -> v3

Youtube Media creation currently uses the YouTube v2 API, which is technically deprecated past April 20, 2015. Google no longer guarantees its operation and it could disappear suddenly.

We should prepare for the inevitable by replacing it with the v3 API. The relevant endpoint is discussed here: https://developers.google.com/youtube/v3/docs/videos/list
General implementation guide: https://developers.google.com/youtube/v3/guides/implementation

UPDATE: The V2 API has been deprecated for the endpoints we interact with.

Remove Jargon

Cortex's Jargon dependency represents great technical debt, and a burden on people setting up Cortex for the first time. Let's debate its usefulness in this Issue, and ensure removing it is the right decision before we do so. I'll update this issue with specific issues with Jargon, and its pros/cons.

FieldType 'Dynamic' Service Layer

Currently, a good chunk of Plugin-specific logic occurs at Cortex's Service/Model layer (i.e. tags, tree logic, etc). It should be the responsibility of the plugin to orchestrate a FieldType's side effects. To this end, we should create a pattern that allows developers to utilize Service objects in their engines that wrap around the FieldType models. Cortex will work with this layer when it wants to write/update/retrieve relevant data. This will allow us to not only clean up the core domain, but also gain access to previously off-limits functionality - such as current_user. Additionally, this feature is mostly necessary for a responsible AssetFieldType implementation using shrine.

This issue will track risks, weaknesses, and questions about this potential new piece of infrastructure.

Set ETags on Feeds and S3 Assets

We should consider using ETags in our headers for Feeds (Posts and Webpages) and for S3 assets (so that Cloudfront busts its cache). This is a more secure, standardized and less brittle solution than our current methods for busting cache (busting it via a non-authorized endpoint, or via low TTL).
See: http://www.w3.org/2005/MWI/BPWG/techs/CachingWithETag.html

I'm unsure if Cloudfront obeys ETags, but I've seen other asset delivery networks that do.

Swappable Asset Backend

There are multiple use cases for a flexible asset backend. If we move Paperclip configuration to Tenant Management, we can:

Select buckets based in other countries. For example, Greece A&R would want a European-based S3 bucket.
Customize asset URLs per-tenant.
Create adapters to use non-Cortex DAMs as the backend, should a customer or application need it.
Doesn't necessarily require a swappable asset backend: Automatically spin up new buckets for every tenant to truly segregate asset data.

Research Tenancy Libraries

Evaluate new Tenancy libraries in the wild, consider replacing home-grown solution (which is working well enough). In JIRA: https://cb-content-enablement.atlassian.net/browse/COR-679

ContentItem FieldType

This is a proposal for a FieldType that will allow a content creator to associate one ContentItem with another ContentItem. There's a few requirements that make this feature tricky and non-obvious:

Once selected, the system needs to be able to display the selected ContentItem in a logical way For example, if the ContentType represents 'Media', you'd expect to see a thumbnail and perhaps some basic info about the Media's asset in the selection box. A good example of this is the Media selection for Posts in the Legacy system:

The tricky bit is the dynamic nature of Cortex. We can't know if a ContentType is going to conform to our expectations. In one system, a Media may have one associated AssetField - in another, Media may have 3 AssetFields. We need a way to suss out how to inform the user of their selection.
In Cortex, a popup will be used to display potential ContentItem reference selections. The popup should work like any other index page on Cortex, and utilize Index Decorators. This means we must be able to specify a Contract to use.
Within this popup's Contract index, results must be further refined. For CareerBuilder's 'Tile Media' example, a Media popup would utilize our standard Media contract for index display, but we'd only want to display images and videos as possible options for association. A Tile Media that's a PDF wouldn't make much sense, and couldn't be rendered by any reasonable frontend. To accomplish this filtering, we'd probably want a way to pass in search criteria.
Indexing and API presentation. We'll want a way to succinctly return information about the referenced ContentItem. Usually, we'd want to see just the relevant FieldItem. We'd use this in the API representation, and also use its data hash to populate the ElasticSearch index. Since we'll be using GraphQL, the user could also request the entire referenced ContentItem, and the system should be able to return it.

With all of these requirements in mind, the solution I'm proposing should be accommodating. When defining a Field, you'll be able to pass in some configuration in the metadata field:

blog.fields.new(name: 'Tile Image', field_type: 'content_item_field_type',
                metadata: {
                  contract_id: 1, # How to render Selection Index?
                  field_id: 1, # How to present Selected ContentItem?
                  queries: [ # How to filter Selection Index?
                    {
                      filter_by_asset_content_type: ['image/png']
                    }
                  ]
                })

There's no need to specify the ContentType, as that's implied by the field_id's parent. queries is a sorted array that specifies queries to be executed, in the sorted order, to filter or sort the data to be presented in the specified Contract. This utilizes the yet-to-be-built Dynamic Query system, which will build ElasticSearch queries on-the-fly using query builder helpers contained within FieldType plugins.

To display the selected ContentItem in the creation interface, we need a way to represent the FieldType associated with the field_id specified in the metadata config hash for the ContentItemField. To accomplish this, I feel there should be an interface all FieldTypes can conform to to implement this feature. By exposing a public render method, association, a FieldType's Cell can help present information to the user about the selected ContentItem.

Tenant Switcher/Indicator

This is less about the visual representation for users of a switcher, but more of the technical implications:

Showing all available tenants based upon logged in user
Seamless transition between tenants without requiring a re-login, etc.

In JIRA: