flipkart-incubator / krystal Goto Github PK

View Code? Open in Web Editor NEW

7.0 10.0 12.0 24.34 MB

License: Apache License 2.0

Java 98.45% Groovy 1.20% ANTLR 0.35%

krystal's Introduction

The Krystal Project

Contribution

All java code should be formatted using the google-java-formatter.

Latest Releases

///// This doc is a WIP

Super-charging the development of synchronous and asynchronous business workflows.

Introduction

The Krystal Project facilitates a developer-friendly way to write high-performant, complex, business logic which is easily maintainable and observabe. The project encompasses the following major components.

Vajram: A programming model which allows developers to design and write code for synchronous scatter gather business logic in a 'bottom-up' (choreographed) manner.
Krystex: A runtime environment which executes synchronous parts of the code written in vajram programming model in an optimal way by understanding static dependencies between pieces of code, creating a logical Directed-Acyclic-Graph, and executing the DAG with maximal concurrency.
Honeycomb: An asynchronous workflow orchestrator which orchestrates the asynchronous parts of worflows written in the vajram programming model.

Bottom-up vs. Top-down programming

The Krystal project supports two ways of writing business logic - Bottom-up and top-down.

Bottom-up programming or Choreography

Bottom-up programming is a paradigm in which a developer focuses on one atomic piece of business logic and it's dependencies without being aware of when, how and in what order the code will be executed with respect to other business logic in the application. The orchestration of the execution of the code is the reponsibility of the platform runtime, which analyzes the static dependencies declared by each piece of business logic and orchestrates the execution of the all the different pieces of the business logic in the most optimal fashion by building a Directed Acyclic Graph. This also called Choreography.

Top-down programming or Orchestration

Top-down programming is a paradigm in which the developer wants complete control over when and in what order a piece of business logic is executed. This means that the developer has an awareness of the uber workflow and a design intent for ordering of the various operations which are part of the workflow in a specific pre-determined order. This is also called Orchestration.

Example 1

Let's say a developer wants to write a piece of business logic that returns the population of the capital of a country, and let us say there are two exisinting APIs which return the capital of a given country and the pupulation of a given city respectively.

Bottom-up programming

//Pseudo-code
name: populationOfCapital
inputs: country
dependencies: 
  city = capital(country)
  population = population(city)
return:
  population

Top-down programming

inputs: country
workflow:
  return 
    getCapitalOfCountry(country)
    .getPopulationOfCity()
    .returnValue()

Example 2

Let's say a developer is coding the recipe for making a pizza

Bottom-up programming

//Pseudo-code
----------
name: dough
owner: dev1 (kneader)
inputs: water, flour
dependencies: []
return:
  rest(water + flour, 5hours)
----------
name: pizzabase
owner: dev2 (base_maker)
inputs: water, flour, thickness, radius
dependencies: 
  dough = dough(water, flour)
return 
  shape(dough, thickness, radius)
----------
name: preheated_oven
owner: dev3 (oven_manager)
inputs: temp, time
dependencies: []
return
  preheat_oven(temp, time)
----------
name: standard_wheat_pizza
owner : dev4 (storefront)
inputs: water, wheat_flour, cheese, toppings[]
dependencies:
  base = pizzabase(water, wheat_flour, thickness = 5mm, radius = 9inch)
  oven = preheated_oven(200degCel, 15min)
return
  bake(oven, base + cheese + toppings, 20min, 200degCel)

As you can see, each piece of business logic is owned by a separate owner. And no single owner knows the complete end to end workflow. With each kryon declaring their local dependencies and interacting with those dependencies, the final pizza get's made without any single developer knowing the complete recipe. They just need to focus on their part of the problem statement (recipe).

Topdown programming

Pseudo code
---------
Owner: Dev5 (chef)
return:
       startWorkflow("Standard_wheat_pizza")
          .preheatOven(200degCel, 15min)
          .as(preheatedOven)
          .take(water, wheatFlour)
          .makeDough(water, flour)
          .as(dough)
          .makeBase(dough, 5mm, 9inch)
          .as(base)
          .add(cheese).on(base)
          .add(toppings).on(cheese)
          .as(unbakedPizza)
          .bake(unbakedPizza).in(preheatedOven, 15min)
          .return()

As you can see here, a single developer has the complete knowledge of the recipe and has coded that logic into a single uber workflow.

Design Goals

Components of the Krystal Project try to adhere to the following design goals:

Separation of functional contracts and non-functional implementation details of dependencies: Developers coding a piece of functional business logic which depends on some other piece of business logic should be completely shielded from non-functional implementation details of the dependency as well as the runtime details of the environment in which the business logic is executing. Here non-functional requirements include:
- Session-level Caching
- Concurrency mode (multithreaded thread-per-task model, vs. reactive vs. hybrid model)
- Batching and batch size of dependency inputs
Minimize lateral and upward impact of code changes: If all the business logic executing to fulfill a request is seen as a Directed acyclic graph, a code change being made in a kryon in the graph should not impact the implementation of any other kryon in the graph which is a direct descendant of it.
Zero glue code: If there is business logic A and business logic B, all the relevant business logic should be contained within the artefact which represents each piece of business logic.
Out-of the box Non-intrusive batching of service-call inputs
Optimal end-to-end execution: This programming model is designed to be adopted for executing complex business logic in systems that power features and user experiences which directly are accessed by end users of websites with heavy traffic, thus strongly needing a low-latency, high-throughput execution runtime.
- Minimize use of native OS threads
- Avoid the possibility of developers erroneously blocking on thread-blocking-code pre-maturely.
- Streaming-over-network-connection capable
Avoid unnecessary bottlenecks and latency long poles: ...
Developer friendliness:
- Minimal bootstrapping overhead
- Dependency discovery
- Code generation
- Single point of coding - One-to-one atomic mapping of coding units and functional requirements.
- Minimize mental-overhead and avoid anti patterns inherent to reactive coding
- Programming-language-native developer experience: For example, type safety.
Observability - application owners get this Out of the Box:
- Circuit Breaking
- Live Service Call Dashboards
- Degradation levers
- Metrics
- Logging
- DAG visualization of a request
Testability
- Backward-incompatibility detection
- Mocking-out-of-the box
- Declarative unit test definition
- Auto-generated unit test code/templates
Programming language agnostic spec definitions: Although the project currently supports the java programming language, keeping the programming-model spec language-agnostic allows the programming model to have implementations in different languages and allows application owners/feature-developers to choose a language of their choice.

Krystal developer FAQs

Version bump up to Krystal

Follow the below mentioned steps for the Krystal version bump -

Build and publish vajram-codegen with new version to local maven repo. Ensure the vajram-codegen dependencies version should not be updated.
Update the new version in Krystal project build.gradle file and revert the vajram-codegen version to previous version. Do a complete build and publish to local maven repo.
Update the vajram-codegen to the new version.

Example : Need to update version from 1.6 to 1.7

vajram-codegen (build.gradle) update

update version from 1.6 to 1.7
Build and gradle publishToMavenLocal in krystal root directory

krystal (build.gradle) update

update version from 1.6 to 1.7
set classpath 'com.flipkart.krystal:vajram:'+ project.krystal_version in vajram-codegen's buildscript block to classpath 'com.flipkart.krystal:vajram:1.6'
Build and gradle publishToMavenLocal in krystal root directory

Final update

revert classpath 'com.flipkart.krystal:vajram:1.6'+ project.krystal_version in vajram-codegen's buildscript block to classpath 'com.flipkart.krystal:vajram:'+ project.krystal_version
Build and publishToMavenLocal and publish in krystal root directory

krystal's People

Contributors

Stargazers

Watchers

Forkers

ramanvesh prabhfk vinisha-parwal vaibhav-kole nahata-p-aditya brajeshprashantfk pratyay surbhi-1206 digvijay-rawat ajitraj88 mdjunaidmahmood dibyanideb

krystal's Issues

#BindableVajramLogicParams Add ability to bindFrom specific inputs in the VajramLogic #Vajram #Vajram-codegen

@VajramLogic
public static methodName(@BindFrom("dependency_name") DependencyResponse response){
 //
}

should be allowed

#ValidateNoDuplicateResolvers

vajram build should fail if two resolvers resolve the same input of the same dependency

#FanoutAwareVajramModels Add ability to declare a dependency as fanout enabled/disabled in #vajram #vajram-codegen

In *.vajram.yaml developer should be able to mark a dependency as fanoutEnabled: true/false.
This should tell the vajram-codegenerator that AllInputs.java should not dependency response. Instead it should have a ValueOrError - simplifying developer's task of accessing dependency values

#VajramLang Create the grammar for a new programming language for vajrams

This is currently for educational, readability and explicability purposes.
The idea is to design a high level programming language grammar that allows us to express the ideas of vajram/knode in a clear and concise manner.
This concise grammar and code can guide the design of vajrams in java to be as close to the ideal as possible.

At a later time, this programming language grammar can be developed into a full-fledged programming language (either JVM based, or otherwise) allowing us to write native vajrams without annotations and code generation.

Add support for optional and madatory dependencies #Vajram #Krystex

Open question: What happens when a subset of requests fail in batch-call to a dependency?

#dependencyInjection Add support for dependency Injection of arbitrary named inputs in #krystex #vajram-krystex

Krystex should be able to inject inputs which have not been provided by client nodes.
This has to be done without adding any external 3rd party dependencies and clients should be able to use their own custom DI frameworks like Guice.

Complete support for IO Vajrams and IO Nodes #Vajram #Krystex

The current implementation is aimed to be generic and should work for IO Vajrams and Nodes as well. But most of the code is gated behind entity.isBlocking() checks. Remove these checks, test that IO Vajrams and Nodes are working, make necessary changes and add test cases for the same

Add code generator for VajramImpl in #vajram #vajram-codegen

Make `Inputs` class an interface which can be implemented per vajram

Currently, Inputs is a final class encapsulating a map. This means every time we need to access any single input from the class we need to a ImmutableMap lookup. While an ImmutableMap lookup is O(1) and might be acceptable in common scenarios, this can add up to consume some significant (single digit %) CPU in a framework like Krsytal where we process thousands of requests per second and each request accesses inputs multiple times. While the ImmutableMap lookup might be inevitable in some scenarios, it can be avoided in other scenarios where the call is being made from inside vajram code which is data type and input schema aware, rather than krystex code which is data type and input schema unaware.

To take advantage of this difference, we can do the following:
We make Inputs class an interface and let the vajram-codegen auto-generate an implementation of Inputs. This implemented class will have the ability to access individual inputs without map look-ups if the caller knows the input name at compile time. Even in cases where the inputs are accessed from krystex, we can autogenerate a switch-cased input accessor which could be faster than a hash map lookup (This needs to be verified and quantified. Also, if switching over strings (input names) is slower, we can consider assigning a unique int id to each input, and use that to refer to inputs instead. This will definitely improve performance of random access of inputs from krystex since switching over integers is faster than switching over strings and looking up hashmaps - this improvement needs to be quantified as well.
Refs:
https://stackoverflow.com/questions/27993819/hashmap-vs-switch-statement-performance
https://web.archive.org/web/20180818151450/http://java-performance.info/string-switch-implementation/
https://stackoverflow.com/questions/12020048/how-does-javas-switch-work-under-the-hood
https://www.artima.com/articles/control-flow-in-the-java-virtual-machine
)

The implication of this is that Inputs equality can get affected. Currently two Inputs are equal if they have the same key value pairs. But with auto-generated Inputs implementations, this behaviour might no be possible to achieve. I don't think Krystal relies on this behaviour anywhere, so this change should be fine, but this has to be verified first.

#OutputPriming Allow vajram V to prime the output of target Vajram T for a set of inputs. #vajramKrystex

TypeSafe -
- Devs should use type safe auto generated models to prime the cache
- We should not allow wrong data types to be used - shoudl fail at compile time
StrictPermissions
- Cache Priming should be disabled by default - means no one should be able to prime Vajram T unless vajram T allows it explicitly

Open question:

CommonCachePrimer? or Vajram specific cache primer

Add support from map datatype in #vajram and #vajram-codegen

Move `VajramKryonGraph.validateMandatory` from `VajramKryonGraph` to auto-generated code #Vajram

Currently VajramKryonGraph.validateMandatory is taking ~10% of the CPU of Kryon.executeMainLogicIfPossible. Most of this time is going in Stream processing and iteration.

I idea is to move this validation into auto-generated vajram models code so that we can omit iteration and stream processing. Instead we can custom generate a validation method which access each input explicitly. This way we can avoid Inputs.getInputValue lookup as well.

Simplifying IO Vajrams supporting Input modulation

Simplifying IO Vajrams supporting Input modulation.

New @demodulation annotation to consume the service client response
Codegen for service client response failure handling in VajramImpl. This will include completing the futures (successfully or exceptionally)

#MandatoryJavaDoc Make Javadoc mandatory on Vajrams and its facets

The javadoc must be read and added to the facet metadata so that it is available as documentation

#ValidateResolverAndVajramLogicDataTypes

All data types should match the corresponding definitions in .vajram.yaml files instead of failing at runtime.

#VajramTestHarness

Provide a test harness where devs can mock dependency vajrams and write unit test cases

#VajramOutputType declaration in the vajram yaml config file

Vajram output type should be defined in the vajram yaml config file. This will eliminate adding vajram response type declaration in the dependency details. Also this can be used in identifying duplicate vajrams during build based on input and output type if required.

#NodeExecutionTracing Add support for Krystex to emit node execution traces (https://opentelemetry.io) #Krystex

Refs:
https://opentelemetry.io/docs/
https://github.com/open-telemetry/opentelemetry-java

#OnlyStaticMethodsInVajrams Add check that mandates that methods MUST be static in #vajrams

This ensures that pure functional programming is achieved

#VajramCodeGenOnFileSave Vajrams models should be auto-generated when user saves .vajram.yaml file

Implement a feature where vajram models are auto-generated whenever user edits and saves a .vajram.yaml file

#UnitTestCoverageReport Integrate with jacoco code coverage plugin to get code coverage reports

#MinimizeCommandQueueUsage #Krystex

NodeCommands should be written to the command queue iff the nodeCommand originates from a thread different from the main Thread (Like IO reactor threads). This way, we minimize monitor locking - since command queue is thread safe, and use monitor locking only when needed (when different threads need to access the command queue).

In past local NFRs it was found that Krystex throughput and latency saw significant improvement when the commandQueue usage was optimized by replacing ThreadPoolExecutor (which uses ArrayBLockingQueue internally) With ForkJoinPool (which has a more sharderd queue model to maximize concurrency) (See #83). This seems to be because of contention in the command queue.

With more and more compute nodes being added (We expect the number of compute nodes in the call graph to be at least an order of mangnitude more than io nodes), avoiding command queue usage when the node command originates from the main thread can give us close to java method call performance or in the same ballpark

#InferDepNodesFromDepChainsInDependantChainConstruction

When creating a dependent chain via fromTriggerOrder method, traverse the call graph and infer the node id.
In DependentChain class, change Optional<NodeId> to NodeId
Remove custom equals and hashcode

#RemoveResponseTypeParamFromVajramRequest

Instead, record it as an annotation on the request class so that annotation processor has access to it.

Avoid recomputing if session Input injection is needed by a Vajram #Vajram

Currently Kryon.getSortedDecorators is taking up ~7.5% of the CPU of Kryon.executeMainLogicIfPossible.
Out of this 50% is being taken up by

logicExecutionContext ->
                vajram.getInputDefinitions().stream()
                    .filter(inputDefinition -> inputDefinition instanceof Input<?>)
                    .map(inputDefinition -> ((Input<?>) inputDefinition))
                    .anyMatch(
                        input ->
                            input.sources() != null
                                && input.sources().contains(InputSource.

which is inside VajramKryonGraph.registerInputInjector.
Computing this every time is not needed as this is static and does not change during the runtime of the application. We can cache this inside VajramDefinition and reuse it.

#FlushSkippedDependencies Add support for Flushing complete dependency call graph when a node is skipped in #Krystex

Problem:
Currently the ResolverCommand.Skip feature and the FlushCommand feature do not interoperate well.
When a node is skipped, it immediately returns, and does not interact with its dependencies.
But the Flush functionality of Krystex only works if every dependantChain reliably flushes all its dependencies.

Requirement:
When a node receives a SkipCommand, force flush all its dependencies.

Testing:
Add test case to cover this case :
A depends on B and B depends on C.
A skips B
A flushes B
B should Flush C

Added a "disabled" test case:
KrystexVajramExecutorTest#flush_skippingADependency_flushesCompleteCallGraph

Fix this bug and enable this test case.

#VajramClassNameIsVajramId Validate that vajram class simple name is same as vajramId

#VajramModelsByAnnotationProcessing Generate vajram models using annotation processing in java compiler

Provide a way to define inputs, dependencies and output type of a vajram in annotations.
Implement an annotation processor which parses these annotation and auto-generated vajram models.
In effect, deprecate the *.vajram.yaml files

#WhatsWrongWithMe Implementation in #Krystex

Users of Krystex should be able to

Dump the complete call graph with inputs, outputs, exceptions and timestamps/latencies of each Kryon.
The dump can be a json or it can be in a format which can be loaded in a browser where the complete call graph and the facet values can be viewed as an interactable Graph. The user should be able to filter node in the graph based on tags (IO, compute, service)

#OptionalsInFunctionArguments Accept Optional<T>, ValueOrError<T> and @Nullable T in resolver and vajram logic arguments

Resolvers and VajramLogic should be able to accept
Optional<T> for and only for optional inputs and optional dependencies with no fanout
ValueOrError<T> for and only for optional dependencies with no fanout

#ChangeResolverNaming Change annotations and fields related to resolvers to be more intuitive and readable

Change
@Resolve(value="dep_name", inputs = {"i1","i2"} ) to
@To(dep="dep_name", depInputs = {"i1","i2"} )
OR
@ResolveInputsOf(dep="dep_name", depInputs = {"i1","i2"} )

Change
@BindFrom to @From

#CachingDecorator

Implement Caching decorator and remove the caching logic from Kryons so that we can pick and choose when to use caching and when not.

#StandardizeFacetsTerminology

Rename Inputs class to Facets
Rename @VajramLogic to @Output

#resilience4JLiveMetrics Add support for resilience4J live metrics collection and real-time dashboards #LogicDecorator

#DefaultValueFunction Add support for computing default value for the vajram

In cases where the VajramLogic cannot be called due to reasons like circuitBreaker being open or semaphore exhaustion, a simple function in the vajram with an annotation like @DefaultValue should be called to compute and return the default value of the vajram.

The default value function's code cannot exit the JVM's boundaries (e.g no IO calls).
The default value function should have access to the input values of the vajram (not the dependency values)

Open questions:

Should the default value function have access to the exception thrown? (Need to collect use cases for this) (ref)
If the above is true, should we allow multiple default value methods with different exception signatures? (ref)
Should the default value be used only as a fallback in case of circuit open and bulkhead full? Or should clients have access to the default value for other scenarios like Timeouts, and other generic exceptions? (Need to collect use cases for this)

#KrystalIntellijPlugin Create Intellij plugin for #Krystal

Plugin should be able to

Show data type errors
Show error for missing resolvers for dependency inputs
Show cyclic dependency errors
Show sub-optimal resolution errors
Show error for duplicate resolvers
Auto-generate resolver code
Auto-generate vajram file skeleton via Create New (⌘N) action, and via quick actions on errors.

#RequestScopedInputInjection

Separate Request and Session scoped injection into separate decorators and apply this where ever needed.

#StaticFanoutInference Allow for vajram framework to infer fanout statically from resolver method return type

Change DependencyCommand hierarchy to remove SkipCommand interface, and introduce skipExecute() and skipMultuExecute().
Add validation such that a resolvers CANNOT return DependencyCommand - they can either return SingleExecute or MultiExecute

This will allow code generate to infer which dependencies have fanout enabled

#EasyBatchingConfig Provide a way to configure shared batchers without having to specify the complete dependent chains

Today, to create a shared Input modulator, the only way is to list the complete dependent chains which have to share the input modulator.

Provide a way to more easily configure shared modulators

Develop a java virtual-thread based Krystal Executor #Krystex

The existing implementation of Krystal Executor is based on a single-threaded, event-loop based model. Implement an executor based on java virtual threads of JDK 20.

The idea is that such an executor will have simpler call stacks potentially making understanding error logs and debugging easier.

#VajramDefinitionInOneMethod Design a way to define a vajram in a single method

Provide a way to define a vajram in a single method(instead of the current construct of one class per vajram) to reduce boilerplate and ceremony. (For example: Producers Framework)

#MakeInputResolversVajramTypeSafe

Try to make compiler check that dependency vajram and input being resolvred are same.

Reduce CPU cost of using `java.util.Stream`s in Krystal code

Rough analysis of CPU flame charts show significant usage of CPU by Stream pipelines
For example, in one instance, filtering a list by applying a condition and finding the first match is taking 5x more CPU than the condition itself.
This means out of all the streams being used in Krystal, upto 4x CPU is being wasted by Streams.
If this can be reclaimed by replacing streams with for loops over iterators, we should do that.

#deadlinePropagation Add support for deadline propagation #Krystex

#NoLombokInGeneratedCode Remove lombok annotations from generated models in #vajram

Change Blocking and NonBlocking terminology across the code base #Vajram #Krystex

Code which makes IO calls which return CompletableFutures are called BlockingVajram and BlockingNode today. This is confusing. The code which is making IO calls don't actually block since they are using the non-blocking network call paradigm. These names need to be changed.

Suggestion:
Change Blocking to IO and NonBlocking to Compute (IOVajram, ComputeVajram, IONode, ComputeNode etc..)

#SimpleForwardingResolvers support simple resolvers where 1-to1 input resolution happens in a single line instead of a whole @Resolve method #Vajram #VajramCodeGen