Git Product home page Git Product logo

nasa-pds / registry-crawler-service Goto Github PK

View Code? Open in Web Editor NEW
2.0 11.0 0.0 413 KB

DEPRECATED. Server app providing the functionality for crawling PDS4 products. It has to be used with other components, such as RabbitMQ message broker, Harvest Server and Harvest Client to enable performant ingestion of large data sets into PDS Registry (https://github.com/NASA-PDS/registry).

License: Other

Java 88.92% Shell 7.05% Batchfile 1.19% Dockerfile 2.84%
crawler data-loader etl java pds pds4

registry-crawler-service's Introduction

Crawler Web Service

Server application providing the functionality for crawling PDS4 products. It has to be used with other components, such as RabbitMQ message broker, Harvest Server and Harvest Client to enable performant ingestion of large data sets into PDS Registry.

The description of the full application is available on https://nasa-pds.github.io/registry-harvest-service/ . Facilities to launch the full application (including this components) are provided in the registry repository (see https://github.com/NASA-PDS/registry/tree/main/docker).

๐Ÿ“€ Installation

This is a Java application. You need Java 11 JDK and Maven to build it. To create a binary distribution (ZIP and TGZ archives) run the following maven command:

mvn package

Binary archives (such as "registry-crawler-service-1.0.0-SNAPSHOT-bin.zip") will be created in "target" directory.

Prebuilt binaries are available in https://github.com/NASA-PDS/registry-crawler-service/releases

To install, just extract a binary archive into some folder, such as "/opt/crawler"

๐Ÿ’โ€โ™€๏ธ Usage

๐Ÿ‘ฅ Contributing

Within the NASA Planetary Data System, we value the health of our community as much as the code. Towards that end, we ask that you read and practice what's described in these documents:

  • Our contributor's guide delineates the kinds of contributions we accept.
  • Our code of conduct outlines the standards of behavior we practice and expect by everyone who participates with our software.

๐Ÿ”ข Versioning

We use the SemVer philosophy for versioning this software. Or not! Update this as you see fit.

Manual Publication

NOTE: Requires using PDS Maven Parent POM to ensure release profile is set.

Update Version Numbers

Update pom.xml for the release version or use the Maven Versions Plugin, e.g.:

# Skip this step if this is a RELEASE CANDIDATE, we will deploy as SNAPSHOT version for testing
VERSION=1.15.0
mvn versions:set -DnewVersion=$VERSION
git add pom.xml
git add */pom.xml

Update Changelog

Update Changelog using Github Changelog Generator. Note: Make sure you set $CHANGELOG_GITHUB_TOKEN in your .bash_profile or use the --token flag.

# For RELEASE CANDIDATE, set VERSION to future release version.
GITHUB_ORG=NASA-PDS
GITHUB_REPO=validate
github_changelog_generator --future-release v$VERSION --user $GITHUB_ORG --project $GITHUB_REPO --configure-sections '{"improvements":{"prefix":"**Improvements:**","labels":["Epic"]},"defects":{"prefix":"**Defects:**","labels":["bug"]},"deprecations":{"prefix":"**Deprecations:**","labels":["deprecation"]}}' --no-pull-requests --token $GITHUB_TOKEN

git add CHANGELOG.md

Commit Changes

Commit changes using following template commit message:

# For operational release
git commit -m "[RELEASE] Validate v$VERSION"

# Push changes to main
git push -u origin main

Build and Deploy Software to Maven Central Repo

# For operational release
mvn clean site site:stage package deploy -P release

# For release candidate
mvn clean site site:stage package deploy

Push Tagged Release

# For Release Candidate, you may need to delete old SNAPSHOT tag
git push origin :v$VERSION

# Now tag and push
REPO=validate
git tag v${VERSION} -m "[RELEASE] $REPO v$VERSION" -m "See [CHANGELOG](https://github.com/NASA-PDS/$REPO/blob/main/CHANGELOG.md) for more details."
git push --tags

Deploy Site to Github Pages

From cloned repo:

git checkout gh-pages

# Copy the over to version-specific and default sites
rsync -av target/staging/ .

git add .

# For operational release
git commit -m "Deploy v$VERSION docs"

# For release candidate
git commit -m "Deploy v${VERSION}-rc${CANDIDATE_NUM} docs"

git push origin gh-pages

Update Versions For Development

Update pom.xml with the next SNAPSHOT version either manually or using Github Versions Plugin.

For RELEASE CANDIDATE, ignore this step.

git checkout main

# For release candidates, skip to push changes to main
VERSION=1.16.0-SNAPSHOT
mvn versions:set -DnewVersion=$VERSION
git add pom.xml
git commit -m "Update version for $VERSION development"

# Push changes to main
git push -u origin main

Complete Release in Github

Currently the process to create more formal release notes and attach Assets is done manually through the Github UI, but should eventually be automated via script.

NOTE: Be sure to add the tar.gz and zip from the target/ directory to the release assets, and use the CHANGELOG generated above to create the RELEASE NOTES.

CI/CD

The template repository comes with our two "standard" CI/CD workflows, stable-cicd and unstable-cicd. The unstable build runs on any push to main (+/- ignoring changes to specific files) and the stable build runs on push of a release branch of the form release/<release version>. Both of these make use of our GitHub actions build step, Roundup. The unstable-cicd will generate (and constantly update) a SNAPSHOT release. If you haven't done a formal software release you will end up with a v0.0.0-SNAPSHOT release (see NASA-PDS/roundup-action#56 for specifics). Additionally, tests are executed on any non-main branch push via branch-cicd.

๐Ÿ“ƒ License

The project is licensed under the Apache version 2 license. Or it isn't. Change this after consulting with your lawyers.

registry-crawler-service's People

Contributors

dependabot[bot] avatar jordanpadams avatar nutjob4life avatar pdsen-ci avatar ramesh-maddegoda avatar tdddblog avatar testpersonal avatar tloubrieu-jpl avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

registry-crawler-service's Issues

Stable Roundup can no longer trigger Imaging workflow

๐Ÿ› Describe the bug

After pushing a tag like release/x.y.z, the Roundup Action performs a release and then deletes the release/x.y.z tag. However, the Stable workflow then triggers the Imaging workflow via repository dispatch. The Imaging workflow then checks out release/x.y.z, but it no longer exists.

๐Ÿ“œ To Reproduce

  1. Push a tag like release/x.y.z
  2. ๐Ÿฟ

๐Ÿ•ต๏ธ Expected behavior

Docker images are constructed and pushed to the Hub.

unstable build failed

@tloubrieu-jpl commented on Tue Dec 13 2022

๐Ÿ› Describe the bug

See action https://github.com/NASA-PDS/registry-crawler-service/actions/runs/3687311354

The snapshot release is done and accessible but on different url, see https://github.com/NASA-PDS/registry-crawler-service/releases/download/untagged-066a2eae3757652aed44/registry-crawler-service-1.1.0-SNAPSHOT-bin.tar.gz compared to what roundup is looking for, see https://github.com/NASA-PDS/registry-crawler-service/releases/download/v1.1.0-SNAPSHOT/registry-crawler-service-1.1.0-SNAPSHOT-bin.tar.gz


@nutjob4life commented on Wed Dec 21 2022

The workflow that had the problem wasn't a Roundup Action workflow, though. It was the Imaging workflow. I believe to fix this in registry-crawler-api we need a fix similar to this one.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.