Git Product home page Git Product logo

tap-sharepointsites's Introduction

tap-sharepointsites

tap-sharepointsites is a Singer tap for Microsoft Graph Sharepoint lists.

Built with the Meltano Tap SDK for Singer Taps.

Test tap-sharepointsites

Capabilities

  • catalog
  • state
  • discover
  • about
  • stream-maps
  • schema-flattening

Settings

Setting Required Default Description
api_url True None The url for the API service
lists False None The name of the list to sync
files False None Files to sync
pages False None Whether or not to sync pages
client_id False None Managed Identity Client ID
stream_maps False None Config object for stream maps capability. For more information check out Stream Maps.
stream_map_config False None User-defined config values to be used within map expressions.
flattening_enabled False None 'True' to enable schema flattening and automatically expand nested properties.
flattening_max_depth False None The max depth to flatten schemas.
batch_config False None

A full list of supported settings and capabilities is available by running: tap-sharepointsites --about

File config

The file configuration accepts an array of objects, with keys:

  • name: Name given to the stream/table
  • file_pattern: regex-like pattern for filenames to load
  • folder: Subfolder where the files are located
  • file_type: Type (format) of file to load, either csv or excel.
  • delimiter: Field delimiter for CSV files. default ,
  • clean_colnames: Whether to convert column names to snake_case. default false
  • sheet_name: Sheet name to pull from. default: Sheet1
  • min_row: starting row in sheet. optional
  • max_row: last row in sheet. optional
  • min_col: starting column in sheet. optional
  • max_col: last column in sheet. optional

Example config:

...
  config:
    ...
    files:
    - name: employees
      file_pattern: employees_.*\.xlsx
      folder: hr_data/raw
      file_type: excel
      clean_colnames: true
      sheet_name: Sheet1
      min_row: 1
      max_row: 10
      min_col: 1
      max_col: 10
  ...

Web pages

You can sync the content of sharepoint web pages, typically relevant for LLM/RAG type of use cases. The Microsoft Graph endpoint for pages is still in Beta, and does not work when logged in as a personal user. In order for it to work, you need to use a Managed Identity.

Example config:

...
  config:
    ...
    pages: true
  ...

Configuration

Accepted Config Options

A full list of supported settings and capabilities for this tap is available by running:

tap-sharepointsites --about

Configure using environment variables

This Singer tap will automatically import any environment variables within the working directory's .env if the --config=ENV is provided, such that config values will be considered if a matching environment variable is set either in the terminal context or in the .env file.

Source Authentication and Authorization

Usage

You can easily run tap-sharepointsites by itself or in a pipeline using Meltano.

Executing the Tap Directly

tap-sharepointsites --version
tap-sharepointsites --help
tap-sharepointsites --config CONFIG --discover > ./catalog.json

Developer Resources

Follow these instructions to contribute to this project.

Initialize your Development Environment

pipx install poetry
poetry install

Create and Run Tests

Create tests within the tap_sharepointsites/tests subfolder and then run:

poetry run pytest

You can also test the tap-sharepointsites CLI interface directly using poetry run:

poetry run tap-sharepointsites --help

Testing with Meltano

Note: This tap will work in any Singer environment and does not require Meltano. Examples here are for convenience and to streamline end-to-end orchestration scenarios.

Next, install Meltano (if you haven't already) and any needed plugins:

# Install meltano
pipx install meltano
# Initialize meltano within this directory
cd tap-sharepointsites
meltano install

Now you can test and orchestrate using Meltano:

# Test invocation:
meltano invoke tap-sharepointsites --version
# OR run a test `elt` pipeline:
meltano elt tap-sharepointsites target-jsonl

SDK Dev Guide

See the dev guide for more instructions on how to use the SDK to develop your own taps and targets.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.