Git Product home page Git Product logo

correted-programmableweb-dataset's Introduction

Corrected ProgrammableWeb dataset

Introduction

This repo contains the data and utils used in Data Correction and Evolution Analysis of the ProgrammableWeb Service Ecosystem.

Here is the content of each folder:

  • crawler: The crawlers to get the data from ProgrammableWeb, including API crawler and Mashup crawler.
  • data: The formatted datasets which are used in the paper are listed here. Its subfolder raw contains the raw data fetched by the crawlers in json type.
  • visualization folder contains some of the statistics results and visualization process.

Data

We put our raw and corrected datasets in the data folder.

Formatted Dataset

The raw data has been formatted in csv type. Some universal column names and their meanings are as follows:

Column name Meaning
tp Type. API or Mashup
url The URL of an API or Mashup
name The API or Mashup's title
st Submit date
et Corrected dead date
oet Dead date provided in PW
c Category
oac Corrected accessibility
ac Accessibility provided in PW
  • api_nodes_estimator.csv is the API data. Each line represents an API.
  • m-a_edges.csv is the data of Mashups invoking APIs.
  • mashup_nodes_estimator.csv is the Mashup data.
  • split_nodes.csv is the split API data.
  • transfer_nodes.csv is the transferred API data.

Raw data

  • active_apis_data.txt, deadpool_apis_data.txt, active_mashups_data.txt, deadpool_mashups_data.txt are APIs and Mashups that are marked as active or dead on the ProgrammableWeb.
  • accessibility subfolder contains APIs' accessibility and Mashups' accessibility. These data are collected by visiting the Homepage URL or API Endpoint or API Portal. With the dataset we can know if an service is still working.
  • all_pairs.txt: A Mashup may invoke many APIs, and we collect the co-work APIs data and the frequencies as API pairs.

Citation

If you make use of this code, we appreciate it if you can cite our paper as follows:

@article{liu2021data,
  title={Data correction and evolution analysis of the ProgrammableWeb service ecosystem},
  author={Liu, Mingyi and Tu, Zhiying and Zhu, Yeqi and Xu, Xiaofei and Wang, Zhongjie and Sheng, Quan Z},
  journal={Journal of Systems and Software},
  pages={111066},
  year={2021},
  publisher={Elsevier}
}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.