Git Product home page Git Product logo

awesome-data-centric-graphml's Introduction

Awesome-Data-Centric-GraphML

A collection of papers and resources about Data-centric Graph Machine Learning (DC-GML).

We undertake a comprehensive review and provide a promising outlook for data-centric graph machine learning (DC-GML), and propose a systematic framework for DC-GML that encompasses all stages of the graph data lifecycle, including graph data collection, exploration, improvement, exploitation, and maintenance. More details can be found in our work:

Table of Contents

How To Enhance Graph Data Availability and Quality?

The answer to this question corresponds to 'Graph Data Improvement' stage in DC-GML framework, incorporating four aspects of graph data characteristics, i.e., Graph Structure Enhancement, Graph Feature Enhancement, Graph Label Enhancement, and Graph Size Enhancement.

Graph Structure Enhancement

Graph Structure Learning

  • [KDD'2020-Pro-GNN] Graph structure learning for robust graph neural networks. [paper]
  • [ICML'2019-LDS] Learning discrete structures for graph neural networks. [paper]
  • [WWW'2021-GEN] Graph structure estimation neural networks. [paper]
  • [CVPR'2019-GLCN] Semi-supervised learning with graph learning convolutional networks. [paper]
  • [NIPS'2020-IDGL] Iterative deep graph learning for graph neural networks: Better and robust node embeddings. [paper]

Graph Sparsification

  • [AIS'2016] Graph sparsification approaches for laplacian smoothing. [paper]
  • [SIGMOD'2011] Local graph sparsification for scalable clustering. [paper]
  • [SICOMP'2011] Spectral sparsification of graphs.[paper]
  • [NIPS'2019] On differentially private graph sparsification and applications. [paper]
  • [ICDM'2022-GraphSparsify] A generic graph sparsification framework using deep reinforcement learning. [paper]

Graph Diffusion

  • [ICLR'2019-PPNP/APPNP] Predict then propagate: graph neural networks meet personalized pagerank. [paper]
  • [NIPS'2019-GDC] Diffusion improves graph learning. [paper]
  • [ICLR'2021] Adaptive universal generalized pagerank graph neural network. [paper]
  • [NIPS'2021-ADC] Adaptive diffusion in graph neural networks. [paper]

Graph Feature Enhancement

Graph Feature Completion

  • [NN'2020-GINN] Missing data imputation with adversarially-trained graph convolutional networks. [paper]
  • [FGCS'2021-GCN_MF] Graph convolutional networks for graphs containing missing features. [paper]
  • [TPAMI'2020-SAT] Learning on attribute-missing graphs. [paper]
  • [WWW'2021-HGNN-AC] Heterogeneous graph neural network via attribute completion. [paper]
  • [IEEETransCybern'2022-Amer] Amer: A new attribute-missing network embedding approach. [paper]
  • [arxiv'2021-SAGA] Siamese attribute-missing graph auto-encoder. [paper]

Graph Feature Denoising

  • [SPM'2013] The emerging field of signal processing on graphs: Extending high-dimensional data analysis to networks and other irregular domains. [paper]
  • [GlobalSIP'2014] Signal denoising on graphs via graph filtering. [paper]
  • [IET-SP'2018] Graph polynomial filter for signal denoising. [paper]
  • [AIS'2015] Trend filtering on graphs. [paper]
  • [ICASSP'2020] Graph auto-encoder for graph signal denoising. [paper]
  • [TSP'2021] Graph unrolling networks: Interpretable neural networks for graph signal denoising. [paper]
  • [TSP'2022] Untrained graph neural networks for denoising. [paper]
  • [WWW'2023-MAGNET] Robust graph representation learning for local corruption recovery. [paper]

Graph Label Enhancement

Graph Pseudo-labeling

  • [AAAI'2018] Deeper insights into graph convolutional networks for semi-supervised learning. [paper]
  • [AAAI'2020] Multi-stage self-supervised learning for graph convolutional networks on graphs with few labeled nodes. [paper]
  • [CIKM'2021-IFC-GCN] Rectifying pseudo labels: Iterative feature clustering for graph representation learning. [paper]
  • [arXiv'2019-DSGCN] Dynamic self-training framework for graph convolutional networks. [paper]
  • [WSDM'2022-RS-GNN] Towards robust graph neural networks for noisy graphs with sparse labels. [paper]
  • [DMKD'2023-InfoGNN] Informative pseudo-labeling for graph neural networks with few labels. [paper]

Graph Label Denoising

  • [WSDM'2023-CLNode] CLNode: Curriculum learning for node classification. [paper]
  • [arXiv'2019-D-GNN] Learning graph neural networks with noisy labels. [paper]
  • [CIKM'2021-IFC-GCN] Rectifying pseudo labels: Iterative feature clustering for graph representation learning. [paper]
  • [KDD'2021-NRGNN] Nrgnn: Learning a label noise resistant graph neural network on sparsely and noisily labeled graphs. [paper]
  • [WSDM'2023-RTGNN] Robust training of graph neural networks via noise governance. [paper]

Graph Class-imbalanced Sampling

  • [WSDM'2021-GraphSMOTE] Graphsmote: Imbalanced node classification on graphs with graph neural networks. [paper]
  • [KDD'2021-ImGAGN] Imgagn: Imbalanced network embedding via generative adversarial graph networks. [paper]
  • [WWW'2021-PC-GNN] Pick and choose: a GNN-based imbalanced learning approach for fraud detection. [paper]
  • [WWW'2021-GraphMixup] Mixup for node and graph classification. [paper]
  • [ICLR'2021-GraphENS] GraphENS: Neighbor-aware ego network synthesis for class-imbalanced node classification. [paper]
  • [arXiv'2023-GraphSR] GraphSR: A Data Augmentation Algorithm for Imbalanced Node Classification. [paper]
  • [NIPS'2021-ReNode] Topology-imbalance learning for semi-supervised node classification. [paper]
  • [arXiv'2022-TopoImb] TopoImb: Toward topology-level imbalance in learning from graphs. [paper]
  • [IJCAI'2013-igBoost] Graph classification with imbalanced class distributions and noise. [paper]
  • [CIKM'2022-G2GNN] Imbalanced graph classification via graph-of-graph neural networks. [paper]

Graph Size Enhancement

Graph Size Reduction

  • [ICML'2009-Herding] Herding dynamical weights to learn. [paper]
  • [CVPR'2017-ICARL] ICARL: Incremental classifier and representation learning. [paper]
  • [ICLR'2018-K-center] Active learning for convolutional neural networks: A core-set approach. [paper]
  • [ICAIS'2020-Coarsening] Graph coarsening with preserved spectral properties. [paper]
  • [arXiv'2021] Graph domain adaptation: A generative view. [paper]
  • [ICLR'2021-GCond] Graph condensation for graph neural networks. [paper]
  • [KDD'2022-DosCond] Condensing graphs via one-step gradient matching. [paper]
  • [NeurIPS-Workshop'2022] Faster hyperparameter search on graphs via calibrated dataset condensation. [paper]
  • [arXiv'2023-SFGC] Structure-free graph condensation: From large-scale graphs to condensed graph-free data. [paper]

Graph Data Augmentation

  • [ACM SIGKDD Explorations Newsletter'2022-Survey] Data augmentation for deep graph learning: A survey. [paper]
  • [arXiv'2202-Survey] Graph data augmentation for graph machine learning: A survey. [paper]
  • [ICLR'2020-DropEdge] DropEdge: Towards deep graph convolutional networks on node classification. [paper]
  • [NeurIPS'2020-GRAND] Graph random neural networks for semi-supervised learning on graphs. [paper]
  • [AAAI'2022-NASA] Regularizing graph neural networks via consistency-diversity graph augmentations. [paper]
  • [KDD'2020-NodeAug] NodeAug: Semi-supervised node classification with data augmentation. [paper]
  • [AAAI'2021-GAUG] Data augmentation for graph neural networks. [paper]
  • [AAAI'2021-GraphMix] Graphmix: Improved training of gnns for semi-supervised learning. [paper]
  • [WWW'2021-GraphMixup] Mixup for node and graph classification. [paper]
  • [WSDM'2021-GraphSMOTE] Graphsmote: Imbalanced node classification on graphs with graph neural networks. [paper]
  • [CVPR'2022-FLAG] Robust optimization as data augmentation for large-scale graphs. [paper]
  • [ICML'2022-G-Mixup] G-mixup: Graph data augmentation for graph classification. [paper]
  • [ICML'2022-LAGNN] Local augmentation for graph neural networks. [paper]

How To Learn From Graph Data With Limited-availability and Low-quality?

The answer to this question corresponds to 'Graph Data Exploitation' stage in DC-GML framework, incorporating four strategies to learn from graph data with low-quality and limited-availability, i.e., Graph Self-supervised Learning, Graph Semi-supervised Learning, Graph Active Learning, and Graph Transfer Learning.

Graph Self-supervised Learning

  • [TKDE'2022-Survey] Graph self-supervised learning: A survey. [paper]
  • [arXiv'2016-GAE] Variational graph auto-encoders. [paper]
  • [CIKM'2017-MGAE] MGAE: Marginalized graph autoencoder for graph clustering. [paper]
  • [IJCAI'2018-ARGA] Adversarially regularized graph autoencoder for graph embedding. [paper]
  • [ICLR'2019-DGI] Deep graph infomax. [paper]
  • [ICML'2020-MVGRL] Contrastive multi-view representation learning on graphs. [paper]
  • [NeurIPS'2020-GraphCL] Graph contrastive learning with augmentations. [paper]
  • [arXiv'2020-PairwiseDistance/NodeProperty] Self-supervised learning on graphs: Deep insights and new direction. [paper]
  • [NeurIPS'2020-GROVER] Self-supervised graph transformer on large-scale molecular data. [paper]
  • [WWW'2020-GMI] Graph representation learning via graphical mutual information maximization. [paper]
  • [ICML'2020] When does self-supervision help graph convolutional networks? [paper]
  • [WWW'2021-GCA] Graph contrastive learning with adaptive augmentation. [paper]
  • [ICML'2021-JOAO] Graph contrastive learning automated. [paper]
  • [NeurIPS'2021-AD-GCL] Adversarial graph augmentation to improve graph contrastive learning. [paper]
  • [KDD'2022-GraphMAE] GraphMAE: Self-supervised masked graph autoencoders. [paper]
  • [Information Sciences'2022-S2GRL] A new self-supervised task on graphs: Geodesic distance prediction. [paper]
  • [ICLR'2022-AutoSSL] Automated self-supervised learning for graphs. [paper]

Graph Semi-supervised Learning

  • [ICML'2003] Semi-supervised learning using gaussian fields and harmonic functions. [paper]
  • [NeurIPS'2003] Learning with local and global consistency. [paper]
  • [ICML'2005] Learning from labeled and unlabeled data on a directed graph. [paper]
  • [AAAI'2018] Deeper insights into graph convolutional networks for semi-supervised learning. [paper]
  • [KDD'2020-NodeAug] NodeAug: Semi-supervised node classification with data augmentation. [paper]
  • [NeurIPS'2020-GRAND] Graph random neural networks for semi-supervised learning on graphs. [paper]
  • [AAAI'2020-M3S] Multi-stage self-supervised learning for graph convolutional networks on graphs with few labeled nodes. [paper]
  • [WSDM'2021-SimP-GCN] Node similarity preserving graph convolutional networks. [paper]
  • [ACM-TIS'2021-GCN-LPA] Combining graph convolutional neural networks and label propagation. [paper]
  • [AAAI'2021-CG3] Contrastive and generative graph convolutional networks for graph-based semi-supervised learning. [paper]
  • [NeurIPS'2021-GCPN] Contrastive graph poisson networks: Semi-supervised learning with extremely limited labels. [paper]
  • [AAAI'2022-Meta-PN] Meta propagation networks for graph few-shot semi-supervised learning. [paper]
  • [World Wide Web'2022-CycProp] Cyclic label propagation for graph semi-supervised learning. [paper]

Graph Active Learning

  • [arXiv'2017-AGE] Active learning for graph embedding. [paper]
  • [IJCAI'2018-ANRMAB] Active discriminative network representation learning. [paper]
  • [IJCAI'2019-ActiveHNE] ActiveHNE: active heterogeneous network embedding. [paper]
  • [arXiv'2019-FeatProp] Active learning for graph neural networks via node feature propagation. [paper]
  • [WWW'2020-ATNE] Active domain transfer on network embedding. [paper]
  • [KDD'2020-ASGN] ASGN: An active semi-supervised graph neural network for molecular property prediction. [paper]
  • [NeurIPS'2020-GPA] Graph policy network for transferable active learning on graphs. [paper]
  • [ACML'2020-MetAL] Metal: Active semi-supervised learning on graphs via meta-learning. [paper]
  • [TNNLS'2020-SEAL] Seal: Semisupervised adversarial active learning on attributed graphs. [paper]
  • [VLDB Endowment'2021-GRAIN] GRAIN: improving data efficiency of graph neural networks via diversified in fluence maximization. [paper]
  • [NeurIPS'2021-RIM] RIM: Reliable influence-based active learning on graphs. [paper]
  • [WWW'2021-Attent] Attent: Active attributed network alignment. [paper]
  • [ICMD'2021-ALG] ALG: Fast and accurate active learning framework for graph convolutional networks. [paper]
  • [WWW'2022-ALLIE] ALLIE: Active learning on large-scale imbalanced graphs. [paper]
  • [AAAI'2022-BIGENE] Batch active learning with graph neural networks via multi-agent deep reinforcement learning. [paper]
  • [ICLR'2022-IGP] Information Gain Propagation: A new way to graph active learning with soft labels. [paper]
  • [KDD'2022-JuryGCN] JuryGCN: quantifying jackknife uncertainty on graph convolutional networks. [paper]

Graph Transfer Learning

  • [IJCAI'2019-DANE] DANE: domain adaptive network embedding. [paper]
  • [WWW'2020-UDA-GCN] Unsupervised domain adaptive graph convolutional networks. [paper]
  • [AAAI'2020-ACDNE] Adversarial deep network embedding for cross-network node classification. [paper]
  • [ICDM'2020-OpenWGL] Openwgl: Open-world graph learning. [paper]
  • [ICML'2020-PGL] Progressive graph learning for open-set domain adaptation. [paper]
  • [NeurIPS'2021-SRGNN] Shift-robust gnns: Overcoming the limitations of localized graph training data. [paper]
  • [arXiv'2021-SOGA] Source free unsupervised graph domain adaptation. [paper]
  • [arXiv'2021] Graph domain adaptation: A generative view. [paper]
  • [NeurIPS-Workshop'2022-SRNC] Shift-robust node classification via graph clustering co-training. [paper]

How To Build Graph MLOps System: The Graph Data-centric View.

The answer to this question corresponds to three stages of 'Graph Data Collection, Graph Data Exploration, and Graph Data Maintenance' in DC-GML framework. Along with Graph Data Improvement and Graph Data Exploitation, we build a graph MLOps from the graph data-centric view.

Graph Data Collection

  • Amazon Mechanical Turk: https://www.mturk.com/
  • [SIGIR-Workshop'2011] Semi-supervised consensus labeling for crowdsourcing. [paper]
  • [Cloud Computing'2021] Knowledge graphs meet crowdsourcing: a brief survey. [paper]
  • [Journal of Classification'1997] Estimation and prediction for stochastic blockmodels for graphs with latent block structure. [paper]
  • Probabilistic graphical models: principles and techniques. [book]
  • [NeurIPS'2019] Gnnexplainer: Generating explanations for graph neural networks. [paper]
  • [KDD'2014] Focused clustering and outlier detection in large attributed graphs. [paper]
  • [Journal of Machine Learning Research'2023] Graph clustering with graph neural networks. [paper]
  • [WWW'2021] Pathfinder discovery networks for neural message passing. [paper]
  • [arXiv'2020] Benchmarking graph neural networks. [paper]
  • [arXiv'2022] Synthetic graph generation to benchmark graph learning. [paper]
  • [KDD'2022] Graphworld: Fake graphs bring real insights for gnns. [paper]

Graph Data Exploration

Graph Data Maintenance

  • [arXiv'2022-TrustworthyGNN-Survey] Trustworthy graph neural networks: Aspects, methods and trends. [paper]
  • [IEEE Network'2010] Privacy and security for online social networks: challenges and opportunities. [paper]
  • [SIGMOD'2008] Towards identity anonymization on graphs. [paper]
  • [Multimedia Tools and Applications'2018] Privacy preservation based on clustering perturbation algorithm for social network. [paper]
  • [EDBT/ICDT Workshops'2015] Privacy-Integrated Graph Clustering Through Differential Privacy. [paper]
  • [Information Sciences'2020] PGAS: Privacy-preserving graph encryption for accurate constrained shortest distance queries. [paper]
  • [KDD'2022] Federatedscope-gnn: Towards a unified, comprehensive and efficient package for federated graph learning. [paper]
  • [AAAI'2023] Federated learning on Non-IID graphs via structural knowledge sharing. [paper]
  • [IEEE Communications Magazine'1994] Access control: principle and practice. [paper]
  • [Global Summit on Computer and Information Technology'2014] Implementation of elliptic curve digital signature algorithm (ECDSA). [paper]
  • [Computers and Security'2021] Threat detection and investigation with system-level provenance graphs: a survey. [paper]

Graph MLOps

awesome-data-centric-graphml's People

Contributors

amanda-zheng avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.