bytedance / fedlearner Goto Github PK

A multi-party collaborative machine learning framework

License: Apache License 2.0

Dockerfile 0.15% Makefile 0.07% Shell 2.86% Go 3.96% Lua 0.20% Python 59.38% C++ 0.63% JavaScript 11.70% HTML 0.09% Mako 0.01% TypeScript 19.53% CSS 0.19% Less 0.36% Mustache 0.87%

fedlearner's Introduction

Fedlearner™

Fedlearner is collaborative machine learning framework that enables joint modeling of data distributed between institutions.

Trademark Usage Policy

Fedlearner welcomes everyone to build on or modify Fedlearner open source software for your own project. The license of the software doesn’t grant permission to use trademarks or product names in respect to the licensor. However, you may use the trademark or product name if,

You use a wordmark to refer to Fedlearner program, product or technology;
You use a wordmark in text to indicate the compatibility of your project with Fedlearner project;
You use a wordmark in text to indicate your project is built based on Fedlearner technology.

If you would like to use the Fedlearner trademark,

to combine a trademark with your own brand, trademark, product, project, service or domain name;
for any other commercial use;
as a verb or noun, rather than only as an adjective followed by the generic name/noun;
in a modified, abbreviated or altered form, or in the plural or possessive form; please feel free to contact us for an express permission.

If you use a trademark in a way not set forth above, or for any illegal purpose with the program, the licensor reserves the right in its sole discretion to terminate or modify your permission to display or use a trademark and to take action against any use that does not conform to these terms and conditions, or violates applicable law.

fedlearner's People

Contributors

Stargazers

Watchers

Forkers

rexnxiaobai kiminh ricky1993 ieyjzhou abrliu weimingwill melong007 mkfhe-ado eddyj xiaming9880 feiga draven-agency dotrado tessiehe smarteryu yuesoctober piiswrong fclh1991 marswong saswat0 rapmetal nervouself codemonkey-ll granzonyz nocmk2 bruinxiong alllenshen wulc etsangsplk telmazzzz lishan-pro astarxixi xiaxzp lastincisor wwyy456258 uheqiang kunchanglee whiledoing atakey zyenggook greatwizard9519 lukehuang suzhoushr gavinljj yun-cn kennyeric2022 csearch vu1seek zuiwanting rfnoah geekqingxin beijinggao vcl990 muximuxi gongfuchang qi-pang aries-jessie codejeremy allenpu youtang1993 yao544303 mysqlsc judgeeeeee douxindong mldl chaoso lovejike lepy fruittea2015 fqiang nlgithubwp 123dosomething shadowfly zhangqixun shimaomao waynecz errord zhengyuli ssskrilex 1157942086 wxpwxpwxp jfatty wei-225 zhenv5 yongquanf ddayzzz joejiong yjshen1982 foreverqing flyfoxci nolanliou smartnews-weitao hangweiqiang-uestc whisylan cypherqiu cosmtrek ahmedcs next-generation-search-engine yuanbw demon888

fedlearner's Issues

Lack of introduction documents

Hi, are there any simple ways to know what the framework can do, without having to dive into the codes?
A document describing what kind of computations are supported (e.g. vertical or horizontal splitted databases? Neural networks or tree models ? Inference or training ? ) and giving example links for each kind would be very helpful.

There are some examples under /example/, but it's not clear which kind of computations they belong to, or are they the only type of computations supported.

No gRPC auto-generated files found?

Hi there, I found in some python files, the *pb2.py and *pb2_grpc.py generated by gRPC's protoc were imported, but I couldn't found where these auto-generated files are.

For example, in fedlearner/data_join/rsa_psi/rsa_psi_signer.py line 26-28:

from fedlearner.common import common_pb2 as common_pb
from fedlearner.common import data_join_service_pb2 as dj_pb
from fedlearner.common import data_join_service_pb2_grpc as dj_grpc

But I could not found any *pb2.py or *pb2_grpc.py in fedlearner/common directory.

I also checked the content of fedlearner/common/__init__.py, it only imports tensorflow_io in line 18:

import tensorflow_io

The same question also confuses me in another example. In fedlearner/web_console_v2/api/fedlearner_webconsole/rpc/client.py line 20-22:

from fedlearner_webconsole.proto import (
    service_pb2, service_pb2_grpc, common_pb2
)

I also couldn't find any auto-generated files in fedlearner_webconsole.proto package.

So, in one word, I am curious about where these auto-generated grpc-related files are located?

Your reply will be greatly appreciated!

fedlearner is supported horizontal and vertical multiple federated model?

fedlearner support horizontal and vertical multiple federated model, like left picture

if supported, how to implement.

deploy-- not found manifests file

fedlearner/docs/tutorials/deploy.md

kubectl create -f deploy/kubernetes_operator/manifests/service_account.yaml
kubectl apply -f deploy/kubernetes_operator/manifests/cluster_role.yaml
kubectl apply -f deploy/kubernetes_operator/manifests/cluster_role_binding.yaml
kubectl apply -f deploy/kubernetes_operator/manifests/fedlearner.k8s.io_flapps.yaml
kubectl apply -f deploy/kubernetes_operator/manifests/controller.yaml

There is a vulnerability in lodash 4.17.20,upgrade recommended

fedlearner/web_console/package.json

Line 103 in 70f3c5b

"lodash": "4.17.20",

CVE-2021-23337 CVE-2020-28500

Recommended upgrade version：4.17.21

How to run the deploy demo

fedlearner/docs/tutorials/deploy.md

I didn't find these files in this project.

What should be the correct method to run this tutorial successfully

Support tensorflow2 in the future?

coud it be used in Recommendation？

hi,dear
based on the douyin or toutiao's tech, could the rp be used for Recommendatiom,
such as Video Recommendation, News Recommendation？

looking forward to the best tech to be open source.

channel_pb2 is not defined

the question is :

from fedlearner.channel import channel_pb2, channel_pb2_grpc

ImportError: cannot import name 'channel_pb2'

There is a vulnerability in immer 7.0.14,upgrade recommended

fedlearner/web_console/package.json

Line 93 in 70f3c5b

"immer": "7.0.14",

CVE-2020-28477

Recommended upgrade version：8.0.1

[Help needed] Cannot deploy and test fedleaner with local k8s cluster by minikube

Followed the "fedlearner/deploy/README.md" with minikube v1.11.0/kubectl v1.18.5/helm 3.2, I could install the "fedlearner-stack" in "default" namespace and the "fedleaner" in "leader" namespace successfully, however, I could not install the second "fedlearner" in "follower" namespace with the follow error message:

"Error: rendered manifests contain a resource that already exists. Unable to continue with install: existing resource conflict: kind : ClusterRole, namespace: , name: fedlearner-apiserver"

It seems the ClusterRole is a global resource causing the conflicts. Is current code base support testing fedlearner with local k8s cluster by minikube? If so, help needed on how to solve this issue. Also there are several README.md in the repo, and which one should be followed? Thanks!

quick_start里python命令缺失checkpoint_path参数和save-checkpoint-steps参数

添加这两个参数后才运行成功

Template fedlearner.k8s.io_flapps.yaml validation error?

Hi there,

I am a newbie to fedlearner and ran into error when executing following CLI
helm install fedlearner ./deploy/charts/fedlearner --namespace leader

W0208 13:36:33.061422 40892 warnings.go:70] apiextensions.k8s.io/v1beta1 CustomResourceDefinition is deprecated in v1.16+, unavailable in v1.22+; use apiextensions.k8s.io/v1 CustomResourceDefinition
W0208 13:36:33.507972 40892 warnings.go:70] networking.k8s.io/v1beta1 Ingress is deprecated in v1.19+, unavailable in v1.22+; use networking.k8s.io/v1 Ingress
W0208 13:36:33.518213 40892 warnings.go:70] networking.k8s.io/v1beta1 Ingress is deprecated in v1.19+, unavailable in v1.22+; use networking.k8s.io/v1 Ingress
W0208 13:36:34.452027 40892 warnings.go:70] apiextensions.k8s.io/v1beta1 CustomResourceDefinition is deprecated in v1.16+, unavailable in v1.22+; use apiextensions.k8s.io/v1 CustomResourceDefinition
W0208 13:36:34.637682 40892 warnings.go:70] apiextensions.k8s.io/v1beta1 CustomResourceDefinition is deprecated in v1.16+, unavailable in v1.22+; use apiextensions.k8s.io/v1 CustomResourceDefinition
Error: failed to create resource: CustomResourceDefinition.apiextensions.k8s.io "flapps.fedlearner.k8s.io" is invalid: [spec.validation.openAPIV3Schema.properties[spec].properties[flReplicaSpecs].additionalProperties.properties[template].properties[spec].properties[initContainers].items.properties[ports].items.properties[protocol].default: Required value: this property is in x-kubernetes-list-map-keys, so it must have a default or be a required property, spec.validation.openAPIV3Schema.properties[spec].properties[flReplicaSpecs].additionalProperties.properties[template].properties[spec].properties[containers].items.properties[ports].items.properties[protocol].default: Required value: this property is in x-kubernetes-list-map-keys, so it must have a default or be a required property]

I am using the latest code and K8S platform is using RKE 1.19.7, the fedlearner-stack is successfully deployed with no issue.
anything I missed?

Thanks in advance.

README.md里面单词写错了

Fedlearner is collaborative machine learning frameowork that enables joint modeling of data distributed between institutions.，这里的frameowork应该是framework吧

部署文档能写的详细一点吗？

Error while running the example

I am using tf-1.15.2 docker image to install fedlearner

Error reported while running the example code

python leader.py --local-addr=localhost:50051 --peer-addr=localhost:50052 --data-path=data/leader &

File "leader.py", line 20, in <module> import fedlearner.trainer as flt File "/work/Algorithms/FederatedLearning/fedlearner/fedlearner/__init__.py", line 22, in <module> from fedlearner import trainer File "/work/Algorithms/FederatedLearning/fedlearner/fedlearner/trainer/__init__.py", line 19, in <module> from fedlearner.trainer import bridge File "/work/Algorithms/FederatedLearning/fedlearner/fedlearner/trainer/bridge.py", line 29, in <module> from fedlearner.common import common_pb2 as common_pb File "/work/Algorithms/FederatedLearning/fedlearner/fedlearner/common/common_pb2.py", line 22, in <module> create_key=_descriptor._internal_create_key, AttributeError: module 'google.protobuf.descriptor' has no attribute '_internal_create_key'

The versions of protobuf and protoc are both 3.11.2

Any help is appreciated!

fedlearner是一个已经完成的（部署完可以进行模型训练后和在线推理）的项目吗？

fedlearner是一个已经完成的（部署完可以进行模型训练后和在线推理）的项目吗？有几个问题确认下，希望解答谢谢
1、部署时，当leader和follow在同一个k8s集群时会冲突，如何解决？（目前我是分别将leader和follow部署在两个集群中）
2、部署完后发现mysql数据库未初始化，是因为项目还在完善中，未加初始化相关操作吗？
3、手动添加用户后，登录到web-console 后无法创建job及其他操作

关于部署和启动任务问题

作者有没有类似像fate的部署说明文档，可以快速的部署起来。
接口相关说明文档，接口所需参数，返回什么结果。

Is the project still under maintenance?

flask api run error

/api/v2/auth/users/1:

ERROR:root:Uncaught exception wrapper() got an unexpected keyword argument 'user_id', stack trace:
 Traceback (most recent call last):
  File "/home/guotie/.local/lib/python3.7/site-packages/flask/app.py", line 1950, in full_dispatch_request
    rv = self.dispatch_request()
  File "/home/guotie/.local/lib/python3.7/site-packages/flask/app.py", line 1936, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/home/guotie/.local/lib/python3.7/site-packages/flask_restful/__init__.py", line 468, in wrapper
    resp = resource(*args, **kwargs)
  File "/home/guotie/.local/lib/python3.7/site-packages/flask/views.py", line 89, in view
    return self.dispatch_request(*args, **kwargs)
  File "/home/guotie/.local/lib/python3.7/site-packages/flask_restful/__init__.py", line 583, in dispatch_request
    resp = meth(*args, **kwargs)
TypeError: wrapper() got an unexpected keyword argument 'user_id'

127.0.0.1 - - [10/Mar/2021 10:11:55] "GET /api/v2/auth/users/1 HTTP/1.0" 500 -
INFO:werkzeug:127.0.0.1 - - [10/Mar/2021 10:11:55] "GET /api/v2/auth/users/1 HTTP/1.0" 500 -

python 3.7

谁部署成功了？

13 INTERNAL: Received RST_STREAM with code 2 (Internal server error)
minikube部署后，创建Federation，点击Check Connection报上面这个错
有没有成功部署并运行demo的，求方法？

Cannot clone this repo on windows PC

there is illegal character in path:

... <git clone output>
error: invalid path 'web_console_v2/client/src/services/mocks/v2/auth/users/:id.ts'
fatal: unable to checkout working tree
warning: Clone succeeded, but checkout failed.

python leader.py error

fix the error

AttributeError: 'tuple' object has no attribute 'make_one_shot_iterator'

code

    def _get_features_and_labels_from_input_fn(self, input_fn, mode):
        return input_fn(self._bridge, self._trainer_master)

ERROR 1

python leader.py --local-addr=192.168.208.62:50051 --peer-addr=192.168.208.62:50052 --data-path=data/leader

WARNING:root:Bridge failed to connect: <_InactiveRpcError of RPC that terminated with:
        status = StatusCode.UNAVAILABLE
        details = "failed to connect to all addresses"
        debug_error_string = "{"created":"@1604474548.697105063","description":"Failed to pick subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":4165,"referenced_errors":[{"created":"@1604474491.601303236","description":"failed to connect to all addresses","file":"src/core/ext/filters/client_channel/lb_policy/pick_first/pick_first.cc","file_line":397,"grpc_status":14}]}"
>. Retry in 1 second...

ERROR 2

python3.6 leader.py --local-addr=localhost:50051 --peer-addr=localhost:50052 --data-path=data/leader --sparse-estimator=True

/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/requests/__init__.py:80: RequestsDependencyWarning: urllib3 (1.25.11) or chardet (3.0.4) doesn't match a supported version!
  RequestsDependencyWarning)
Failed to load fedlearner operators from /Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/fedlearner-0.1-py3.6.egg/cc/embedding.so
Traceback (most recent call last):
  File "leader.py", line 113, in <module>
    model_fn, serving_input_receiver_fn)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/fedlearner-0.1-py3.6.egg/fedlearner/trainer/trainer_worker.py", line 244, in train
    save_checkpoint_secs=args.save_checkpoint_secs)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/fedlearner-0.1-py3.6.egg/fedlearner/trainer/estimator.py", line 287, in train
    input_fn, ModeKeys.TRAIN)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/fedlearner-0.1-py3.6.egg/fedlearner/trainer/sparse_estimator.py", line 235, in _get_features_and_labels_from_input_fn
    slot_configs = self._set_model_configs(mode) # features, labels, mode)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/fedlearner-0.1-py3.6.egg/fedlearner/trainer/sparse_estimator.py", line 223, in _set_model_configs
    self._model_fn(M, None, None, mode) # features, labels, mode)
  File "leader.py", line 51, in model_fn
    x = features['x']
TypeError: 'NoneType' object is not subscriptable

env

python3.6 -V
Python 3.6.6

protoc --version
libprotoc 3.6.1

How can I run a rsa-psi demo?

Hi there, I want to use use your implementation of RSA blind signature based PSI to do set intersection, how can I do that? Is there a tutorial about this? And b.t.w, have you test the time efficiency of your implementation? For example, if private set A and B both contain about 100K elements, how long with this implementation take?

Your reply will be greatly appreciated!

example/mnist 运行报错，embedding.so 或 libtensorflow_io_golang.so

启动 leader.py follower.py 之后抛出异常栈：

Failed to load fedlearner operators from /fedlearner/cc/embedding.so
Traceback (most recent call last):
File "follower.py", line 20, in
import fedlearner.trainer as flt
File "/fedlearner/fedlearner/init.py", line 21, in
from fedlearner import trainer
File "/fedlearner/fedlearner/trainer/init.py", line 23, in
from fedlearner.trainer import trainer_worker
File "/fedlearner/fedlearner/trainer/trainer_worker.py", line 32, in
from fedlearner.trainer.data_visitor import DataPathVisitor,
File "/fedlearner/fedlearner/trainer/data_visitor.py", line 32, in
from fedlearner.data_join.data_block_visitor import DataBlockVisitor
File "/fedlearner/fedlearner/data_join/data_block_visitor.py", line 20, in
import tensorflow_io # pylint: disable=unused-import
File "/usr/local/lib/python3.7/site-packages/tensorflow_io/init.py", line 21, in
from tensorflow_io.core.python.ops.io_info import version as version
File "/usr/local/lib/python3.7/site-packages/tensorflow_io/core/python/ops/init.py", line 69, in
core_golang_ops = _load_library('libtensorflow_io_golang.so')
File "/usr/local/lib/python3.7/site-packages/tensorflow_io/core/python/ops/init.py", line 67, in _load_library
"{}, from paths: {}\ncaused by: {}".format(filename, filenames, errs))
NotImplementedError: unable to open file: libtensorflow_io_golang.so, from paths: ['/usr/local/lib/python3.7/site-packages/tensorflow_io/core/python/ops/libtensorflow_io_golang.so']
caused by: ['dlopen: cannot load any more object with static TLS']
Failed to load fedlearner operators from /fedlearner/cc/embedding.so
Traceback (most recent call last):
File "leader.py", line 20, in
import fedlearner.trainer as flt
File "/fedlearner/fedlearner/init.py", line 21, in
from fedlearner import trainer
File "/fedlearner/fedlearner/trainer/init.py", line 23, in
from fedlearner.trainer import trainer_worker
File "/fedlearner/fedlearner/trainer/trainer_worker.py", line 32, in
from fedlearner.trainer.data_visitor import DataPathVisitor,
File "/fedlearner/fedlearner/trainer/data_visitor.py", line 32, in
from fedlearner.data_join.data_block_visitor import DataBlockVisitor
File "/fedlearner/fedlearner/data_join/data_block_visitor.py", line 20, in
import tensorflow_io # pylint: disable=unused-import
File "/usr/local/lib/python3.7/site-packages/tensorflow_io/init.py", line 21, in
from tensorflow_io.core.python.ops.io_info import version as version
File "/usr/local/lib/python3.7/site-packages/tensorflow_io/core/python/ops/init.py", line 69, in
core_golang_ops = _load_library('libtensorflow_io_golang.so')
File "/usr/local/lib/python3.7/site-packages/tensorflow_io/core/python/ops/init.py", line 67, in _load_library
"{}, from paths: {}\ncaused by: {}".format(filename, filenames, errs))
NotImplementedError: unable to open file: libtensorflow_io_golang.so, from paths: ['/usr/local/lib/python3.7/site-packages/tensorflow_io/core/python/ops/libtensorflow_io_golang.so']