neptune-ai / neptune-client Goto Github PK
View Code? Open in Web Editor NEWπ The experiment tracker for foundation model training
Home Page: https://neptune.ai
License: Apache License 2.0
π The experiment tracker for foundation model training
Home Page: https://neptune.ai
License: Apache License 2.0
When creating experiment with
neptune.create_experiment()
it returns the experiment object.
It could also print a link to experiment to make it easier to find.
Any plan to do some notification with telegram, or even an App?
I am training to add feature version to tag as '1.0' and it results in error:
neptune.api_exceptions.ExperimentValidationError: Tags: [1.0] are invalid. Valid tags may contain only lowercase letters, digits, underscores and dashes.
Hey,
I would be good/friendly to be able to add many tags at once. It can be for example npt_exp.append_tag(['tag1', 'tag2', 'tag3'])
so list of tags. In this case adding single tag just as string is valid, like: npt_exp.append_tag('tag1')
@aniezurawski @jakubczakon what do you think?
I'm trying to run an experiment but I am getting an error:
neptune.init("blackarbsceo/hpo-es-features", api_token=api_key)
neptune.create_experiment(
"hyperparameter-optuna-rf-test", upload_source_files=["*.py"]
)
neptune_callback = optuna_utils.NeptuneCallback(log_study=True, log_charts=True)
Traceback (most recent call last):
File "C:\Users\kngka\Anaconda3\envs\mlfinlab2\lib\site-packages\IPython\core\interactiveshell.py", line 3418, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-2-63e8b42c81c5>", line 17, in <module>
import neptune
File "C:\Program Files\JetBrains\PyCharm Community Edition 2020.1.2\plugins\python-ce\helpers\pydev\_pydev_bundle\pydev_import_hook.py", line 21, in do_import
module = self._system_import(name, *args, **kwargs)
File "C:\Users\kngka\Anaconda3\envs\mlfinlab2\lib\site-packages\neptune\__init__.py", line 24, in <module>
from neptune.internal.backends.hosted_neptune_backend import HostedNeptuneBackend
File "C:\Program Files\JetBrains\PyCharm Community Edition 2020.1.2\plugins\python-ce\helpers\pydev\_pydev_bundle\pydev_import_hook.py", line 21, in do_import
module = self._system_import(name, *args, **kwargs)
File "C:\Users\kngka\Anaconda3\envs\mlfinlab2\lib\site-packages\neptune\internal\backends\hosted_neptune_backend.py", line 58, in <module>
from neptune.oauth import NeptuneAuthenticator
File "C:\Program Files\JetBrains\PyCharm Community Edition 2020.1.2\plugins\python-ce\helpers\pydev\_pydev_bundle\pydev_import_hook.py", line 21, in do_import
module = self._system_import(name, *args, **kwargs)
File "C:\Users\kngka\Anaconda3\envs\mlfinlab2\lib\site-packages\neptune\oauth.py", line 21, in <module>
from oauthlib.oauth2 import TokenExpiredError, OAuth2Error
File "C:\Program Files\JetBrains\PyCharm Community Edition 2020.1.2\plugins\python-ce\helpers\pydev\_pydev_bundle\pydev_import_hook.py", line 21, in do_import
module = self._system_import(name, *args, **kwargs)
ModuleNotFoundError: No module named 'oauthlib.oauth2'
oauthlib 3.1.0 pypi_0 pypi
neptune-client 0.4.132+2.g26cdb5a pypi_0 pypi
neptune-contrib 0.25.0 pypi_0 pypi
I think there is some regression. For recent experiments experiment.get_properties()
return either an empty dictionary or {'key1': 'value1', 'key2': '17', 'key3': 'other-value'}
(whatever that is) in case of the sandbox project.
For older experiments, I still can get the properties.
This is probably a backend issue but there is no better place to put it.
I would like to be able to run scripts with neptune tracking but if I want to run it without tracking (for instance in tests) I want to have a flag that turns of neptune tracking.
Something like this would be great
import neptune
SILENT=True
if SILENT:
neptune.set_silent()
neptune.init()
neptune.create_experiemnt()
...
I am just getting started with my first neptune project and am running into a problem with the logger. I haven't seen any information online about this particular problem so I wanted to post it here.
The following is an example of a project and experiment initialization that passes in a python logger object.
import logging
import neptune
from src.keys import NEPTUNE_TOKEN
from neptune.experiments import Experiment
neptune.init('richt3211/thesis', api_token=NEPTUNE_TOKEN)
logger = logging.getLogger()
exp = neptune.create_experiment(
name='test log',
description='testing logger',
logger=logger
)
logger.info('Starting experiment')
When I run this in a jupyter notebook, the experiment is created, but no logs appear.
I believe this is the correct way to capture a python logger in neptune according to the docs. If it's not the docs might need updated to give a clearer example. If it is the correct, way, there may be a bug.
Hi,
I am trying to download the /public/dsb_2018_data/ data on the public Neptune drive. I can't find any sample code on downloading data from the Neptune public drive.
Any help is very appreciated.
Thanks
I get the following error message when importing neptune:
` import neptune
File "E:\01_Programs\01_Anaconda\envs\mlenv\lib\site-packages\neptune_init_.py", line 19, in
from neptune import envs, projects, experiments
File "E:\01_Programs\01_Anaconda\envs\mlenv\lib\site-packages\neptune\projects.py", line 23, in
from neptune.experiments import Experiment, push_new_experiment
File "E:\01_Programs\01_Anaconda\envs\mlenv\lib\site-packages\neptune\experiments.py", line 31, in
from neptune.internal.utils.image import get_image_content
File "E:\01_Programs\01_Anaconda\envs\mlenv\lib\site-packages\neptune\internal\utils\image.py", line 19, in
from PIL import Image
File "E:\01_Programs\01_Anaconda\envs\mlenv\lib\site-packages\PIL\Image.py", line 94, in
from . import _imaging as core
ImportError: cannot import name '_imaging'`
Python 3.6
Please help
It would be useful to download images from the image channel.
For example:
exp =project.get_experiment(id='PROJ-1')
exp.download_image_channel('local/dest/to/image/dir')
And the images would land in my 'local/dest/to/image/dir` folder.
Some properties of the experiments need to be downloaded from the backend. I understand this logic but this is impractical. My case:
I have 1000 experiments and want to download some of them and filter them by name but calling experiment.name
requires 1000 calls to the server, which takes ages. This is a pain. Another server call is when I ask for experiment.get_properties()
and another for experiment.state
. Thus 3000 requests to the server just for such simple data.
I think, it would be perfectly reasonable to make get_experiments()
download those data and keep them static and if this is too convoluted then, please at least allow to filter experiments by name in get_experiments()
.
I think that experiment's parameters should be property, so I do not need to do: npt_exp.get_parameters()['dropout']
, but I can simply do: npt_exp.parameters['dropout']
.
What do you think: @pitercl @jakubczakon ?
It is a bit frustrating to type project_qualified_name
every time I run neptune.init()
Could that be changed to project_name
?
Also, is there any other way to pass the project_name
, like CLI?
I ran the following script:
import neptune
neptune.init('jakub-czakon/examples')
with neptune.create_experiment():
neptune.send_metric('score', 0.9)
neptune.set_property('was_logged', True)
and no property is set for this experiment. There is no error it just doesn't send it.
But if I run the following with str(True)
:
import neptune
neptune.init('jakub-czakon/examples')
with neptune.create_experiment():
neptune.send_metric('score', 0.9)
neptune.set_property('was_logged', str(True))
Everything works just fine.
Maybe we should have an automated str(value)
in set_property()
or raise warnings or something.
Got Invalid JSON received from frontend.
message when login with neptune account login
. It may due to I am using socks proxy. Please kindly suggest a way to solve this problem.
I modified neptune/client.py
file on func def _upload_tar_data(self, experiment, api_method, data):
to:
proxies = {"http": "socks5h://127.0.0.1:9999", 'https': 'socks5h://127.0.0.1:9999'}
return session.send(session.prepare_request(request), proxies=proxies)
So that it works for neptune-client without command line.
Hi, I'm running into the same issue that Richard was in #280. I'm creating neptune experiments within a loop testing the same model on multiple subsets of a dataset.
The model is the same each time and the JsonDecodeError occurs randomly, i.e. the code will run for hours then randomly throw the error.
Sorry if this is the wrong place to post, I would have commented in the other thread but the issue was already closed.
Once I call neptune.create_experiment()
is am unable to change: name
, params
, description
.
I think that neptune-client
should allow User to do it. My UC is that I want to append stuff to params and assign name and description right before training loop.
What do you guys think?
@lukasz-walkiewicz @jakubczakon
I tried to upload my pytorch model weights as an artifact through a BytesIO:
buffer = BytesIO()
print(getsizeof(buffer)) # 96
torch.save(model.state_dict(), buffer)
print(getsizeof(buffer)) # 101291839
experiment.log_artifact(artifact=buffer, destination="fold0.pth")
But the artifact is empty (0B) and the python script doesn't crash:
What is the reason to don't allow sending inf's and nan's as metric values? I imagine that it is impossible to plot them but this is still some information.
I would like to be able to send a list of values to a channel rather than send one value in a for loop.
Something like the following would be great:
neptune.send_metric('accuracy', [0.8, 0.4, 0.9])
neptune.send_artifacts(['model.pkl','report.pdf'])
neptune.send_images('diagnostics', ['roc_auc.png','pred_dist.png', 'conf_matrix.png'])
neptune.set_property({'data_version': 'f23fasdqw122312',
'data_path': 'data/raw/table.csv',
'model_path': 'models/model_v1.csv'})
some text goes here, also @kamil-kaczmarek
It would be good to add an option to do something like this:
...
experiment.get_output_files('my_model.h5', '/path/to/local/storage')
So that I could access (download) the artifacts from code.
sorry to bother you guys, really great app
# Connect your script to Neptune
PARAMS = {'boosting_type': 'gbdt',
'objective': 'binary',
'metric': 'auc',
'bagging_fraction': 0.7,
'seed': 2020,
}
# Create an experiment and log hyperparameters
neptune.create_experiment(name='test-lgb-1',
description='1st test on LC data, train-test-split, basic-fe',
params={**PARAMS,
# 'num_boosting_round': NUM_BOOSTING_ROUNDS
},
upload_source_files=['train.py', 'environment.yaml'],
)
# read data
train = pd.read_csv('cfn-train.csv')
test = pd.read_csv('cfn-testa.csv')
fea = [f for f in train.columns if f not in ['id', 'isDefault']]
X_train = train[fea]
X_test = test[fea]
y_train = train['isDefault']
folds = 5
seed = 2008
kf = KFold(n_splits=folds, shuffle=True, random_state=seed)
# _______________________________
cv_scores = []
for i, (train_ind, valid_ind) in enumerate(kf.split(X_train, y_train)):
print('************************************ {} ************************************'.format(str(i + 1)))
X_train_split, y_train_split, X_val, y_val = X_train.iloc[train_ind], y_train[train_ind], X_train.iloc[valid_ind], y_train[valid_ind]
train_matrix = lgb.Dataset(X_train_split, label=y_train_split) # categorical_feature = ['grade, subGrade']
valid_matrix = lgb.Dataset(X_val, label=y_val)
gbm = lgb.train(PARAMS,
train_set=train_matrix,
# num_boost_round=NUM_BOOSTING_ROUNDS,
valid_sets=valid_matrix,
verbose_eval=200,
early_stopping_rounds=200,
# valid_names=['train', 'valid'],
callbacks=[neptune_monitor()], # monitor learning curves (prefix)
)
val_pred = gbm.predict(X_val, num_iteration=gbm.best_iteration)
cv_scores.append(roc_auc_score(y_val, val_pred))
print(cv_scores)
I am having this bug . I don't know how to fix... I searched and found an article, but really got me confused...
Failed to send channel value.
Traceback (most recent call last):
File "F:\Anaconda\envs\kaggle\lib\site-packages\neptune\internal\channels\channels_values_sender.py", line 156, in _send_values
self._experiment._send_channels_values(channels_with_values)
File "F:\Anaconda\envs\kaggle\lib\site-packages\neptune\experiments.py", line 1138, in _send_channels_values
self._backend.send_channels_values(self, channels_with_values)
File "F:\Anaconda\envs\kaggle\lib\site-packages\neptune\utils.py", line 210, in wrapper
return func(*args, **kwargs)
File "F:\Anaconda\envs\kaggle\lib\site-packages\neptune\internal\backends\hosted_neptune_backend.py", line 560, in send_channels_values
raise ChannelsValuesSendBatchError(experiment.id, batch_errors)
neptune.api_exceptions.ChannelsValuesSendBatchError: Received batch errors sending channels' values to experiment LEN-9. Cause: Error(code=400, message='X-coordinates must be strictly increasing for channel: 4176ea13-112e-41f7-a353-fcd8348b3379. Invalid point: InputChannelValue(timestamp=2020-09-30T07:11:18.643Z, x=0.0, numericValue=0.7006723423306764, textVa', type=None) (metricId: '4176ea13-112e-41f7-a353-fcd8348b3379', x: 0.0) Skipping 100 values.
Hi,
Running in VSCode JupyterNotebooks with python enviroment 3.8.2
neptune-client 0.4.130
when I try to:
import neptune
I get this error:
ModuleNotFoundError Traceback (most recent call last)
in
----> 1 import neptune
2
3 import pandas as pd
4 import numpy as np
5ModuleNotFoundError: No module named 'neptune'
And yes, the library is installed:
!pip list
Package Version
... 6.0.7
nbformat 5.0.8
neptune-client 0.4.130
neptune-notebooks 0.0.16
nest-asyncio 1.4.3
...
I have tried with different versions but nothing, any idea?
Thank you
In my settings the discovery of git repo location fails. Let me give the context: I have a main script where the experiment is defined but the execution of the experiment is made from a different script, which is outside my repo (to be precise, I run my experiments using ray). In such a context, the git repo of my experiment script is not located correctly. The problem is in:
def discover_git_repo_location():
import __main__
if hasattr(__main__, '__file__'):
return os.path.dirname(os.path.abspath(__main__.__file__))
return None
which ask for the __main__
, which, in my case, will be some external ray module.
I would appreciate any workaround tips, e.g., how to change my __main__
.
Related to neptune-ai/neptune-contrib#55
I have a machine whose compute nodes have no access to the internet. Is it possible to use the offline mode during compute and load the results the the web interface after the compute job is finished?
Hi..
something like, loading the experiment and get the output files by βexperiment.download_artifactβ..
The code might look like this,:-
previous_experiment = neptune.load_experiment(id = 'SAN-1')# loading a previously made experiment
my_csv = previous_experiment.download_artifact('xyz.csv')
Is it possible??
Hi there,
I enjoy neptune very much and on my macbook everything works fine. But when I run the same code on my Windows 10 machine, I get an error when calling create_experiment().
Traceback (most recent call last): File "<stdin>", line 1, in <module> File "C:\ProgramData\Anaconda3\envs\rl_insurance\lib\site-packages\neptune\__init__.py", line 177, in create_experiment notebook_id=notebook_id File "C:\ProgramData\Anaconda3\envs\rl_insurance\lib\site-packages\neptune\projects.py", line 400, in create_experiment click.echo(str(experiment.id)) File "C:\ProgramData\Anaconda3\envs\rl_insurance\lib\site-packages\click\utils.py", line 218, in echo file = _default_text_stdout() File "C:\ProgramData\Anaconda3\envs\rl_insurance\lib\site-packages\click\_compat.py", line 675, in func rv = wrapper_func() File "C:\ProgramData\Anaconda3\envs\rl_insurance\lib\site-packages\click\_compat.py", line 436, in get_text_stdout rv = _get_windows_console_stream(sys.stdout, encoding, errors) File "C:\ProgramData\Anaconda3\envs\rl_insurance\lib\site-packages\click\_winconsole.py", line 295, in _get_windows_console_stream func = _stream_factories.get(f.fileno()) AttributeError: 'StdOutWithUpload' object has no attribute 'fileno'
It happens when I run:
import neptune
import cfg
neptune.init(api_token=cfg.neptune_token, project_qualified_name=cfg.neptune_project_name)
neptune.create_experiment()
I run it in conda environments both times.
I'm trying to use the advanced search option with NQL but am running into an issue trying to write a query for a specific field. My field is named "trained epochs" which is a nice indication to quickly check the progress of a model without looking at the charts.
I would like to write a query for experiments that only have at least one value for the trained epochs metric. However, because there is a space in the field name, I can't write a query to accomplish this.
So far I have tried the following queries
"trained epochs" > 0
epochs > 0
trained\ epochs > 0 # trying to escape the space character
Is it possible to allow for spaces in NQL fields? Another solution would be to have an option to rename the metric so that it is compatible with NQL. If it isn't possible to allow for spaces in NQL, it would be nice to have something in the docs that mentions this for field name creation so that users are encouraged (or even forced) to not have spaces in the metric and log names.
Running
import neptune
neptune.init()
neptune.send_metric('metric', 0.3)
results in the generic python list error
IndexError: list index out of range
when it could say that you need to run:
neptune.create_experiment()
When using argparse with neptune, python cannot recognize it correctly.
Example Code:
import argparse
parser = argparse.ArgumentParser("cifar")
parser.add_argument('--data', type=str, default='../data', help='location of the data corpus')
# parser.add_argument('--batch_size', type=int, default=64, help='batch size')
args = parser.parse_args()
# import neptune
If comment out as above run:
python temp.py --data ./data
has no problem.
However if uncomment above:
import argparse
parser = argparse.ArgumentParser("cifar")
parser.add_argument('--data', type=str, default='../data', help='location of the data corpus')
parser.add_argument('--batch_size', type=int, default=64, help='batch size')
args = parser.parse_args()
import neptune
Run
python temp.py --batch-size 100
will raise unrecognized argument error.
Any solution?
I have tried downloading the artifact programatically:
from neptune.sessions import Session
session = Session()
project = session.get_project(project_qualified_name='jakub-czakon/blog-hpo')
exp = project.get_experiments(id='BLOG-97')[0]
exp.download_artifact('forest_results.pkl', '.')
It results in the following error
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-17-8f3ed3e70168> in <module>
6 exp = project.get_experiments(id='BLOG-97')[0]
7
----> 8 exp.download_artifact('forest_results.pkl', '.')
~/.envs/npt_dev/lib/python3.5/site-packages/neptune/experiments.py in download_artifact(self, filename, destination_dir)
328 raise NotADirectory(destination_dir)
329
--> 330 self._client.download_data(self._project, path, destination_path)
331
332 def send_graph(self, graph_id, value):
~/.envs/npt_dev/lib/python3.5/site-packages/neptune/client.py in wrapper(*args, **kwargs)
51 def wrapper(*args, **kwargs):
52 try:
---> 53 return func(*args, **kwargs)
54 except requests.exceptions.SSLError:
55 raise SSLError()
~/.envs/npt_dev/lib/python3.5/site-packages/neptune/client.py in download_data(self, project, path, destination)
700 query_params={
701 "projectId": project.internal_id,
--> 702 "path": path
703 }) as response:
704 if response.status_code == NOT_FOUND:
~/.envs/npt_dev/lib/python3.5/site-packages/neptune/client.py in _download_raw_data(self, api_method, headers, path_params, query_params)
823 url = self.api_address + api_method.operation.path_name + "?"
824
--> 825 for key, val in path_params.iteritems():
826 url = url.replace("{" + key + "}", val)
827
AttributeError: 'dict' object has no attribute 'iteritems'
Hi!
First of all, thanks for sharing your amazing library!
I am quite new to Neptune, and am trying to run (and log) some trainings on 1 GPU. Everything went smoothly for ~20 hours, but then I got an error (Failed to send channel value.
). I am wondering, what might have caused this.
It happened at the same time to all 3 of my jobs. I see a few possibilities:
I would highly appreciate any hint on how can I deal with this :-)
Logs from Neptune's stderr:
20:17:01 | E1023 04:07:48.158644 35187409351088 channels_values_sender.py:164] Failed to send channel value.
-- | --
20:17:01 | Traceback (most recent call last):
20:17:01 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/neptune/utils.py", line 210, in wrapper
-- | --
20:17:01 | return func(*args, **kwargs)
20:17:01 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/neptune/internal/backends/hosted_neptune_backend.py", line 556, in send_channels_values
20:17:01 | channelsValues=input_channels_values
20:17:01 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/bravado/http_future.py", line 239, in response
20:17:01 | six.reraise(*sys.exc_info())
20:17:01 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/six.py", line 693, in reraise
-- | --
20:17:01 | raise value
20:17:01 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/bravado/http_future.py", line 200, in response
20:17:01 | swagger_result = self._get_swagger_result(incoming_response)
20:17:01 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/bravado/http_future.py", line 124, in wrapper
20:17:01 | return func(self, *args, **kwargs)
20:17:01 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/bravado/http_future.py", line 303, in _get_swagger_result
20:17:01 | self.request_config.response_callbacks,
20:17:01 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/bravado/http_future.py", line 334, in unmarshal_response
20:17:01 | raise_on_unexpected(incoming_response)
20:17:01 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/bravado/http_future.py", line 408, in raise_on_unexpected
20:17:01 | raise make_http_exception(response=http_response)
20:17:01 | bravado.exception.HTTPInternalServerError: 500 : {"code":500,"errorType":"INTERNAL_SERVER_ERROR","title":"Internal Server Error (2fb6177e655)"}
20:17:01 | During handling of the above exception, another exception occurred:
20:17:01 | Traceback (most recent call last):
20:17:01 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/neptune/internal/channels/channels_values_sender.py", line 156, in _send_values
20:17:01 | self._experiment._send_channels_values(channels_with_values)
20:17:01 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/neptune/experiments.py", line 1138, in _send_channels_values
20:17:01 | self._backend.send_channels_values(self, channels_with_values)
20:17:01 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/neptune/utils.py", line 221, in wrapper
20:17:01 | raise ServerError()
20:17:01 | neptune.api_exceptions.ServerError: Server error. Please try again later.
20:18:56 | Traceback (most recent call last):
-- | --
20:18:56 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/neptune/utils.py", line 210, in wrapper
20:18:56 | return func(*args, **kwargs)
20:18:56 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/neptune/internal/backends/hosted_neptune_backend.py", line 556, in send_channels_values
20:18:56 | channelsValues=input_channels_values
20:18:56 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/bravado/http_future.py", line 239, in response
20:18:56 | six.reraise(*sys.exc_info())
20:18:56 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/six.py", line 693, in reraise
20:18:56 | raise value
20:18:56 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/bravado/http_future.py", line 200, in response
20:18:56 | swagger_result = self._get_swagger_result(incoming_response)
20:18:56 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/bravado/http_future.py", line 124, in wrapper
20:18:56 | return func(self, *args, **kwargs)
20:18:56 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/bravado/http_future.py", line 303, in _get_swagger_result
20:18:56 | self.request_config.response_callbacks,
20:18:56 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/bravado/http_future.py", line 334, in unmarshal_response
20:18:56 | raise_on_unexpected(incoming_response)
20:18:56 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/bravado/http_future.py", line 408, in raise_on_unexpected
20:18:56 | raise make_http_exception(response=http_response)
20:18:56 | bravado.exception.HTTPInternalServerError: 500 : {"errorType":"INTERNAL_SERVER_ERROR","code":500,"title":"Internal Server Error (e50dc164b5c)"}
20:18:56 | During handling of the above exception, another exception occurred:
20:18:56 | Traceback (most recent call last):
20:18:56 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/neptune/internal/channels/channels_values_sender.py", line 156, in _send_values
20:18:56 | self._experiment._send_channels_values(channels_with_values)
20:18:56 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/neptune/experiments.py", line 1138, in _send_channels_values
20:18:56 | self._backend.send_channels_values(self, channels_with_values)
20:18:56 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/neptune/utils.py", line 221, in wrapper
20:18:56 | raise ServerError()
20:18:56 | neptune.api_exceptions.ServerError: Server error. Please try again later.
20:19:02 | E1023 04:09:49.150690 35187409351088 channels_values_sender.py:164] Failed to send channel value.
20:19:02 | Traceback (most recent call last):
20:19:02 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/neptune/utils.py", line 210, in wrapper
20:19:02 | return func(*args, **kwargs)
20:19:02 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/neptune/internal/backends/hosted_neptune_backend.py", line 556, in send_channels_values
20:19:02 | channelsValues=input_channels_values
20:19:02 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/bravado/http_future.py", line 239, in response
20:19:02 | six.reraise(*sys.exc_info())
20:19:02 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/six.py", line 693, in reraise
20:19:02 | raise value
20:19:02 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/bravado/http_future.py", line 200, in response
20:19:02 | swagger_result = self._get_swagger_result(incoming_response)
20:19:02 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/bravado/http_future.py", line 124, in wrapper
20:19:02 | return func(self, *args, **kwargs)
20:19:02 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/bravado/http_future.py", line 303, in _get_swagger_result
20:19:02 | self.request_config.response_callbacks,
20:19:02 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/bravado/http_future.py", line 334, in unmarshal_response
20:19:02 | raise_on_unexpected(incoming_response)
20:19:02 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/bravado/http_future.py", line 408, in raise_on_unexpected
20:19:02 | raise make_http_exception(response=http_response)
20:19:02 | bravado.exception.HTTPInternalServerError: 500 : {"errorType":"INTERNAL_SERVER_ERROR","code":500,"title":"Internal Server Error (d98440aac5e)"}
20:19:02 | During handling of the above exception, another exception occurred:
20:19:02 | Traceback (most recent call last):
20:19:02 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/neptune/internal/channels/channels_values_sender.py", line 156, in _send_values
20:19:02 | self._experiment._send_channels_values(channels_with_values)
20:19:02 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/neptune/experiments.py", line 1138, in _send_channels_values
20:19:02 | self._backend.send_channels_values(self, channels_with_values)
20:19:02 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/neptune/utils.py", line 221, in wrapper
20:19:02 | raise ServerError()
20:19:02 | neptune.api_exceptions.ServerError: Server error. Please try again later.
20:19:06 | E1023 04:09:53.008028 35187409351088 channels_values_sender.py:164] Failed to send channel value.
20:19:06 | Traceback (most recent call last):
20:19:06 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/neptune/utils.py", line 210, in wrapper
20:19:06 | return func(*args, **kwargs)
20:19:06 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/neptune/internal/backends/hosted_neptune_backend.py", line 556, in send_channels_values
20:19:06 | channelsValues=input_channels_values
20:19:06 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/bravado/http_future.py", line 239, in response
20:19:06 | six.reraise(*sys.exc_info())
20:19:06 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/six.py", line 693, in reraise
20:19:06 | raise value
20:19:06 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/bravado/http_future.py", line 200, in response
20:19:06 | swagger_result = self._get_swagger_result(incoming_response)
20:19:06 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/bravado/http_future.py", line 124, in wrapper
20:19:06 | return func(self, *args, **kwargs)
20:19:06 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/bravado/http_future.py", line 303, in _get_swagger_result
20:19:06 | self.request_config.response_callbacks,
20:19:06 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/bravado/http_future.py", line 334, in unmarshal_response
20:19:06 | raise_on_unexpected(incoming_response)
20:19:06 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/bravado/http_future.py", line 408, in raise_on_unexpected
20:19:06 | raise make_http_exception(response=http_response)
20:19:06 | bravado.exception.HTTPInternalServerError: 500 : {"errorType":"INTERNAL_SERVER_ERROR","code":500,"title":"Internal Server Error (b34b09c679a)"}
20:19:06 | During handling of the above exception, another exception occurred:
20:19:06 | Traceback (most recent call last):
20:19:06 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/neptune/internal/channels/channels_values_sender.py", line 156, in _send_values
20:19:06 | self._experiment._send_channels_values(channels_with_values)
20:19:06 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/neptune/experiments.py", line 1138, in _send_channels_values
20:19:06 | self._backend.send_channels_values(self, channels_with_values)
20:19:06 | File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/neptune/utils.py", line 221, in wrapper
20:19:06 | raise ServerError()
20:19:06 | neptune.api_exceptions.ServerError: Server error. Please try again later.
Logs from my machine (jobs are still working, but not logging anything)
Traceback (most recent call last):
File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/neptune/internal/threads/ping_thread.py", line 37, in run
self.__backend.ping_experiment(self.__experiment)
File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/neptune/utils.py", line 210, in wrapper
return func(*args, **kwargs)
File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/neptune/internal/backends/hosted_neptune_backend.py", line 611, in ping_experiment
self.backend_swagger_client.api.pingExperiment(experimentId=experiment.internal_id).response()
File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/bravado/client.py", line 279, in __call__
request_config=request_config,
File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/bravado/requests_client.py", line 399, in request
self.authenticated_request(sanitized_params),
File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/bravado/requests_client.py", line 440, in authenticated_request
return self.apply_authentication(requests.Request(**request_params))
File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/bravado/requests_client.py", line 445, in apply_authentication
return self.authenticator.apply(request)
File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/neptune/oauth.py", line 90, in apply
self.auth.refresh_token_if_needed()
File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/neptune/utils.py", line 210, in wrapper
return func(*args, **kwargs)
File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/neptune/oauth.py", line 51, in refresh_token_if_needed
self._refresh_token()
File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/neptune/oauth.py", line 54, in _refresh_token
self.session.refresh_token(self.session.auto_refresh_url)
File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/requests_oauthlib/oauth2_session.py", line 446, in refresh_token
self.token = self._client.parse_request_body_response(r.text, scope=self.scope)
File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/oauthlib/oauth2/rfc6749/clients/base.py", line 421, in parse_request_body_response
self.token = parse_token_response(body, scope=scope)
File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/oauthlib/oauth2/rfc6749/parameters.py", line 431, in parse_token_response
validate_token_parameters(params)
File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/oauthlib/oauth2/rfc6749/parameters.py", line 438, in validate_token_parameters
raise_from_error(params.get('error'), params)
File "/gpfs/share/skynet/apps/anaconda3/envs/wmlce_env_1.6.1/lib/python3.6/site-packages/oauthlib/oauth2/rfc6749/errors.py", line 405, in raise_from_error
raise cls(**kwargs)
oauthlib.oauth2.rfc6749.errors.InvalidGrantError: (invalid_grant) Offline user session not found
I have some metrics that is being logged each 1 step and other metrics that are logged each 10 time steps unfortunately this makes the graphs non-comparable on the interface.
Can we allow the neptune.send_metric
to take step count. So when displaying them the both graphs become comparable?
Thanks
Currently, when fetching experiment data I have two options:
exp = project.get_experiment(id=['PROJ-28'])[0]
exp.get_numeric_channel_values('auc_train')
exp.get_numeric_channel_values('auc_train', 'auc_valid')
I would like to be able to pass a list without unpacking it.
Today:
channel_list = ['auc_train', 'auc_valid']
exp.get_numeric_channel_values(*channel_list)
I would like to:
channel_list = ['auc_train', 'auc_valid']
exp.get_numeric_channel_values(channel_list)
What do you think?
Hi,
Just discovered neptune.ai and it's really great!
I have the following error:
Failed to send channel value: SSL certificate validation failed. Set NEPTUNE_ALLOW_SELF_SIGNED_CERTIFICATE environment variable to accept self-signed certificates.
Even if I set NEPTUNE_ALLOW_SELF_SIGNED_CERTIFICATE
in my bashrc
neptune-client still throws this error.
.bashrc:
export NEPTUNE_ALLOW_SELF_SIGNED_CERTIFICATE=True
python:
Python 3.7.7 (default, Mar 10 2020, 15:16:38)
[GCC 7.5.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> os.getenv("NEPTUNE_ALLOW_SELF_SIGNED_CERTIFICATE")
'True'
>>> `
System:
ubuntu 18.04
neptune-client 0.4.116
python 3.7.7
Hello!
First off, I absolutely love the interface and easy integration/documentation you provide!! And hats off for @jakubczakon whom provided with great tutorials! However I stumbled on a problem when retrieving the best results. I saved artifacts but I'm not successful in retrieving them.
>>> exp = project.get_experiments('BAY-59')[0]
>>> artifacts = exp.download_artifacts()
>>> print(artifacts)
None
And if i try by retrieving the best results by project.get_leaderboard('BAY-59') the results is a string.
Problem
my code
PARAMS = {'lr': 0.0005,
'dropout': 0.2,
'batch_size': 64,
'optimizer': 'adam',
}
project = neptune.Session().get_project('kamil/Tensor-Cell-Demo')
npt_exp = project.create_experiment(name='neural-net-mnist',
params=PARAMS
)
Now, when I do: npt_exp.get_parameters()['batch_size']
, I have string returned, while it should be int.
Solution
npt_exp.get_parameters()
should return dict with original types.
What do you guys think: @pitercl @lukasz-walkiewicz
When creating the experiment I need to list files by name:
neptune.create_experiment(upload_source_files=['main.py', 'utils.py', 'config.yaml'])
I would like to be able to just use * option. For example:
neptune.create_experiment(upload_source_files=['*.py', '*.yaml'])
Given neptune-client
newly installed on MacOS Catalina
pip3 install neptune-client
when import
ing neptune
python3 -c "import neptune"
Python crashes with
Path: /usr/local/Cellar/python/3.7.4_1/Frameworks/Python.framework/Versions/3.7/Resources/Python.app/Contents/MacOS/Python
Identifier: Python
Version: 3.7.4 (3.7.4)
Code Type: X86-64 (Native)
Parent Process: Python [7526]
Responsible: Terminal [7510]
User ID: 501
Date/Time: 2019-10-07 20:59:20.675 +0530
OS Version: Mac OS X 10.15 (19A582a)
Report Version: 12
Anonymous UUID: CB7F20F6-96C0-4F63-9EC5-AFF3E0989687
Time Awake Since Boot: 3000 seconds
System Integrity Protection: enabled
Crashed Thread: 0 Dispatch queue: com.apple.main-thread
Exception Type: EXC_CRASH (SIGABRT)
Exception Codes: 0x0000000000000000, 0x0000000000000000
Exception Note: EXC_CORPSE_NOTIFY
Application Specific Information:
/usr/lib/libcrypto.dylib
abort() called
Invalid dylib load. Clients should not load the unversioned libcrypto dylib as it does not have a stable ABI.
This seems to be exactly the problem with described in this SO question.
I was able to simply fix the problem by following this SO answer and doing:
pip3 uninstall cryptography
pip3 install cryptography
Not sure what the solution would be, other than noting this in README and even alerting the user on stdout during importing, if this is at all possible.
Hi,
Sending metrics when resuming an experiment works fine. However the experiment state does not change from "succeeded" to "running". And neither does it log system information (cpu/gpu information) or stdout when resuming an experiment. Code example:
import time
import neptune
from neptune.sessions import Session
# Initialize experiment
neptune.init(project_qualified_name='Test/project')
experiment = neptune.create_experiment(name='test')
experiment_id = experiment.id
# Send metrics
for i in range(10):
time.sleep(2)
print('logging this to STDOUT channel.')
experiment.send_metric('iter', i)
experiment.stop()
# Resume experiment
session = Session()
project = session.get_project(project_qualified_name='Test/project')
experiment = project.get_experiments(id=experiment_id)[0]
# Send metrics
for i in range(10):
time.sleep(2)
print('Unable to log this to STDOUT channel.')
experiment.send_metric('iter', 3*i)
experiment.stop()
It seems that when I send metrics through the send_metric
method, it gets sent to x not y dimension.
It is later visualized as if it was y.
I've noticed that the python logging module only writes to Neptune's STDOUT channel when I initialize the Neptune experiment first. Like so:
import sys
import neptune
neptune.init(project_qualified_name='test_project')
experiment = neptune.create_experiment(name='test')
logger = logging.getLogger()
logger.setLevel(logging.INFO)
handler = logging.StreamHandler(sys.stdout)
handler.setLevel(logging.INFO)
formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
handler.setFormatter(formatter)
logger.addHandler(handler)
logger.info('Initializing experiment')
for i in range(5):
experiment.send_metric('iter', i)
logger.info('Iteration: {}'.format(i))
logger.info('Wrapping up experiment')
experiment.stop()
The STDOUT channel of my neptune experiment does not show output of the logger instance when initializing it the other way around:
import sys
import neptune
logger = logging.getLogger()
logger.setLevel(logging.INFO)
handler = logging.StreamHandler(sys.stdout)
handler.setLevel(logging.INFO)
formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
handler.setFormatter(formatter)
logger.addHandler(handler)
neptune.init(project_qualified_name='test_project')
experiment = neptune.create_experiment(name='test')
logger.info('Initializing experiment')
for i in range(5):
experiment.send_metric('iter', i)
logger.info('Iteration: {}'.format(i))
logger.info('Wrapping up experiment')
experiment.stop()
A solution to this, if possible, would benefit me in a project with a complicated sequence of class initializations.
Hi,
the default value of upload_source_files
in create_experiment
is None
, which is not an iterable. Therefore the following loop causes an error:
for filepath in upload_source_files:
expanded_source_files |= set(glob.glob(filepath))
(line 391-392 in neptune/projects.py)
Running code with:
project.create_experiment(name='foo', upload_source_files=[])
does solve this problem, yet is not an elegant solution.
According to the doc-string, it is an optional argument:
upload_source_files (:obj:
list, optional, default is ``['main.py']``):
Could you please fix this issue?
Thanks in advance!
I have code like this:
from torch.utils.data import DataLoader
# ...
def main():
# ...
train_loader = DataLoader(
datasets.train,
shuffle=True,
batch_size=ARGS.batch_size,
num_workers=ARGS.num_workers,
pin_memory=True,
)
where num_worker
determines with how many threads pytorch reads the data. If this is set to 0, then pytorch reads the data in the main thread. On MacOS, setting it to any other value than 0 seems to mess up neptune (the problem does not seem to appear on Linux; I have not tried Windows).
What seems to happen is that neptune starts a new experiment for each worker (this is with num_workers=4
):
https://ui.neptune.ai/tmk/fcm/e/FCM-38
https://ui.neptune.ai/tmk/fcm/e/FCM-39
https://ui.neptune.ai/tmk/fcm/e/FCM-40
https://ui.neptune.ai/tmk/fcm/e/FCM-41
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/Users/tk324/anaconda/envs/pytorch/lib/python3.8/multiprocessing/spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "/Users/tk324/anaconda/envs/pytorch/lib/python3.8/multiprocessing/spawn.py", line 125, in _main
prepare(preparation_data)
File "/Users/tk324/anaconda/envs/pytorch/lib/python3.8/multiprocessing/spawn.py", line 236, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "/Users/tk324/anaconda/envs/pytorch/lib/python3.8/multiprocessing/spawn.py", line 287, in _fixup_main_from_path
main_content = runpy.run_path(main_path,
File "/Users/tk324/anaconda/envs/pytorch/lib/python3.8/runpy.py", line 263, in run_path
return _run_module_code(code, init_globals, run_name,
File "/Users/tk324/anaconda/envs/pytorch/lib/python3.8/runpy.py", line 96, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "/Users/tk324/anaconda/envs/pytorch/lib/python3.8/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/Users/tk324/PycharmProjects/fair-dist-matching/run_clust.py", line 4, in <module>
main()
File "/Users/tk324/PycharmProjects/fair-dist-matching/clustering/optimisation/train.py", line 170, in main
input_shape = get_data_dim(context_loader)
File "/Users/tk324/PycharmProjects/fair-dist-matching/shared/utils/utils.py", line 44, in get_data_dim
x = next(iter(data_loader))[0]
File "/Users/tk324/anaconda/envs/pytorch/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 291, in __iter__
return _MultiProcessingDataLoaderIter(self)
File "/Users/tk324/anaconda/envs/pytorch/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 737, in __init__
w.start()
File "/Users/tk324/anaconda/envs/pytorch/lib/python3.8/multiprocessing/process.py", line 121, in start
self._popen = self._Popen(self)
File "/Users/tk324/anaconda/envs/pytorch/lib/python3.8/multiprocessing/context.py", line 224, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "/Users/tk324/anaconda/envs/pytorch/lib/python3.8/multiprocessing/context.py", line 283, in _Popen
return Popen(process_obj)
File "/Users/tk324/anaconda/envs/pytorch/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 32, in __init__
super().__init__(process_obj)
File "/Users/tk324/anaconda/envs/pytorch/lib/python3.8/multiprocessing/popen_fork.py", line 19, in __init__
self._launch(process_obj)
File "/Users/tk324/anaconda/envs/pytorch/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 42, in _launch
prep_data = spawn.get_preparation_data(process_obj._name)
File "/Users/tk324/anaconda/envs/pytorch/lib/python3.8/multiprocessing/spawn.py", line 154, in get_preparation_data
_check_not_importing_main()
File "/Users/tk324/anaconda/envs/pytorch/lib/python3.8/multiprocessing/spawn.py", line 134, in _check_not_importing_main
raise RuntimeError('''
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.
This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:
if __name__ == '__main__':
freeze_support()
...
The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce an executable.
Traceback (most recent call last):
File "/Users/tk324/anaconda/envs/pytorch/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 779, in _try_get_data
data = self._data_queue.get(timeout=timeout)
File "/Users/tk324/anaconda/envs/pytorch/lib/python3.8/multiprocessing/queues.py", line 107, in get
if not self._poll(timeout):
File "/Users/tk324/anaconda/envs/pytorch/lib/python3.8/multiprocessing/connection.py", line 257, in poll
return self._poll(timeout)
File "/Users/tk324/anaconda/envs/pytorch/lib/python3.8/multiprocessing/connection.py", line 424, in _poll
r = wait([self], timeout)
File "/Users/tk324/anaconda/envs/pytorch/lib/python3.8/multiprocessing/connection.py", line 930, in wait
ready = selector.select(timeout)
File "/Users/tk324/anaconda/envs/pytorch/lib/python3.8/selectors.py", line 415, in select
fd_event_list = self._selector.poll(timeout)
File "/Users/tk324/anaconda/envs/pytorch/lib/python3.8/site-packages/torch/utils/data/_utils/signal_handling.py", line 66, in handler
_error_if_any_worker_fails()
RuntimeError: DataLoader worker (pid 53749) exited unexpectedly with exit code 1. Details are lost due to multiprocessing. Rerunning with num_workers=0 may give better error trace.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "run_clust.py", line 4, in <module>
main()
File "/Users/tk324/PycharmProjects/fair-dist-matching/clustering/optimisation/train.py", line 170, in main
input_shape = get_data_dim(context_loader)
File "/Users/tk324/PycharmProjects/fair-dist-matching/shared/utils/utils.py", line 44, in get_data_dim
x = next(iter(data_loader))[0]
File "/Users/tk324/anaconda/envs/pytorch/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 363, in __next__
data = self._next_data()
File "/Users/tk324/anaconda/envs/pytorch/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 974, in _next_data
idx, data = self._get_data()
File "/Users/tk324/anaconda/envs/pytorch/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 941, in _get_data
success, data = self._try_get_data()
File "/Users/tk324/anaconda/envs/pytorch/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 792, in _try_get_data
raise RuntimeError('DataLoader worker (pid(s) {}) exited unexpectedly'.format(pids_str))
RuntimeError: DataLoader worker (pid(s) 53749) exited unexpectedly
With num_workers=0
, this runs fine.
Hello,
It's a pleasure working with Neptune, thanks.
I am trying to log a StringIO as an artifact. I wish to avoid saving a temporary file in-order to log it and then proceed to remove it.
Neptune version: 0.4.126
It was installed using Conda on Ubuntu 18.04.
Python version 3.7.8
Here is a minimal example to reproduce:
import neptune
from io import StringIO
summary_string_io = StringIO()
summary_string_io.write("something, something.")
neptune.init('a/b')
neptune.create_experiment(name='minimal_example')
neptune.send_artifact(summary_string_io, destination="summary.txt")
The error I'm getting is (I omitted some of my paths):
Traceback (most recent call last):
File "", line 12, in
neptune.send_artifact(summary_string_io, destination="summary.txt")
File "lib/python3.7/site-packages/neptune/init.py", line 355, in send_artifact
return get_experiment().log_artifact(artifact, destination)
File "lib/python3.7/site-packages/neptune/experiments.py", line 620, in log_artifact
experiment=self)
File "lib/python3.7/site-packages/neptune/internal/storage/storage_utils.py", line 230, in upload_to_storage
upload_api_fun(**dict(kwargs, data=file_chunk_stream, progress_indicator=progress_indicator))
File "lib/python3.7/site-packages/neptune/internal/backends/hosted_neptune_backend.py", line 692, in upload_experiment_output
query_params={})
File "lib/python3.7/site-packages/neptune/internal/backends/hosted_neptune_backend.py", line 847, in _upload_loop
ret = with_api_exceptions_handler(self._upload_loop_chunk)(fun, part, data, **kwargs)
File "lib/python3.7/site-packages/neptune/utils.py", line 211, in wrapper
return func(*args, **kwargs)
File "lib/python3.7/site-packages/neptune/internal/backends/hosted_neptune_backend.py", line 865, in _upload_loop_chunk
response = fun(data=part.get_data(), headers=headers, **kwargs)
File "lib/python3.7/site-packages/neptune/internal/storage/datastream.py", line 34, in get_data
return io.BytesIO(self.data)
TypeError: a bytes-like object is required, not 'str'
I tried looking for examples but I couldn't find any.
Thank you,
Ben.
This didn't happen with the same code, and I am using the fastai neptune callback
Traceback (most recent call last):
File "finetune.py", line 116, in <module>
neptune.init(project_qualified_name='natsume/electra-glue')
File "/home/yisiang/miniconda3/envs/ml/lib/python3.7/site-packages/neptune/__init__.py", line 148, in init
backend = HostedNeptuneBackend(api_token, proxies)
File "/home/yisiang/miniconda3/envs/ml/lib/python3.7/site-packages/neptune/utils.py", line 210, in wrapper
return func(*args, **kwargs)
File "/home/yisiang/miniconda3/envs/ml/lib/python3.7/site-packages/neptune/internal/backends/hosted_neptune_backend.py", line 91, in __init__
self._client_config = self._create_client_config(self.credentials.api_token, backend_client)
File "/home/yisiang/miniconda3/envs/ml/lib/python3.7/site-packages/neptune/utils.py", line 210, in wrapper
return func(*args, **kwargs)
File "/home/yisiang/miniconda3/envs/ml/lib/python3.7/site-packages/neptune/internal/backends/hosted_neptune_backend.py", line 930, in _create_client_config
config = backend_client.api.getClientConfig(X_Neptune_Api_Token=api_token).response().result
File "/home/yisiang/miniconda3/envs/ml/lib/python3.7/site-packages/bravado/http_future.py", line 200, in response
swagger_result = self._get_swagger_result(incoming_response)
File "/home/yisiang/miniconda3/envs/ml/lib/python3.7/site-packages/bravado/http_future.py", line 124, in wrapper
return func(self, *args, **kwargs)
File "/home/yisiang/miniconda3/envs/ml/lib/python3.7/site-packages/bravado/http_future.py", line 303, in _get_swagger_result
self.request_config.response_callbacks,
File "/home/yisiang/miniconda3/envs/ml/lib/python3.7/site-packages/bravado/http_future.py", line 337, in unmarshal_response
op=operation,
File "/home/yisiang/miniconda3/envs/ml/lib/python3.7/site-packages/bravado/http_future.py", line 374, in unmarshal_response_inner
content_value = response.json()
File "/home/yisiang/miniconda3/envs/ml/lib/python3.7/site-packages/bravado/requests_client.py", line 160, in json
return self._delegate.json(**kwargs)
File "/home/yisiang/miniconda3/envs/ml/lib/python3.7/site-packages/requests/models.py", line 898, in json
return complexjson.loads(self.text, **kwargs)
File "/home/yisiang/miniconda3/envs/ml/lib/python3.7/site-packages/simplejson/__init__.py", line 525, in loads
return _default_decoder.decode(s)
File "/home/yisiang/miniconda3/envs/ml/lib/python3.7/site-packages/simplejson/decoder.py", line 370, in decode
obj, end = self.raw_decode(s)
File "/home/yisiang/miniconda3/envs/ml/lib/python3.7/site-packages/simplejson/decoder.py", line 400, in raw_decode
return self.scan_once(s, idx=_w(s, idx).end())
simplejson.errors.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. πππ
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google β€οΈ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.