
Reputation
Badges 1
119 × Eureka!` [package_manager.force_repo_requirements_txt=true] Skipping requirements, using repository "requirements.txt"
Using base prefix '/opt/conda'
New python executable in /home/ramon/.clearml/venvs-builds/3.7/bin/python3.7
Also creating executable in /home/ramon/.clearml/venvs-builds/3.7/bin/python
Installing setuptools, pip, wheel...
2021-06-10 09:57:56
done.
Collecting pip<20.2
Using cached pip-20.1.1-py2.py3-none-any.whl (1.5 MB)
Installing collected packages: pip
Attempting uninstall: p...
AgitatedDove14 task.set_archived(True)
+ the cleanup service should do it π If we run in debug mode the experiment goes directly to the archive and gets cleaned and we donβt pollute the main experiment page.
I feel itβs easier not to report than cleaning after but please correct me if I am overthinking it. Iβll check if I could wrap the code in something that calls the Task.delete if debugging
Yes! I think thats what I will do π Let me know if there is a way to contribute a mode to keep logging off. We just donβt want to pollute the server when debugging.
Thanks SuccessfulKoala55 !
I am using pytorch_lightning
, I'll try to create a snippet I can share! Thanks π
Hey CostlyOstrich36 ! I am using clearml==1.1.2
and clearml-agent==1.1.0
. Stopped is not the right word, more like frozen, it just froze at an epoch. The console on the agent shows epoch 33 first batch and the one at the server epoch 32 last batch. The experiment was running for ~6 hours.
` File "/home/ramon/.trains/venvs-builds/3.7/lib/python3.7/site-packages/trains/backend_api/session/token_manager.py", line 72, in _get_token_exp
return jwt.decode(token, verify=False).get('exp', sys.maxsize)
File "/home/ramon/.trains/venvs-builds/3.7/lib/python3.7/site-packages/jwt/api_jwt.py", line 113, in decode
decoded = self.decode_complete(jwt, key, algorithms, options, **kwargs)
File "/home/ramon/.trains/venvs-builds/3.7/lib/python3.7/site-packages/jwt/api_jwt.py", line 80, in decode_c...
Hey AgitatedDove14 do you have an implementation for gcloud? this is awesome
Not yet AgitatedDove14 , does the agent use by default the python version the command is run with? I installed conda and tried using package_manager.type=conda
but then get an error:clearml_agent: ERROR: 'NoneType' object has no attribute 'lower'
With pip
I get the first error I showed, I tried conda
and it starts running but at some point crashes with:clearml_agent: ERROR: 'NoneType' object has no attribute 'lower'
My bad :man-facepalming: It was just specifying weights_path=dirpath
since the first argument is weights_filename
Managed to get:
clearml_agent: ERROR: Command '['/home/ramon/.clearml/venvs-builds/3.9/bin/python', '-m', 'pip', '--disable-pip-version-check', 'install', '-r', '/var/tmp/requirements_tb0x2i3j.txt', '--extra-index-url', '
died with <Signals.SIGKILL: 9>.
while building the task with the id on the agent
Yes! What env variables should I pass
Yes AgitatedDove14 , I am not sure what they use by default. Here is a simple working example:
` from typing import Optional
import torch
from clearml import Task
from pytorch_lightning import LightningDataModule, LightningModule
from pytorch_lightning.utilities.cli import LightningCLI
from torch.utils.data import DataLoader, Dataset, Subset
class RandomDataset(Dataset):
def init(self, size, length):
self.len = length
self.data = torch.randn(length, size)
def ...
There are also ways to override the parameters as stated https://pytorch-lightning.readthedocs.io/en/latest/common/lightning_cli.html#use-of-command-line-arguments .
AgitatedDove14 Downloading a dataset would not be possible using this right? I want to be able to access the data just avoid reporting the experiment results
Works like a charm π thanks!
No, I have all the packages with a version. I just want to know if there is a way to override the requirements versions detected by Pigar when using detect_with_pip_freeze: false
. I have locally cloudpickle==1.4.1
but when running the code and sending the task to the node the environment uses cloudpickle==1.6.0
. I have to manually change the version on the UI. Is there a way to force this single package to have a version? Maybe on the requirments.txt or something similar
Thats really cool! But I would still prefer avoid using pip_freeze, is there a way?
AgitatedDove14 I am not sure why the packages get different versions, maybe since the package is not directly imported in my code it is possible to get a different version to what I have locally (?). Should all the libraries versions match exactly between local and the code that runs in the agent? The Task.add_requirements(package_name, package_version=None)
workaround works perfectly! I just add the previous version that doesnβt break the code. Yes, definitely a force flag could help ...