Reputation
Badges 1
25 × Eureka!Hi CleanPigeon16
Put the specific git into the "installed packages" section
It should look like:... git+ ...(No need for the specific commit, you can just take the latest)
assume clearml has some period of time that after it, shows this message. am I right?
Yes you are π
is this configurable?
It is πtask.set_resource_monitor_iteration_timeout(seconds_from_start=1800)
BTW:
Error response from daemon: cannot set both Count and DeviceIDs on device request.
Googling it points to a docker issue (which makes sense considering):
https://github.com/NVIDIA/nvidia-docker/issues/1026
What is the host OS?
Ohh so even easier:print(client.workers.get_all())
Hi MelancholyElk85
Can I manually deleteΒ
.zip
Β files with datasets inΒ
.clearml/cache/storage_manager/datasets
Β directory?
Yes, you can. I "think" the .zip is stored for easier access, but you can delete it, as long as the "extracted" folder exists, it should be fine.
Hmm is "model_monitoring_eps" another version of the model and it does not have all the properties of the "original" one?
Hmm are you getting the warning on the client side , or in the clearml-server ?
SoggyFrog26 there is a full pythonic interface, why don't you use this one instead, much cleaner π
Yes the clearml-server AMI - we want to be able to back it up and encrypt it on our account
I think the easiest and safest way for you is to actually have full control over the AMI, and recreate once from scratch.
Basically any ubuntu/centos + docker and docker-compose should do the trick, wdyt ?
/opt/clearml/data/fileserver this is ion the host machine and it is mounted Into the container to /mnt/fileserer
ThickDove42 Windows conda python3.6 was exactly what I was using,
started the jupyter with:
"python -m jupyter notebook"
Then opened / created a new notebook, everything worked.
Tested on latest clearml 0.17.2
Maybe it's something with the path to the repo that breaks it? Because obviously the issue is it is looking in the wrong folder.
Hmm... That's what happens with the exception of None/'' if type is str... There is no way to differentiate in the UI.
This is why we opted for type=str will "cast" everything to str so you always get str, while not specifying a type will leave the variable as is... If you have an idea on how to support both, feel free to suggest π
I have to assume that I do not know the dataset ID
Sorry I mean:
datasets = Dataset.list_datasets(dataset_project="some_project")
for d in datasets:
d["version"] = Dataset.get(dataset_id=d["id"]).version
wdyt?
So could you re-explain assuming my piepline object is created byΒ
pipeline = PipelineController(...)
?
pipe.add_step(name='stage_train', parents=['stage_process', ], monitor_artifact=['my_created_artifact'], base_task_project='examples', base_task_name='pipeline step 3 train model', parameter_override={'General/dataset_task_id': '${stage_process.id}'})This will put the artifact names "my_created_artifact" from the step Tas...
So are you saying the large file size download is the issue ? (i.e. network issues)
MysteriousBee56
Well we don't want to ask sudo permission automatically, and usually setups do no change, but you can diffidently call this one before running the agent πsudo chmod 777 -R ~/.trains/
Very lacking wrt to how things interact with one another
If I'm reading it correctly, what you are saying is that some of the "big picture" / holistic approach on how different parts interact with one another is missing, is that correct?
I think ClearML would benefit itself a lot if it adopted a documentation structure similar to numpy ecosystem
Interesting thought, what exactly would you suggest we "borrow" in terms of approach?
Hi UnevenDolphin73
Does ClearML somehow
remove
any loggers from
logging
module? We suddenly noticed that we have some handlers missing when running in ClearML
I believe it adds a logger, it should not remove any loggers,
What's the clearml version you are using ?
Weird issue, I'll make sure we fix compatibility with python 3.9
About .get_local_copy... would that then work in the agent though?
Yes it would work both locally (i.e. without agent) and remotely
Because I understand that there might not be a local copy in the Agent?
If the file does not exist locally it will be downloaded and cached for you
I don't know whether you have access to the backend,
Creepy , no I do not π
I can't make anything appear in the console part of the ui
clearml_task.logger.report_text("some text") should work
As I understand, providing this param at the Task.init() inside the subtask is too late, because step is already started.
If you are running the task on an agent (with I assume you do), than one way would be to configure the "default_output_uri" on the agnets clearml.conf file.
The other option is to change the task as creation time, task.storage_uri = 's3://...'
In our case this is not possible due to client security (e.g. training data from clients can potentially be 'reverse engineered' from trained models in future).
Hmm I see, wouldn't it make more sense to separate clients like a multi-tenant SAAS solution ?
The problem is of course filling in all the configuration details, so that they are viewable.
Other than that, check out:
https://allegro.ai/docs/task.html#trains.task.Task.export_task
https://allegro.ai/docs/task.html#trains.task.Task.import_task
Sounds good ?
Hi BlandPuppy7 , is this Trains related, are you trying to integrate it, and need help?
BTW is it cheaper than ec2 instance? Why not use the aws autoscaler ?
My typos are killing us, apologies :
change -t to -it it will make it interactive (i.e. you can use bash π )
DefeatedMoth52 how many agents do you have running on the same GPU ?