Reputation
Badges 1
25 × Eureka!Yep... they are pushing "heavy" users away from these instances. Nothing really you can do, maybe switch to Azure/GCP, but it might be the same there
LudicrousDeer3 when using Logger you can provide 'iteration' argument, is this what you are looking for?
Now I suspect what happened is it stayed on another node, and your k8s never took care of that
You can switch to docker-mode for better control over cuda drivers, or use conda and specify cudatoolkit (this feature will be part of the next RC, meanwhile it will install the cudatoolkit based on the global cuda_version).
Hmm I think you are correct:param auto_create: Create new dataset if it does not exist yetit should have created it, this seems like a bug, I'll make sure to pass along π
well it should fail, but I think the error message should be fixed π
maybe:ValueError: dataset 'tmp_datset' not found in projectlavi-testing' `wdyt?
Hi GreasyPenguin14
Sure you can, although a bit convoluted (I'll make sure we have a nice interface π )import hashlib title = hashlib.md5('epoch_accuracy_title'.encode('utf-8')).hexdigest() series = hashlib.md5('epoch_accuracy_series'.encode('utf-8')).hexdigest() task_filter = { 'page_size': 2, 'page': 0, 'order_by': ['last_metrics.{}.{}'.format(title, series)] } queried_tasks = Task.get_tasks(project_name='examples', task_filter=task_filter)
I guess. or pipelines that you can compose after running experiments to see that experiments are connected to each other
hmm what do you mean by "compose after running experiments" ? like a way to group them? what is the relation between one "item" to another ?
If this is a sequence of Tasks , are they executed by a controller ?
Won't it be too harsh to have system wide restriction like that ?
Just dropping this here but I've had some funky compressions with very small datasets!
Odd deflate behavior ...?!
You might need to play around a bit, it might be that StorageHelper.get(' gs://bucket ') and then helper.list('folder/*')
Let me know what worked π
Assuming Tensorflow (which would be an entire folder)local_folder_or_files = mode.get_weights_package()
I think the ClearmlLogger is kind of deprecated ...
Basically all you need is Task.init at the beginning , the default tensorboard logger will be caught by clearml
Hi MinuteGiraffe30
Are you saying that when you are running you code locally with a gitea repository, cleamrl incorrectly adds a link to gitlab ?
DeliciousBluewhale87 could you restart the pod and ssh to the Host and make sure the folder /opt/clearml/agent exists and there is not *.conf file in it ?
HugeArcticwolf77 from the CLI you cannot control it (but we could probably add that), from code you can:
https://github.com/allegroai/clearml/blob/d17903d4e9f404593ffc1bdb7b4e710baae54662/clearml/datasets/dataset.py#L646
pass compression=ZIP_STORED
WorriedParrot51 I now see ...
Two solutions that I can quickly think of:
In the code add:import sys sys.path.append('./my_sub_module')Assuming you always have to add the sub-directories to make the code work, and assuming they are part of the repository, this is probably the table stolution
2. In the the UI in the Docker base image, add -e PYTHONPATH=/folder
or from code (which is exactly what you did)
a clean interface task.set_base_docker('nvidia/cids -e PYTHONPATH=/folder")
@<1577468638728818688:profile|DelightfulArcticwolf22>
How can I tell clearml-agent not to run pip install unless my requierments.txt file was changed.
the agent has built in cache, it will reuse the previous venv if nothing changed (cache local on the agent's machine).
Make sure this is line is not commented :
None
Would it suffice to provide the git credentials ...
That should be enough, basically this is where they should be:
https://github.com/allegroai/clearml-agent/blob/0462af6a3d3ef6f2bc54fd08f0eb88f53a70724c/docs/clearml.conf#L18
So if I do this in my local repo, will it mess up my git state, or should I do it in a fresh directory?
It will install everything fresh into the target folder (including venv and code + uncommitted changes)
But from the log it seems that:
you are not running as root in the docker? Python3.8 is installed (and not python 3.6 as before)
Hmm, I think it is this line:
WARNING - Model configuration only supports dictionary or string objects
done
Let me check something.
hmm that is odd, it should have detected it, can you verify the issue still exists with the latest RC?pip3 install clearml-agent==1.2.4rc3
GrittyHawk31 by default any user can login (i.e. no need for password), if you want user/password access:
https://clear.ml/docs/latest/docs/deploying_clearml/clearml_server_config/#web-login-authentication
Notice no need to have anything else in the apiserver.conf , just the user/pass section, everything else will just be the default values.
Hi UnevenDolphin73
I cannot initialize a task before loading the file, but the docs for
connect_configuration
Yes, that's basically the problem. you have to decide where is the main driver.
If you are executing the code "manually" (i.e. not via the agent) then there is no problem, obviously you have the local file and you can use it to load the "project name" etc, then you just call Task.connect_configuration to log the content.
If you are running the same code via the agent...
Hi ExcitedFish86
Of course, this is what it was designed for. Notice in the UI under Execution you can edit this section (Setup Shell Script). You can also set via task.set_base_docker
the question remains though: why docker containers won't launch onΒ
services
Maybe something with the way it launched on the docker-compose?
(I'm assuming it will fail on any docker container regardless, right?!)
try:
import os
...
dataset_path = Dataset.get(
dataset_name=dataset_name,
dataset_project=dataset_project,
alias="0013_Dataset"
).get_local_copy()
dataset_path = os.path.join(dataset_path, "data.yaml")
...
Hmm, you can delete the artifact with:task._delete_artifacts(artifact_names=['my_artifact']However this will not delete the file itself.
Do delete the file I would do :remote_file = task.artifacts['delete_me'].url h = StorageHelper.get(remote_file) h.delete(remote_file) task._delete_artifacts(artifact_names=['delete_me']Maybe we should have a proper interface for that? wdyt? what's the actual use case?