At the top there should be the URL of the notebook (I think)
Basically when running remotely, the first argument to any configuration (whether object or string, or whatever) is ignored, right?
Correct 🙂
Is there a planned documentation overhaul?
you mean specifically for the connect_configuration ? or in general on the connect approach rationale ?
Different question. How can I pass PYTHONPATH env variable to a task, run by agent (so python can find classes inside m subdirectories)?
Hi HelpfulHare30
By default the working directory will be added to the python path, this means if I have under execution:Working Dir: "." Script: "src/script.py"The root git repo will be added to the python path.
BTW: next RC you could add a flag to the agent to always add the git repo
And what is exactly missing from the "installed packages" ? Is "help_models" an additional wheel you have to install ?
Just making sure here, but remember that if your original code did not have a git repo, the only thing that is "copied" to the trains-server is the initial script, so any accompanying scripts will be missing in the trains-agent environment
Basically it hooks into any torch.save function (monkey patching in realtime)
Oh that makes sense:
` # Create a child process
using os.fork() method
pid = os.fork()
if pid > 0 :
# pid greater than 0 represents
# the parent process
print("I am parent process:")
print("Process ID:", os.getpid())
print("Child's process ID:", pid)
else :
# pid equal to 0 represents
# the created child process
print("\nI am child process - this is still fully auto logged")
print("Process ID:", os.getpid())
print("Parent's process ID:", o...
Hi @<1695969549783928832:profile|ObedientTurkey46>
Why do tags only show on a version level, but not on the dataset-level? (see images)
Tags of datasets are tags on "all the dataset versions" i.e. to help someone locate datasets (think locating projects as an analogy). Dataset Version tags are tags on a specific version of the dataset, helping users to locate a specific version of the dataset. Does that make sense ?
yey 🙂 notice that when executed by the agent the call execute_remotely is skipped, and so does the If statement I added (since running_locally will return False when the process is executed by the agent)
Hi ReassuredTiger98
It's clearml that needs to support subparser, and it does support it.
What are you seeing in the Args section ?
(Notice that at the end all the args parsing are stored on the global "args" variable after you call the pasre_args(), clearml will basically take those variables and put them into Args section)
Could it be you have old OS environment overriding the configuration file ?
Can you change the IP of the server in the conf file, and make sure it has an effect (i.e. the error changed)?
Hi @<1556450111259676672:profile|PlainSeaurchin97>
While testing the migration, we found that all of our models had their
MODEL URL
set to the IP of the old server.
Yes all the artifacts/models/debug-samples are stored "as is" , this means that if you configured your original setup with IP, it is kind of stuck there, this is why it is always preferred to use host-name ...
you apparently also need to rename
all
model URLs
Yes 😞
To summarize: The scheduler should assign tasks the the agent first, which gives a queue the highest priority.
The issue here you assume both are idle and you need global priority based on resource preference. I understand your scenario now, but it will only hold if enqueuing order is "optimal". For example, if machine Y is running a Task B that is about to be completed (e.g. in a minute) then still machine X will pick the new Task B, and again we end up in the scenario where Task A i...
So that agent on different nodes will probably require different cuda-version images.
That makes sense SarcasticSquirrel56
I would edit the helm chart (or deploy manually) based on a selector that will select the different nodes/gpus and assign the correct containers (i.e. matching CUDA versions to the diff GPUs / drivers)
BTW: you can also playaround with k8s glue, which would dynamically spin pods based on clearml Tasks.
wdyt?
Hi ItchyJellyfish73
You can always archive a Task/Model even when published
In the UI you can right-click and choose archive.
From code you need to add a system tag "archived"from clearml import Task t = Task.get_task(task_id='aabb') t.set_system_tags(t.get_system_tags() + ['archived'])And similarly for Model(model_id='aabb')
Just making sure i understand, you are to upload your models with clearml to the Yandex compatible s3 storage?
, but it seems like I can only trigger a task using a Task scheduler but not a pipeline.
@<1523701132025663488:profile|SlimyElephant79> Maybe we should better state it, but Pipeline is "just" another type of Task. so triggering a Task with the Pipeline ID is essentially triggering the pipeline (do notice you need to select the "services" queue to be used so that the pipeline runs on the correct resource). Make sense ?
Hi TightDog77 _
HTTPSConnectionPool(host='
', port=443): Max retries exceeded with url: /upload/storage/v1/b/models/o?uploadType=resumable (Caused by SSLError(SSLError(1, '[SSL: DECRYPTION_FAILED_OR_BAD_RECORD_MAC] decryption failed or bad record mac (_ssl.c:2633)')))
This seems like a network error to GCP, (basically GCP python package thows it)
Are you always getting this error? is this something new ?
Yes, but as you mentioned everything is created inside the lib, which means the python is not able to intercept the metrics so that clearml can send them to the backend.
Could you extend on the use case of #18 ? how would you use it? what problem will it be solving ?
You can always specify diff clearml.conf files with --config-file 🙂
Hi DilapidatedDucks58
how to force-reinstall package from github in Installed Packages
You mean make sure that the agent installs it from github?
The "Installed packages" section is equivalent to "requirements.txt" anything you can put in requirements.txt, you can put there.
For example adding to "Installed Packages"git+Will make sure you install the latest clearml from GitHub.
Notice that you cannot have two packages with the same name (just like with regular requirements.txt)...
Yes EnviousStarfish54 the comparison is line by line and compared only to the left experiment (like any multi comparison, you have to set the baseline, which is always the left column here, do notice you can reorder the columns and the comparison will be updated)
Wait, with the Port it does not work?
Notice that since this is external S3 you have to have the port specified so it Knows this is not an AWS S3 but a different compatible service
Hi BurlyRaccoon64
What do you mean by "custom_build_script" ? not sure I found it in "clearml,conf"
https://github.com/allegroai/clearml-agent/blob/master/docs/clearml.conf
ScantChimpanzee51 what's the use case for the full path without specific artifact?
This should have worked with the latest clearml RC.
And you verified it is not working?
when you say "configuration files" are you referencing the dict in the mock example ?