Why can we even change the pip version in the clearml.conf?
LOL mistakes learned the hard way 🙂
Basically too many times in the past pip versions were a bit broken, which is fine if they are used manually and users can reinstall a diff version, but horrible when you have an automated process like the agent, so we added a "freeze version" option, only with greater control. Make sense ?
Hi @<1624941407783358464:profile|GrievingTiger47>
I think you should try to contact the sales guys here: None
I was using clearml == 0.17.5 and I also had this issue
I think it was introduced when we moved to subprocess reporting, with 0.17.5
You can disable it with the following in clearml.conf:sdk.development.report_use_subprocess = false
Hi ReassuredTiger98
Could you add some print ? before / after the artifact upload?
Also what's the clearml version you are using ?
I see, that means xarray
is not an actual package but a folder add to the python path.
This explains why Task.add_requirements fails, as it is supposed to add python packages to the equivalent of "requirements.txt" ...
Is the folder part of the git repository ? How would you pass it to the remote machine the cleamrl-agent is running on?
Hi MuddySquid7 issue is verified, v1.1.1 will be released in a few hours with a fix.
Thank you for noticing!
Hi ReassuredTiger98
Could you send the log of both run ?
(I'm not sure this is a bug, or some misconfiguration , but the scenario should have worked...)
Can you verify by adding the the following to your extra_docker_shell_script:
https://github.com/allegroai/clearml-agent/blob/a5a797ec5e5e3e90b115213c0411a516cab60e83/docs/clearml.conf#L152extra_docker_shell_script: ["echo machine example.com > ~/.netrc", "echo login MY_USERNAME >> ~/.netrc", "echo password MY_PASSWORD >> ~/.netrc"]
What's the error you are getting ?
That said , if you could open a github issue and explain the idea behind it, I think a lot of people will be happy to have such process , i.e. CI process verifying code. And I think we should have a "CI" flag doing exactly what we have in the "hack" wdyt?
MagnificentSeaurchin79 are you using the latest RC ?
(I think this was exactly the issue)
EDIT:
try to create the version withe the file removed after you upgrade to the latest RC (0.17.5rc3) in the summary you should see 1 file removed.
I think it should look something like:files { gsc { contents: """{"type": "service_account", "project_id": "ai-platform", "private_key_id": "9999", "private_key": "-----BEGIN PRIVATE KEY-----==\n-----END PRIVATE KEY-----\n", "client_email": "a@ai.iam.gserviceaccount.com", "client_id": "111", "auth_uri": "
", "token_uri": "
", "auth_provider_x509_cert_url": "
", "client_x509_cert_url": "
"}""" path: "~/gs.cred" } }
SweetGiraffe8
That might be it, could you test with the Demo server ?
Hi @<1554275802437128192:profile|CumbersomeBee33>
what do you mean by "will the dependencies will be removed or not" ?
The next time the agent spin a new Task it will create a new venv and delete the previous one
It's the same but done from outside, you want the same and "offline" as well right?
Hi @<1523702307240284160:profile|TeenyBeetle18>
and url of the model refers to local file, no to the remote storage.
Do you mean that in the Model tab when you look into the model details the URL points to a local location (e.g. file:///mnt/something/model) ?
And your goal is to get a copy of that model (file) from your code, is that correct ?
JitteryCoyote63 if this is simulating an agent, the assumption is that the Task was already created, hence the task ID.
If i am working with Task.set_offline(True)
How would the two combine ? I mean off-line is be definition not executed by an agent, what am I missing ?
Now I'm curious what's the workaround ?
regrading the artifact, yes that make sense, I guess this is why there is "input" type for an artifact, the actual use case was never found (I guess until now?! what are you point there?)
Regrading the configuration
It's very useful for us to be able to see the contents of the configuration and understand
Wouldn't that just do exactly what you are looking for:
` local_config_file_that_i_can_always_open = task.connect_configuration("important", "/path/to/config/I/only/have/on/my/machi...
Because it lives behind a VPN and github workers don’t have access to it
makes sense
If this is the case, I have to admit that combining offline-mode and remote execution makes sense, no?
Hmm you mean how long it takes for the server to timeout on registered worker? I'm not sure this is easily configured
bash: line 1: 1031 Aborted (core dumped)
@<1570583227918192640:profile|FloppySwallow46> seems like the processes crashed,
why are there indefinitely growing anonymous tasks, even after i've closed the main schedulers.
The anonymous Tasks are The Dataset you are creating (a Dataset version is also a Task of a certain type with artifacts, the idea is usually Datasets are created from code, hence the need to combine the two).
Make sense ?
That depends on the HPO algorithm, basically the will be pushed based on the limit of "concurrent jobs", so you do not end up exploding the queue. It also might be a Bayesian process, i.e. based on previous set of parameters and runs, like how hyper-band works (optuna/hpbandster)
Make sense ?
Let me rerun the code and check
Okay, this is odd the request returned exactly 100 out 100.
It seems not all of them were reported?!
Could you post the toy code, I'll check what's going on.