Reputation
Badges 1
25 × Eureka!Hi PlainSquid19
Did you check the website https://allegro.ai ?
If you need more info I would just fill-in the contact info, I'm sure the sales guys will get back to you soon π
t = Task.get_task('aabbcc') t.update_task(task_data={'task_type': "optimizer"})
Is this caused by running the script with the arguments
Yep π
Assuming git repo looks something like:.git readme.txt module | +---- script.py
The working directory should be "."
The script path should be: "-m module.scipt"
And under the Configuration/Args, you should have:args1 = value args2 = another_value
Make sense?
BTW: how did it get there ?
Hi JitteryCoyote63
experiments logs ...
You mean the console outputs ?
Hi JitteryCoyote63 report_frequency_sec=30.
controller how frequently monitoring events are sent to the server, default is every 30 seconds (you can change the UI display to wall-time to review). You can change it to 180 so it will only send an event every 3 minutes (for example).
sample_frequency_per_sec is the sampling frequency it uses internally, then it will average the results over the course of the report_frequency_sec
time window, and send the averaged result on the repo...
Would be cool to let it get untracked as well, especially if we want to as an option
How would you decide what should be tracked?
you can run md5 on the file as stored in the remote storage (nfs or s3)
s3 is implementation specific (i.e. minio weka wassaby etc, might not support it) and I'm actually not sure regrading nfs (I mean you can run it, but it actually means you are reading the data, that said, nfs by definition I'm assuming is relatively fast access)
wdyt?
Also, how would one ensure immutability ?
I guess this is the big question, assuming we "know" a file was changed, this will invalidate all versions using it, this is exactly why the current implementation stores an immutable copy. Or are you suggesting a smarter "sync" function ?
Hi RoughTiger69
but still get the semantics of knowing when an (external) file changed?
How would you know it changed?
This implies you have a way to verify hash, which means you download the data , no?
I think you are correct the env variable is not resolved in "time". It might be it's resolved at import not at Task.init
Actually it cannot be differed, long story short when the agent is running the same code we have to verify and pass arguments at import time. I have to wonder, I'm expecting the env variables to be preset (I.e previously set for the entire environment) how come they are manually set inside the code (and wouldn't that break when running with an agent)?
A quick fix will be:
` import dotenv
dotenv.load_dotenv('~/.env')
from clearml import Task # Now we can load it.
import argparse
if name == "main":
# do stuff `wdyt?
BTW: which clearml version are you using ?
(I remember there was a change in the last one, or the one before, making the config loading differed until accesses)
Hmm UnevenDolphin73 I just checked with v.1.1.6, the first time the configuration file is loaded is when calling Task.init (if not running with an agent, which is your case).
But the main point I just realized I missed π€―"http://"${CLEARML_ENDPOINT}":8080"
The code does not try to resolve OS environments there!
Which, well, is a nice feature to add
https://github.com/allegroai/clearml/blob/d3e986393ac8d1a1ea48302224962570ab8e6f9e/clearml/backend_api/session/session.py#L576
should p...
That makes total sense. The question was about the Mac users and OS environment in the configuration file and having that os environment set in code (this is my assumption as it seems that at import time it does not exist). What am I missing here?
Are they expanded in the "api_server" ? (I verified on a linux machine, same error, the env in the api_server is not being resolved)
feature is however available in the Enterprise Version as HyperDatasets. Am i correct?
Correct
BTW you could do:datasets_used = dict(dataset_id="83cfb45cfcbb4a8293ed9f14a2c562c0") task.connect(datasets_used, name='datasets') from clearml import Dataset dataset_path = Dataset.get(dataset_id=datasets_used['dataset_id']).get_local_copy()
This will ensure that not only you have a new section called "datasets" on the Task's configuration, buy tou will also be able to replace the datase...
So we basically have two options, one is when you call Dataset.get_local_copy()
, we register it on the Task automatically, the other is a more explicit, with something like:ds = Datasset.get(...) folder = ds.get_local_copy() task.connect(ds, name=train) ... ds_val = Datasset.get(...) folder = ds_val.get_local_copy() task.connect(ds_val, name=validate)
wdyt?
Or is this a feature of hyperdatasets and i just mixed them up.
Ohh yes, this is it. Hyper Datasets are part of the UI (i.e. there is a Tab with the HyperDataset query) Dataset Usage is currently listed on the Task. make sense ?
i'm Jax, not Manoj! lol.
I know π I just mentioned that this issue is being actively discussed
ClearML maintains a github action that sets up a dummy clearml-server,
You have one, it's the http://app.clear.ml (not a dummy one, but for this purpose it will work)
thoughts ?
JitteryCoyote63 of course there is πTask.debug_simulate_remote_task(task_id="<task_id_here>")
That said , if you could open a github issue and explain the idea behind it, I think a lot of people will be happy to have such process , i.e. CI process verifying code. And I think we should have a "CI" flag doing exactly what we have in the "hack" wdyt?
Because it lives behind a VPN and github workers donβt have access to it
makes sense
If this is the case, I have to admit that combining offline-mode and remote execution makes sense, no?
JitteryCoyote63 if this is simulating an agent, the assumption is that the Task was already created, hence the task ID.
If i am working with Task.set_offline(True)
How would the two combine ? I mean off-line is be definition not executed by an agent, what am I missing ?
Hmm, we could add an optional test for the python version, and the fail the Task if the python version is not found. wdyt?
UpsetTurkey67 are you saying there is a sym link in the original repository, and when it copies it, it breaks the symlink ?