![Profile picture](https://clearml-web-assets.s3.amazonaws.com/scoold/avatars/AgitatedDove14.png)
Reputation
Badges 1
25 × Eureka!when u say use
Task.current_task()
you for logging? which i’m guessing that the fastai binding should do right?
right, this is a fancy way to say, make sure the actual sub-process is initializing ClearML so all the automagic kicks in, since this is not "forked" but a whole new process, calling Task.current_task is the equivalent of calling Task.init with the same arguments (which you can also do, I'm not sure which one is more straight forward, wdyt?)
Could I just build it and log these parameters using
task.set_parameters()
so that I call
task.get_parameters()
later?
instead of manually calling set/get, you call task.connect(some_dict_or_object)
, it does both:
When running manually (i.e. without an agent) it logs the keys/values on the Task,
when running with an agents, it takes the values from the backend (Task) and sets them on the dict/object
Make sense ?
setting max_workers to 1 prevents the error (but, I assume, it may come the cost of slower sequential uploads).
This seems like a question to GS storage, maybe we should open an issue there, their backend does the rate limit
My main concern now is that this may happen within a pipeline leading to unreliable data handling.
I'm assuming the pipeline code will have max_workers, but maybe we could have a configuration value so that we can set it across all workers, wdyt?
If
...
TartSeal39 please let me know if it works, conda is a strange beast and we do our best to tame it.
Specifically when you execute manually on a conda env we collect (separately) the conda packages & the python packages (so later we can replicate on both conda & pip, or at least do our best)
Are you running both development env and agent with conda ?
Hi @<1561885921379356672:profile|GorgeousPuppy74>
- Could you copy the 3 messages here into your original message, it helps keeping things tidy and nice (press on the 3 dot menu and select edit)
- what do you mean by "currently its not executing in queue-01", you changed it so it should be pushed to queue-02, no? Also notice that you can run the enire pipeline as sub-processes for debugging,
just callpipe.start_locally(run_pipeline_steps_locally=True)
You also need an agent on the ser...
Hi OutrageousGrasshopper93
When the Task is executed on a worker, the presence of spaces breaks the URLs and from the UI I cannot access to the resources on the bucket
You are saying the URLs generated in a remote execution are "broken" and on local execution are working, even though it is the same project/task name ?
BattyLion34 Okay, I'll try to see if we can solve the multi-instance issue on Windows (because obviously it should be automatic)
Hi ShinyWhale52
This is just a suggestion, but this is what I would do:
- use
clearml-data
and create a dataset from the local CSV fileclearml-data create ... clearml-data sync --folder (where the csv file is)
2. Write a python code that takes the csv file from the dataset and creates a new dataset of the preprocessed data
` from clearml import Dataset
original_csv_folder = Dataset.get(dataset_id=args.dataset).get_local_copy()
process csv file -> generate a new csv
preproces...
BattyLion34
if I simply clone nntraining stage and run it in default queue - everything goes fine.
When you compare the Task you clone manually and the Task created by the pipeline , what's the difference ?
BattyLion34 let me see if I understand.
The same base_task_id when cloned by the UI and enqueues on the same queue as the pipeline, will work but when the pipeline runs the same Task it fails?!
Could it be that you enqueue them on different queues ?
Could you send me the cosnole log of both tasks, failing and passing one?
No, I mean actually compare using the UI, maybe the arguments are different or the "installed packages"
I'm assuming you are building for x86
sets up the venv correctly, prints
Starting Task Execution:
then does nothing
Can you provide a log?
Do you see the code/git reference in the Pipeline Task details - Execution Tab ?
ShinyWhale52 any time 🙂
Feel free to followup with more questions
GreasyPenguin14 I think the default is reporting on failed tasks only? could that be?
@<1523711619815706624:profile|StrangePelican34> are you saying that after the " with
" block the task is marked completed? how is that possible? is this done manually ?
Hi SkinnyPanda43
This issue was fixed with clearml-agent 1.5.1, can you verify?
Is this still an issue (if you provide queue name, the default tag is not used so no error should be printed)
Thanks HelpfulHare30 , I would love know know what you find out, please feel free to share 🙂
First that is awesome to hear PanickyFish98 !
Can you send the full exception? You might be on to something...
2. Actually we thought of it, but could not find a use case, can you expand?
3. I'm not sure I follow, do you mean you expect the first execution to happen immediately?
Hi SkinnyPanda43
I realized that the params are not being saved anymore
Could you test with clearml==1.0.4 ?
Can you verify this example is not working for you?
https://github.com/allegroai/clearml/blob/master/examples/frameworks/hydra/hydra_example.py
EnviousStarfish54 following on this issue, the root cause is that dictConfig will clean All handlers if Not passed "incremental": True
conf_logging = { "incremental": True, ... }
Since you pointed that Kedro is internally calling logging.config.dictConfig(conf_logging)
,
this seems like an issue with Kedro as this call will remove All logging handlers, which seems problematic. wdyt ?
Hi ProudChicken98
How about saving it as a local YAML and upload the file itself as an artifact?
ClumsyElephant70
Could it be virtualenv package is not installed on the host machine ?
(From the log it seems you are running in venv mode, is that correct?)
that clearml-agent needs to be installed from system python mentioned anywhere in the docs, if not I suggest it gets added.
You are right, I will check and fix if not 🙂
Thank you so much for helping.
My pleasure