Reputation
Badges 1
25 × Eureka!What I'm trying to do is to filter is between two datetimes...Is that possible?
could you expand ?
It is the folder the clearml creates and the folder we create ourself to store the predictions
I see... If that is the case, the only solution I can think of is manually uploading the files with StorageManager(...) then get the url, and register it as debug_media or artifact:logger.report_media("image", "type a", iteration=iteration, url="
") task.upload_artifact('a link', artifact_object='
')
. So to conclude: it has to be executed manually first, then with trains agent?
Yes, that said, as you mentioned, you can always edit the "installed packages" once manually, from that point you are basically cloning the experiment, including the "installed packages" so it should work if the original worked.
Make sense ?
And when exactly are you getting the "user aborted" message)?
How do you start the process (are you manually running it, or is it an agent, or maybe pycharm?)
Can you provide the full log ?
Hi CooperativeFox72 ,
From the backend guys, long story short, upgrade your machine => more cpu cores , more processes , it is that easy π
This is very odd ... let me check something
So the way it works when you run a component the return value with the entire function execution is cached, basically:
this did NOT add the artifact to the pipeline via caching on subsequent runs β
you just need to do:
PipelineDecorator.upload_artifact(name='images', artifact_object=img_dir, wait_on_upload=True)
return Task.current_task().artifacts['images'].url
This will return the URL of the uploaded images (i.e. S3 bucket)
which means if this is cached you will get it...
Thanks for the details TroubledJellyfish71 !
So the agent should have resolved automatically this line:torch == 1.11.0+cu113
into the correct torch version (based on the cuda version installed, or cpu version if no cuda is installed)
Can you send the Task log (console) as executed by the agent (and failed)?
(you can DM it to me, so it's not public)
Hi @<1577106212921544704:profile|WickedSquirrel54>
We are self hosting it using Docker Swarm
Nice!
and were wondering if this is something that the community would be interested in.
Always!
what did you have in mind? I have to admit I'm not familiar with the latest in Docker swarm but we all lover Docker the product and the company
Ohh I see, okay next pipeline version (coming very very soon π will have the option of function as Task, would that be better for your use case ?
(Also in case of local execution, and I can totally see why this is important, how would you specify where is the current code base ? are you expecting it to be local ?)
I'm all for trying to help with debugging pipeline, because this is really challenging.
BTW: you can run your code as if it is executed from an agent (including the param ove...
Sure π
BTW: clearml-agent will mount your host .ssh into the docker to /root/.ssh by default.
So no need to do that manually
DefeatedCrab47 if TB has it as image, you should find it under "debug_samples" as image.
Can you locate it there ?
SarcasticSparrow10 sure see "execute_remotely" it does exactly that:
https://allegro.ai/docs/task.html#trains.task.Task.execute_remotely
It will stop the current process (after syncing everything) and launch itself remotely (i.e. enqueue itself)
When the same code is running by the "trains-agent" the execute_remotely call becomes a no-operation and is basically skipped
Because we are working with very big files, having them stored at multiple locations is something we try to avoid
Just so I better understand, is this for storing files as part of a dataset, or as debug samples ?
In other words can two diff processes create the exact same file (image) ?
BTW, we figure out thatΒ Β
'
Β is belong the echo
yep, when seeing the full command it is apparent
Hi PanickyMoth78 an RC with a fix is out, let me know if it works (notice you can now set the max_workers from CLI or Dataset functions) pip install clearml==1.8.1rc1
It is deployed on an on premise, secured network that has no access to the outside world.
Is it password protected or something of that nature?
Perhaps we could find a different solution or work around, rather than solving a technical issue.
Solving it means allowing the python code to ask the JupyterLab server for the notebook file
However, once working with ClearML and using a venv (and not the default python kernel),
Are you saying on your specific setup (i.e. OpenShif...
Yep it is the scale π and yes it should appear once you upgrade
I am logging debug images via Tensorboard (via
add_image
function), however apparently these debug images are not collected within fileserver,
ZanyPig66 what do you mean not collected to the file server? are you saying the TB add_image is not automatically uploading images? or that you cannot access the files on your files server?
It seems like there is no way to define that a Task requires docker support from an agent, right?
Correct, basically the idea is you either have workers working in venv mode or docker.
If you have a mixture of the two, then you can have the venv agents pulling from one queue (say default_venv) and the docker mode agents pulling from a different queue (say default_docker). This way you always know what you are getting when you enqueue your Task
but when I run the same task again it does not map the keys..Β (edited)
SparklingElephant70 what do you mean by "map the keys" ?
StraightDog31 how did you get these ?
It seems like it is coming from maptplotlib, no?
using caching where specified but the pipeline page doesn't show anything at all.
What do you mean by " the pipeline page doesn't show anything at all."? are you running the pipeline ? how ?
Notice PipelineDecorator.component needs to be Top level not nested inside the pipeline logic, like in the original example
@PipelineDecorator.component(
cache=True,
name=f'append_string_{x}',
)
I would guess that for some reason loglevel is DEBUG, could that be the case?
BTW: latest PyCharm plugin with 2022 support was just released:
https://github.com/allegroai/clearml-pycharm-plugin/releases/tag/1.1.0
ClearML does not work easily with Google Drive.
Yes, google drive is not google storage (which ClearML supports π )
Seems like you solved it?