Reputation
Badges 1
25 × Eureka!Hi GrittyKangaroo27
Is it possible to import user-defined modules when wrapping tasks/steps with functions and decorators?
Sure, any package (local included) can be imported, and will be automatically listed in the "installed packages" section of the pipeline component Task
(This of course assumes that on a remote machine you could do the "pip install <package")
Make sense ?
Now I suspect what happened is it stayed on another node, and your k8s never took care of that
What I try to do is that DSes have some lightweight baseclass that is independent of clearml they use and a framework have all the clearml specific code. This will allow them to experiment outside of clearml and only switch to it when they are in an OK state. This will also help not to pollute clearml spaces with half backed ideas
So you want the DS to manually tell the baseclasss what to store ?
then the base class will store it for them, for example with joblib
, is this the...
I'll make sure we add the reference somewhere on GitHub
So obviously the straight forward solution is to report normalize the step value when reporting to TB, i.e. int(step/batch_size). Which makes sense as I suppose the batch size is known and is part of the hyper-parameters. Normalization itself can be done when comparing experiments in the UI, and in the backend can do that, if given the correct normalization parameter. I think this feature request should actually be posted on GitHub, as it is not as simple as one might think (the UI needs to a...
ThickFox50 I also have to point that there is a free hosted server here ๐ https://app.community.clear.ml
Is this consistent on the same file? can you provide a code snippet to reproduce (or understand the flow) ?
Could it be two machines are accessing the same cache folder ?
Notice the pipeline step/Task at execution is not aware of the pipeline context
Hi GiganticTurtle0
dataset_task = Task.get_task(task_id=dataset.id)
Hmmm I think that when it gets the Task "output_uri" is not updated from the predefined Task (you can obviously set it again).
This seems like a bug that is unrelated to Datasets.
Basically any Task that you retrieve will default to the default ouput_uri (not the stored one)
That was the idea behind the feature (and BTW any feedback on usability and debugging will be appreciated here, pipelines are notorious to debug ๐ )
the ability to exexute without an agent i was just talking about thia functionality the other day in the community channel
What would be the use case ? (actually the infrastructure now supports it)
Yes that makes sense, if the overhead of the additional packages is not huge, I do not think it is worth the maintenance ๐
BTW clearml-agent has full venv caching that you can turn on, so when running remotely you are not "paying" for the additional packages being installed:
Un-comment this line ๐
https://github.com/allegroai/clearml-agent/blob/51eb0a713cc78bd35ca15ed9440ddc92ffe7f37c/docs/clearml.conf#L116
named asย
venv_update
ย (I believe it's still in beta). Do you think enabling this parameter significantly helps to build environments faster?
This is deprecated... it was a test to use the a package that can update pip venvs, but it was never stable, we will remove it in the next version
Yes, I guess. Since pipelines are designed to be executed remotely it may be pointless to enable anย
output_uri
ย parameter in theย
PipelineDecorator.componen...
BitterStarfish58 could you open a GitHub issue on it? I really want to make sure we support it (and I think it should not be very difficult)
BTW: any specific reason for going the RestAPI way and not using the python SDK ?
So what is the difference?!
GiganticTurtle0 quick update, a fix will be pushed, so that casting is based on the Actual value passed not even type hints ๐
(this is only in case there is no default value, otherwise the default value type is used for casting)
bash: line 1: 1031 Aborted (core dumped)
@<1570583227918192640:profile|FloppySwallow46> seems like the processes crashed,
Hi SubstantialElk6
try:--docker "<image_name> --privileged"
Notice the quotes
and: " clearml_agent: ERROR: 'charmap' codec can't encode character '\u0303' in position 5717: character maps to <undefined>ย "
Ohh that's the issue with the LC_ALL missing in the docker itself (i.e unicode code character will break it)
Add locals into the container, in your clearml.conf
add the followingagent.extra_docker_shell_script: ["apt-get install -y locales",]
Let me know if that solves the issue (as you pointed, it has nothing to do with importing package X)
So why is it trying to upload to "//:8081/files_server:" ?
What do you have in the trains.conf on the machine running the experiment ?
And when runningย
get
ย the files on the parent dataset will be available as links.
BTW: if you call get_mutable_copy() the files will be copied, so you can work on them directly (if you need)
StaleButterfly40 are you sure you are getting the correct image on your TB (toy255) ?
Eg, i'm creating a task usingย
clearml.Task.create
ย , often it doesn't properly get the git diff correctly,
ShakyJellyfish91 Task.create does not store any "git diff" automatically, is there a reason not to use Task.init
?
Hmm could it be this is on the "helper functions" ?
So the TB issue was reported images were not logged.
We are now talking about the caching, which is actually a UI thing which clearml-server version are you using ?
And where are the images stored (the default files server or is it S3/GS etc.) ?
trains-agent runs a container from that image, then clones ...
That is correct
I'd like the base_docker_image to not only be defined at runtime
I see, may I ask why not just build it once, push it into artifactory and then have trains-agent
use it? (it will be much faster)
BTW, how can I run 'execute_orchestrator' concurrently?
It is launching simultaneously, (i.e. if you are not processing the output of the pipeline step function, the execution will not wait for its completion, notice that the call itself might take a few seconds, as it create a task and enqueues/sub-process it, but is it Not waiting for it)
GiganticTurtle0 where in the code you set the output destination to "file:///home/mount/user/server_local_storage" ?