that is because my own machine has 10.2 (not the docker, the machine the agent is on)
No that has nothing to do with it, the CUDA is inside the container. I'm referring to this image https://allegroai-trains.slack.com/archives/CTK20V944/p1593440299094400?thread_ts=1593437149.089400&cid=CTK20V944
Assuming this is the output from your code running inside the docker , it points to cuda version 10.2
Am I missing something ?
Specifically your error seems to be an issue with nvidia Triton container upgrade
Nice debugging experience
Kudos on the work !
BTW, I feel weird to add an issue on their github, but someone should, this generic setup will break all sorts of things ...
Oh, so is it a bug and you should have seen two series on each graph? (I think it is... not sure how to actually name the second instance other than running number)
I don't want a new task every 5 minutes as that will create a lot of tasks over a day. It would be better if I had just one task.
Oh you mean the Task that will be launched will override the previous "instance", correct ?
but can it NOT use /tmp for this i’m merging about 100GB
You mean to configure your Temp folder for when squashing ?
you can do hack the following:
` import tempfile
tempfile.tempdir = "/my/new/temp"
Dataset squash
tempfile.tempdir = None `But regradless I think this is worth a GitHub issue with feature request, to set the temp folder///
BitterLeopard33
How to create a parent-child Dataset with a same dataset_id and only access the child?
Dataset ID is unique, the child will have a different UID. The name of the Dataset can the the same though.
Specifically to create a child Dataset:
https://clear.ml/docs/latest/docs/clearml_data#datasetcreatechild = Dataset.create(..., parent_datasets=['parent_datast_id'])
Are there any ways to access the parent dataset(assuming its large and i dont want to download it)
...
GiganticTurtle0 fix was pushed 🙂
you can test with:pip install git+
🤞
what if for some old tasks I get WARNING:root:Could not delete Task ID=a0908784a2a942c3812f947ec1f32c9f, 'Task' object has no attribute 'delete'? What's the best way of cleaning them?
This seems like an old SDK no?
Shouldn't this be a real value and not a template
you mean value being pulled to the pod that failed ?
Hm GiganticTurtle0 let me check quickly it
OutrageousGrasshopper93 could you send an example of the two links from the artifacts (one local one remote) ?
I mean to use a function decorated with
PipelineDecorator.pipeline
inside another pipeline decorated in the same way.
Ohh... so would it make sense to add "helper_functions" so that a function will be available in the step's context ?
Or maybe we need a new to support "standalone" decorator?! Currently to actually "launch" the function step, you have to call it from the "pipeline" main logic function, but, at least in theory, one could do without the Pipeline itself.....
seems like I'm passing in my own docker image which is then used at run time?
You are passing the Default docker image, if the Task does not list a specific docker image it will use the one you passed.
Yes this is "docker mode" (in venv mode no dockers are used, it just creates a new venv per experiment and installs everything inside the venv)
And having a pdf is easier/better than sharing a link to the results page ?
Hi AverageBee39
It seems the json is corrupted, could that be ?
(We should probably better state it in the GitHub readme)
Hi @<1566596960691949568:profile|UpsetWalrus59>
Could it be the two experiments have the exact name ?
(I sounds like a bug in the UI, but I'm trying to make sure, and also understand how to reproduce)
What's your clearml-server version ?
` from clearml.automation.parameters import LogUniformParameterRange
sampler = LogUniformParameterRange(name='test', min_value=-3.0, max_value=1.0, step_size=0.5)
sampler.to_list()
Out[2]:
[{'test': 1.0},
{'test': 3.1622776601683795},
{'test': 10.0},
{'test': 31.622776601683793},
{'test': 100.0},
{'test': 316.22776601683796},
{'test': 1000.0},
{'test': 3162.2776601683795}] `
Unfortunately this sounds a classic case of RBAC (role based access control), and only the enterprise version has that feature (I think there is a contact us button on the website for those queries).
The easiest way to support the use case you describe is to share on a Task level 😞
Hi @<1614069770586427392:profile|FlutteringFrog26>
So since you have the Task id. you do:
task = Task.get_task("task id here")
Then to get the models
models = task.models["output]
the models is a list And a dict, if you want the lats one you do last_model = models[-1]
if you know the best model name you do model = models["best model"]
(notice the model name is the exact one you see in the UI. Once you have the model object you can get a copy with `model.get_lo...
And is "requirements-dev.txt" in your git root folder?
What is your clearml-agent version?
when u say use
Task.current_task()
you for logging? which i’m guessing that the fastai binding should do right?
right, this is a fancy way to say, make sure the actual sub-process is initializing ClearML so all the automagic kicks in, since this is not "forked" but a whole new process, calling Task.current_task is the equivalent of calling Task.init with the same arguments (which you can also do, I'm not sure which one is more straight forward, wdyt?)
. Could you clarify the question for me, please?
...
Could you please point me to the piece of ClearML code related to the downloading process?
I think I mean this part:
https://github.com/allegroai/clearml/blob/e3547cd89770c6d73f92d9a05696018957c3fd62/clearml/datasets/dataset.py#L2134
Hi TroubledJellyfish71
What do you have listed on the Task's execution "installed packages" section ? (of the original Task) ?
How did it end up with an http link of pytorch ?
Usually it would be torch==1.11
...
EDIT:
I'm assuming the original Task was executed on a Mac M1, what are you getting when calling pip freeze
?
And where is the agent running ? (and is it venv or docker mode?)
That said, it might be different backend, I'll test with the demoserver