
Reputation
Badges 1
25 × Eureka!Thanks @<1694157594333024256:profile|DisturbedParrot38> !
Nice catch.
Could you open a github issue so that at least we output a more informative error?
Hi JuicyDog96
The easiest way at the moment (apologies for still lack of RestAPI documentation, it is coming:)
Is actually the code (full docstring doc)
https://github.com/allegroai/trains/tree/master/trains/backend_api/services/v2_8
You can access it all with an easy Pythonic interface, for example:from trains.backend_api.session.client import APIClient client = APIClient() tasks = client.tasks.get_all()
Intersting!
I would also add that Task name is not unique and you can use to describe the "process / goal etc" which would make it pretty obvious to search / review from the UI.
Regrading models and branchs, Iw ould use the Task tags (you can have as many as you like) to tag the specific model type (or dev branch if the alg is diff), this means you can also easily filter based on the Tags in the UI.
can you use the Web UI to compare the artifacts from two separate subprojects?
Yes comp...
each of it gets pushed as a separate Model entity right?
Correct
But thereβs only one unique model with multiple different version of it
Do you see multiple lines in the Model repository ? (every line is an entity) basically if you store it under the same local file, it will override the model entry (i.e. reuse it and upgrade the file itself), otherwise you are creating a new model, "version" will be progress in time ?
Are you suggesting the default "ubuntu:18.04" is somehow contaminated ?
This is an official Ubuntu container (nothing to do with ClearML), this is Very Very odd...
π DilapidatedDucks58 how exactly are you "relaunching/continue" the execution? And what exactly are you setting?
but I cannot compare between them
I think we noticed it, and this will be fixed in the next server update (again, some plotly.js issue there)
Hi UnsightlySeagull42
But now I need the hyperparameters in every python file.
You can always get the Task from anywhere?main_task = Task.current_task()
MelancholyBeetle72 thanks! I'll see if we could release an RC with a fix soon, for you to test :)
I start the TaskScheduler, register a task, and stop the scheduler, how do I restart the TaskScheduler in a way that re-register the tasks?
if it's aborted, just re-enqueue it?
(it serializes itself and stores it's state on the Task object, so when re-launched it will deserialize from the last state)
Long story short, not any longer (in previous versions of k8s it was possible, but after the runtime container change it is not supported)
RobustGoldfish9 I see.
So in theory spinning an experiment on an gent would be clone code -> build docker -> mount code -> execute code inside docker?
(no need for requirements etc.?)
Hi RoundMosquito25
The main problem here is there is no way to know before running the Task how much memory it would need ... And without that parameter maximizing GPUs is quite challenging. wdyt?
Correct (basically pip freeze results)
Hi SharpDove45
whatΒ
Β suggested about how it fails on bad/missing credentials
Yes, this is correct, since you specifically set the hosts worst case you will end up with wrong credentials π
mostly by using
Task.create
instead of
Task.init
.
UnevenDolphin73 , now I'm confused , Task.create is Not meant to be used as a replacement for Task.init, this is so you can manually create an Additional Task (not the current process Task). How are you using it ?
Regarding the second - I'm not doing anything per se. I'm running in offline mode and I'm trying to create a dataset, and this is the error I get...
I think the main thing we need to...
-rw------- 1 1000 1000 0 Feb 28 23:41 config
BoredGoat1 where exactly do you think that happens ?
https://github.com/allegroai/trains/blob/master/trains/utilities/gpu/gpustat.py#L316
?
https://github.com/allegroai/trains/blob/master/trains/utilities/gpu/gpustat.py#L202
Clearml 1.13.1
Could you try the latest (1.16.2)? I remember there was a fix specific to Datasets
However, once I extract the zips (or download the dataset through Python API or CLI) not all the files are there.
and all the files are registered in the metadata? coulf you add --verbose
to the sync command to see what it is doing
"clearml-data add --folder ./*" seems to fix this issue though it doesn't preserve my directory structure
This is also odd, it should Not flatten the folder structure. What is your OS / Python / clearml version?
Is this reproducible ? if so, how ...
Hi @<1600299043865497600:profile|MagnificentSeaurchin90>
Any chance you can provide more info on the error?
if I want to compare two experiments the scalar plots do not load ( loading forever ).
I'm assuming the issue is the Plots tab? or is it the Scalars? what do you have in the Plots? can you send an image of the single experiment ?
Hi PanickyMoth78
Yes i think you are correct, this looks like gs throttling your connection. You can control the number of concurrent uploads with max_worker=1
https://github.com/allegroai/clearml/blob/cf7361e134554f4effd939ca67e8ecb2345bebff/clearml/datasets/dataset.py#L604
Let me know if it works
I theory this would be doable, but wouldn't it be a bit confusing? Also why not always use containers if the host supports it, there is no real downside, just set the default docker image to something that is a good starting point
Hi PompousParrot44
What do you have in the Execution/"script path" ?
Hi @<1533620191232004096:profile|NuttyLobster9>
Hi All, is there a way to clone a pipeline from the web UI like you can with a task?
Right click on the pipeline and select Run (it is basically the same thing as cloning it)