Hi SubstantialElk6
quick update, once clearml 1.1 is out, we will push the clearml-data improvement, supporting chunks per version (i.e. packaging the changeset into multiple zip files, instead of a single one as the current version does).
regrading (1) storage limit server.
Ideally, we should be able to specify the batch size that we want to download, or even better, tie this in with the training by parallelising the data download, data preprocessing and batch trains.
With the nex...
Also, the IDs as an entry in the Configuration will not be clickable in the web interface, right?
No, but on the other hand, it will be editable if you clone the Task.
Which brings me to a different scenario,
In the original one, the Main Task created the Dataset, i.e. Output Dataset (and stored it both ways).
I could think of a situation the Task is using the Dataset as input (say preprocessing or traing), then we might want to enable users to clone and change the Input dataset. wdyt?
You need to use tf.summary.image and not summary_ops_v2.image
Fixed on main branch (see github issue), RC later today
Image needs to be in range [0, 1] and not [0, 255] (matplotlib and tensorboard can handle either one)
Is there a code to reproduce ?
So it seems to get the "hint" from the type:
This will worktf.summary.image('toy255', (ex * 255).astype(np.uint8), step=step, max_outputs=10)
wdyt, should it actually check min/max and manually cast it ?
StaleButterfly40 are you sure you are getting the correct image on your TB (toy255) ?
I get the same "white" image in both TB & ClearML 😞
I mean, can you install it with something like ?pip install git+
Basically the agent will install main repository, and any git submodules. But it cannot install multiple repositories, as the directory structure might be too much.
wdyt?
Hi ConvolutedChicken69
but when running the script it only clones the repo the clearml task is on, how can it get the other repo also?
Do you have a wheel or a git you can install it from ?
Hi BattyLizard6
Not that I'm aware of, which TF version are you using, and which clearml version?
is there a code that can reproduce it ?
BTW: see if this works:$ CLEARML_API_HOST_VERIFY_CERT=0 clearml-init
Hi Team, I'm currently trying to install ClearML-Server on a Powerpc server with RedHat7.
You are a brave man LividCrab90 !
s there dockerfiles for the ClearML-Server stack somewhere ?
The main issue is replacing the DB containers, do you have elastic/mongo/redis for powerpc ?
Task.current_task().connect(training_args, name='hugggingface args')
And you should be able to change them when launching remotely 😉
SmallDeer34 btw: "set_parameters_as_dict" will replace all the arguments (and is one way) ...
Then in theory (since the backend is python based) you just need to find a base docker image to build it on.
UnevenDolphin73 go to the profile page, I think at the bottom right corner you should see it
(Also ctrl-F5 to reload the web application, if you upgraded the server 🙂 )
DisgustedDove53 , TrickySheep9
I'm all for it!
I can think of two options here, (1) use the k8s glue + apply template with ports mode see discussion https://clearml.slack.com/archives/CTK20V944/p1628091020175100
(2) create an interface (queue) to launch arbitrary job on the k8s cluster, with the full pod definition on the Task. This will allow the clearml-session to setup everything from the get go.
How would you interface with the k8s operator, and what exactly will it do?
(BTW: the reas...
MiniatureCrocodile39 from the screen shot I imagine you are running inside a docker, this means that when you restart the docker, the configuration file is lost.
Could that be the case ?
And your ~/clearml,conf ?
but I cannot compare between them
I think we noticed it, and this will be fixed in the next server update (again, some plotly.js issue there)
Did you experiment any drop of performances using forkserver?
No, seems to be working properly for me.
If yes, did you test the variant suggested in the pytorch issue? If yes, did it solve the speed issue?
I haven't tested it, that said it seems like a generic optimization of the DataLoader
RobustSnake79 let's assume that the trace figure above is probably too much to get into the WebUI, which simple figures might still have value in your scenario ?
Three options:
In your code: Task.init(..., output_uri='s3://.../'
2. Configure a default output_uri to be used by all tasks: https://github.com/allegroai/clearml/blob/64042f6c4fdaaf15b6c5f816f2fbf50f89c313e2/docs/clearml.conf#L156
3. In the UI after you clone a Task under Execution tab, "output" "destination"
In all cases output_uri can be:
/mnt/share/folder (if you have a shared folder between all machines. http://trains-server:8081/ gs://bucket azure://bucket/
I mean what is the actual link?
File:// is a path to a file.
If your machine cannot access that path you get an error.
For example:
file:///home/user/file.bin
translates to /home/user/file.bin
If you do not have the file /home/user/file.bin on your machine you get an error.
GrievingTurkey78 make sense ?
Note that by default trains / clearml will not upload your weights file anywhere , only if you set "output_uri" to a specific location it will do that .