Reputation
Badges 1
25 × Eureka!Requested version: 2.28, Used version 1.0" for some reason
This is fine that means there is no change in that API
containing the
Extension
module
Not sure I follow, what is the Extension module ? what were you running manually that is not just pip install /opt/keras-hannd ?
OutrageousSheep60 so this should work, no?ds.upload(output_url='gs://<BUCKET>/', compression=0, chunk_size=100000000000)Notice the chunk size is the maximum size (in bytes) per chunk, so it should basically very large
OmegaConf is the configuration, the overrides are in the Hyperparameters "Hydra" section
None
could be nice to have a direct "task comparison" link in the UI somewhere,
you mean like a "cart" for comparison ? or just to "save the state" so you can move between projects ?
Hi DefeatedCrab47
You should be able to change the Web server port , but API port (8008) cannot be changed. If you can login to the web app and create a project it means everything is okay. Notice that when you configure trains ( trains-init ) the port numbers are correct π
What do you already have working from the above steps ? and which parts are missing or we can think of automating ?
Hi @<1697056701116583936:profile|JealousArcticwolf24>
Awesome deployment π€©
Yes if you need another scalable model serving you can just run another instance of the clearml-serving-inference
https://github.com/allegroai/clearml-serving/blob/7ba356efc97a6ae2159283d198d981b3c1ab85e6/docker/docker-compose.yml#L77
So you end up with two of them, one per models environ...
The issue is the 400 returned form the server, let me check with backend guys
ZanyPig66 it sounds like you need to add the docker args for binding, just add to the Task.create the argument: 'docker_args="-v /mnt/host:/mnt/container"'
Hi BitingKangaroo95
Are you running the agent on docker-mode or venv mode ?
basically, clearml-session will work on on clearml-agents that are running in docker mode
(I think we already have a fix for the documentation, probably will be deployed soon)
Hi @<1618780810947596288:profile|ExuberantLion50>
Iβm trying to containerize a task using clearml-agent build, following instructions from the docs online.
Do you mean to create a container with the Task's environment for debugging ?
If this is for running the Task there is no need to create a specific container for it, both code and python env are cached.
Hi ItchyJellyfish73
The behavior should not have changed.
"force_repo_requirements_txt" was always a "catch all option" to set a behavior for an agent, but should generally be avoided
That said, I think there was an issue with v1.0 (cleaml-server) where when you cleared the "Installed Packages" it did not actually cleared it, but set it to empty.
It sounds like the issue you are describing.
Could you upgrade the clearml-server and test?
This is done in the background while accessing the cache, so it should not have any slowdown effect
Hi @<1549202366266347520:profile|GorgeousMonkey78>
how do I integrate sagemaker with clearml ,
you mean to launch an experiment, or just to log it?
Yes MuddySquid7 it is automatically detects it (regardless of you uploading DF as an artifact).
How are you saving the dataframe ?
(it will auto log any joblib.save call, is that it?)
Using the dataset.create command and the subsequent add_files, and upload commands I can see the upload action as an experiment but the data is not seen in the Datasets webpage.
ScantCrab97 it might be that you need the latest clearml package installed on the client end (as well as the new server with the UI)
What is your clearml package version ?
clearml-agent deployment file
What do you mean by that? is that the helm of the agent ?
Hi @<1523715429694967808:profile|ThickCrow29>
clearml.automation.auto_scaler.AutoScaler which runs smoothly (kudos!!).
NICE!
The only thing I am missing is the in the clearml dashboard/orchestration --> Is there a way to make it
hmm kind of needs backend support for that π
For now, I can just see the log of the clearML task to monitor whatβs happening
Or is this retricted to pro user ?
Yeah the GCP and AWS autoscalers dashboards are paid tier feature. But...
Why do you ask? is your server sluggish ?
Looking at theΒ
supervisor
Β method of the baseΒ
AutoScaler
Β class, where are the worker IDs kept.
Is it in the class attributeΒ
queues
Β ?
Actually the supervisor is passing a fixed prefix, then it asks the clearml-server on workers starting with this name.
This way we can have a fixed init script for all agents, while we still can differentiate them from the other agent instances in the system. Make sense ?
I want the task of human tagging a model to be βjust another step in the pipelineβ
That makes total sense.
Quick question, would you prefer the pipeline controller to "wait" for the tagging and then continue, or would it make more sense to create a trigger on the tagging ?
Let say I donβt have the data on my local machine but only S3 bucket.
You can still register it, but make sure you do not delete it from the S3 bucket because it will keep a link to it
Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known')': /
what did you put in output_uri ?
compression=ZIP_DEFLATED if compression is None else compressionwdyt?
The docker crashes and I want to be abel to debug it exactly as it is run by the agent
On your machine (any machine)
pip install clearml-agent
clearml-agent build --id <taskID> --docker "local_mydocker_name"
docker run -it local_mydocker_name bash
Hi @<1619867971730018304:profile|WhimsicalGorilla67>
No π only the "admin" (owner) of the workspace has access to it
SoreDragonfly16 as SmallDeer34 mentioned, you can iterate over the Tasks, pull metrics (with either task.get_last_scalar_metrics or task.get_reported_scalar ) then report them on the Task that runs the Loop itself with the Logger.
wdyt?
Hi SmallDeer34
The clearml-agent has its own cleaml.conf file , there you should put S3 credentials and they will be passed to any Task the agent executes:
https://github.com/allegroai/clearml-agent/blob/176b4a4cdec9c4303a946a82e22a579ae22c3355/docs/clearml.conf#L234
(Go to the profile page, and click "Disable HiDPI browser scale override" see if that helps)
2023-02-15 12:49:22,813 - clearml - WARNING - Could not retrieve remote configuration named 'SSH'
This is fine, it means it uses the default identity keys
The thing is - when I try to connect with normal SSH there are no issues
Now I'm lost, so when exactly do you see the issue ?