Reputation
Badges 1
25 × Eureka!Hi DepressedChimpanzee34
This is not a query call, this is a reporting call. see docs below
https://clear.ml/docs/latest/docs/references/api/workers#post-workersstatus_report
It is used by the worker to report its own status.
I think this is what you are looking for:
https://clear.ml/docs/latest/docs/references/api/workers#post-workersget_stats
GleamingGrasshopper63 can you ping to your api server ?!ping api.server.here
Also what's the api server you configured ? (ip:8008 ?)
Any chance this is a Local machine, i.e. the colab machine cannot get back into the clearml server cunning locally ?
Hi @<1566596960691949568:profile|UpsetWalrus59>
you should call it before initializing the Task
Task.ignore_requirements("pywin32")
task = Task.init(...)
Hi @<1600299043865497600:profile|MagnificentSeaurchin90>
Any chance you can provide more info on the error?
if I want to compare two experiments the scalar plots do not load ( loading forever ).
I'm assuming the issue is the Plots tab? or is it the Scalars? what do you have in the Plots? can you send an image of the single experiment ?
(But in venv mode is also hangs the same way)
Hmm this is strange, could it be you are running out of storage ?
AdventurousButterfly15 this one is quite self container:
https://github.com/allegroai/clearml/blob/master/examples/reporting/scalar_reporting.py
So I guess pip install finished working
But the task is evidently not being executed.
This is very odd ... you can run the agent with debugging with --debug --foreground to see all the outputs and logs
If the same happens in venv mode, see if pip process actually finished (you can find it with ps -Af | grep pip
)
your account has 2FA enabled and you must use a personal access token instead of a password.
I'm assuming you have created the personal access token and used it, not the pass
Hi @<1523702786867335168:profile|AdventurousButterfly15>
Make sure you pass output_uri=true in Task.init
It will automatically upload your model to the file server. You can also configure it in the clearml.conf, look for defualt_output_uri
Hi @<1523702786867335168:profile|AdventurousButterfly15>
I am running cross_validation, training a bunch of models in a loop like this:
Use the wildcard or disable all together:
task = Task.init(..., auto_connect_frameworks={"joblib": False})
You can also do
task = Task.init(..., auto_connect_frameworks={"joblib": ["realmodelonly.pkl", ]})
Cloud Access section is in theΒ
Profile
Β page.
Any storage credentials (S3 for example) are only stored on the client side (never the trains-server), this is the reason we need to configure them in the trains.conf. When the browser needs to access those URL's (downloading an artifact) it also needs the secret/key, it automatically display a popup requesting them, and will store them in this section. Notice they are stored on the browser session (as a cookie).
OutrageousGrasshopper93 could you send an example of the two links from the artifacts (one local one remote) ?
yes, looks like. Is it possible?
Sounds odd...
Whats the exact project/task name?
And what is the output_uri?
Thanks OutrageousGrasshopper93
I will test it "!".
By the way the "!" is in the project or the Task name?
How are you getting:
beautifulsoup4 @ file:///croot/beautifulsoup4-split_1681493039619/work
is this what you had on the Original manual execution ? (i.e. not the one executed by the agent) - you can also look under "org _pip" dropdown in the "installed packages" of the failed Task
Since the error says network error, is it possible because I'm in Taiwan? Like downloading from Asia leads to this kind of issue
Can you download it from the browser ? (I mean the file size after download , is it 400mb?)
BTW: I suspect this is the main issue:
https://github.com/python-poetry/poetry/issues/2179
... transformed to 'str' when passed to a function decorated withΒ
PipelineDecorator.component
Β at the time of calling it in the pipeline itself. Again, is this something intentional?
Are you sure about that? Notice the example code specifies, int as well...
It seems the code is trying to access an s3 bucket, could that be the case? PanickyMoth78 any chance you can post the full execution log? (Feel free to DM so it won't end up being public)
I wonder if I just need to join 2 docker-compose files to run everything in one session
Actually that could also work
But for reference, when I said IP i meant the actual host network IP not the 127.0.0.1 (which is the same as localhost)
Hi @<1523706266315132928:profile|DefiantHippopotamus88>
The idea is that clearml-server acts as a control plane and can sit on a different machine, obviously you can run both on the same machine for testing. Specifically it looks like the clearml-sering is not configured correctly as the error points to issue with initial handshake/login between the triton containers and the clearml-server. How did you configure the clearml-serving docker compose?
Hi JitteryCoyote63
I would like to switch to using a single auth token.
What is the rationale behind to that ?
if I want to run the experiment the first time without creating theΒ
template
?
You mean without manually executing it once ?
at means I need to pass a single zip file toΒ
path
Β argument inΒ
add_files
Β , right?
actually the opposite, you pass a folder (of files) to add_files. Then add_files remembers the files location (and pre calculates the hash of the files content). When you call upload
it will actually compress the files that changed into a zip file (or files depending on the chunk size), and upload the files to the destination (as specified in the upload
call...
MelancholyElk85
After I set base docker for pipeline controller task, I cannot clone the repo...
What do you mean by that?
Also, how do you set the PipelineController base_docker_image (I'm assuming the is needed to run the pipeline logic?!, is that correct?)
Also in the same open docker session, can you try:$LOCAL_PYTHON -m clearml_agent execute --disable-monitoring --id <task_id_here>
Where the Task ID is one of the failed executions (only reset it before)