
Reputation
Badges 1
25 × Eureka!the second seems like a botocore issue :
https://github.com/boto/botocore/issues/2187
AWS_ACCESS_KEY_ID AWS_SECRET_ACCESS_KEY AWS_DEFAULT_REGION
Ohh "~/trains.conf" is root probably
You should have a download button when you hover over the table, I guess that would be the easiest.
If needed I can send an SDK code but unfortunately there is no single call for that
is "my_package" a local package ?
what is the output of:pip freeze | grep my_package
Hi @<1523701066867150848:profile|JitteryCoyote63>
Thank you for bringing it! can you verify with the latest clearml-agent 1.5.3rc2
?
Hi OddShrimp85
If you pass 'output_uri=True' to task init, it will upload the model automatically, or as you said manually with outputmodel class
Hello guys, i have 4 workers (2 in default and 2 in service queue on same machine)
Hi @<1526734437587357696:profile|ShaggySquirrel23>
I think what happens is one agent is deleting it's cfg file when it is done, but at least in theory each one should have it's own cfg
One last request can you try with the agent's latest RC version 1.5.3rc2 ?
Hi @<1600299043865497600:profile|MagnificentSeaurchin90>
Any chance you can provide more info on the error?
if I want to compare two experiments the scalar plots do not load ( loading forever ).
I'm assuming the issue is the Plots tab? or is it the Scalars? what do you have in the Plots? can you send an image of the single experiment ?
Hi SubstantialElk6 I'll start at the end, you can run your code directly on the remote GPU machine π
See clearml-task
documentation, on how to create a task from existing code and launch it
https://github.com/allegroai/clearml/blob/master/docs/clearml-task.md
That said, the idea is that you add the Task.init
call when you are writing/coding the code itself, then later when you want to run it remotely you already have everything defined in the UI.
Make sense ?
CluelessFlamingo93 I would also fix the pip version requirements to:pip_version: ["<20.2 ; python_version < '3.10'", "<22.3 ; python_version >= '3.10'"]
the SDK is unable to see each of the nodes?
Exactly ! I mean I love the idea of "nested" component, but implementation wise this is not trivial, it will also hurt the ability of caching individual component. The workaround is to have all the "business logic" in the pipeline function itself, routing data between components is basically "free". The data does not actually go through the pipeline logic, it only passes reference (unless the pipeline logic actually tries to access the data o...
Yes π
BTW: do you guys do remote machine development (i.e. Jupyter / vscode-server) ?
Hi WittyOwl57
Are you starting a new server from scratch or is it running on previously stored data?
Yes MuddySquid7 it is automatically detects it (regardless of you uploading DF as an artifact).
How are you saving the dataframe ?
(it will auto log any joblib.save call, is that it?)
based on this:
https://clear.ml/docs/latest/docs/references/api/endpoints#post-debugping
" http://localhost:8080/debug.ping β
btw: What'd the usage scenario ?
Hi SubstantialElk6
Generically, we would 'export' the preprocessing steps, setup an inference server, and then pipe data through the above to get results. How should we achieve this with ClearML?
We are working on integrating the OpenVino serving and Nvidia Triton serving engiones, into ClearML (they will be both available soon)
Automated retraining
In cases of data drift, retraining of models would be necessary. Generically, we pass newly labelled data to fine...
I guess it wonβt due to the nature of services?
Correct, k8s glue works differently, that said I would actually use the helm to spin a pod woth the agent in services mode and venv mode.
BTW:
Task.add_requirements('tensorflow', '2.2') will make sure you get the specified version π
It could be the model storing? could it be the peak is at the end of the epoch ?
Hi @<1523704157695905792:profile|VivaciousBadger56>
No these are 3 different ways of building pipelines.
Creating from decorators is recommended when each component can be easily packages into a single function (every function can have an accompanying repository).
Here the idea it is very easy to write complex execution logic, basically the automagic does serialization/deserialization so you can write pipelines like you would code python.
Creating from Tasks is a good match if you need to ...
Looks great, let me see if I can understand what's missing, because it should have worked ...
Hi ShallowCat10
What's the TB your are using?
Is this example working correctly for you?
https://github.com/allegroai/clearml/blob/master/examples/frameworks/tensorflow/tensorboard_pr_curve.py
HighOtter69
Could you test with the latest RC? I think this fixed it:
https://github.com/allegroai/clearml/issues/306
What's the OS / Python version?
I think the crux of the issue is the subprocess calls I removed.
That kind of makes sense, though if the subprocess function also had Task.init call it should have worked.
Would that be the setup to try to replicate?
Something is off here ... Can you try to run the TB examples and the artifacts example and see if they work?
https://github.com/allegroai/clearml/blob/master/examples/frameworks/tensorflow/tensorflow_mnist.py
https://github.com/allegroai/clearml/blob/master/examples/reporting/artifacts.py