Reputation
Badges 1
25 × Eureka!1st: is it possible to make a pipeline component call another pipeline component (as a substep)
Should work as long as they are in the same file, you can however launch and wait any Task (see pipelines from tasks)
2nd: I am trying to call a function defined in the same script, but unable to import it. I passing the repo parameter to the component decorator, but no change, it always comes back with "No module named <module>" after my
from module import function
c...
Hi FunnyTurkey96
Let me check what's the status here
(BTW: Is this for a specific Task or for a specific Project?)
Hi @<1610083503607648256:profile|DiminutiveToad80>
Yes, it does. They are also cached by default (on the machine with the agent)
None
Hi CurvedHedgehog15
I would like to optimize hparams saved in Configuration objects.
Yes, this is a tough one.
Basically the easiest way to optimize is with hyperparameter sections as they are basically key/value you can control from the outside (see the HPO process)
Configuration objects are, well, blobs of data, that "someone" can parse. There is no real restriction on them, since there are many standards to store them (yaml,json.init, dot notation etc.)
The quickest way is to add...
Hi @<1566596960691949568:profile|UpsetWalrus59>
Could it be the two experiments have the exact name ?
(I sounds like a bug in the UI, but I'm trying to make sure, and also understand how to reproduce)
What's your clearml-server version ?
And actually the slack thing is actually a good workaround this since people can just comment easily
Any reference for similar integration between Slack and other platforms ?
I'm thinking maybe the easiest way is Slack bot to you can @ task id ?
function and just seem to be getting an "isadirectory" error?
Can you post here what you are getting ? which clearml version are you using ?!
also tried manually adding
leap==0.4.1
in the task UI which didn't work.
That has to work, if it did not, can you send the log for the failed Task (or the Task that did not install it)?
The environment in the logs does show that leap is being installed potentially from a cache?
- leap @ file:///opt/keras-hannd...
PompousBeetle71 quick question, will you ever want to pass an empty string ? reason for asking is that it is either one or the other, there is no way for Trains to actually differentiate (from the web UI, perspective this is just an empty string field...)
Also I canβt call the βpreprocessβ function since there is no valid endpoint to be hitting
Wait now I'm confused, when you are calling " None " you are actually calling the preprocess function running on the inference container, and this one in turn (automatically) calls the Triton container.
Are you calling the Triton manually?
Could you share your preprcoess.py , and the command line you have used to register the two model versions ?
(based on ...
I saw documentation, but I can't make the proper dict object for hyperparams
I see, this is what you are after (I think)
https://github.com/allegroai/clearml/blob/fb644fe9ec6be36b8f2f70a34256fbdc593d663a/clearml/backend_api/services/v2_20/tasks.py#L3138
What is the difference toΒ
file_history_size
Number of unique files per titles/series combination (aka how many images to store in the history, when the iteration is constantly increasing)
LOL totally π
Could not install packages due to an EnvironmentError: [Errno 2] No such file or directory: '/tmp/build/80754af9/attrs_1604765588209/work'Seems like pip failed creating a folder
Could it be you are out of space ?
I think the main issue is running with python -m module.name --args
Which is a bit different, when trying to "understand" what is the actual repository.
Can you try to run it from the repository folder (same command, just to see if it will have any effect on the detected packages)
Hi ProudMosquito87 trains-agent will automatically clone your code into the docker, no need to worry about it π make sure you configure the https://github.com/allegroai/trains-agent/blob/master/docs/trains.conf#L16 or the trains-agent machine contains the git ssh keys in the home folder of the user executing the trains-agent
CooperativeSealion8 let me know if you managed to solve the issue, also feel free to send the entire trains-server log. I'm assuming one of the dockers failed to boot...
Hi SkinnyPanda43
In your local machine do not pass output_uri at all, so nothing will be uploaded.
On the agent's configuration file configure, default_output_uri to the S3 bucket
(Notice you can always override them in the UI, see the bottom of the execution Tab)
https://github.com/allegroai/clearml-agent/blob/e93384b99bdfd72a54cf2b68b3991b145b504b79/docs/clearml.conf#L312
(BTW: any reason not to use the agent?)
Unfortunately that is correct. It continues as if nothing happened!
oh dear, let me make sure this is taken care of
And thank you for the reproduce code!!!
Worker just installs by name from pip, and it installs not my package!
Oh dear ...
Did you configure additional pip repositories in the Agent's clearml.conf ? https://github.com/allegroai/clearml-agent/blob/178af0dee84e22becb9eec8f81f343b9f2022630/docs/clearml.conf#L77 It might be that (1) is not enough, as pip will first try to search the package in the pip repository, and only then in the private one. To avoid that, in your code you can point directly to an https of your package` Ta...
Hi @<1523701066867150848:profile|JitteryCoyote63>
I found a memory leak
in
Logger.report_matplotlib_figure
Are you sure this is not Matplotlib leak but the Logger's fault ? I'm trying to think how we could create such a mem leak
wdyt?
This smells like a driver/image issue on the instance VM
What are you getting if add this inside your code?
os.system('nvidia-smi')
Hi @<1600299043865497600:profile|MagnificentSeaurchin90>
Any chance you can provide more info on the error?
if I want to compare two experiments the scalar plots do not load ( loading forever ).
I'm assuming the issue is the Plots tab? or is it the Scalars? what do you have in the Plots? can you send an image of the single experiment ?
Hi @<1600661423610925056:profile|StrongMouse81>
using serving base url and also other endpoint of model we add using:
clearml-serving model add
we get the attached respond:
And other model endpoints are working for you?
agentservice...
Not related, the agent-services job is to run control jobs, such as pipelines and HPO control processes.
I think that the first model saved gets the task name as its name and the following models take
f"{task_name} - {file_name}"
Hmm, I'm not sure what would be a good way to make it consistent, would it make sense to always have the model file name?
I guess it takes some time before the the correct names are assigned?
Hmm that is odd, I have a feeling it has to do with calling Task.close()?!
I just tried with the latest clearml version and it seemed to work as expected
However, are you thinking of including this callbacks features in the new pipelines as well?
Can you see a good use case ? (I mean the infrastructure supports it, but sometimes too many arguments is just confusing, no?!)