
Reputation
Badges 1
25 × Eureka!you can also specify additional packages on the decorator@PipelineDecorator.component(..., packages=["tqdm>=2.1", "scikit-learn"]) def step_one(...): # code here
True, this is exactly the reason. That said, you can always manually add it. You can see the default values : https://github.com/allegroai/trains-agent/blob/master/docs/trains.conf
Hi FriendlyKoala70 you can edit the installed package section and add the missing package. See more details on how trains-agent works here (although it's on conda the same rules apply for pip) https://github.com/allegroai/trains-agent/issues/8
If you have the check point (see output_uri for automatically uploading it) then you can always load it. Do you mean if you can change the iteration/ step counter? Or do you mean with trains-agent?
Hi LivelyLion31
Yes, the reason we designed Trains with an automagic integration is exactly that reason, so users do not need to learn another package and that with almost no effort you get most of the benefits.
Regrading the TB files, from experience most users will use the TB files short after they executed the experiment, usually for debugging and in depth capabilities (like network debugger profile etc), metric view is something that is much easier to do on a centralized server (like on...
Hi LivelyLion31 I missed your S3 question, apologies. What did you guys end up doing?
BTW you could always upload the entire TB log folder as artifact, it's simple task.upload_artifact('tensorboard', './tblogsfolder')
You can already sort and filter experiments based on any hyper parameter or metric that the experiment reports, there is no need for any custom language query. Also all created filter/sorted table can be shared exactly as they are, so you can create leaderboards and share specific filters. You can also use the search bar in order to filter based on experiment name / comment. Tags will be added soon as well 🙂
Example of custom columns is here (the screen grab is a bit old, now there is als...
Also. finally the columns will be movable and re sizable, I can't wait for the next version ;)
BTW, VexedKangaroo32 are you using torch launch ?
Hi VexedKangaroo32 , funny enough this is one of the fixes we will be releasing soon. There is a release scheduled for later this week, right after that I'll put here a link to an RC containing a fix to this exact issue.
Since this fix is all about synchronizing different processes, we wanted to be extra careful with the release. That said I think that what we have now should be quite stable. Plan is to have the RC available right after the weekend.
Hmmm, that actually connects with something we were thinking about: introducing sections to the hyper parameters. This way we could easily differentiate between the command line arguments and other types of parameters. DilapidatedDucks58 what do you think?
It's dead simple to install:
Pip install trains-agent
the.n you can simply do:
Trains-agent execute --id myexperimentid
FYI all the git pulls are cached even in docker mode so there is no "tax" to pay for pulling the sub-modules (only the first time of course)
AstonishingSeaturtle47 that's awesome! Could you explain the hack, it might be helpful for others (I assume :))
Hi WorriedParrot51 , what do you mean by "call get_parameters_as_dict() from agent" ?
Do you mean like change the trains-agent to run the task differently?
Or inside your code while the trains agent runs it?
From the code itself (regardless off how you run it) you can always call, and get the current states parameters (i.e. from backend if running with trains-agent, or copied from the code, if running manually)task.get_parameters_as_dict()
I think for it to work you have to have ssh running on the host machine (the socket client itself), no?
So there is no copying of the data to the pod, it is simply references via the EFS
Correct
However, when 'extra' is a positional argument then it is transformed to 'str'
Hmm... okay let me check something
Hi @<1726410010763726848:profile|DistinctToad76>
Why not just report scalars, the x-axis you can use as "iterations" if this is a running in real time to collect the prompts.
If this is a summary then just report a scatter plot (you can also specify the names of the axis and the series)
None
BTW, how can I run 'execute_orchestrator' concurrently?
It is launching simultaneously, (i.e. if you are not processing the output of the pipeline step function, the execution will not wait for its completion, notice that the call itself might take a few seconds, as it create a task and enqueues/sub-process it, but is it Not waiting for it)
The 'on-premise' server fails to connect to the ClearML server because of the VPN I think
I think you are correct.
You can quickly test it, try ti run curl
http://local-server:8008 see if that works
CurvedHedgehog15 the agent has two modes of opration:
single script file (or jupyter notebook), where the Task stores the entire file on the Task itself. multiple files, which is only supported if you are working inside a git repository (basically the Task stores a refrence to the git repository and the agent pulls it from the git repo)Seems you are missing the git repo, could that be?
LOL EnormousWorm79 you should have a "do not show again" option, no?
One additional question, if you import clearml after you call torch
does it work ?
EnviousPanda91 'connect' will log the object properties, the automagic logging is controlled in the Task.init call. Specifically Which framework produces metrics that are not logged? Your sample code manually reports some scalars/values, do you these as well?
ReassuredTiger98
(for some reason it kind of jumps over PyTorch, but then installs torchvision?!)
Could you run with the latest with --debug
We just added but you will have to install from git:pip3 install git+
Then run with --debug:clearml-agent --debug daemon ...
Hi RoundMosquito25
Hi, are there available somewhere examples of testing in ClearML? For example unit tests that check if parameters are passed correctly to new tasks etc.?
What do you mean by "testing in ClearML" ?
For example unit tests that check if parameters are passed correctly
Passed where / how? Are we thinking agents here ?
Hi CrookedWalrus33
I think there if you are already logged in and you pressed on the "signup" tab instead of the "login" tab (frontend are working on a solution)
In the meantime just make sure you are clicking on the "login" tab