Reputation
Badges 1
25 × Eureka!RobustGoldfish9 do you see the trains-agent listed as a machine in the UI (under workers)
I think what you are looking for is clearml-agent daemon
https://clear.ml/docs/latest/docs/clearml_agent
https://clear.ml/docs/latest/docs/getting_started/video_tutorials/agent_remote_execution_and_automation
It would be nice to have some documentation proclaiming how randomness behaves when running tasks (in all their variations). E.g. Should I trust seeds to be reset or should I not assume anything and do my own control over seeds.
That is a good point, I'll make sure we mention it somewhere in the docs. Any thoughts on where?
So how do I solve the problem? Should I just relaunch the agents? Because they can't execute jobs now
Are you running in docker mode ?
If so you can actually delete mapped files (they will still be available inside the docker), just make sure you delete them X hours after they were created, and you should be fine.
wdyt?
BTW: if you only need the git diff you can just copy them from the UI into a txt file and do:git apply <copied-diff.txt>
we concluded that we don't want to run it through ClearML after all, so we ran it standalone
out of curiosity, what was the conclusion and why?
you mean The Task already exists or you want to create a Task from the code ?
CrookedWalrus33 can you send the entire log? (you can DM it to me)
Hi SmarmyDolphin68
You have two options:
Automatically upload the models when training pass output_uri
to Task.init. For example output_uri=True
will upload to the clearml-server, output_uri='
s3://bucket/folder '
will upload to S3 etc. Manually upload a model that you have locally: https://github.com/allegroai/clearml/blob/9ff52a8699266fec1cca486b239efa5ff1f681bc/examples/reporting/model_config.py#L37
doing some extra "services"
what do you mean by "services" ? (from the system perspective any Task that is executed by an agent that is running in "services-mode" is a service, there are no actual limitation on what it can do 🙂 )
I think for it to work you have to have ssh running on the host machine (the socket client itself), no?
yes 🙂
But I think that when you get the internal_task_representation.execution.script you are basically already getting the API object (obviously with the correct version) so you can edit it in place and pass it too
Yes including this. (There was a fix to an issue with trains-agent
and disabling frameworks, it is already part of 0.16.3 )
MysteriousBee56 okay look for the folder ~/.trains/vcs_cache you will find the git repo there, just overwrite the content with your local copy
Hi @<1576381444509405184:profile|ManiacalLizard2>
Yeah that should work, assuming credentials are set in your clearml.conf
Hi UpsetTurkey67
repository discovery stores github repo in the form:
...
while for others
git@github.com:...
Yes that depends on how they locally cloned the repo (via SSH or user/pass/token)
Interestingly in the former case the ssh config is ignored and cloning repository breaks on the worker
If you have passed git user/pass to the agent it should use them not SSH, how did you configure the agent ?
BoredHedgehog47 were you able to locate the issue ?
SpicyCrab51 you can change the task to complete, it is just a state change nothing will actually change other than the status. Task.get_task(pass_dataset_id_here).mark_complete()
I think this issue was fixed in clearml-server 1.3.0 (released after the weekend),
Let me check
Lol, :)
I think the issue is that you do not need to manually set the initial iteration, it's supposed to get it , as it is stored on the Task itself
Hey JoyousKoala59 , it seems the helm chart for the clearml server is due to be released tomorrow. My apologies for the confusion :(
Everything seems correct...
Let's try to set it manually.
create a file ~/trains.conf , then copy paste the credentials section from the UI, it should look something like:api { web_server: http:127.0.0.1:8080 api_server: http:127.0.0.1:8008 files_server: http:127.0.0.1:8081 credentials { "access_key" = "access" "secret_key" = "secret" } }
Let's see if that works
SweetGiraffe8 no need to import it, any report to TB is automatically logged by ClearML 🙂
Maybe we should add it to Storage Manager? What do you think?
Hi LudicrousParrot69
Not sure I follow, is this pyfunc running remotely ?
Or are you looking for interfacing with previously executed Tasks ?
Is there a way to move existing pipelines between projects?
You should be able to, go to your settings page and turn on "show hidden folders"
Then go to your project, you should see " .pipeline
" sub project there, right click it and move it to another folder.