Reputation
Badges 1
18 × Eureka!my worker is a remote instance with admin access only, so i cannot really run experiments manually there. I can try to setup a minimalistic environment locally and check what happens
Hi CostlyOstrich36 is there a default location for the agents local log?
unfortunately the experiment is run in docker and the container is down already... I don't know if this happened at the same time. So you're saying it might be memory issues? Any other hints i might check while running a new experiment?
CostlyOstrich36 hi, thanks for the answer. Unfortunately, both my CLI run experiment (using clearml-task
) and the one cloned from this have the same setup in the INFO/worker and INFO/queue. I do set the queue as one of the clearml-task arguments.
CostlyOstrich36 no, the code is always running remotely. I use two ways to start an experiment. The first is using clearml-task
interface https://github.com/allegroai/clearml/blob/master/docs/clearml-task.md where I define which script to run from a repo + branch, and also define a queue (remote worker) for where to run the experiment. The second way is by manually clicking Clone in the API dashboard (then modifying some params in configuration objects) then setting Enqueue to th...
Hi AgitatedDove14 thanks for the suggestion. I might have a go with this, i just need a bit more help to clear things out. When running with clearml-task
, i also use --repo
and --branch
options to setup the code version, and other options (- -docker_args --docker_bash_setup_script --packages --output-uri
) to export some env variables, install some dependencies etc. How can i do this in case of running a script locally and switching to remote? Is this somewhere close ...
Ok thanks. I'm also using --skip-task-init because i'm manually initializing a task and connecting configurations in my_script. I guess then the get_parameters won't give what i'm looking for?
I have a specific situation, maybe you can help me with this
my_script looks something like thisthirdparty_argparser1() [...my_code...] thirdparty_argparser2()
I cannot interfere with implementations of these thirdparty argparsers. The first argparser expects arg1=val1 --arg2=val2
, and the sec...
I'm sorry for bad explanations... My problem is that I need to pass an argument to one function, but then I call a second function that breaks because of this argument. So i'd like to use the command line argument it in the first argparse, and then hide/delete/override before running the second argparse.
Uh, why not Current task? Couldn't this do? (seems that wasn't the final question 😞 )task = Task.init() thirdparty_argparser1() #this one takes both arg1 and arg2 [...my_code...] task.set_parameters({'Args/arg1':<leave_as_is>, 'Args/arg2': None}) thirdparty_argparser2() #this one takes arg1, but not arg2
Ok, getting is easy then with an additional argparser, and what about manipulating? I really hope this is my final question. Can i modify an argument among CONFIGURATION > HYPER PARAMETERS - Args from code?
AgitatedDove14 thanks so much for your help. I managed finally by updating parameter -the argument that was in the way 🙂
Not in this setup, no. I don't have the memory or processing resources for these training tasks. If i run it locally for simple tests, i just run my script directly and pass the arguments from the command line, without the clearml agent
(it will also expose exactly how the artgparse is being stored on the Task, so later you understand how to pass arguments from clearml-task command line) (edited)
does this mean that Task stores --args (and propagates these further through the code as CLI arguments) somewhere where i can get and manipulate them from my code? Are they propagated from Configuration ARGS? In this case, can i just update them from my code between different argparse calls (using connect_configuration or sth...
Ok good idea thanks, will do in the next run
ps. i've tried calling the thirdparty_argparser2 with subprocesssubprocess.call(["python3", "thirdparty_argparser2.py", arg1_val])
but somehow (?) it still got the second arg2 and failed due to type error 😅
Hi, thanks for the answer. Loading manually set params is fine as you have explained, but I would need the file reference as well. The thing is, I'm using both the config file contents (params) and the file path (re-loading some params in other processes that don't share all the resources).
Does what you are saying mean that the file I have modified in the app overrides the existing file under the same file name in repo clone on my training instance?