Would be very cool if you could include this use case!
I totally think we should, any chance you can open an Issue, so this feature is not lost?
eval
ย built-in. wdyt?
eval is never recommended as basically you could do Args/float='os.system("rm ...")' ๐
In theory type is stored on the hyper parameter (this is a relatively new feature the backend supports)
The casting though, is done based on the Original value type, which means Task.connect needs to be called with the original dict. Is there a specific reason for using get_parameters instead of task.connect ?
PompousParrot44
Check out the task.execute_remotely()
You can call it right after the task init, and it will enqueue your running Task, and leave the process (if you want).
https://github.com/allegroai/trains/blob/65a4aa7aa90fc867993cf0d5e36c214e6c044270/trains/task.py#L1437
Hi @<1541954607595393024:profile|BattyCrocodile47>
Did you check None ?
You are not supposed to do 2,3,4
After (1) you should just do
ssh root@localhost -p 8022
and provide the password that is written in the CLI
(Notice to pass --public-ip if your remote machine is using a public IP you can access)
ok so i accidentally (probably with luck) noticed the max_connection: 2 in the azure.storage config.
NICE!!!! ๐
But wait where is that set?
None
Should we change the default or add a comment ?
Interesting question, should work and looks like an interesting combination, I'm curious what you come up with.
btw: grafana itself can already provide a lot of alerts for drift etc, this is basically their histogram delta feature
Hi @<1523701066867150848:profile|JitteryCoyote63>
Hi, how does
agent.enable_git_ask_pass
works
basically it pushes the pass through stdin to git when it asks (it is a git feature)
Hi JitteryCoyote63 , I have to admit, we have not thought of this scenario... what's the exact use case to clone a Task and change the type?
Obviously you can always change the task type, a bit of a hack but should work:task._edit(type='testing')
Hi @<1684735407637401600:profile|WonderfulJellyfish65>
BTW, the training script connects to apiserver via the internal IP address
That is a big issue, because as you noticed the links to data =generated by the code will have the internal IP ...
You basically need every component to use the same address (url)
'-v', '/tmp/clearml_agent.ssh.cbvchse1:/.ssh',
It's my bad, after that inside the container it does cp -Rf /.ssh ~/.ssh
The reason is that you cannot know the user home folder before spinning the container
Anyhow the point is, are you sure that you have ~/.ssh on the Host machine configured?
And if you do, are you saying this is part of your AMI? if not how did you put it there?
BTW: you should probably update the server, you're missing out on a lot of cool features ๐
but we run everything in docker containers. Will it still help?
As long as you are running with clearml-agent(in docker mode), all the cache folders (this one included) are mounted on the host machine for persistency
Hi AntsySeagull45
Any chance the original code was running with python2?
Which version of trains-agent are you using?
Working on it as we speak ๐ probably a day worst case 2. This is quite strange and we are not sure where is the fault, as nothing in the code itself changed...
Hi ShallowArcticwolf27
from the command line to a remote machine while loading a localย
.env
ย file as a configuration object?
Where would the ".env" go to ? Are we trying to pass it to the remote machine somehow ?
should i only do mongodb
No, you should do all 3 DBs ELK , Mongo, Redis
Notice the parents argument when creating a new Dataset
The main reason to add the timeout is because the warning was annoying to users ๐
The secondary was that clearml will start reporting based on seconds from start, then when iterations start it will revert back to iterations. But if the iterations are "epochs" the numbers are lower so you end up with a graph that does not match the expected "iterations" x-axis. Make sense ?
Any chance your code needs more than the main script, but it is Not in a git repo? Because the agent supports either single script file, or a git repo with multiple files
Oh I see, what you need is to pass '--script script.py' as entry-point and ' --cwd folder' as working dir
Hi @<1523701868901961728:profile|ReassuredTiger98> when you get to it...
please download the wheel, then install it with
pip3 install -U clearml_agent-0.17.3rc0-py3-none-any.whl
Then run the daemon with the additional --debug argument, basically:
clearml-agent --debug daemon --foreground ...
Once the agent is running please send the Task's log from your console ๐
It seems the code is trying to access an s3 bucket, could that be the case? PanickyMoth78 any chance you can post the full execution log? (Feel free to DM so it won't end up being public)
I did nothing to generate a command-line. Just cloned the experiment and enqueued it. Used the server GUI.
Who/What created the initial experiment ?
I noticed that if I run the initial experiment by "python -m folder_name.script_name"
"-m module" as script entry is used to launch entry points like python modules (which is translated to "python -m script")
Why isn't the entry point just the python script?
The command line arguments are passed as arguments on the Args section of t...
Interesting use case, do you already have the connect_configuration in the code? or do we need to somehow create it ?
I think that clearml should be able to do parameter sweeps using pipelines in a manner that makes use of parallelisation.
Use the HPO, it is basically doing the same thing with some more sophisticated algorithm (HBOB):
https://github.com/allegroai/clearml/blob/master/examples/optimization/hyper-parameter-optimization/hyper_parameter_optimizer.py
For example - how would this task-based example be done with pipelines?
Sure, you could do something like:
` from clearml import Pi...