Reputation
Badges 1
533 × Eureka!Continuing on this line of thought... Is it possible to call task.execute_remotely
on a CPU only machine (data scientists' laptop for example) and make the agent that fetches this task to run it using GPU? I'm asking that because it is mentioned that it replicates the running environment on the task creator... which is exactly what I'm not trying to do 😄
SuccessfulKoala55 here it is
👍
Searched for "custom plotly" and "log plotly" in search, didn't thinkg about "report plotly"
I'd go for
` from trains.utilities.pyhocon import ConfigFactory
config = ConfigFactory.parse_file(CONF_FILE_PATH) `
I think you are talking about separate problems - the "WARNING DIFF IS TOO LARGE" is only a UI issue, that you can't see hte diff in the UI - correct me if I'm wrong with this
Maria seems to be saying that the execution FAILS when she has uncomitted changes, which is not the expected behavior - am I right maria?
Oh I get it, I thought it is only a UI issue... but it actually doesn't send it O_O
SuccessfulKoala55 AppetizingMouse58
[ec2-user@ip-10-0-0-95 ~]$ df -h Filesystem Size Used Avail Use% Mounted on devtmpfs 3.9G 0 3.9G 0% /dev tmpfs 3.9G 0 3.9G 0% /dev/shm tmpfs 3.9G 880K 3.9G 1% /run tmpfs 3.9G 0 3.9G 0% /sys/fs/cgroup /dev/nvme0n1p1 8.0G 6.5G 1.5G 82% / tmpfs 790M 0 790M 0% /run/user/1000
We try to break up every thing into independent tasks and group them using a pipeline. The dependency on an agnet caused an unnecessary overhead since we just want to execute locally. It became a burden once new data scientists join the project and instead of just telling them "yeah, just execute this script" you have to now teach them about clearml, the role of agents, how to launch them, how they behave, how to remove them and stuff like that... things you want to avoid with data scientists
AgitatedDove14 sorry for the late reply,
It's right after executing all the steps. So we have the following block which determines whether we run locally or remotely
if not arguments.enqueue: pipe.start_locally(run_pipeline_steps_locally=True) else: pipe.start(queue=arguments.enqueue)
And right after we have a method that calls Task.current_task()
which returns None
I also ran it without $(pwd) on the Create Clearml task templates section, I added it because of CostlyOstrich36 's comments but it didn't help
Cool, now I understand the auto detection better
the level of configurability in this thing is one of the best I've seen
anyway, my ultimate goal is to create templates for other tasks... Is that possible in any other way through the CLI?
and also in the extra_vm_bash_script
variables, I ahve them under export TRAINS_API_ACCESS_KEY
and export TRAINS_API_SECRET_KEY
I only found Project ID, which I'm not sure what this refers to - I have the project name
Worth mentioning, nothing has changed before we executed this, it worked before and now after the update it breaks
Committing that notebook with changes solved it, but I wonder why it failed
I mean, I barely have 20 experiments
so basically - if she has new commits locally that werent pushed it won't work
But if she did not commit her latest changes, and now she enqueues - it will work?