Reputation
Badges 1
25 × Eureka!It only happens in the clearml environment, works fine local.
Hi BoredHedgehog47
what do you mean by "in the clearml environment" ?
UnevenDolphin73 if you have the time to help fix / make it work it will be greatly appreciated π
Hmm you mean how long it takes for the server to timeout on registered worker? I'm not sure this is easily configured
I guess I got confused since the color choices in
One of the most beloved features we added π
Yep π
Also maybe worth changing the entry point of the agent docker to always create a queue if it is missing?
SmugDog62 so on plain vanilla Jupyter/lab everything seems to work.
What do you think is different in your setup ?
Is there an option to do this from a pipeline, from within theΒ
add_step
Β method? Can you link a reference to cloning and editing a task programmatically?
Hmm, I think there is an open GitHub issue requesting a similar ability , let me check on the progress ...
nope, it works well for the pipeline when not I don't choose to continue_pipeline
Could you send the full log please?
Sure:Dataset.create(..., use_current_task=True)
This will basically attach/make the main Task the Dataset itself (Dataset is a type of a Task, with logic built on top of it)
wdyt ?
Hi SubstantialElk6
32 CPU cores, 64GB ram
Should be plenty, this sounds like network bottle neck issue, I can't imagine the server is actually CPU bounded
Wait I might be completely off.
Is this line "hangs" ?
task.execute_remotely(..., exit_process=True)
If the only issue is this linetask.execute_remotely(..., exit_process=True)
It has to finish the static analysis of the entire repository (which usually happens in the background but now we have to wait for it). If the repo is large this could actually take 20sec (depending on CPU/drive of the machine itself)
Thanks SubstantialElk6 !
Happy new year π πΊ πΎ π
If you wan to change the Args, go to the Args section in the Configuration tab, when the Task is in draft mode you can edit them there
Hi QuaintPelican38
Assuming you have open the default SSH port 10022 on the ec2 instance (and assuming the AWS premissions are set so that you can access it). You need to use the --public-ip
flag when running the clearml-session. Otherwise it "thinks" it is running on a local network and it registers itself with the local IP. With the flag on it gets the public IP of the machine, then the clearml-session running on your machine can connect to it.
Make sense ?
DilapidatedDucks58 Nice!
but it would be great to see predecessors of each experiment in the chain
So maybe we should add "manual pipeline" to create the connection post execution ? is this a one time thing ?
Maybe a service creating these flow charts ?
Should we put them in the Project's readme ? Or in the Pipeline section (coming soon)
Hmm can you try:--args overrides="['log.clearml=True','train.epochs=200','clearml.save=True']"
WackyRabbit7 if this is a single script running without git repo, you will actually get the entire code in the uncommitted changes section.
Do you mean get the code from the git repo itself ?
But the artifacts and my dataset of my old experiments still use the old adress for the download ( is there a way to change that ) ?
MotionlessCoral18 the old artifacts are stored with direct links, hence the issue, as SweetBadger76 noted you might be able to replace the links directly inside the backend databases
Yes. Because my old
has never been resolved (though closed), we use the dataset object to upload e.g. local files needed for remote execution.
Ohh No I remember... following this line, can I assume these files are reused, i.e. this is not a "per instance" . I have to admit that I have a feeling this is a very unique usecase. and Maybe the "old" way Dataset were shown is better suited ?
No, I mean why does it show up in the task view (see attached image), forcing me to clic...
Hi ProudChicken98
How about saving it as a local YAML and upload the file itself as an artifact?
In both case if I get the element from the list, I am not able to get when the task started. Where is info stored?
If you are using client.tasks.get_all( ...)
should be under started
field
Specifically you can probably also do:queried_tasks = Task.query_tasks(additional_return_fields=['started']) print(queried_tasks[0]['id'], queried_tasks[0]['started'],)
Could you send the logs?
DefiantCrab67
Where will you copy it from ?
Then as you suggested, I would just use sys.path it is probably the easiest and actually very safe (because the subfolders are Always next to the "main" source code)
Hi SmugSnake6
I think it was just fixed, let me check if the latest RC includes the fix
Hmm check if this one works:optimizer._get_child_tasks_ids( parent_task_id=optimizer._job_parent_id or optimizer._base_task_id, order_by=optimizer._objective_metric._get_last_metrics_encode_field(), additional_filters={'page_size': int(top_k), 'page': 0})
If it does, let's PR it as a dedicated function