Reputation
Badges 1
371 × Eureka!I'm not in the best position to answer these questions right now.
I have the server running now and for now it seems that I'm able to get the dataset even in the other file. I'll mess around with it now to get a hang of it and see how it actually works
I was getting a different error when I posted this question. Now i'm just getting this connection error
I'm kind of at a point where I don't know a lot of what to even search for.
So I got my answer, for the first one. I found where the data is stored in the server
Like there are files in a specific folder on Machine A. A script on Machine A, creates a Dataset, adds files located in that folder, and publishes it. Now can you look at that dataset on the server machine? Not from the ClearML interface but inside normal directories, like in /opt/clearml etc. this directory mentioned is just an example.
I'll try to see how to use the sdk method you just shared
I know how to enqueue in using the UI. I'm trying to do it programatically.
So the api is something new for me. I've already seen the sdk. Am I misremembering sending python script and requirements to run on agent directly from the cli? Was there no such way?
Basically, as soon as I get the trigger that a new dataset has been published, I want to pass the dataset id to the script as an cli argument and pass the code to the agent
Found it.
https://clear.ml/docs/latest/docs/guides/clearml-task/clearml_task_tutorial/
The second example here, executing a local script. I think that was it. Thank you for the help.
The scheduler is set to run once per hour but even now I've got around 40+ anonymous running tasks.
Let me tell you what I think is happening and you can correct me where I'm going wrong.
Under certain conditions at certain times, a Dataset is published, that activates a Dataset trigger. So if every day I publish one dataset, I activate a Dataset Trigger that day once it's published.
N publishes = N Triggers = N Anonymous Tasks, right?
because those spawned processes are from a file register_dataset.py , however I'm personally not using any file like that and I think it's a file from the library.
Can you spot something here? Because to me it still looks like it should only create a new Dataset object if batch size requirement is fulfilled, after which it creates and publishes the dataset and empties the directory.
Once the data is published, a dataset trigger is activated in the checkbox_.... file. which creates a clearml-task for training the model.
Let me share the code with you, and how I think they interact with eachother.
I'll test it with the updated one.
I'm still a bit confused around the fact that since my function runs once per hour, why are there indefinitely growing anonymous tasks, even after i've closed the main schedulers.
apparently it keeps caliing this register_dataset.py script
For anyone reading this. apparently there aren't any credentials for my own custom server for now. I just ran it without credentials and it seems to work.
{"meta":{"id":"c3edee177ae348e5a92b65604b1c7f58","trx":"c3edee177ae348e5a92b65604b1c7f58","endpoint":{"name":"","requested_version":1.0,"actual_version":null},"result_code":400,"result_subcode":0,"result_msg":"Invalid request path /","error_stack":null,"error_data":{}},"data":{}}
I get the following error.
this is the console output
Thank you for the help.
I just copied the commands in order from the page and pasted them. All of the linux ones specifically.
I don't think I changed anything.