Reputation
Badges 1
371 × Eureka!Shouldn't I get redirected to the login page if i'm not logged in instead of the dashboard? 😞
Big thank you though.
Thank you, I found the solution to my issue, when I started reading at default output uri.
Basically when I have to re run the experiment with different hyperparameters, I should clone the previous experiment and change the hyperparameters then before putting it in the queue?
For anyone reading this. apparently there aren't any credentials for my own custom server for now. I just ran it without credentials and it seems to work.
Thanks for the help.
Additional Error info
Launching job: ScheduleJob(name='Watch Checkbox Detector Buffer Folder', base_task_id='', base_function=<function watch_folder at 0x7f6e308b6840>, queue=None, target_project=None, single_instance=False, task_parameters=None, task_overrides=None, clone_task=True, _executed_instances=['140111227815680', '140111227815680', '140111227815680', '140111227815680'], execution_limit_hours=None, recurring=True, starting_time=datetime.datetime(2021, 11, 25, 9, 45, 41, 175873), min...
Set up is on a single machine, I have a nas mounted where I'm watching a folder, if there are sufficient images, it should publish the data but since I was using start_remotely, the code was running somewhere else and couldn't access folder.
the one mentioned on the page.
For anyone reading this. I think I've gotten an understanding. I can add folders to a dataset so I'll be creating single dataset and will just keep adding folders to it. Then keep records of it in a database
Understandable. I mainly have regular image data, not video sequences so I can do the train test splits like you mentioned normally. What about the epochs though? Is there a recommended number of epochs when you train on that new batch?
when i pass the repo in clearml-task with the parameters, it runs fine and finishes. Basically when I clone and attempt the task again, I get the above assert error I don't know why.
dataset = Dataset.create(data_name, project_name)
print('Dataset Created, Adding Files...')
dataset.add_files(data_dir)
print('Files added succesfully, Uploading Files...')
dataset.upload(output_url=upload_dir, show_progress
Also my execution just completed and as of yet, I can only see the hyperparameters as a report. not in a configurable form. I've just started with ClearML and am having these issues.
So in my head, every time i publish a dataset, it should get triggered and run that task.
When I try to access the server with the IP I set as CLEARML_HOST_IP, it looks like this. I set that IP to the ip assigned to me by the network
Currently every 2000 iterations, a checkpoint is saved, that's just part of the code. Since output_uri = True, it gets uploaded to the ClearML server.
I just copied the commands in order from the page and pasted them. All of the linux ones specifically.
The server is on a different machine. I'm experimenting on the same machine though.
It works, however it shows the task is enqueued and pending. Note I am using .start() and not .start_remotely() for now
I get the following error.
def watch_folder(folder, batch_size):
count = 0
classes = os.listdir(folder)
class_count = len(classes)
files = []
dirs = []
for cls in classes:
class_dir = os.path.join(folder, cls)
fls = os.listdir(class_dir)
count += len(fls)
files.append(fls)
dirs.append(class_dir)
if count >= batch_size:
dataset = Dataset.create(project='data-repo')
dataset.add_files(folder)
dataset.upload()
dataset.final...
I checked and it seems when i an example from git, it works as it should. but when I try to run my own script, the draft is in read only mode.
So I just published a dataset once but it keeps scheduling task.
I actually just asked about this in another thread. Here's the link. Asking about the usage of the upload_artifact