Reputation
Badges 1
371 × Eureka!Also I need to modify the code to only keep the N best checkpoints as artifacts and remove others.
Retrying (Retry(total=239, connect=239, read=240, redirect=240, status=240)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fb2191dcaf0>: Failed to establish a new connection: [Errno 111] Connection refused')': /auth.login
Retrying (Retry(total=238, connect=238, read=240, redirect=240, status=240)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fb2191e10a0>: Failed to establish a new connection: ...
can you point me to where I should look?
from sklearn.datasets import load_iris
import tensorflow as tf
import numpy as np
from clearml import Task, Logger
import argparse
def main():
parser = argparse.ArgumentParser()
parser.add_argument('--epochs', metavar='N', default=64, type=int)
args = parser.parse_args()
parsed_args = vars(args)
task = Task.init(project_name="My Workshop Examples", task_name="scikit-learn joblib example")
iris = load_iris()
data = iris.data
target = i...
were you able to reproduce it CostlyOstrich36 ?
If there aren't N datasets, the function step doesn't Squash the datasets and instead just returns -1.
Thus if I get -1, I want the pipeline execution to end or the proceeding task to be skipped.
I have checked in the args, the value is indeed -1. Unless there is some other way for conditional pipeline steps execution.
So in my case where I schedule a task every time I publish a data, when I publish my dataset once, it triggers and starts a new task.
shouldn't checkpoints be uploaded immediately, that's the purpose of checkpointing isn't it?
To me it still looks like the only difference is that the non mutable copy is downloaded to the cache folder while mutable copy downloads to the directory I want. I could delete files from both sets so it seems like it's up to the user to make sure not to mutate the non mutable download in the cache folder.
Its a simple DAG pipeline.
I have a step, at which I want to run a task which finds the model I need.
only issue is even though it's a bool, it's stored as "False" since clearml stores the args as strings.
From what I recall, I think resume was set to false, originally and in the cloned task.
For anyone reading this. apparently there aren't any credentials for my own custom server for now. I just ran it without credentials and it seems to work.
I have a lot of anonymous tasks running which I would like to close immediately.
Ok. I kind of have a confusion now. Suppose I have an agent listening to some Queue X. If someone else on some other machine enqueues their task on Queue X, will my agent run it?
Oh oh oh. Wait a second. I think I get what you're saying. When I'm originally creating clearml-task, since I'm not passing the argument myself, so it just uses the value False.
This problem occurs when I'm scheduling a task. Copies of the task keep being put on the queue even though the trigger only fired once.
I checked and it seems when i an example from git, it works as it should. but when I try to run my own script, the draft is in read only mode.
keeps retrying and failing when I use Dataset.get
is this the correct way to upload an artifact?
checkpoint.split('.')[0] is the name that I want it assigned and the second argument is the path to the file.
After the step which gets the merged dataset, I should use pipe.stop if it returned -1?
def watch_folder(folder, batch_size):
count = 0
classes = os.listdir(folder)
class_count = len(classes)
files = []
dirs = []
for cls in classes:
class_dir = os.path.join(folder, cls)
fls = os.listdir(class_dir)
count += len(fls)
files.append(fls)
dirs.append(class_dir)
if count >= batch_size:
dataset = Dataset.create(project='data-repo')
dataset.add_files(folder)
dataset.upload()
dataset.final...
let me check
Or is there any specific link you can recommend to try and create my own server.
Quick follow up question. Once I parse args, should they be directly available for i even enque the project for the first time or will i be able to access hyperparameters after running it once?
I however have another problem. I have a dataset trigger that has a schedule task.