
Reputation
Badges 1
31 × Eureka!I didn't try yet but thought about dataset.upload(output_url=)
AgitatedDove14 I didn't know about dataset.squish(). Thank you. I'll check this variant today
TimelyPenguin76 , rechecking this situation with clearml-agent 1.0.0 now...
TimelyPenguin76 , thank you. I'll try now
TimelyPenguin76 , any news regarding this?
Do I understand right that I can avoid task (including dataset termination if I update it somehow once a period (say, sending a log line)?
SuccessfulKoala55 the second option
SuccessfulKoala55 , I have the following structure now (maybe it's not best practice and you can suggest a better one). There is a sequence of tasks, that are run manually or from pipeline. Every task at the end updates some dataset. The dataset should be closed only after all the sequence is finished (and some task in the sequence can take more than two days). The issue I want to avoid is aborting of the dataset task that these regular tasks update.
SuccessfulKoala55 thank you. It worked
Hi SuccessfulKoala55 Thank you for response. So, it's not possible if we use community server, right?
` Current configuration (clearml_agent v0.17.2, location: /home/olga/clearml.conf):
api.version = 1.5
api.verify_certificate = true
api.default_version = 1.5
api.http.max_req_size = 15728640
api.http.retries.total = 240
api.http.retries.connect = 240
api.http.retries.read = 240
api.http.retries.redirect = 240
api.http.retries.status = 240
api.http.retries.backoff_factor = 1.0
api.http.retries.backoff_max = 120.0
api.http.wait_on_maintenance_forever = true
api.http.pool_...
Hi AgitatedDove14 . Thank you. Yes. Pipeline means and clearml-agent on environment that runs some parallelization framework are options. I'll look in this direction
TimelyPenguin76 , the same behaviour with clearml-agent 1.0.0
As I remember, I added it because it was not added automatically. But I'll recheck now...
TimelyPenguin76 , sorry I didn't see this comment. No. I mean that when I run task locally (from PyCharm and without task.execute_remotely()), model is uploaded and registered. But when I do the same with task.execute_remotely() and it runs on agent model cannot be found in the task after this. I speak about the same script I sent in the second thread
TimelyPenguin76 , it worked. Thank you!
VexedCat68 I would try to find the process on the machine with something like 'ps aux | grep clearml ' and kill it
TimelyPenguin76 , thank you. Trying...
SuccessfulKoala55 Without commented line files are uploaded to http://files.community.clear.ml instead of by S3 bucket
TimelyPenguin76 , thank you for explanation. 1). Great. 2) As you can see from my screenshot, Data Processing task is created but I don't see Datasets tab as I see in https://clear.ml/blog/construction-feat-tf2-object-detection-api/ 3) I see. So need to specify with every cli command/SDK method call
TimelyPenguin76 Ok, when no explicit artifact upload is done, it indeed uploads model locally, but not remotely
TimelyPenguin76 , thank you for willing to help. Here is a small project attached. load_mnist.py generates a dataset, model_train.py is the script in question (it uses the dataset generated by load_mnist.py)
SuccessfulKoala55 To be more specific, I mean situations when training is long and its parts can be parallelized in some way like in Spark or Dask. I suspect that such functionality is framework-specific and it's hard to believe it is in focus on ClearML that is more or less framework-agnostic. On the other hand, ClearML has many integrations with concrete frameworks. So I'd like to understand whether there is any kind of support on general ClearML level or as a part of integrations with fra...
Hi SuccessfulKoala55 Here is code_snipet
` task = Task.init(project_name=PROJECT_NAME, task_name=section)
task.connect(params)
print('params', params)
dataset = Dataset.create(dataset_name=params['dataset'], dataset_project=PROJECT_NAME)
dataset_local_dir = dataset.get_local_copy()
dataset._task.output_uri = task.output_uri
KeywordProcessor(params['es_host'], params['es_port'], True, DOCS_ROOT)
dataset.add_files(DOCS_ROOT, wildcard='*.csv')
dataset.upload() `I add several files to a da...
ReassuredTiger98 , Ah, interesting. Thank you. I'll recheck it