Reputation
Badges 1
103 × Eureka!using the helm charts
https://github.com/allegroai/clearml-helm-charts
I'm looking for the bucket URI
I think my work flow needs to alter.
get the data into the bucket and then create the Dataset using the add_external_file
and then be able to consume the data locally or stream And then I can use - link_entries
Is current relationship only available via _get_parents()
method?
we want to use the dataset output_uri as a common ground to create additional dataset formats such as https://webdataset.github.io/webdataset/
In order to create a webdataset
we need to create tar files -
so we need to unzip and then recreate the tar file.
Additionally when the files are in GCS in the raw format you can easily review them with the preview (e.g. a wav file can be directly listened within the GCP console - web browser).
I think the main difference is that I can see a value of having access to the raw format within the cloud vendor and not only have it as an archive
from the example -
since the `mp_hander`` runs
cmd = [sys.executable, sys.argv[0],
'--counter', str(counter - 1),
'--num_workers', str(args.num_workers),
'--use-subprocess' if args.subprocess else '--no-subprocess']
p = subprocess.Popen(cmd, cwd=os.getcwd())
can I run another subprocess
in the mp_worker
?
CostlyOstrich36 - but we will use any method that will allow us to save the files as parquet.
We are not yet using clearml Dataset
- i'm not sure if this is a solution
Well - that will convert it to a binary pickle format but not as parquet -
since the artifact will be accessed from other platforms we want to use parquet
Thx CostlyOstrich36 for your reply
Can't see the reverence to parquet
. we are currently using the above functionality , but the pd.DataFrame
is only saved as csv
compressed by gz
SmugDolphin23 Where can I check the lates RC? I was not able to find it in the clearml github repo
Hi HugeArcticwolf77
I'v run the following code - which uploads the files with compression, although compression=None
ds.upload(show_progress=True, verbose=True, output_url='
', compression=None)
ds.finalize(verbose=True, auto_upload=True)
Any idea way?
shape -> tuple([int],[int])
I decided to use
._task.upload_artifact(name='metadata', artifact_object=metadata)
where metadata is a dict
metadata = {**metadata, **{"name":f"{Path(file_tmp_path).name}", "shape": f"{df.shape}"}}
We need to convert it a DataFrame since
Displaying metadata in the UI is only supported for pandas Dataframes for now. Skipping!
This is my current solution[ds for ds in dataset.list_datasets() if ds['project'].split('/')[0]==<PROJEFCT_NAME>]
AgitatedDove14 -
I also tried to https://github.com/allegroai/clearml-session
running the session
within docker but got the same error
clearml-session --docker
--git-credentials
(there is a typo in git - --git-credent ila s -> --git-credent ials)
and still got the same error
clearml_agent: ERROR: Can not run task without repository or literal script in
script.diff
I found the task in the UI -
and in the UNCOMMITTED CHANGES
execution section there is
No changes logged
Any other suggestions?
I'm checking the possibility of our firewall between the clearml-agent
machine and the local computer running the session
Hi SuccessfulKoala55
I've run the daemon via dockerCLEARML_WORKER_ID=XXXX clearml-agent daemon --queue MY_QUEUE --docker --detached
and then run the session
via dockerclearml-session --docker nvidia/cuda:10.1-cudnn7-runtime-ubuntu18.04 \ --packages "clearml" "tensorflow>=2.2" "keras" \ --queue MY_QUEUE \ --verbose
However I'm still getting the same error
clearml_agent: ERROR: Can not run task without repository or literal script in
script.diff
Well it seems that we have similar https://github.com/allegroai/clearml-agent/issues/86
currently we are just creating a new worker and on a separate queue
updated the clearml.conf
with empty worker_id/name ran
clearml-agent daemon --stop
top | grep clearmKilled the pidsran
clearml-agent list
still both of the workers are listed
not sure I understand
runningclearml-agent list
I get
`
workers:
- company:
id: d1bd92...1e52b
name: clearml
id: clearml-server-...wdh:0
ip: x.x.x.x
... `
yes - the pre_installations.sh
runs and completes - but the pytorch/main.py
file doesn't run.
so the Task completes successfully but without running the script