Reputation
Badges 1
103 × Eureka!I saw https://clear.ml/docs/latest/docs/references/sdk/dataset/#verify_dataset_hash - but I don't think it is the correct one. the https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.shape.html property
@<1523701205467926528:profile|AgitatedDove14> -
I'm getting the following error when running the following code within the mp_worker
command = ["ffmpeg","-i",f"{url}","-vcodec","libx264", "output.mp4"]
subprocess.run(command, stderr=subprocess.STDOUT)
TypeError: fork_exec() takes exactly 21 arguments (17 given)
Any suggestions?
from the example -
since the `mp_hander`` runs
cmd = [sys.executable, sys.argv[0],
'--counter', str(counter - 1),
'--num_workers', str(args.num_workers),
'--use-subprocess' if args.subprocess else '--no-subprocess']
p = subprocess.Popen(cmd, cwd=os.getcwd())
can I run another subprocess
in the mp_worker
?
I'm looking for the bucket URI
I think my work flow needs to alter.
get the data into the bucket and then create the Dataset using the add_external_file
and then be able to consume the data locally or stream And then I can use - link_entries
not sure I understand
runningclearml-agent list
I get
`
workers:
- company:
id: d1bd92...1e52b
name: clearml
id: clearml-server-...wdh:0
ip: x.x.x.x
... `
shape -> tuple([int],[int])
I decided to use
._task.upload_artifact(name='metadata', artifact_object=metadata)
where metadata is a dict
metadata = {**metadata, **{"name":f"{Path(file_tmp_path).name}", "shape": f"{df.shape}"}}
We need to convert it a DataFrame since
Displaying metadata in the UI is only supported for pandas Dataframes for now. Skipping!
Strange
I ranclearml-agent daemon --stop
and after 10 min I ranclearml-agent list
and I still see a worker
ClearML key/secret provided to the agent
When is this provided? Is this during the build
?
I found the task in the UI -
and in the UNCOMMITTED CHANGES
execution section there is
No changes logged
Any other suggestions?
Hi SuccessfulKoala55
I've run the daemon via dockerCLEARML_WORKER_ID=XXXX clearml-agent daemon --queue MY_QUEUE --docker --detached
and then run the session
via dockerclearml-session --docker nvidia/cuda:10.1-cudnn7-runtime-ubuntu18.04 \ --packages "clearml" "tensorflow>=2.2" "keras" \ --queue MY_QUEUE \ --verbose
However I'm still getting the same error
clearml_agent: ERROR: Can not run task without repository or literal script in
script.diff
Looking in the repo I was not able to see an example - only reference to https://github.com/allegroai/clearml/blob/b9b0a506f35a414f6a9c2da7748f3ec3445b7d2d/docs/clearml.conf#L13 - I just need to add company.id
or user.id
in the credential dict?
Thx for investigating - What is the use case for such behavior ?
How would you use the user properties
as part of an experiment?
I'm guessing .1
is since there were datasets that I could not see - but actually they were there (as sub projects). so everything is related
Dataset.list_datasets(dataset_project='XXXX')
Always returns an empty list
Hi AgitatedDove14
OK - the issue was the firewall rules that we had.
Now both of the jupyter lab
and vscode
servers are up.
But now there is an issue with the Setting up connection to remote session
After the
Environment setup completed successfully
Starting Task Execution:
ClearML results page:
There is a WARNING
clearml - WARNING - Could not retrieve remote configuration named 'SSH'...
Well - that will convert it to a binary pickle format but not as parquet -
since the artifact will be accessed from other platforms we want to use parquet
Thx CostlyOstrich36 for your reply
Can't see the reverence to parquet
. we are currently using the above functionality , but the pd.DataFrame
is only saved as csv
compressed by gz
This is my current solution[ds for ds in dataset.list_datasets() if ds['project'].split('/')[0]==<PROJEFCT_NAME>]
Hi CostlyOstrich36 ,
After verifying - I can confirm that there is no custom certificate .
any other ideas?
Hi HugeArcticwolf77
I'v run the following code - which uploads the files with compression, although compression=None
ds.upload(show_progress=True, verbose=True, output_url='
', compression=None)
ds.finalize(verbose=True, auto_upload=True)
Any idea way?
Possibly - thinking more of https://github.com/pytorch/data/blob/main/examples/vision/caltech256.py - using clearml dataset as root path.
Well it seems that we have similar https://github.com/allegroai/clearml-agent/issues/86
currently we are just creating a new worker and on a separate queue