@<1523701087100473344:profile|SuccessfulKoala55> this is execution section of task.
It either cannot create the local code file from the uncommitted changes, or it can't find python...
@<1523701087100473344:profile|SuccessfulKoala55> Yes there is no docker involved and I have nothing in the venvs-builds folder.
@<1528546301493383168:profile|ThoughtfulElephant4> , why would you clone a dataset?
@<1523701070390366208:profile|CostlyOstrich36> If I want to create a new project and I want to use the already existing dataset created by others in clearml server.
Hi @<1528546301493383168:profile|ThoughtfulElephant4> , where did you upload the dataset? Can you add the full log? If your colleague clones and enqueues - the code assumes that the files are local, no?
Execution log
from clearml import Dataset
ds = Dataset.create(dataset_project='Asteroid_Solution/.datasets/raw_asteroid_dataset', dataset_name='raw_asteroid_dataset', dataset_version='None')
ds.add_files(
path='/tmp/nasa.csv',
wildcard=None,
local_base_folder=None,
dataset_path=None,
recursive=True
)
ds.upload(
show_progress=True,
verbose=False,
output_url=None,
compression=None
)
ds.finalize()
Are you running the task from a git repo? (also, can you show the top of the execution section?)
Hi @<1523701070390366208:profile|CostlyOstrich36> Clearml server is on aws, It created a dataset artifact when my colleague uploaded it then when I try to clone and enqueue, it fails.
How did you create the dataset originally, can you share a snippet that reproduces this?
@<1528546301493383168:profile|ThoughtfulElephant4> how is the ClearML Files server configured on your machine? is it None ?
but what does your clearml.conf define as the files host address?
Hi @<1523701070390366208:profile|CostlyOstrich36> here is the snippet
from clearml import Task,
Dataset import global_config
from data import database
task = Task.init( project_name=global_config.PROJECT_NAME, task_name='get data', task_type='data_processing', reuse_last_task_id=False )
config = { 'query_date': '2022-01-01' } task.connect(config)
# Get the data and a path to the file query = 'SELECT * FROM asteroids WHERE strftime("%Y-%m-%d", `date`) <= strftime("%Y-%m-%d", "{}")'.format(config['query_date']) df, data_path = database.query_database_to_df(query=query) print(f"Dataset downloaded to: {data_path}") print(df.head())
# Create a ClearML dataset dataset = Dataset.create( dataset_name='raw_asteroid_dataset', dataset_project=global_config.PROJECT_NAME )
# Add the local files we downloaded earlier dataset.add_files(data_path)
dataset.get_logger().report_table(title='Asteroid Data', series='head', table_plot=df.head())
# Finalize and upload the data and labels of the dataset dataset.finalize(auto_upload=True) print(f"Created dataset with ID: {dataset.id}")
print(f"Data size: {len(df)}")
I have to clone the dataset into a new project that other's have uploaded...what is the best way to do it?
Yes @<1523701087100473344:profile|SuccessfulKoala55> same configuration as you mentioned before.
I see this is not running using docker - can you just go to the venv directory C:/Users/guruprasad.j/.clearml/venvs-builds
unser the last venv used and see what files you have there?
Can you show the task's execution section in the UI?