Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hi All, We Have A Self Hosted Clearml Server, My Colleague Uploaded A Dataset From His Machine And When I Try To Clone And Enqueue The Dataset Into Different Project From My Machine The Task Gets Failed Prompting "The System Cannot Find The File Specified

Hi All,
we have a self hosted clearml server, my colleague uploaded a dataset from his machine and when I try to clone and enqueue the dataset into different project from my machine the task gets failed prompting "The system cannot find the file specified" and it is not visible in dataset tab also - Is there any way to recreate the dataset or to view the original dataset from the cloned one?

  
  
Posted 12 months ago
Votes Newest

Answers 26


image

  
  
Posted 12 months ago

This is where I cloned from.
image

  
  
Posted 12 months ago

This is the artifact URL.
image

  
  
Posted 12 months ago

This is what I cloned.
image

  
  
Posted 12 months ago

image

  
  
Posted 12 months ago

Are you running the task from a git repo? (also, can you show the top of the execution section?)

  
  
Posted 12 months ago

Hi @<1523701070390366208:profile|CostlyOstrich36> here is the snippet

from clearml import Task, 
Dataset import global_config 
from data import database 

task = Task.init( project_name=global_config.PROJECT_NAME, task_name='get data', task_type='data_processing', reuse_last_task_id=False ) 

config = { 'query_date': '2022-01-01' } task.connect(config) 

# Get the data and a path to the file query = 'SELECT * FROM asteroids WHERE strftime("%Y-%m-%d", `date`) <= strftime("%Y-%m-%d", "{}")'.format(config['query_date']) df, data_path = database.query_database_to_df(query=query) print(f"Dataset downloaded to: {data_path}") print(df.head()) 

# Create a ClearML dataset dataset = Dataset.create( dataset_name='raw_asteroid_dataset', dataset_project=global_config.PROJECT_NAME ) 

# Add the local files we downloaded earlier dataset.add_files(data_path)
 dataset.get_logger().report_table(title='Asteroid Data', series='head', table_plot=df.head())
 
# Finalize and upload the data and labels of the dataset dataset.finalize(auto_upload=True) print(f"Created dataset with ID: {dataset.id}") 
print(f"Data size: {len(df)}")
  
  
Posted 12 months ago

Execution log

from clearml import Dataset

ds = Dataset.create(dataset_project='Asteroid_Solution/.datasets/raw_asteroid_dataset', dataset_name='raw_asteroid_dataset', dataset_version='None')
ds.add_files(
    path='/tmp/nasa.csv', 
    wildcard=None, 
    local_base_folder=None, 
    dataset_path=None, 
    recursive=True
)
ds.upload(
    show_progress=True, 
    verbose=False, 
    output_url=None, 
    compression=None
)
ds.finalize()
  
  
Posted 12 months ago

@<1523701087100473344:profile|SuccessfulKoala55> Yes there is no docker involved and I have nothing in the venvs-builds folder.

  
  
Posted 12 months ago

What is the artifact URL from the task?

  
  
Posted 12 months ago

Can you attach the full task log?

  
  
Posted 12 months ago

I mean the one that failed...

  
  
Posted 12 months ago

@<1523701070390366208:profile|CostlyOstrich36> If I want to create a new project and I want to use the already existing dataset created by others in clearml server.

  
  
Posted 11 months ago

Yes @<1523701087100473344:profile|SuccessfulKoala55> same configuration as you mentioned before.

  
  
Posted 12 months ago

@<1528546301493383168:profile|ThoughtfulElephant4> , why would you clone a dataset?

  
  
Posted 11 months ago

Can you show the task's execution section in the UI?

  
  
Posted 12 months ago

Hi @<1528546301493383168:profile|ThoughtfulElephant4> , where did you upload the dataset? Can you add the full log? If your colleague clones and enqueues - the code assumes that the files are local, no?

  
  
Posted 12 months ago

@<1528546301493383168:profile|ThoughtfulElephant4> how is the ClearML Files server configured on your machine? is it None ?

  
  
Posted 12 months ago

It either cannot create the local code file from the uncommitted changes, or it can't find python...

  
  
Posted 12 months ago

@<1523701087100473344:profile|SuccessfulKoala55> this is execution section of task.

  
  
Posted 12 months ago

Hi @<1523701070390366208:profile|CostlyOstrich36> Clearml server is on aws, It created a dataset artifact when my colleague uploaded it then when I try to clone and enqueue, it fails.

  
  
Posted 12 months ago

How did you create the dataset originally, can you share a snippet that reproduces this?

  
  
Posted 12 months ago

I see this is not running using docker - can you just go to the venv directory C:/Users/guruprasad.j/.clearml/venvs-builds unser the last venv used and see what files you have there?

  
  
Posted 12 months ago

but what does your clearml.conf define as the files host address?

  
  
Posted 12 months ago

Task console log

  
  
Posted 12 months ago

I have to clone the dataset into a new project that other's have uploaded...what is the best way to do it?

  
  
Posted 12 months ago
588 Views
26 Answers
12 months ago
11 months ago
Tags
Similar posts