Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hi, I Have A Question Regarding Clearml Datasets. In The Web Ui, What Causes The "Content" Tab To Show A List Of The Files In The Dataset? It Used To Show Automatically, But Recently It Now Has "No Data To Show" Even Though All Files Are Definitely In The

Hi, I have a question regarding ClearML Datasets. In the web ui, what causes the "Content" tab to show a list of the files in the dataset? It used to show automatically, but recently it now has "no data to show" even though all files are definitely in the dataset. I'm using the following process to create a dataset:

dataset = Dataset.create(
    dataset_name="my-dataset",
    dataset_project="classification_datasets",
    parent_datasets=["abcd1234"]
)
dataset.add_files("/mnt/data/my-dataset")
dataset.finalize(auto_upload=True)

Note, I did try also with
dataset.add_files("/mnt/data/my-dataset", local_base_folder="mnt/data/my-dataset")
but no different result.
image

  
  
Posted 26 days ago
Votes Newest

Answers 17


Hi @<1523701070390366208:profile|CostlyOstrich36> , update for you here. I had noticed that the issue was not present for smaller datasets, which led us to discover that the problem was being caused by some nginx (I think) settings with the new server deployment. This was blocking the upload of the "dataset content" object. So our devops team was able to resolve the issue. Thanks very much for your help.

  
  
Posted 24 days ago

Hi @<1523701070390366208:profile|CostlyOstrich36> , this is what our devops engineer said:

the proxy-body-size limitation crashed for the Clearml api, for WEB and FileServer I set it to unlimited, but for the API I didn't change it.

  
  
Posted 24 days ago

Hi @<1533620191232004096:profile|NuttyLobster9> , thank you for the update. Can you please point out what were the changes that were done?

  
  
Posted 24 days ago

Do you see any errors in the mongo or elastic containers when you create a dataset?

Also, does this issue reproduce when you create datasets via the CLI rather then the SDK?

  
  
Posted 24 days ago

Hi @<1533620191232004096:profile|NuttyLobster9> , are you self hosting ClearML?

  
  
Posted 25 days ago

Hi John, yes we are .

  
  
Posted 25 days ago

I think this is what you're looking for but let me know if you meant something different:

{
    "meta": {
        "id": "76fffdf3b04247fa8f0c3fc0743b3ccb",
        "trx": "76fffdf3b04247fa8f0c3fc0743b3ccb",
        "endpoint": {
            "name": "tasks.get_by_id_ex",
            "requested_version": "2.30",
            "actual_version": "1.0"
        },
        "result_code": 200,
        "result_subcode": 0,
        "result_msg": "OK",
        "error_stack": "",
        "error_data": {}
    },
    "data": {
        "tasks": [
            {
                "comment": "Auto-generated at 2024-08-21 14:39:28 UTC by root@a37cd5bc31c4",
                "configuration": {
                    "Dataset Struct": {
                        "name": "Dataset Struct",
                        "value": "{\n  \"0\": {\n    \"job_id\": \"177265adca46459f8b19d7669ab5e5d5\",\n    \"status\": \"in_progress\",\n    \"last_update\": 1724251291,\n    \"parents\": [],\n    \"job_size\": 840767386,\n    \"name\": \"my-dataset\",\n    \"version\": \"1.0.0\"\n  }\n}",
                        "type": "json",
                        "description": "Structure of the dataset"
                    }
                },
                "id": "177265adca46459f8b19d7669ab5e5d5",
                "name": "my-dataset",
                "runtime": {
                    "orig_dataset_name": "my-dataset",
                    "orig_dataset_id": "177265adca46459f8b19d7669ab5e5d5",
                    "version": "1.0.0",
                    "ds_file_count": 29608,
                    "ds_link_count": 0,
                    "ds_total_size": 840767386,
                    "ds_total_size_compressed": 845212762,
                    "ds_change_add": 29608,
                    "ds_change_remove": 0,
                    "ds_change_modify": 0,
                    "ds_change_size": 840767386
                },
                "status": "completed"
            }
        ]
    }
}
  
  
Posted 25 days ago

Server (see screenshot). Thanks!
image

  
  
Posted 25 days ago

Please open developer tools (F12) and see what is returned in the network from tasks.get_by_id_ex
Also please see if there are any errors in the console

  
  
Posted 25 days ago

What version of the server are you running? What version of the SDK also?

  
  
Posted 25 days ago

I don't see any console errors

  
  
Posted 25 days ago

SDK: clearml==1.15.1

  
  
Posted 25 days ago

Do any of these API calls have a "Dataset Content" field anywhere in the "configuration" section?

  
  
Posted 25 days ago

It seems so, yes. I'm not the one who did the server migration, but as a user I believe this is when I started noticing the issue for new datasets created after the migration.

  
  
Posted 25 days ago

No, i'm not seeing that "Dataset Content" section. We have some older datasets that were copied from a prior server deployment that do have the section and it appears in the UI.

  
  
Posted 25 days ago

So you migrated your server and since then this issue appeared?

  
  
Posted 25 days ago

I'll check

  
  
Posted 25 days ago