Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hi, I Have A Question Regarding Clearml Datasets. In The Web Ui, What Causes The "Content" Tab To Show A List Of The Files In The Dataset? It Used To Show Automatically, But Recently It Now Has "No Data To Show" Even Though All Files Are Definitely In The

Hi, I have a question regarding ClearML Datasets. In the web ui, what causes the "Content" tab to show a list of the files in the dataset? It used to show automatically, but recently it now has "no data to show" even though all files are definitely in the dataset. I'm using the following process to create a dataset:

dataset = Dataset.create(
    dataset_name="my-dataset",
    dataset_project="classification_datasets",
    parent_datasets=["abcd1234"]
)
dataset.add_files("/mnt/data/my-dataset")
dataset.finalize(auto_upload=True)

Note, I did try also with
dataset.add_files("/mnt/data/my-dataset", local_base_folder="mnt/data/my-dataset")
but no different result.
image

  
  
Posted 3 months ago
Votes Newest

Answers 17


Hi @<1523701070390366208:profile|CostlyOstrich36> , update for you here. I had noticed that the issue was not present for smaller datasets, which led us to discover that the problem was being caused by some nginx (I think) settings with the new server deployment. This was blocking the upload of the "dataset content" object. So our devops team was able to resolve the issue. Thanks very much for your help.

  
  
Posted 3 months ago

Server (see screenshot). Thanks!
image

  
  
Posted 3 months ago

Please open developer tools (F12) and see what is returned in the network from tasks.get_by_id_ex
Also please see if there are any errors in the console

  
  
Posted 3 months ago

Hi @<1533620191232004096:profile|NuttyLobster9> , thank you for the update. Can you please point out what were the changes that were done?

  
  
Posted 3 months ago

Do any of these API calls have a "Dataset Content" field anywhere in the "configuration" section?

  
  
Posted 3 months ago

Hi John, yes we are .

  
  
Posted 3 months ago

Do you see any errors in the mongo or elastic containers when you create a dataset?

Also, does this issue reproduce when you create datasets via the CLI rather then the SDK?

  
  
Posted 3 months ago

Hi @<1533620191232004096:profile|NuttyLobster9> , are you self hosting ClearML?

  
  
Posted 3 months ago

So you migrated your server and since then this issue appeared?

  
  
Posted 3 months ago

I don't see any console errors

  
  
Posted 3 months ago

SDK: clearml==1.15.1

  
  
Posted 3 months ago

I think this is what you're looking for but let me know if you meant something different:

{
    "meta": {
        "id": "76fffdf3b04247fa8f0c3fc0743b3ccb",
        "trx": "76fffdf3b04247fa8f0c3fc0743b3ccb",
        "endpoint": {
            "name": "tasks.get_by_id_ex",
            "requested_version": "2.30",
            "actual_version": "1.0"
        },
        "result_code": 200,
        "result_subcode": 0,
        "result_msg": "OK",
        "error_stack": "",
        "error_data": {}
    },
    "data": {
        "tasks": [
            {
                "comment": "Auto-generated at 2024-08-21 14:39:28 UTC by root@a37cd5bc31c4",
                "configuration": {
                    "Dataset Struct": {
                        "name": "Dataset Struct",
                        "value": "{\n  \"0\": {\n    \"job_id\": \"177265adca46459f8b19d7669ab5e5d5\",\n    \"status\": \"in_progress\",\n    \"last_update\": 1724251291,\n    \"parents\": [],\n    \"job_size\": 840767386,\n    \"name\": \"my-dataset\",\n    \"version\": \"1.0.0\"\n  }\n}",
                        "type": "json",
                        "description": "Structure of the dataset"
                    }
                },
                "id": "177265adca46459f8b19d7669ab5e5d5",
                "name": "my-dataset",
                "runtime": {
                    "orig_dataset_name": "my-dataset",
                    "orig_dataset_id": "177265adca46459f8b19d7669ab5e5d5",
                    "version": "1.0.0",
                    "ds_file_count": 29608,
                    "ds_link_count": 0,
                    "ds_total_size": 840767386,
                    "ds_total_size_compressed": 845212762,
                    "ds_change_add": 29608,
                    "ds_change_remove": 0,
                    "ds_change_modify": 0,
                    "ds_change_size": 840767386
                },
                "status": "completed"
            }
        ]
    }
}
  
  
Posted 3 months ago

Hi @<1523701070390366208:profile|CostlyOstrich36> , this is what our devops engineer said:

the proxy-body-size limitation crashed for the Clearml api, for WEB and FileServer I set it to unlimited, but for the API I didn't change it.

  
  
Posted 3 months ago

It seems so, yes. I'm not the one who did the server migration, but as a user I believe this is when I started noticing the issue for new datasets created after the migration.

  
  
Posted 3 months ago

No, i'm not seeing that "Dataset Content" section. We have some older datasets that were copied from a prior server deployment that do have the section and it appears in the UI.

  
  
Posted 3 months ago

What version of the server are you running? What version of the SDK also?

  
  
Posted 3 months ago

I'll check

  
  
Posted 3 months ago