Hi, Lately While Trying To Create A New Dataset We Encounter The Following Error: My Code:

Answered

Hi, lately while trying to create a new dataset we encounter the following error:

my code:

from clearml import Dataset, Task
dataset_1 = Dataset.create(dataset_project='test', dataset_name='unit_test', dataset_version='1.0.0')

the error:

2024-01-15 13:22:06,490 - clearml.storage - ERROR - Exception encountered while uploading Failed uploading object test/.datasets/unit_test/unit_test.ca32cecc5a464ae1b7c60ffa92db6293/artifacts/state/state.json (500): <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<title>500 Internal Server Error</title>
<h1>Internal Server Error</h1>
<p>The server encountered an internal error and was unable to complete your request. Either the server is overloaded or there is an error in the application.</p>
Traceback (most recent call last):
  File "C:\Users\TomerRoditi\venvs\ca\lib\site-packages\IPython\core\interactiveshell.py", line 3508, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-2-140a1aae43c9>", line 2, in <module>
    dataset_1 = Dataset.create(dataset_project='test', dataset_name='unit_test', dataset_version='1.0.0')
  File "C:\Users\TomerRoditi\venvs\ca\lib\site-packages\clearml\datasets\dataset.py", line 1316, in create
    instance._serialize()
  File "C:\Users\TomerRoditi\venvs\ca\lib\site-packages\clearml\datasets\dataset.py", line 2175, in _serialize
    self._task.upload_artifact(
  File "C:\Users\TomerRoditi\venvs\ca\lib\site-packages\clearml\task.py", line 2162, in upload_artifact
    raise exception_to_raise
  File "C:\Users\TomerRoditi\venvs\ca\lib\site-packages\clearml\task.py", line 2143, in upload_artifact
    if self._artifacts_manager.upload_artifact(
  File "C:\Users\TomerRoditi\venvs\ca\lib\site-packages\clearml\binding\artifacts.py", line 781, in upload_artifact
    uri = self._upload_local_file(local_filename, name,
  File "C:\Users\TomerRoditi\venvs\ca\lib\site-packages\clearml\binding\artifacts.py", line 965, in _upload_local_file
    StorageManager.upload_file(local_file.as_posix(), uri, wait_for_upload=True, retries=ev.retries)
  File "C:\Users\TomerRoditi\venvs\ca\lib\site-packages\clearml\storage\manager.py", line 78, in upload_file
    return CacheManager.get_cache_manager().upload_file(
  File "C:\Users\TomerRoditi\venvs\ca\lib\site-packages\clearml\storage\cache.py", line 97, in upload_file
    result = helper.upload(
  File "C:\Users\TomerRoditi\venvs\ca\lib\site-packages\clearml\storage\helper.py", line 2486, in upload
    res = self._do_upload(
  File "C:\Users\TomerRoditi\venvs\ca\lib\site-packages\clearml\storage\helper.py", line 2926, in _do_upload
    raise last_ex
  File "C:\Users\TomerRoditi\venvs\ca\lib\site-packages\clearml\storage\helper.py", line 2910, in _do_upload
    if not self._upload_from_file(local_path=src_path, dest_path=canonized_dest_path, extra=extra):
  File "C:\Users\TomerRoditi\venvs\ca\lib\site-packages\clearml\storage\helper.py", line 2883, in _upload_from_file
    res = self._driver.upload_object(
  File "C:\Users\TomerRoditi\venvs\ca\lib\site-packages\clearml\storage\helper.py", line 291, in upload_object
    return self.upload_object_via_stream(iterator=stream, container=container,
  File "C:\Users\TomerRoditi\venvs\ca\lib\site-packages\clearml\storage\helper.py", line 211, in upload_object_via_stream
    raise ValueError('Failed uploading object %s (%d): %s' % (object_name, res.status_code, res.text))
ValueError: Failed uploading object test/.datasets/unit_test/unit_test.ca32cecc5a464ae1b7c60ffa92db6293/artifacts/state/state.json (500): <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<title>500 Internal Server Error</title>
<h1>Internal Server Error</h1>
<p>The server encountered an internal error and was unable to complete your request. Either the server is overloaded or there is an error in the application.</p>

is there something going on with the clearml's servers? the code (and conf file) hasn't changed lately and its a pretty simple task so i tend to think the problem is not on my side.. any idea what is causing the issue?
FriendlyBee37

  				
Posted 
	one year ago

					More  		
  Report
		
					DangerousBee35
				
					0
					 × 1

Votes Newest

Answers 2

conf file:

# ClearML SDK configuration file
api { 
    # Tomer Roditi's workspace
    web_server:


    api_server:


    files_server:


    # corractions server
    credentials {"access_key": "****", "secret_key": "****"}
}
sdk {
    # ClearML - default SDK configuration

    storage {
        cache {
            # Defaults to system temp folder / cache
            default_base_dir: "~/.clearml/cache"
            # default_cache_manager_size: 100
        }

        direct_access: [
            # Objects matching are considered to be available for direct access, i.e. they will not be downloaded
            # or cached, and any download request will return a direct reference.
            # Objects are specified in glob format, available for url and content_type.
            { url: "file://*" }  # file-urls are always directly referenced
        ]
    }

    metrics {
        # History size for debug files per metric/variant. For each metric/variant combination with an attached file
        # (e.g. debug image event), file names for the uploaded files will be recycled in such a way that no more than
        # X files are stored in the upload destination for each metric/variant combination.
        file_history_size: 100

        # Max history size for matplotlib imshow files per plot title.
        # File names for the uploaded images will be recycled in such a way that no more than
        # X images are stored in the upload destination for each matplotlib plot title.
        matplotlib_untitled_history_size: 100

        # Limit the number of digits after the dot in plot reporting (reducing plot report size)
        # plot_max_num_digits: 5

        # Settings for generated debug images
        images {
            format: JPEG
            quality: 87
            subsampling: 0
        }

        # Support plot-per-graph fully matching Tensorboard behavior (i.e. if this is set to true, each series should have its own graph)
        tensorboard_single_series_per_graph: false
    }

    network {
        # Number of retries before failing to upload file
        file_upload_retries: 3

        metrics {
            # Number of threads allocated to uploading files (typically debug images) when transmitting metrics for
            # a specific iteration
            file_upload_threads: 4

            # Warn about upload starvation if no uploads were made in specified period while file-bearing events keep
            # being sent for upload
            file_upload_starvation_warning_sec: 120
        }

        iteration {
            # Max number of retries when getting frames if the server returned an error (http code 500)
            max_retries_on_server_error: 5
            # Backoff factory for consecutive retry attempts.
            # SDK will wait for {backoff factor} * (2 ^ ({number of total retries} - 1)) between retries.
            retry_backoff_factor_sec: 10
        }
    }
    aws {
        s3 {
            # S3 credentials, used for read/write access by various SDK elements

            # The following settings will be used for any bucket not specified below in the "credentials" section
            # ---------------------------------------------------------------------------------------------------
            region: ""
            # Specify explicit keys
            key: ""
            secret: ""
            # Or enable credentials chain to let Boto3 pick the right credentials. 
            # This includes picking credentials from environment variables, 
            # credential file and IAM role using metadata service. 
            # Refer to the latest Boto3 docs
            use_credentials_chain: false
            # Additional ExtraArgs passed to boto3 when uploading files. Can also be set per-bucket under "credentials".
            extra_args: {}
            # ---------------------------------------------------------------------------------------------------


            credentials: [
                # specifies key/secret credentials to use when handling s3 urls (read or write)
                # {
                #     bucket: "my-bucket-name"
                #     key: "my-access-key"
                #     secret: "my-secret-key"
                # },
                # {
                #     # This will apply to all buckets in this host (unless key/value is specifically provided for a given bucket)
                #     host: "my-minio-host:9000"
                #     key: "12345678"
                #     secret: "12345678"
                #     multipart: false
                #     secure: false
                # }
            ]
        }
        boto3 {
            pool_connections: 512
            max_multipart_concurrency: 16
        }
    }
    google.storage {
        # # Default project and credentials file
        # # Will be used when no bucket configuration is found
        # project: "clearml"
        # credentials_json: "/path/to/credentials.json"
        # pool_connections: 512
        # pool_maxsize: 1024

        # # Specific credentials per bucket and sub directory
        # credentials = [
        #     {
        #         bucket: "my-bucket"
        #         subdir: "path/in/bucket" # Not required
        #         project: "clearml"
        #         credentials_json: "/path/to/credentials.json"
        #     },
        # ]
    }
    azure.storage {
        # max_connections: 2

        # containers: [
        #     {
        #         account_name: "clearml"
        #         account_key: "secret"
        #         # container_name:
        #     }
        # ]
    }

    log {
        # debugging feature: set this to true to make null log propagate messages to root logger (so they appear in stdout)
        null_log_propagate: false
        task_log_buffer_capacity: 66

        # disable urllib info and lower levels
        disable_urllib3_info: true
    }

    development {
        # Development-mode options

        # dev task reuse window
        task_reuse_time_window_in_hours: 72.0

        # Run VCS repository detection asynchronously
        vcs_repo_detect_async: true

        # Store uncommitted git/hg source code diff in experiment manifest when training in development mode
        # This stores "git diff" or "hg diff" into the experiment's "script.requirements.diff" section
        store_uncommitted_code_diff: true

        # Support stopping an experiment in case it was externally stopped, status was changed or task was reset
        support_stopping: true

        # Default Task output_uri. if output_uri is not provided to Task.init, default_output_uri will be used instead.
        default_output_uri: ""

        # Default auto generated requirements optimize for smaller requirements
        # If True, analyze the entire repository regardless of the entry point.
        # If False, first analyze the entry point script, if it does not contain other to local files,
        # do not analyze the entire repository.
        force_analyze_entire_repo: false

        # If set to true, *clearml* update message will not be printed to the console
        # this value can be overwritten with os environment variable CLEARML_SUPPRESS_UPDATE_MESSAGE=1
        suppress_update_message: false

        # If this flag is true (default is false), instead of analyzing the code with Pigar, analyze with `pip freeze`
        detect_with_pip_freeze: false

        # Log specific environment variables. OS environments are listed in the "Environment" section
        # of the Hyper-Parameters.
        # multiple selected variables are supported including the suffix '*'.
        # For example: "AWS_*" will log any OS environment variable starting with 'AWS_'.
        # This value can be overwritten with os environment variable CLEARML_LOG_ENVIRONMENT="[AWS_*, CUDA_VERSION]"
        # Example: log_os_environments: ["AWS_*", "CUDA_VERSION"]
        log_os_environments: []

        # Development mode worker
        worker {
            # Status report period in seconds
            report_period_sec: 2

            # The number of events to report
            report_event_flush_threshold: 100

            # ping to the server - check connectivity
            ping_period_sec: 30

            # Log all stdout & stderr
            log_stdout: true

            # Carriage return (\r) support. If zero (0) \r treated as \n and flushed to backend
            # Carriage return flush support in seconds, flush consecutive line feeds (\r) every X (default: 10) seconds
            console_cr_flush_period: 10

            # compatibility feature, report memory usage for the entire machine
            # default (false), report only on the running process and its sub-processes
            report_global_mem_used: false

            # if provided, start resource reporting after this amount of seconds
            #report_start_sec: 30
        }
    }

    # Apply top-level environment section from configuration into os.environ
    apply_environment: false
    # Top-level environment section is in the form of:
    #   environment {
    #     key: value
    #     ...
    #   }
    # and is applied to the OS environment as `key=value` for each key/value pair

    # Apply top-level files section from configuration into local file system
    apply_files: false
    # Top-level files section allows auto-generating files at designated paths with a predefined contents
    # and target format. Options include:
    #  contents: the target file's content, typically a string (or any base type int/float/list/dict etc.)
    #  format: a custom format for the contents. Currently supported value is `base64` to automatically decode a
    #          base64-encoded contents string, otherwise ignored
    #  path: the target file's path, may include ~ and inplace env vars
    #  target_format: format used to encode contents before writing into the target file. Supported values are json,
    #                 yaml, yml and bytes (in which case the file will be written in binary mode). Default is text mode.
    #  overwrite: overwrite the target file in case it exists. Default is true.
    #
    # Example:
    #   files {
    #     myfile1 {
    #       contents: "The quick brown fox jumped over the lazy dog"
    #       path: "/tmp/fox.txt"
    #     }
    #     myjsonfile {
    #       contents: {
    #         some {
    #           nested {
    #             value: [1, 2, 3, 4]
    #           }
    #         }
    #       }
    #       path: "/tmp/test.json"
    #       target_format: json
    #     }
    #   }
}

  				
Posted 
	one year ago

					More  		
  Report
		
					DangerousBee35
				
					0
					 × 1

Hi DangerousBee35 , all data and logs points to some configuration error on your side that caused your code to somehow reach an incorrect server, I suggest verifying the configuration on your end. For further discussion (and since personal usage is concerned), I suggest moving to a DM 🙂

  				
Posted 
	one year ago

					More  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

Write your answer

1K Views

2 Answers

one year ago