Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Is There Any Testing Suite That Ships With Clearml? If We'D Like To Make Some Unit Tests For Our Code?

Is there any testing suite that ships with ClearML? If we'd like to make some unit tests for our code?

  
  
Posted 2 years ago
Votes Newest

Answers 30


Seems like

Task.create

is the correct use-case then, since again this is about testing flows using e.g. pytest,

Make sense

This seems to be fine for now, ...

Sounds good! thanks UnevenDolphin73

  
  
Posted 2 years ago

This seems to be fine for now, if any future lookups finds this thread, btw
with mock.patch('clearml.datasets.dataset.Dataset.create'): ...

  
  
Posted 2 years ago

Seems like Task.create is the correct use-case then, since again this is about testing flows using e.g. pytest, so the task is not the current process.

I've at least seen references in dataset.py 's code that seem to apply to offline mode (e.g. in Dataset.create there is if output_uri and not Task._offline_mode: , so someone did consider datasets in offline mode)

  
  
Posted 2 years ago

mostly by using

Task.create

instead of

Task.init

.

UnevenDolphin73 , now I'm confused , Task.create is Not meant to be used as a replacement for Task.init, this is so you can manually create an Additional Task (not the current process Task). How are you using it ?

Regarding the second - I'm not doing anything per se. I'm running in offline mode and I'm trying to create a dataset, and this is the error I get...

I think the main thing we need to test is offline Dataset creation... I do not think this was actually supported. What is the use case here? or should it be a pass-through ?

Or if it wasn't clear, that chunk of code is from clearml's dataset.py

Oh! that makes sense 🙂

  
  
Posted 2 years ago

Or if it wasn't clear, that chunk of code is from clearml's dataset.py

  
  
Posted 2 years ago

Yeah I managed to work around those former two, mostly by using Task.create instead of Task.init . It's actually the whole bunch of daemons running in the background that takes a long time, not the zipping.

Regarding the second - I'm not doing anything per se. I'm running in offline mode and I'm trying to create a dataset, and this is the error I get...
There is a data object it, but there is no script object attached to it (presumably again because of pytest?)

  
  
Posted 2 years ago

Last but not least - can I cancel the offline zip creation if I'm not interested in it

you can override with OS environment, would that work?

Or well, because it's not geared for tests, I'm just encountering weird shit. Just calling

task.close()

takes a long time

It actually zips the entire offline folder so you can later upload it. Maybe we can disable that part?!

` # generate the script section
script = (
"from clearml import Dataset\n\n"
"ds = Dataset.create(dataset_project='{dataset_project}', dataset_name='{dataset_name}', dataset_version='{dataset_version}')\n".format(
dataset_project=dataset_project, dataset_name=dataset_name, dataset_version=dataset_version
)
)

      task.data.script.diff = script

E AttributeError: 'NoneType' object has no attribute 'diff' `Well this is something you should probably not do, there is no "data" object in offline it never accessed the backend

  
  
Posted 2 years ago

Another example - trying to validate dataset interactions ends with

` else:
self._created_task = True
dataset_project, parent_project = self._build_hidden_project_name(dataset_project, dataset_name)
task = Task.create(
project_name=dataset_project, task_name=dataset_name, task_type=Task.TaskTypes.data_processing)
if bool(Session.check_min_api_server_version(Dataset.__min_api_version)):
get_or_create_project(task.session, project_name=parent_project, system_tags=[self.__hidden_tag])
get_or_create_project(
task.session,
project_name=dataset_project,
project_id=task.project,
system_tags=[self.__hidden_tag, self.__tag],
)
# set default output_uri
task.output_uri = True
task.set_system_tags((task.get_system_tags() or []) + [self.__tag])
if dataset_tags:
task.set_tags((task.get_tags() or []) + list(dataset_tags))
task.mark_started()
# generate the script section
script = (
"from clearml import Dataset\n\n"
"ds = Dataset.create(dataset_project='{dataset_project}', dataset_name='{dataset_name}', dataset_version='{dataset_version}')\n".format(
dataset_project=dataset_project, dataset_name=dataset_name, dataset_version=dataset_version
)
)

      task.data.script.diff = script

E AttributeError: 'NoneType' object has no attribute 'diff' `

  
  
Posted 2 years ago

Or well, because it's not geared for tests, I'm just encountering weird shit. Just calling task.close() takes a long time

  
  
Posted 2 years ago

Last but not least - can I cancel the offline zip creation if I'm not interested in it 🤔
EDIT: I see not, guess one has to patch ZipFile ...

  
  
Posted 2 years ago

Note that it would succeed if e.g. run with pytest -s

  
  
Posted 2 years ago

` # test_clearml.py
import pytest
import shutil
import clearml

@pytest.fixture
def clearml_task():
clearml.Task.set_offline_mode(True)
task = clearml.Task.init(project_name="test", task_name="test")
yield task
shutil.rmtree(task.get_offline_mode_folder())
clearml.Task.set_offline_mode(False)

class ClearMLTests:
def test_something(self, clearml_task):
assert True run with pytest test_clearml.py `

  
  
Posted 2 years ago

😢 any chance you have a toy pytest that replicates it ?

  
  
Posted 2 years ago

I dunno :man-shrugging: but Task.init is clearly incompatible with pytest and friends

  
  
Posted 2 years ago

But that should not mean you cannot write to them, no?!

  
  
Posted 2 years ago

I'm running tests with pytest , it consumes/owns the stream

  
  
Posted 2 years ago

UnevenDolphin73 are you saying offline does not work?

stream.write(msg + self.terminator) ValueError: I/O operation on closed file.This is internal python error, how come there is no stream?

  
  
Posted 2 years ago

Is Task.create the way to go here? 🤔

  
  
Posted 2 years ago

This is with:
Task.set_offline_mode(True) task = Task.init(..., auto_connect_streams=False)

  
  
Posted 2 years ago

Coming back to this; ClearML prints a lot of error messages in local tests, supposedly because the output streams are not directly available:
` --- Logging error ---
Traceback (most recent call last):
File "/usr/lib/python3.10/logging/init.py", line 1103, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/home/idan/CC/git/ds-platform/.venv/lib/python3.10/site-packages/clearml/task.py", line 3504, in _at_exit
self.__shutdown()
File "/home/idan/CC/git/ds-platform/.venv/lib/python3.10/site-packages/clearml/task.py", line 3595, in __shutdown
self._wait_for_repo_detection(timeout=10.)
File "/home/idan/CC/git/ds-platform/.venv/lib/python3.10/site-packages/clearml/task.py", line 3468, in _wait_for_repo_detection
self.log.info('Waiting for repository detection and full package requirement analysis')
Message: 'Waiting for repository detection and full package requirement analysis'
Arguments: ()
--- Logging error ---
Traceback (most recent call last):
File "/usr/lib/python3.10/logging/init.py", line 1103, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/usr/lib/python3.10/threading.py", line 973, in _bootstrap
self._bootstrap_inner()
File "/usr/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
self.run()
File "/usr/lib/python3.10/threading.py", line 953, in run
self._target(*self._args, **self._kwargs)
File "/home/idan/CC/git/ds-platform/.venv/lib/python3.10/site-packages/clearml/backend_interface/task/task.py", line 284, in _update_repository
self.reload()
File "/home/idan/CC/git/ds-platform/.venv/lib/python3.10/site-packages/clearml/backend_interface/task/task.py", line 468, in reload
return super(Task, self).reload()
File "/home/idan/CC/git/ds-platform/.venv/lib/python3.10/site-packages/clearml/backend_interface/base.py", line 176, in reload
self._data = self._reload()
File "/home/idan/CC/git/ds-platform/.venv/lib/python3.10/site-packages/clearml/backend_interface/task/task.py", line 564, in _reload
res = self.send(tasks.GetByIdRequest(task=self.id))
File "/home/idan/CC/git/ds-platform/.venv/lib/python3.10/site-packages/clearml/backend_interface/base.py", line 109, in send
return self._send(session=self.session, req=req, ignore_errors=ignore_errors, raise_on_errors=raise_on_errors,
File "/home/idan/CC/git/ds-platform/.venv/lib/python3.10/site-packages/clearml/backend_interface/base.py", line 70, in _send
log.error(error_msg)
Message: 'Action failed <400/101: tasks.get_by_id/v1.0 (Invalid task id: id=offline-833fb11175d14a1cabb5344e3366dce0, company=d1bd92a3b039400cbafc60a7a5b1e52b)> (task=offline-833fb11175d14a1cabb5344e3366dce0)'
Arguments: ()
--- Logging error ---
Traceback (most recent call last):
File "/home/idan/CC/git/ds-platform/.venv/lib/python3.10/site-packages/clearml/backend_interface/base.py", line 176, in reload
self._data = self._reload()
File "/home/idan/CC/git/ds-platform/.venv/lib/python3.10/site-packages/clearml/backend_interface/task/task.py", line 564, in _reload
res = self.send(tasks.GetByIdRequest(task=self.id))
File "/home/idan/CC/git/ds-platform/.venv/lib/python3.10/site-packages/clearml/backend_interface/base.py", line 109, in send
return self._send(session=self.session, req=req, ignore_errors=ignore_errors, raise_on_errors=raise_on_errors,
File "/home/idan/CC/git/ds-platform/.venv/lib/python3.10/site-packages/clearml/backend_interface/base.py", line 103, in _send
raise SendError(res, error_msg)
clearml.backend_interface.session.SendError: Action failed <400/101: tasks.get_by_id/v1.0 (Invalid task id: id=offline-833fb11175d14a1cabb5344e3366dce0, company=d1bd92a3b039400cbafc60a7a5b1e52b)> (task=offline-833fb11175d14a1cabb5344e3366dce0)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/usr/lib/python3.10/logging/init.py", line 1103, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/usr/lib/python3.10/threading.py", line 973, in _bootstrap
self._bootstrap_inner()
File "/usr/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
self.run()
File "/usr/lib/python3.10/threading.py", line 953, in run
self._target(*self._args, **self._kwargs)
File "/home/idan/CC/git/ds-platform/.venv/lib/python3.10/site-packages/clearml/backend_interface/task/task.py", line 284, in _update_repository
self.reload()
File "/home/idan/CC/git/ds-platform/.venv/lib/python3.10/site-packages/clearml/backend_interface/task/task.py", line 468, in reload
return super(Task, self).reload()
File "/home/idan/CC/git/ds-platform/.venv/lib/python3.10/site-packages/clearml/backend_interface/base.py", line 178, in reload
self.log.error("Failed reloading task {}".format(self.id))
Message: 'Failed reloading task offline-833fb11175d14a1cabb5344e3366dce0'
Arguments: ()
--- Logging error ---
Traceback (most recent call last):
File "/usr/lib/python3.10/logging/init.py", line 1103, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/home/idan/CC/git/ds-platform/.venv/lib/python3.10/site-packages/clearml/task.py", line 3504, in _at_exit
self.__shutdown()
File "/home/idan/CC/git/ds-platform/.venv/lib/python3.10/site-packages/clearml/task.py", line 3595, in __shutdown
self._wait_for_repo_detection(timeout=10.)
File "/home/idan/CC/git/ds-platform/.venv/lib/python3.10/site-packages/clearml/task.py", line 3478, in _wait_for_repo_detection
self.log.info('Finished repository detection and package analysis')
Message: 'Finished repository detection and package analysis'
Arguments: ()
--- Logging error ---
Traceback (most recent call last):
File "/usr/lib/python3.10/logging/init.py", line 1103, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/home/idan/CC/git/ds-platform/.venv/lib/python3.10/site-packages/clearml/task.py", line 3504, in _at_exit
self.__shutdown()
File "/home/idan/CC/git/ds-platform/.venv/lib/python3.10/site-packages/clearml/task.py", line 3605, in __shutdown
self.log.info('Waiting to finish uploads')
Message: 'Waiting to finish uploads'
Arguments: () `
Any way to avoid this?

  
  
Posted 2 years ago

I'm working on the config object references 😉

  
  
Posted 2 years ago

If you create an initial code base maybe we can merge it?

  
  
Posted 2 years ago

I'll try it out, but I would not like to rewrite that code myself maintain it, that's my point 😅

Or are you suggesting I Task.import_offline_session ?

  
  
Posted 2 years ago

Yes exactly that AgitatedDove14
Testing our logic maps correctly, etc for everything related to ClearML

  
  
Posted 2 years ago

I guess the thing that's missing from offline execution is being able to load an offline task without uploading it to the backend.

UnevenDolphin73 you mean like as to get the Task object from it?
(This might be doable, the main issue would be the metrics / logs loading)
What would be the use case for the testing ?

  
  
Posted 2 years ago

It does, but I don't want to guess the json structure (what if ClearML changes it or the folder structure it uses for offline execution?). If I do this, I'm planning a test that's reliant on ClearML implementation of offline mode, which is tangent to the unit test

  
  
Posted 2 years ago

I think it basically runs in offline mode and populates all the relevant fields (Task attributes) in some json (or some other config file). I think you could read this file and compare to something that is expected thus having an ability to run something offline and then verify it's contents.

  
  
Posted 2 years ago

I guess the thing that's missing from offline execution is being able to load an offline task without uploading it to the backend.
Or is that functionality provided by setting offline mode and then importing an offline task?

  
  
Posted 2 years ago

no, at least not yet, someone definitely needs to do that though haha

Currently all the unit tests are internal (the hardest part is providing server they can run against and verify the results, hence the challange)

For example, if ClearML would offer a

TestSession

that is local and does not communicate to any backend

Offline mode? it stores everything into a folder, then zips it, you can access the target folder or the zip file and verify all the data/states

  
  
Posted 2 years ago