Is There Any Testing Suite That Ships With Clearml? If We'D Like To Make Some Unit Tests For Our Code?

Should not be complicated, it's basically here
https://github.com/allegroai/clearml/blob/1eee271f01a141e41542296ef4649eeead2e7284/clearml/task.py#L2763
wdyt?

  				
Posted 
	2 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

I'll try it out, but I would not like to rewrite that code myself maintain it, that's my point 😅

Or are you suggesting I Task.import_offline_session ?

  				
Posted 
	2 years ago

					More  		
  Report
		
					UnevenDolphin73
				
					0
					 × 1

Yeah I managed to work around those former two, mostly by using Task.create instead of Task.init . It's actually the whole bunch of daemons running in the background that takes a long time, not the zipping.

Regarding the second - I'm not doing anything per se. I'm running in offline mode and I'm trying to create a dataset, and this is the error I get...
There is a data object it, but there is no script object attached to it (presumably again because of pytest?)

  				
Posted 
	2 years ago

					More  		
  Report
		
					UnevenDolphin73
				
					0
					 × 1

This seems to be fine for now, if any future lookups finds this thread, btw
with mock.patch('clearml.datasets.dataset.Dataset.create'): ...

  				
Posted 
	2 years ago

					More  		
  Report
		
					UnevenDolphin73
				
					0
					 × 1

If you create an initial code base maybe we can merge it?

  				
Posted 
	2 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

` # test_clearml.py
import pytest
import shutil
import clearml

@pytest.fixture
def clearml_task():
clearml.Task.set_offline_mode(True)
task = clearml.Task.init(project_name="test", task_name="test")
yield task
shutil.rmtree(task.get_offline_mode_folder())
clearml.Task.set_offline_mode(False)

class ClearMLTests:
def test_something(self, clearml_task):
assert True run with pytest test_clearml.py `

  				
Posted 
	2 years ago

					More  		
  Report
		
					UnevenDolphin73
				
					0
					 × 1

😢 any chance you have a toy pytest that replicates it ?

  				
Posted 
	2 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

I dunno :man-shrugging: but Task.init is clearly incompatible with pytest and friends

  				
Posted 
	2 years ago

					More  		
  Report
		
					UnevenDolphin73
				
					0
					 × 1

Seems like Task.create is the correct use-case then, since again this is about testing flows using e.g. pytest, so the task is not the current process.

I've at least seen references in dataset.py 's code that seem to apply to offline mode (e.g. in Dataset.create there is if output_uri and not Task._offline_mode: , so someone did consider datasets in offline mode)

  				
Posted 
	2 years ago

					More  		
  Report
		
					UnevenDolphin73
				
					0
					 × 1

But that should not mean you cannot write to them, no?!

  				
Posted 
	2 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

It does, but I don't want to guess the json structure (what if ClearML changes it or the folder structure it uses for offline execution?). If I do this, I'm planning a test that's reliant on ClearML implementation of offline mode, which is tangent to the unit test

  				
Posted 
	2 years ago

					More  		
  Report
		
					UnevenDolphin73
				
					0
					 × 1

I guess the thing that's missing from offline execution is being able to load an offline task without uploading it to the backend.

UnevenDolphin73 you mean like as to get the Task object from it?
(This might be doable, the main issue would be the metrics / logs loading)
What would be the use case for the testing ?

  				
Posted 
	2 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Coming back to this; ClearML prints a lot of error messages in local tests, supposedly because the output streams are not directly available:
` --- Logging error ---
Traceback (most recent call last):
File "/usr/lib/python3.10/logging/init.py", line 1103, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/home/idan/CC/git/ds-platform/.venv/lib/python3.10/site-packages/clearml/task.py", line 3504, in _at_exit
self.__shutdown()
File "/home/idan/CC/git/ds-platform/.venv/lib/python3.10/site-packages/clearml/task.py", line 3595, in __shutdown
self._wait_for_repo_detection(timeout=10.)
File "/home/idan/CC/git/ds-platform/.venv/lib/python3.10/site-packages/clearml/task.py", line 3468, in _wait_for_repo_detection
self.log.info('Waiting for repository detection and full package requirement analysis')
Message: 'Waiting for repository detection and full package requirement analysis'
Arguments: ()
--- Logging error ---
Traceback (most recent call last):
File "/usr/lib/python3.10/logging/init.py", line 1103, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/usr/lib/python3.10/threading.py", line 973, in _bootstrap
self._bootstrap_inner()
File "/usr/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
self.run()
File "/usr/lib/python3.10/threading.py", line 953, in run
self._target(*self._args, **self._kwargs)
File "/home/idan/CC/git/ds-platform/.venv/lib/python3.10/site-packages/clearml/backend_interface/task/task.py", line 284, in _update_repository
self.reload()
File "/home/idan/CC/git/ds-platform/.venv/lib/python3.10/site-packages/clearml/backend_interface/task/task.py", line 468, in reload
return super(Task, self).reload()
File "/home/idan/CC/git/ds-platform/.venv/lib/python3.10/site-packages/clearml/backend_interface/base.py", line 176, in reload
self._data = self._reload()
File "/home/idan/CC/git/ds-platform/.venv/lib/python3.10/site-packages/clearml/backend_interface/task/task.py", line 564, in _reload
res = self.send(tasks.GetByIdRequest(task=self.id))
File "/home/idan/CC/git/ds-platform/.venv/lib/python3.10/site-packages/clearml/backend_interface/base.py", line 109, in send
return self._send(session=self.session, req=req, ignore_errors=ignore_errors, raise_on_errors=raise_on_errors,
File "/home/idan/CC/git/ds-platform/.venv/lib/python3.10/site-packages/clearml/backend_interface/base.py", line 70, in _send
log.error(error_msg)
Message: 'Action failed <400/101: tasks.get_by_id/v1.0 (Invalid task id: id=offline-833fb11175d14a1cabb5344e3366dce0, company=d1bd92a3b039400cbafc60a7a5b1e52b)> (task=offline-833fb11175d14a1cabb5344e3366dce0)'
Arguments: ()
--- Logging error ---
Traceback (most recent call last):
File "/home/idan/CC/git/ds-platform/.venv/lib/python3.10/site-packages/clearml/backend_interface/base.py", line 176, in reload
self._data = self._reload()
File "/home/idan/CC/git/ds-platform/.venv/lib/python3.10/site-packages/clearml/backend_interface/task/task.py", line 564, in _reload
res = self.send(tasks.GetByIdRequest(task=self.id))
File "/home/idan/CC/git/ds-platform/.venv/lib/python3.10/site-packages/clearml/backend_interface/base.py", line 109, in send
return self._send(session=self.session, req=req, ignore_errors=ignore_errors, raise_on_errors=raise_on_errors,
File "/home/idan/CC/git/ds-platform/.venv/lib/python3.10/site-packages/clearml/backend_interface/base.py", line 103, in _send
raise SendError(res, error_msg)
clearml.backend_interface.session.SendError: Action failed <400/101: tasks.get_by_id/v1.0 (Invalid task id: id=offline-833fb11175d14a1cabb5344e3366dce0, company=d1bd92a3b039400cbafc60a7a5b1e52b)> (task=offline-833fb11175d14a1cabb5344e3366dce0)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/usr/lib/python3.10/logging/init.py", line 1103, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/usr/lib/python3.10/threading.py", line 973, in _bootstrap
self._bootstrap_inner()
File "/usr/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
self.run()
File "/usr/lib/python3.10/threading.py", line 953, in run
self._target(*self._args, **self._kwargs)
File "/home/idan/CC/git/ds-platform/.venv/lib/python3.10/site-packages/clearml/backend_interface/task/task.py", line 284, in _update_repository
self.reload()
File "/home/idan/CC/git/ds-platform/.venv/lib/python3.10/site-packages/clearml/backend_interface/task/task.py", line 468, in reload
return super(Task, self).reload()
File "/home/idan/CC/git/ds-platform/.venv/lib/python3.10/site-packages/clearml/backend_interface/base.py", line 178, in reload
self.log.error("Failed reloading task {}".format(self.id))
Message: 'Failed reloading task offline-833fb11175d14a1cabb5344e3366dce0'
Arguments: ()
--- Logging error ---
Traceback (most recent call last):
File "/usr/lib/python3.10/logging/init.py", line 1103, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/home/idan/CC/git/ds-platform/.venv/lib/python3.10/site-packages/clearml/task.py", line 3504, in _at_exit
self.__shutdown()
File "/home/idan/CC/git/ds-platform/.venv/lib/python3.10/site-packages/clearml/task.py", line 3595, in __shutdown
self._wait_for_repo_detection(timeout=10.)
File "/home/idan/CC/git/ds-platform/.venv/lib/python3.10/site-packages/clearml/task.py", line 3478, in _wait_for_repo_detection
self.log.info('Finished repository detection and package analysis')
Message: 'Finished repository detection and package analysis'
Arguments: ()
--- Logging error ---
Traceback (most recent call last):
File "/usr/lib/python3.10/logging/init.py", line 1103, in emit
stream.write(msg + self.terminator)
ValueError: I/O operation on closed file.
Call stack:
File "/home/idan/CC/git/ds-platform/.venv/lib/python3.10/site-packages/clearml/task.py", line 3504, in _at_exit
self.__shutdown()
File "/home/idan/CC/git/ds-platform/.venv/lib/python3.10/site-packages/clearml/task.py", line 3605, in __shutdown
self.log.info('Waiting to finish uploads')
Message: 'Waiting to finish uploads'
Arguments: () `
Any way to avoid this?

  				
Posted 
	2 years ago

					More  		
  Report
		
					UnevenDolphin73
				
					0
					 × 1

I guess the thing that's missing from offline execution is being able to load an offline task without uploading it to the backend.
Or is that functionality provided by setting offline mode and then importing an offline task?

  				
Posted 
	2 years ago

					More  		
  Report
		
					UnevenDolphin73
				
					0
					 × 1

I think it basically runs in offline mode and populates all the relevant fields (Task attributes) in some json (or some other config file). I think you could read this file and compare to something that is expected thus having an ability to run something offline and then verify it's contents.

  				
Posted 
	2 years ago

					More  		
  Report
		
					CostlyOstrich36
				
					0

Seems like

Task.create

is the correct use-case then, since again this is about testing flows using e.g. pytest,

Make sense

This seems to be fine for now, ...

Sounds good! thanks UnevenDolphin73

  				
Posted 
	2 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

mostly by using

Task.create

instead of

Task.init

.

UnevenDolphin73 , now I'm confused , Task.create is Not meant to be used as a replacement for Task.init, this is so you can manually create an Additional Task (not the current process Task). How are you using it ?

Regarding the second - I'm not doing anything per se. I'm running in offline mode and I'm trying to create a dataset, and this is the error I get...

I think the main thing we need to test is offline Dataset creation... I do not think this was actually supported. What is the use case here? or should it be a pass-through ?

Or if it wasn't clear, that chunk of code is from clearml's dataset.py

Oh! that makes sense 🙂

  				
Posted 
	2 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Yes exactly that AgitatedDove14
Testing our logic maps correctly, etc for everything related to ClearML

  				
Posted 
	2 years ago

					More  		
  Report
		
					UnevenDolphin73
				
					0
					 × 1

I'm running tests with pytest , it consumes/owns the stream

  				
Posted 
	2 years ago

					More  		
  Report
		
					UnevenDolphin73
				
					0
					 × 1

UnevenDolphin73 are you saying offline does not work?

stream.write(msg + self.terminator) ValueError: I/O operation on closed file.This is internal python error, how come there is no stream?

  				
Posted 
	2 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Is Task.create the way to go here? 🤔

  				
Posted 
	2 years ago

					More  		
  Report
		
					UnevenDolphin73
				
					0
					 × 1

This is with:
Task.set_offline_mode(True) task = Task.init(..., auto_connect_streams=False)

  				
Posted 
	2 years ago

					More  		
  Report
		
					UnevenDolphin73
				
					0
					 × 1

Or well, because it's not geared for tests, I'm just encountering weird shit. Just calling task.close() takes a long time

  				
Posted 
	2 years ago

					More  		
  Report
		
					UnevenDolphin73
				
					0
					 × 1

Last but not least - can I cancel the offline zip creation if I'm not interested in it 🤔
EDIT: I see not, guess one has to patch ZipFile ...

  				
Posted 
	2 years ago

					More  		
  Report
		
					UnevenDolphin73
				
					0
					 × 1

Or if it wasn't clear, that chunk of code is from clearml's dataset.py

  				
Posted 
	2 years ago

					More  		
  Report
		
					UnevenDolphin73
				
					0
					 × 1

I'm working on the config object references 😉

  				
Posted 
	2 years ago

					More  		
  Report
		
					UnevenDolphin73
				
					0
					 × 1

Last but not least - can I cancel the offline zip creation if I'm not interested in it

you can override with OS environment, would that work?

Or well, because it's not geared for tests, I'm just encountering weird shit. Just calling

task.close()

takes a long time

It actually zips the entire offline folder so you can later upload it. Maybe we can disable that part?!

` # generate the script section
script = (
"from clearml import Dataset\n\n"
"ds = Dataset.create(dataset_project='{dataset_project}', dataset_name='{dataset_name}', dataset_version='{dataset_version}')\n".format(
dataset_project=dataset_project, dataset_name=dataset_name, dataset_version=dataset_version
)
)

      task.data.script.diff = script

E AttributeError: 'NoneType' object has no attribute 'diff' `Well this is something you should probably not do, there is no "data" object in offline it never accessed the backend

  				
Posted 
	2 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Another example - trying to validate dataset interactions ends with

` else:
self._created_task = True
dataset_project, parent_project = self._build_hidden_project_name(dataset_project, dataset_name)
task = Task.create(
project_name=dataset_project, task_name=dataset_name, task_type=Task.TaskTypes.data_processing)
if bool(Session.check_min_api_server_version(Dataset.__min_api_version)):
get_or_create_project(task.session, project_name=parent_project, system_tags=[self.__hidden_tag])
get_or_create_project(
task.session,
project_name=dataset_project,
project_id=task.project,
system_tags=[self.__hidden_tag, self.__tag],
)
# set default output_uri
task.output_uri = True
task.set_system_tags((task.get_system_tags() or []) + [self.__tag])
if dataset_tags:
task.set_tags((task.get_tags() or []) + list(dataset_tags))
task.mark_started()
# generate the script section
script = (
"from clearml import Dataset\n\n"
"ds = Dataset.create(dataset_project='{dataset_project}', dataset_name='{dataset_name}', dataset_version='{dataset_version}')\n".format(
dataset_project=dataset_project, dataset_name=dataset_name, dataset_version=dataset_version
)
)

      task.data.script.diff = script

E AttributeError: 'NoneType' object has no attribute 'diff' `

  				
Posted 
	2 years ago

					More  		
  Report
		
					UnevenDolphin73
				
					0
					 × 1

no, at least not yet, someone definitely needs to do that though haha

Currently all the unit tests are internal (the hardest part is providing server they can run against and verify the results, hence the challange)

For example, if ClearML would offer a

TestSession

that is local and does not communicate to any backend

Offline mode? it stores everything into a folder, then zips it, you can access the target folder or the zip file and verify all the data/states

  				
Posted 
	2 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Note that it would succeed if e.g. run with pytest -s

  				
Posted 
	2 years ago

					More  		
  Report
		
					UnevenDolphin73
				
					0
					 × 1

Answers 30