HI JitteryCoyote63 ,
can you try increasing the polling_interval_time_min
to 5-7 minutes for the check? do you get double machines with it too?
One of the following objects Numpy.array, pandas.DataFrame, PIL.Image, dict (json), or pathlib2.Path
Also, if you used pickle
, the pickle.load
return value is returned. and for strings a txt
file (as it stored).
I guess not many people use the local file storage
I’m using it 🙂
How can I reproduce this issue? what should I have as cache_dir
? ~/.clearml
?
I think the only way you can get it is from the task attribute:
ds = Dataset.get(dataset_id="your dataset id") ds_uri = ds._task.artifacts.get("data").url
👍 let me try to reproduce with it. can you write the change you edited in the docker-compose
?
Hi AverageRabbit65 ,
Is this part of a repository? if so, you can specify the repo in the add_function_step
thanks SmugTurtle78 , checking it
So just after the clone, before creating the env?
Basically we can have Pigar or freeze for getting the packages&versions (+ change and create a template in the UI), what is the specific scenario you have? maybe we can think about another solution
how do you run it locally? the same?
can you attach the full log of the instance? Did the aws scalar output any logs?
The Hyperparameter Optimizer can give you such table, but I’m not sure this is what you are looking for ( https://allegro.ai/clearml/docs/docs/examples/frameworks/pytorch/notebooks/image/hyperparameter_search.html and https://medium.com/pytorch/accelerate-your-hyperparameter-optimization-with-pytorchs-ecosystem-tools-bc17001b9a49 )
and using https://github.com/allegroai/clearml-agent/blob/21c4857795e6392a848b296ceb5480aca5f98e4b/docs/clearml.conf#L140 for running scripts at docker startup
Not sure getting that, if you are loading the last dataset task in your experiment task code, it should take the most updated one.
You can change the dataset
_task
object to have your storage location as output_uri
and the LABELS section is empty for running with the agent? Running locally works?
The training task (child)
this is the task the HPO is cloning?
Do you have the packages in this task?
BTW why using the api calls and not clearml sdk?
Hi GrievingTurkey78
If you like to have the same environment in trains-agent
, you can use on your local machine the detect_with_pip_freeze
option, on you ~/trains.conf
file.
Just change detect_with_pip_freeze: true
( https://github.com/allegroai/trains/blob/master/docs/trains.conf#L168 is an example)
Hi UnevenDolphin73 , the fix is ready, can you try it with the latest rc?
pip install clearml==1.4.2rc0
So if you have ~/trains.conf
work with it, if you don't, work offline?
HI JitteryCoyote63 , so just the print is double?
Hi DeliciousBluewhale87
Can you share the version you are using? Did you get any other logs? maybe from the pod?
If you are entering a specific task artifact, you’ll get an Artifact
object ( trains.binding.artifacts.Artifact
)
Hi IntriguedRat44
If you don’t want sending framework’s outputs, you can disable those with auto_connect_frameworks=False
in your Task.init
call.
You can find more options https://github.com/allegroai/trains/blob/master/trains/task.py#L328
RoughTiger69 can you share the python version and the logs?
When you are not using the StorageManager you don’t get the OSError: [Errno 9] Bad file descriptor
errors?
Not everything is manage with a git repo, if your script is a standalone, the full script will be in the uncommitted changes
section (EXECUTION tab).
The repository information is the repository location, the uncommitted changes, the branch with commit id / tag.
BTW, the full link to the docs - https://allegro.ai/clearml/docs/