Reputation
Badges 1
25 × Eureka!:) yes on your gateway/firewall set http://demoapi.trains.allegro.ai to 127.0.0.1 . That's always good practice ;)
What's the exact error you are getting ?
(Maybe this is privilege error on the cache folder, what are the folders it is using, you can see in the configuration as well)
could you send the entire log here?
i.e. from the "docker-compose" command line and onward
Hi LazyLeopard18 ,
See details below, are you using the win10 docker-compose yaml?
https://github.com/allegroai/trains-server/blob/master/docs/install_win.md
Specifically notice step (1) and (2) they are important for Windows docker service to be able to run the elastic container and mongo container
LazyLeopard18 nice. maybe we should add it in the FAQ / Install. Could you send the exact docker-compose you used and command line, I'll ask the guys to add it 🙂
Hi CurvedHedgehog15
I would like to optimize hparams saved in Configuration objects.
Yes, this is a tough one.
Basically the easiest way to optimize is with hyperparameter sections as they are basically key/value you can control from the outside (see the HPO process)
Configuration objects are, well, blobs of data, that "someone" can parse. There is no real restriction on them, since there are many standards to store them (yaml,json.init, dot notation etc.)
The quickest way is to add...
I'm going to follow your suggestion and just put the extra effort into distributing a pre-built image.
That sounds good 🙂
If you feel the need is important, I do have a hack in mind, it will be doable once we have support for entrypoint "-c python_code_here". But since this is still not available I believe you are right and build an image would be the easiest.
A note on the docker image, remember that when running inside the docker we inherit the system packages installed on the d...
Hi @<1541954607595393024:profile|BattyCrocodile47>
Can you help me make the case for ClearML pipelines/tasks vs Metaflow?
Based on my understanding
- Metaflow cannot have custom containers per step (at least I could not find where to push them)
- DAG only execution. I.e. you cannot have logic driven flows
- cannot connect git repositories to different component in the pipeline
- Visualization of results / artifacts is rather limited
- Only Kubernetes is supported as underlying prov...
Hi FierceFly22
You called execute_remotely a bit too soon. If you have any manual configuration, they have to be called before, so they are stored in the Task. This includes task.connect and task.connct_configuration.
This is a part of a bigger process which times quite some time and resources, I hope I can try this soon if this will help get to the bottom of this
No worries, if you have another handle on how/why/when we loose the current Task, please share 🙂
We already redesigned the implementation so it should be quite easy to extend to GCP and Azure, what are you planning ?
Hi WittyOwl57
I'm guessing clearml is trying to unify the histograms for each iteration, but the result is in this case not useful.
I think you are correct, the TB histograms are actually a 3d histograms (i.e. 2d histograms over time, which would be the default for kernel;/bias etc.)
is there a way to ungroup the result by iteration, and, is it possible to group it by something else (e.g. the tags of the two plots displayed below side by side).
Can you provide a toy example...
Hi ShinyWhale52
This is just a suggestion, but this is what I would do:
- use
clearml-data
and create a dataset from the local CSV fileclearml-data create ... clearml-data sync --folder (where the csv file is)
2. Write a python code that takes the csv file from the dataset and creates a new dataset of the preprocessed data
` from clearml import Dataset
original_csv_folder = Dataset.get(dataset_id=args.dataset).get_local_copy()
process csv file -> generate a new csv
preproces...
AdventurousRabbit79 are you passing cache_executed_step=False
to the PipelineController ?
https://github.com/allegroai/clearml/blob/332ceab3eadef4997e897d171957975a247a6dc1/clearml/automation/controller.py#L129
Could you send a usage example ?
my pipeline controller always updates to the latest git commit id
This will only happen if the Task the pipeline creates has no specific commit ID, and instead just uses the latest from the git repo. Is this the case ?
should I update nodejs in centos image ?
I think so, it might have been forgotten
the only problem with it is that it will start the task even if the task is completed
What is the criteria ?
DepressedChimpanzee34
so parsing bask is done via a yaml reader:
https://github.com/allegroai/clearml/blob/49fcbd7bbf3236f4175cdff29fa951847b0923cc/clearml/backend_interface/task/args.py#L506
We could add extra test here, checking for \ in the string, that should solve it and will be backwards compatible (I think)
https://github.com/allegroai/clearml/blob/49fcbd7bbf3236f4175cdff29fa951847b0923cc/clearml/backend_interface/task/task.py#L935
Hi @<1523701066867150848:profile|JitteryCoyote63>
Could you please push the code for that version on github?
oh seems like it is not synced, thank you for noticing (it will be taken care immediately)
Regrading the issue:
Look at the attached images
None does not contain a specific wheel for cuda117 to x86, they use the pip defualt one
![image](https://clearml-web-assets.s3.amazonaws.com/scoold/images/TT9ATQXJ5-F05744CK09L/screenshot...
AgitatedTurtle16 from the screenshot, it seems the Task is stuck in the queue. which means there is no agent running to actual run the interactive session.
Basic setup:
A machine running clearml-agent
(this is the "remote machine") A machine running cleaml-session (let's call it laptop 🙂 )You need to first start the agent on the "remote machine" (basically call clearml-agent daemon --docker --queue default
), Once the agent is running on the remote machine, from your laptop ru...
I assume the task is being launched sequentially. I'm going to prepare a more elaborate example to see what happens.
Let me know if you can produce a mock test, I would love to make sure we support the use case, this is a great example of using pipeline logic 🙂
JitteryCoyote63 so now everything works as expected ?
GiganticTurtle0 quick update, a fix will be pushed, so that casting is based on the Actual value passed not even type hints 🙂
(this is only in case there is no default value, otherwise the default value type is used for casting)
Hi @<1541954607595393024:profile|BattyCrocodile47>
see here: None
Try with app.clearml.mlops-club.org
and the rest of them
Basically it is the same as "report_scatter2d"
Hi @<1523701066867150848:profile|JitteryCoyote63>
Thank you for bringing it! can you verify with the latest clearml-agent 1.5.3rc2
?
Hi SoreDragonfly16
The warning you mention means that someone state of the experiment was changed to aborted
, which in term will actually kill the process.
What do you mean by "If I disable the logger," ?
Hi JumpyPig73
Funny enough this is being fixed as we speak 🙂
The main issue is that as you mentioned, ClearML does not "detect" the exit code when os.exit() is called, and this is why it is "missing" the failed test (because as mentioned, all exceptions are caught). This should be fixed in the next RC
Hi JumpyPig73 , I think it was synced to github. You can already test with: git install git+ https://github.com/allegroai/clearml.git