Reputation
Badges 1
25 × Eureka!im not running in docker mode though
hmmm that might be the first issue. it cannot skip venv creation, it can however use a pre-existing venv (but it will change it every time it installs a missing package)
so setting CLEARML_AGENT_SKIP_PYTHON_ENV_INSTALL=1 in non docker mode has no affect
So if everything works you should see "my_package" package in the "installed packages"
the assumption is that if you do:pip install "my_package"
It will set "pandas" as one of its dependencies, and pip will automatically pull pandas as well.
That way we do not list the entire venv you are running on, just the packages/versions you are using, and we let pip sort the dependencies when installing with the agent
Make sense ?
Hi JealousParrot68
This is the same as:
https://clearml.slack.com/archives/CTK20V944/p1627819701055200
and,
https://github.com/allegroai/clearml/issues/411
There is something odd happening in the files-server as it replaces the header (i.e. guessing the content o fthe stream) and this breaks the download (what happens is the clients automatically ungzip the csv).
We are working on a hit fix to he issue (BTW: if you are using object-storage / shared folders, this will not happen)
And having a pdf is easier/better than sharing a link to the results page ?
ElegantKangaroo44 good question, that depends on where we store the score of the model itself. you can obviously parse the file name task.models['output'][-1].url and retrieve the score from it. you can also store it on the model name task.models['output'][-1].name and you can put it as general purpose blob o text on what is currently model.config_text (for convenience you can have model parse a json like text and use model.config_dict
Hi SubstantialElk6
I can't see that is was removed, could you send the full log ?
Did you experiment any drop of performances using forkserver?
No, seems to be working properly for me.
If yes, did you test the variant suggested in the pytorch issue? If yes, did it solve the speed issue?
I haven't tested it, that said it seems like a generic optimization of the DataLoader
Hi MistakenDragonfly51
Hello everyone! First, thanks a lot to everyone that made ClearML possible,
β€
To your questions π
long story short, no unless you really want to compile the dockers, which I can't see the real upside here Yes, add the following /opt/clearml.conf:/root/clearml.conf herehttps://github.com/allegroai/clearml-server/blob/5de7c120621c2831730e01a864cc892c1702099a/docker/docker-compose.yml#L154
and configure your hosts " /opt/clearml.conf" with ...
JitteryCoyote63 hacky but sure π
` from trains.config import config_obj
print(config_obj) `
Hi DilapidatedDucks58 just making sure, the link is pyrorch nightly artifactory? Or is it a direct link to the package? Reason for asking, I was not aware they have proper artifactory... When the task runs the trains agent will update the installed packages with all the installed packages it used. Could you verify you have the correct version?
Regarding the extra files, you are correct, the docker container is reset every run, so they will get lost. What are those files for? Could you add ...
it does
not
include the βinternal.repoβ as a package dependency, so it crashes.
understood
And for the time being we have not used the decorators,
So how are you building the pipeline component ?
LOL love that approach.
Basically here is what I'm thinking,
` from clearml import Task, InputModel, OutputModel
task = Task.init(...)
run this part once
if task.running_locally():
my_auxiliary_stuff = OutputModel()
my_auxiliary_stuff.system_tags = ["DATA"]
my_auxiliary_stuff.update_weights_package(weights_path="/path/to/additional/files")
input_my_auxiliary = InputModel(model_id=my_auxiliary_stuff.id)
task.connect(input_my_auxiliary, "my_auxiliary")
task.execute_remotely()
my_a...
Hi, what is host?
The IP of the machine running the ClearML server
Thanks, new doc site is scheduled for next week, it will also be on github, so pr-ing fixes will be a breeze :)
Hi @<1729309131241689088:profile|MistyFly99>
notice that the files server need to have an "address" that can be accessed from the browser, data is stored in a federated manner. This means your browser is directly accessing the files server, not through the API server, I'm assuming the address is not valid?
How can i find queue name
You can generate as many as you like, the default one is called "default" but you can add new queues in the UI (goto workers & queus page, then Queues, and click "+ New Queue"
Is Task.current_task() creating a task?
Hmm it should not, it should return a Task instance if one was already created.
That said, I remember there was a bug (not sure if it was in a released version or an RC) that caused it to create a new Task if there isn't an existing one. Could that be the case ?
Thanks JitteryCoyote63 !
Any chance you want to open github issue with the exact details or fix with a PR ?
(I just want to make sure we fix it as soon as we can π )
potential sources of slow down in the training code
Is there one?
p.s. you should remove this line πextra_index_url: ["git@github.com:salimmj/xxxx"]
Actually I saw that theΒ
RuntimeError: context has already been set
Β appears when the task is initialised outsideΒ
if name == "main":
Is this when you execute the code or when the agent ?
Also what's the OS of your machine/ agent ?
JitteryCoyote63 s3 should work, you can go to your profile page, see if you do not have some old credentials already there, maybe this is the issue.
Hi StrangePelican34 , you mean poetry as package manager of the agent? The venvs cache will only work for pip and conda, poetry handles everything internally:(
Ohh sorry. task_log_buffer_capacity is actually internal buffer for the console output, on how many lines it will store before flushing it to the server.
To be honest, I can't think of a reason to expose / modify it...
And is there an easy way to get all the metrics associated with a project?
Metrics are per Task, but you can get the min/max/last of all the tasks in a project. Is that it?
(basically python abusing types/casting where the value can be both str/bool on the same argparser aergument)
Plan is to have it out in the next couple of weeks.
Together with a major update in v0.16
Ohhhh , okay as long as you know, they might fall on memory...