Thanks BroadSeaturtle49
I think I was able to locate the issue !=
breaks the pytroch lookup
I will make sure we fix asap and release an RC.
BTW: how come 0.13.x have No linux x64 support? and the same for 0.12.x
https://download.pytorch.org/whl/cu111/torch_stable.html
HelplessCrocodile8
Basically the file URI might be different on a different machine (out of my control) but they point to the same artifact storage location
We might have thought of that...
in your clearml.conf file:
` sdk{
storage {
path_substitution = [
# Replace registered links with local prefixes,
# Solve mapping issues, and allow for external resource caching.
{
registered_prefix = file:///mnt/data/...
HugeArcticwolf77 changing the color is definitely a feature we will have in the next version, right now I think you cannot 😞 it is randomly chosen based on the title/series and I think your example is a great failure case of that randomness 😅
Nice! I'll see if we can have better error handling for it, or solve it altogether 🙂
JumpyPig73 you should be able to find in in the bottom pf the page, try scrolling down (it should be after the installed packages)
We're wondering how many on-premise machines we'd like to deprecate.
I think you can see that in the queues tab, no?
Oh I see the pipeline controller itself (not the components) is the one with the repo
To fix that add at the top of the script the following:
` from clearml import Task
Task.force_store_standalone_script()
@PipelineDecorator.pipeline(...) `That should do the trick
Thanks BattyLion34 I fixed the code snippet :)
Hi VexedCat68
So if I understand correctly, the issue is this argument:parameter_override={'Args/dataset_id': '${split_dataset.split_dataset_id}', 'Args/model_id': '${get_latest_model_id.clearml_model_id}'},
I think that what is missing is telling it this an artifact:parameter_override={'Args/dataset_id': '${split_dataset.artifacts.split_dataset_id.url}', 'Args/model_id': '${get_latest_model_id.clearml_model_id}'},
You can see the example here:
https://clear.ml/docs/latest/docs/ref...
For .git-credentials remove the git_pass/git_user from the clearml.conf
If you want to use ssh you need to also add:force_git_ssh_protocol: true
https://github.com/allegroai/clearml-agent/blob/a2db1f5ab5cbf178840da736afdc370cfff43f0f/docs/clearml.conf#L25
We are here if you need further help 🙂
Hi JumpyDragonfly13
- is "10.19.20.15" accessible from your machine (i.e. can you ping to it)?
- Can you manually SSH to 10.19.20.15 on port 10022 ?
I think there was an issue with the entire .ml domain name (at least for some dns providers)
I would like to bypass this behavior because my code has a need for a specific version of PyTorch.
DilapidatedCow43 you will get exactly the pytorch version you need, but complied to the CUDA version that is installed (pytorch people actually maintain multiple versions based on different cuda versions)
OutrageousSheep60 so if this is the case I think you need to add "external links" i.e. upload the individual files to GCS, then register the links to GCS, does that make sense ?
If you have a requirements file then you can specify it:Task.force_requirements_env_freeze(requirements_file='requirements.txt')
If you just want pip freeze
output to be shown in your "Installed Packages" section then use:Task.force_requirements_env_freeze()
Notice that in both cases you should call the function Before you call Task.init()
btw, what do you mean by "Packages will be installed from projects requirements file" ?
Hmmm, can you view the settings? that's the only thing I can think of at the moment that will be diff between your setup and the working one...
Also, is there a way for you to have the trains-server behind https (on your GCP)
Was going crazy for a short amount of time yelling to myself: I just installed clear-agent init!
oh noooooooooooooooooo
I can relate so much, happens to me too often that copy pasting into bash just uses the unicode character instead of the regular ascii one
I'll let the front-end guys know, so we do not make ppl go crazy 😉
BTW: is this on the community server or self-hosted (aka docker-compose)?
he problem is due to tight security on this k8 cluster, the k8 pod cannot reach the public file server url which is associated with the dataset.
Understood, that makes sense, if this is the case then the path_substitution
feature is exactly what you are looking for
Hi ReassuredTiger98
Agent's queue priory can be translated to the order the agent will pull jobs from.
Now let's assume we have two agents with priorities A,B for one and B,A for the other. If we only push a Task to queue A, and both agents are idle (implying queue B is empty), there is no guarantee which one will pull the job.
Does that make sense ?
What is the use-case you are trying to solve/optimize for ?
but it is not optimal if one of the agents is only able to handle tasks of a single queue (e.g. if the second agent can only work on tasks of type B).
How so?
Sure thing 🙂
BTW: ReassuredTiger98 this is definitely an interesting use case, and I think you can actually write some code to solve it if you like.
Basically let's followup on you setup:Machine X: agent listening to queue A, B_machine_a *notice we have two agents here Machine Y: agent listening to queue B_machine_b
Now we (the users) will push our jobs into queues A and B
Now we have a service that does the following:
` see if we have a job in queue B
check if machine Y is working...
Hi ReassuredTiger98
I think DefiantCrab67 solved it 🙂
https://clearml.slack.com/archives/CTK20V944/p1617746462341100?thread_ts=1617703517.320700&cid=CTK20V944
With pleasure 🙂
Hmm ConvincingSwan15
WARNING - Could not find requested hyper-parameters ['Args/patch_size', 'Args/nb_conv', 'Args/nb_fmaps', 'Args/epochs'] on base task
Is this correct ? Can you see these arguments on the original Task in the UI (i.e. Args section, parameter epochs?)
Hi ConvincingSwan15
A few background questions:
Where is the code that we want to optimize? Do you already have a Task of that code executed?
"find my learning script"
Could you elaborate ? is this connect to the first question ?
Hmm, maybe the original Task was executed with older versions? (before the section names were introduced)
Let's try:DiscreteParameterRange('epochs', values=[30]),
Does that gives a warning ?
You put it there 🙂 so the assumption you know what you are looking for, or use glob? wdyt?