MysteriousBee56 when you run the trains-agent
with --foreground , before it starts the docker it print the full command line, could you send it please?
I can't figure out where the extra ' came from...
Also could you send the trains.conf file?
(feel free to redact and confidential information)
btw: both should work fine
DilapidatedDucks58 Nice!
but it would be great to see predecessors of each experiment in the chain
So maybe we should add "manual pipeline" to create the connection post execution ? is this a one time thing ?
Maybe a service creating these flow charts ?
Should we put them in the Project's readme ? Or in the Pipeline section (coming soon)
Could it be the credentials are actually incorrect? because it seems like you can access the server? (I assume you were able to browse to it and generate credentials. right?)
Hi StraightCoral86
When I run an experiment usingΒ
Task.create()
Β ,
Use Task.init
π
Task.create is meant to create an extranl Task (i.e. Job) ins the system, Not to auto-gernerate a job from the running code. Make sense ?
CLI? Code ?
Yes it is reproducible do you want a snippet?
Already fixed π please ping tomorrow, I think an RC should be out soon with the fix
Thanks FiercePenguin76 , I can totally understand your point on running proper tests, and reluctance to break other things.
I suggest to add a comment with the temp fix that solved the problem for you, and we will make sure the team takes it from there. wdyt?
SweetGiraffe8
That might be it, could you test with the Demo server ?
Hi JuicyFox94 ,
Actually we just added that π (still on GitHub , RC soon)
https://github.com/allegroai/clearml/blob/400c6ec103d9f2193694c54d7491bb1a74bbe8e8/clearml/automation/controller.py#L696
HelplessCrocodile8 I just tried it, everything seems to work (ubuntu 20.04) π
What's the OS your are using? Python version? Is it conda ?
@<1523701868901961728:profile|ReassuredTiger98> thank you so much for testing it!
Hi @<1687643893996195840:profile|RoundCat60>
anyone with access to the server
Is that a thing? If you have access to the server Not sure how "protected" you are even if using a key ring...
(unfortunately I do not think we support anything else, but what did you have in mind?
You can do that programatically, clone the pipeline Task (a pipeline is also a Task) and change the Args section of that Task, wdyt?
Example:
None
Because submodules inside a git are basically a requirement for a git repo to run. Skipping over a few or selecting manually will break the agent. That said maybe shallow clone might be easier or faster. Regardless it should be an environment passed per Task. Feel free to add a GH issue request, if this is not a unique edge case we will add it
Yes that's the part that is supposed to only pull the GPU usage for your process (and sub processes) instead of globally on the entire system
Hi EnviousStarfish54
docker on windows , with nvidia runtime support is only with WSL (I think)
https://docs.nvidia.com/cuda/wsl-user-guide/index.html#installing-wip
https://medium.com/@dalgibbard/docker-with-gpu-support-in-wsl2-ebbc94251cf5
hmmm I see...
It seems to miss the fact that your process do uses the GPU.
Maybe it only happens later, that the GPU is used?
Does that make sense ?
(i.e. importing the trains package is enough to patch the argparser, only when you call the task.init the arguments will be logged, before they are stored in memory)
PompousBeetle71 you can check this example:
https://github.com/allegroai/trains/blob/master/examples/distributed/example_torch_distributed.py
I think it should help, if you want a more manual approach, you can check the POpen subprocesses here:
https://github.com/allegroai/trains/blob/master/examples/distributed/example_subprocess.py
BoredHedgehog47 could it be "python" python points to python 2.7 inside your container, as opposed to python3 on your machine
(this error is python2 trying to run python 3 code)
https://stackoverflow.com/questions/20555517/using-multiple-versions-of-python"Training classifier with command:\n python -m sfi.imagery.models.bbox_predictorv2.train
Also can you right click on the image and save it on your machine, see if it is cropped, or it is just a UI issue
EnviousStarfish54 data versioning on the open source leverages the artifacts and storage and caching capabilities of Trains.
A simple workflow
- Upload data
https://github.com/allegroai/events/blob/master/odsc20-east/generic/dataset_artifact.py - Preprocessing data
https://github.com/allegroai/events/blob/master/odsc20-east/generic/process_dataset.py - Using data
https://github.com/allegroai/events/blob/master/odsc20-east/scikit-learn/sklearn_jupyter.ipynb
Hi ShortElephant92
You could get a local copy from the local server, then switch credentials to the hosted server and upload again, would that work?