Reputation
Badges 1
25 × Eureka!Could not install packages due to an EnvironmentError: [Errno 2] No such file or directory: '/tmp/build/80754af9/attrs_1604765588209/work'
Seems like pip failed creating a folder
Could it be you are out of space ?
SubstantialElk6
Regrading cloning the executed Task:
In the pip requirements syntax, "@" is a hint that tells pip where to find the package if it is not preinstalled.
Usually when you find the @ /tmp/folder
It means the packages was preinstalled (usually pre installed in the docker).
What is the exact scenario that caused it to appear (this was always the case, before v1 as well).
For example zipp
package is installed from pypi be default and not from local temp file.
Your fix b...
Hi DefeatedCrab47
Not really sure, and it smells bad ...
Basically you can always use the TB logger, and call Task.init.
BoredHedgehog47 if you are running it on K8s, then the setup script is running before everything else, even before an agent appears on the machine, unfortunately this means the output is not logged yet, hence the missing console lines (I think the next version of the glue will fix that)
In order to test you can do:export TEST_ME
then inside your code you will be able to see itos.environ['TEST_ME']
Make sense ?
Hi SteadySeagull18
What does the intended workflow for making a "pipeline from tasks" look like?
The idea is if you have existing Tasks in the system and you want to launch them one after the other with control over inputs (or outputs of them) you can do that, without writing any custom code.
Currently, I have a script which does some
Task.create
's,
Notice that your script should do Task.init - Not Task.create, as Task create is designed to create additional ...
Hey SarcasticSparrow10 see here π
https://allegro.ai/clearml/docs/docs/deploying_clearml/clearml_server_linux_mac.html#upgrading
When I'm setting up my Pipeline, I can't go "here are some brand new tasks, please run them",
I think this is the main point. Can you create those Tasks via Task.create and get what you want? If so, then sure you can do that:
` def create_step_task(a_node):
task = Task.create(...)
return task
pipe.add_step(
name="stage_process",
parents=["stage_data"],
base_task_factor=create_step_task
) `wdyt?
As for the node, this confusing bit is that this is text from the docs...
SteadySeagull18 btw: in post-callback the node.job will be completed
because it is a called after the Task is completed
Hi LazyTurkey38
What do you mean the git repo is not recognized? When execute_remotely leaves you should see on the task a ref to the git repo with the exact commit ID you have locally pulled, do you see it under the Execution tab?
Hmm I would have the docker file contain the default Azure credentials/output_uri, and then have the users clearml credentials passed as env variable in runtime. wdyt?
(I'm checking if you can pass the azure credentials as env in a minute)
In the UI you can edit the base container image + add "SETUP SHELL SCRIPT", with any missing "apt update && apt-get install -y ..."
A true mystery π
That said, I hardly think it is directly related to the trains-agent
...
Do you have any more insights on when / how it happens ?
GreasyPenguin66 you can pass:AZURE_STORAGE_ACCOUNT AZURE_STORAGE_KEY
As the default azure access/secret π
Do you have any experience and things to watch out for?
Yes, for testing start with cheap node instances π
If I remember correctly everything is preconfigured to support GPU instances (aka nvidia runtime).
You can take one of the templates from here as a starting point:
https://aws.amazon.com/blogs/compute/running-gpu-accelerated-kubernetes-workloads-on-p3-and-p2-ec2-instances-with-amazon-eks/
Yes, that sounds like a good start, DilapidatedDucks58 can you open a github issue with the feature request ?
I want to make sure we do not forget
DeliciousBluewhale87 Is it ML or DL serving you are after ?
Can you see all the agent in the UI (that basically means they are configured correctly and can connect to the server)
Try to add '--network host' to the docker args on the task you are launching
The fact is that I use docker for running clearml server both on Linux and Windows.
My question was on running the agent, is it running with --docker
flag, i.e. docker mode
Also, just forgot to note, that I'm running clearml-agent and clearml processes in virtual environment - conda environment on Windows and venv on Linux.
Yep that answers my question above π
Does it make any sense to chdngeΒ
system_site_packages
Β toΒ
true
Β if I r...
so far I understand, clearml tracks each library called from scripts and saves the list of this libraries somewhere (as I assume, this list is saved as requirements.txt file somewhere - which is later loaded into venv, when pipeline is running).
Correct
Can I edit this file (just to comment the row with "object-detection==0.1)?
BTW, regarding the object-detection library. My training scripts have calls like:
Yes in the UI, iu can right click on the Task select "reset", then it...
To automate the process, we could use a pipeline, but first we need to understand the manual workflow
https://stackoverflow.com/questions/60860121/plotly-how-to-make-an-annotated-confusion-matrix-using-a-heatmap
MagnificentSeaurchin79 see plotly example here:
https://allegro.ai/clearml/docs/docs/examples/reporting/plotly_reporting.html
SmilingFrog76
there is no internal scheduler in Trains
So obviously there is a scheduler built into Trains, this is the queues (order / priority)
What is missing from it is multi node connection, e.g. I need two agents running the exact same job working together.
(as opposed to, I have two jobs, execute them separately when a resource is available)
Actually my suggestion was to add a SLURM integration, like we did with k8s (I'm not suggesting Kubernetes as a solution for you, the op...
adding the functionality to clearml-task sounds very attractive!
Hmm, what do you think?parser.add_argument('--configuration', type=str, default=None, help='Specify local configuration file' ) parser.add_argument('--configuration-name', type=str, default=None, help='configuration section name' ) ... with open(args.configuration, 'rt') as f: create_populate.task.set_configuration_object(args.name, config_text=f.read())
Add h...
pip cache & git cache & venvs cache
Are all supported, you just need to map the folders.
If you do not want to spin a PVC with NFS mount, you can just mount an S3 bucket with s3fs as part of the container extra bash script,
https://github.com/allegroai/clearml-agent/blob/b39b54bbafab39e6731cb742fdf317bc6dcae54a/docs/clearml.conf#L140
s3 FUSE fuse filesystems:
https://github.com/kahing/goofys
https://github.com/s3fs-fuse/s3fs-fuse
WDYT?
Hi SuperiorDucks36
Could you post the entire log?
(could not resolve host seems to be coming from the "git clone" call).
Are you able to manually clone the repository on the machine running trains-agent
Please attach the log π