Reputation
Badges 1
25 × Eureka!I think this is the main issue, is this reproducible ? How can we test that?
Hmm I suspect the 'set_initial_iteration' does not change/store the state on the Task, so when it is launched, the value is not overwritten. Could you maybe open a GitHub issue on it?
WickedGoat98
for such pods instantiating additional workers listening on queues
I would recommend to create a "devops" user and have its credentials spread across all agents. sounds good?
EDIT:
There is no limit on number of users on the system, so login as a new one and create credentials in the "profile" page :)
WickedGoat98 until the next RC release (should not take long) this will solve it:df = pd.concat([tickerDf.Close, tickerDf_Change.Close_pcent], axis=1) df = df[1:] df.index = df.index.astype(str) setattr(df, 'ticker', args.symbol)
Basically removing the nan and converting the datetime to string representation (so plotly.js likes it)
WickedGoat98 Nice!!!
BTW: The fix should solve both (i.e. no need to manually cast), I'll make sure the fix is on GitHub so you'll be able to verify 🙂
WickedGoat98 what's the clearml version you are using?
WickedGoat98 Same for me, let me ask the UI guys, I think this is a UI bug.
Also maybe before you post the article we could release a fix to both, what do you think?
EDIT:
Never mind 🙂 i just saw the medium link, very cool!!!
Okay, I was able to reproduce it (this is odd) let me check ...
Hi WickedGoat98
I try to write an article on medium about ClearML and face some a problem with plotly figures.
This is awesome !
I ran the plotly_reporting.py example locally and the uploaded plot was ok.
So are you saying the same example code from the repository worked okay on your server but showed nothing on the hosted server ?
Hey WickedGoat98
I found the bug, it is due to the fact the numpy (passed to plotly) contains both datetime and nan, and plotly.js does not like it. I'll make sure this is fixed, in the meantime you can just remove the first row (it contains the nan):df = pd.concat([tickerDf.Close, tickerDf_Change.Close_pcent], axis=1) df = df[1:]
WickedGoat98 this is awesome! Let me know how I could help 🙂
BTW: I checked regrading the plot comparison, this is a BE issue due to the size of the plot, I was told a fix will be deployed in a day or two.
WickedGoat98 is this related to plotly opening a web page when you call show()
method ?
You can do:if not Task.running_locally() fig.show()
clearml-task
seems does not allow me passing the
run
argument without value
EnviousStarfish54 did you try --args run=True
I'm assuming run is a boolean of a sort ?
a. The submitted job would automatically download data from internal data repository, but it will be time consuming if data is re-downloaded every time. Does ClearML caching the data somewhere?
What do you mean by the agent will download the data ? are you referring to Dataset
?
Will they get ordered ascending or descending?
Good point, I'll check the docs... but I think they do not specify
https://clear.ml/docs/latst/docs/references/sdk/task#taskget_tasks
From the code it seems the ordered is not guaranteed.
You can however pass '-last_update'
: order_by
which will give you the latest updated first
` task_filter = {
'page_size': 2,
'page': 0,
'order_by': ['last_metrics.{}.{}'.format(title, series), '-last_update']
}
Task.get_tasks(...
Still feels super hacky tho, think it would be nice to have a simplier way or atleast some nice documentation
YES you are absolutely correct, we should add it to the Task interface.
Any chance you add a GitHub issue so we do not forget ?
BTW: is this on the community server or self-hosted (aka docker-compose)?
NastyOtter17
Usually the first report will happen after 30 seconds, could that be the difference ?
Yes, you are too quick for the resource monitoring 🙂
Hi EagerOtter28
Let's say we query another time and get 60k images. Now it is not trivial to create a new dataset B but only upload the diff: ...
Use Dataset.sync (or clearml-data sync) to check which files where changed/added.
All files are already hashed, right? I wonder why
clearml-data
does not keep files in a semi-flat hierarchy and groups them together to datasets?
It kind of does, it has a full listing of all the files with their hash (SHA2) values, ...
If I checkout/download dataset D on a new machine, it will have to download/extract 15GB worth of data instead of 3GB, right? At least I cannot imagine how you would extract the 3GB of individual files out of zip archives on S3.
Yes, I'm not sure there is an interface to extract only partial files from the zip (although worth checking).
I also remember there is a GitHub issue with uploading 50GB dataset, and the bottom line is, we should support setting chuck size, so that we can uploa...
I have to leave i'll be back online in a couple of hours.
Meanwhile see if the ports are correct (just curl to all ports see if you get an answer) if everything is okay, try again to run the text example
And the agent section on this machine is:api_server:
web_server:
files_server:
Is that correct?
Hi WickedGoat98
"Failed uploading to //:8081/files_server:"
Seems like the problem. what do you have defined as files_server in the trains.conf
or do you mean the machine I ran the experiment locally?
Yes this one
WickedGoat98 Actually the fileserver replied, so it all looks fine to me.
Try to run the text example again, see if you are still getting the fileserver error .