Also, we use Dagster for orchestration.
@<1523701070390366208:profile|CostlyOstrich36> here is the INFO section:
Here is a sample from the end of the logs where it failed:
/home/megan/code/direct-relief-forecasting/forecast.py:277: FutureWarning:
The behavior of DatetimeProperties.to_pydatetime is deprecated, in a future version this will return a Series containing python datetime objects instead of an ndarray. To retain the old behavior, call `np.array` on the result
2024-07-04 12:14:04.589 | INFO | forecast:infer:281 - Inferring for dates: ['2024-06-13']
2024-07-04 12:14:04.734 | INFO | forecast:infer:298 - Inferring for 29,838 materials
2024-07-04 12:14:04.735 | INFO | forecast:infer:302 - With MA, forecast is missing for 24,888 (83.4%) materials
2024-07-04 12:14:04.736 | INFO | forecast:infer:302 - With LGBM, forecast is missing for 0 (0.0%) materials
2024-07-04 12:14:04.737 | INFO | forecast:infer:302 - With LGBM_lower, forecast is missing for 0 (0.0%) materials
2024-07-04 12:14:04.737 | INFO | forecast:infer:302 - With LGBM_upper, forecast is missing for 0 (0.0%) materials
2024-07-04 12:14:04.742 | INFO | forecast:infer:320 - Fixing all forecasts (adjusting to guardrails)
2024-07-04 12:14:04.743 | INFO | src.forecasting.inference:fix_forecast:175 - 42 (0.1%) forecasts are negative
2024-07-04 12:14:04.743 | INFO | src.forecasting.inference:fix_forecast:178 - Setting negative forecasts to 0.
2024-07-04 12:14:04.744 | INFO | src.forecasting.inference:fix_forecast:197 - 177 (0.6%) of forecasts have negative lower bound
2024-07-04 12:14:04.745 | INFO | src.forecasting.inference:fix_forecast:203 - Setting negative lower bounds to zero
2024-07-04 12:14:04.746 | INFO | src.forecasting.inference:fix_forecast:220 - 81 (0.3%) of lower bounds are greater than the forecast
2024-07-04 12:14:04.747 | INFO | src.forecasting.inference:fix_forecast:226 - Adjusting lower bounds to forecasts
2024-07-04 12:14:04.748 | INFO | src.forecasting.inference:fix_forecast:248 - 88 (0.3%) of upper bounds are lower than the forecast
2024-07-04 12:14:04.749 | INFO | src.forecasting.inference:fix_forecast:254 - Adjusting upper bounds to forecasts
2024-07-04 12:14:04.765 | INFO | src.forecasting.inference:forecast:307 - 29,838 forecasted materials
2024-07-04 12:14:04.766 | INFO | src.forecasting.inference:forecast:312 - Columns to be used as predictors: ['LGBM_lower', 'LGBM', 'LGBM_upper', 'MA']
2024-07-04 12:14:04.767 | INFO | src.forecasting.inference:forecast:313 - Mapping columns following {'LGBM_lower': 'forecast_lower', 'LGBM': 'forecast', 'LGBM_upper': 'forecast_upper', 'MA': 'baseline'}
2024-07-04 12:14:04.772 | INFO | src.forecasting.inference:forecast:329 - Getting forecasts for existing products
2024-07-04 12:14:04.772 | INFO | src.forecasting.inference:forecast:335 - We have at least one non-NA material -- proceed to merge with forecasts for existing materials
2024-07-04 12:14:04.780 | INFO | src.forecasting.inference:forecast:360 - 1 products in the current offer have an existing forecast
2024-07-04 12:14:04.781 | INFO | src.forecasting.inference:forecast:375 - Identifying existing products and new products
2024-07-04 12:14:04.783 | INFO | src.forecasting.inference:forecast:394 - Number of existing materials: 1
2024-07-04 12:14:04.784 | INFO | src.forecasting.inference:forecast:397 - Number of new materials: 23
2024-07-04 12:14:04.785 | INFO | src.forecasting.inference:forecast:411 - 23 missing -- use similarities
2024-07-04 12:14:04.785 | INFO | src.forecasting.new_product:forecast:1027 - Finding similarities by NDC
2024-07-04 12:14:04.792 | INFO | src.forecasting.new_product:forecast_by_ndc:924 - 0 existing products with same NDCs
2024-07-04 12:14:04.793 | INFO | src.forecasting.new_product:forecast:1036 - 23 products still missing
2024-07-04 12:14:04.794 | INFO | src.forecasting.new_product:forecast:1037 - Finding similarities by generic name
2024-07-04 12:14:04.800 | INFO | src.forecasting.new_product:forecast_by_generic_name:811 - 207 existing products with same generic name
2024-07-04 12:14:04.800 | INFO | src.forecasting.new_product:forecast_by_generic_name:817 - Incorporating forecast by shared generic name
2024-07-04 12:14:04.807 | INFO | src.forecasting.new_product:drop_new_forecasts_of_existing_products:190 - There are 1 existing products and 0 of them share the same `generic_name`
2024-07-04 12:14:04.821 | INFO | src.forecasting.new_product:forecast:1058 - No more missing products -- skipping NLP matching
/home/megan/code/direct-relief-forecasting/src/forecasting/new_product.py:1104: FutureWarning:
The behavior of DataFrame concatenation with empty or all-NA entries is deprecated. In a future version, this will no longer exclude empty or all-NA columns when determining the result dtypes. To retain the old behavior, exclude the relevant entries before the concat operation.
2024-07-04 12:14:04.829 | INFO | forecast:infer:350 - Fixing forecasts (adjusting to guardrails)
2024-07-04 12:14:04.830 | INFO | src.forecasting.inference:fix_forecast:175 - 0 (0.0%) forecasts are negative
2024-07-04 12:14:04.831 | INFO | src.forecasting.inference:fix_forecast:197 - 0 (0.0%) of forecasts have negative lower bound
2024-07-04 12:14:04.831 | INFO | src.forecasting.inference:fix_forecast:220 - 0 (0.0%) of lower bounds are greater than the forecast
2024-07-04 12:14:04.832 | INFO | src.forecasting.inference:fix_forecast:248 - 0 (0.0%) of upper bounds are lower than the forecast
2024-07-04 12:14:04.833 | INFO | forecast:infer:366 - Forecast output column list: ['pd_id', 'date', 'offer', 'vendor_name', 'material', 'ndc', 'description', 'description_level1', 'description_level2', 'description_level3', 'description_level4', 'description_level5', 'generic_name', 'dosage_form', 'administration_route', 'strength_name', 'redbook_family', 'expiration_date', 'base_unit', 'price', 'offered', 'baseline', 'forecast_lower', 'forecast', 'forecast_upper', 'existing_new']
2024-07-04 12:14:04.833 | INFO | forecast:infer:368 - Missing forecast output column list: ['restriction']
2024-07-04 12:14:04 +0000 - dagster - INFO - forecasting_allocation_job - 5be181a6-2e8b-477d-a752-f7f1fe7236e0 - forecasting_op - Saving model run instance with url
to the database
2024-07-04 14:14:16
2024-07-04 12:14:16 +0000 - dagster - INFO - forecasting_allocation_job - 5be181a6-2e8b-477d-a752-f7f1fe7236e0 - forecasting_op - Getting forecast predictions for materials from the database for forecast_date 2024-06-13...
2024-07-04 14:14:21
2024-07-04 12:14:21 +0000 - dagster - INFO - forecasting_allocation_job - 5be181a6-2e8b-477d-a752-f7f1fe7236e0 - forecasting_op - Inserting forecast predictions for all materials into the database.
2024-07-04 14:14:25
2024-07-04 12:14:25 +0000 - dagster - ERROR - forecasting_allocation_job - 5be181a6-2e8b-477d-a752-f7f1fe7236e0 - forecasting_op - You are trying to merge on datetime64[ns] and object columns for key 'forecast_date'. If you wish to proceed you should use pd.concat
Hi @<1654294828365647872:profile|GorgeousShrimp11> , can you provide a log for such a task? What is the status change in the INFO section?